HTML parser release notes
2.2.1
Enhancements
-
Performance improvements
2.2.0
Enhancements
-
Introduced “at” filter support in BasicElementFilter so matching rules can easily include or discard elements based on their position inside their parent node.
-
Introduced “underHeaderAtRow” filter support in BasicElementFilter to match values at a given row under arbitrary header cells in any position of a table.
-
Added no-varargs method alternatives in BasicElementFilter, including
-
allowing user to provide a space separated sequence of CSS class names in method
classes
- providing multiple possible attribute values of any type in method
attribute
.
-
-
Added support for filtering rows of an entity via the newly introduced interface {@@LINK HtmlRecordFilter}.
-
Parser ignores duplicate paths that might have been accidentally defined for the same field.
- Log messages now display group rules applied to any given matching rule
Bug fixes
-
Fixed handling of matching rules applied over sequence of sibling elements
-
Validate annotation is ignored in some cases
-
Getting list of values matched directly from a HtmlElement may return
null
s along with the desired values - Persistent fields should have their values cleared if they are defined from a group and it becomes active.
2.1.1
Enhancements
-
Made license manager dialog appear automatically if a license can’t be found or if it is invalid.
- Adding support for regex validation and custom validations on class attributes and methods annotated with
@Validate
Bug fixes
-
Fixed inconsistent result rows produced by link follower fields added in between the parent entity fields.
- Fixed issues parsing stored files originated from link followers that have been configured to be saved under a non-standard file location.
2.1.0
Enhancements
-
Implemented support for
file://
protocol to allow transforming locally stored pages and resources viafetchResources
-
Added support for
@Validate
annotation on annotated java beans. - Added support for including fields from “parent” row into linked entity records: github issue #3
Bug fixes
-
Fetch resources will alter CSS files already downloaded in a previous run which can potentially break the resource paths used in it.
- Fetch resources does not create daemon threads and keeps main thread alive if users don’t explicitly shut down the executor service.
2.0.1
Bug fixes
-
Group constants not applied when declared last: github issue #1
- Parser won’t let the JVM shut down without explicitly calling
HtmlParserSettings.getExecutorService().shutDown()
: github issue #2
2.0.0
- First public release