Skip to content

Releases: kermitt2/grobid

Version 0.5.5

28 May 20:55
Compare
Choose a tag to compare
  • Using pdfalto instead of pdf2xml for the first PDF parsing stage, with many improvements in robustness, ICU support, unknown glyph/font normalization
  • Improvement and full review of the integration of consolidation services, supporting biblio-glutton (additional identifiers and Open Access links) and Crossref REST API (add specific user agent, email and token for Crossref Metadata Plus)
  • Fix bounding box issues for some PDF #330
  • Updated lexicon #396

Version 0.5.4

12 Feb 19:23
Compare
Choose a tag to compare

Changes:

  • transparent usage of DeLFT deep learning models (usual BidLSTM-CRF) instead of Wapiti CRF models, native integration via JEP

  • support of biblio-glutton as DOI/metadata matching service, alternative to crossref REST API

  • improvement of citation context identification and matching (+9% recall with similar precision, for PMC sample 1943 articles, from 43.35 correct citation contexts per article to 49.98 correct citation contexts per article)

  • citation callout now in abstract, figure and table captions

  • structured abstract (including update of TEI schema)

  • bug fixes and some more parameters: by default using all available threads when training and possibility to load models at the start of the service

Version 0.5.3

10 Dec 00:06
Compare
Choose a tag to compare

Changes:

  • Improvement of consolidation options and processing (better handling of CrossRef API, but the best is coming soon ;)
  • Better recall for figure and table identification (thanks to @detonator413)
  • Support of proxy for calling crossref with Apache HttpClient
  • Minor bugfixing

0.5.2

17 Oct 16:12
Compare
Choose a tag to compare

Changes:

  • Corrected back status codes from the REST API when no available engine (503 is back again to inform the client to wait, it was removed by error in version 0.5.0 and 0.5.1 for PDF processing services only, see documentation of the REST API)
  • Added metrics in the REST entrypoint (accessible via http://localhost:8071)
  • Added Grobid clients for Java, Python and NodeJS
  • Added counters for consolidation tasks and consolidation results
  • Add case sensitiveness option in lexicon/FastMatcher
  • Updated documentation
  • Bugfixing: #339, #322, #300, and other

Version 0.5.1 of GROBID

29 Jan 07:48
Compare
Choose a tag to compare

Version 0.5.0 of GROBID

09 Nov 18:11
Compare
Choose a tag to compare

The latest stable release of GROBID is version 0.5.0. As compared to previous version 0.4.3, this version brings:

  • Migrate from maven to gradle for faster, more flexible and more stable build, release, etc.
  • Usage of Dropwizard for web services
  • Move the Grobid service manual to readthedocs
  • (thanks to @detonator413 and @lfoppiano for this release! future work in versions 0.5.* will focus again on improving PDF parsing and structuring accuracy)

Version 0.4.4 of GROBID

13 Oct 14:53
Compare
Choose a tag to compare

Fixed issue that was making the release build not working

Version 0.4.3 of GROBID

07 Oct 00:55
Compare
Choose a tag to compare

The latest stable release of GROBID is version 0.4.3. As compared to previous version 0.4.2, this version brings:

  • New models: f-score improvement on the PubMed Central sample, bibliographical references +2.5%, header +7%
  • New training data and features for bibliographical references, in particular for covering HEP domain (INSPIRE), arXiv identifier, DOI and url (thanks @iorala and @michamos !)
  • Support for CrossRef REST API (instead of the slow OpenURL-style API which requires a CrossRef account), in particular for multithreading usage (thanks @Vi-dot)
  • Improve training data generation and documentation (thanks @jfix)
  • Unicode normalisation and more robust body extraction (thanks @aoboturov)
  • fixes, tests, documentation and update of the pdf2xml fork for Windows (thanks @lfoppiano)

Version 0.4.2 of GROBID

05 Aug 18:58
Compare
Choose a tag to compare

Versions 0.4.2 of GROBID

Versions 0.4.1 of GROBID

02 Oct 02:00
Compare
Choose a tag to compare
grobid-parent-0.4.1

[maven-release-plugin]  copy for tag grobid-parent-0.4.1