Christoph Auer
a8c6b29a67
feat: Upgrade docling-parse PDF backend and interface to use page-by-page parsing ( #44 )
...
* Use docling-parse page-by-page
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
* Propagate document_hash to PDF backends, use docling-parse 1.0.0
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
* Upgrade lockfile
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
* repin after more packages on pypi
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
---------
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
Co-authored-by: Michele Dolfi <dol@zurich.ibm.com>
2024-08-22 13:49:37 +02:00
Michele Dolfi
78347bf679
feat: allow computing page images on-demand with scale and cache them ( #36 )
...
* feat: allow computing page images on-demand and cache them
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
* feat: expose scale for export of page images and document elements
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
* fix comment
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
---------
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
2024-08-20 13:27:19 +02:00
Michele Dolfi
90dd676422
feat: update parser with bytesio interface and set as new default backend ( #32 )
...
* update parser with bytesio interface
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
* change default backend
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
* update DEFAULT_BACKEND
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
---------
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
2024-08-14 12:30:00 +02:00
Panos Vagenas
d2d9543415
fix: set page number using 1-based indexing ( #22 )
...
Signed-off-by: Panos Vagenas <35837085+vagenas@users.noreply.github.com>
2024-07-31 14:28:44 +02:00
Panos Vagenas
eb0b208272
chore: switch to docling-core Markdown export ( #14 )
...
Signed-off-by: Panos Vagenas <35837085+vagenas@users.noreply.github.com>
2024-07-18 16:10:05 +02:00
Michele Dolfi
d1d1724537
fix: missing type for default values ( #12 )
...
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
2024-07-17 15:54:43 +02:00
Christoph Auer
e2d996753b
Initial commit
2024-07-15 09:42:42 +02:00