Commit Graph

4 Commits

Author SHA1 Message Date
Christoph Auer
c253dd743a
Add redbooks to test data, small additions (#35)
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
Co-authored-by: Christoph Auer <cau@zurich.ibm.com>
2024-08-20 12:36:00 +02:00
Michele Dolfi
90dd676422
feat: update parser with bytesio interface and set as new default backend (#32)
* update parser with bytesio interface

Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>

* change default backend

Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>

* update DEFAULT_BACKEND

Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>

---------

Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
2024-08-14 12:30:00 +02:00
Michele Dolfi
794b20a50a
fix: type of path_or_stream in PdfDocumentBackend (#28)
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
2024-08-07 17:20:44 +02:00
Maxim Lysak
b8f5e38a8c
feat: introducing docling_backend (#26)
Uses our own docling_parse to reliably get PDF cells
To get page images, this backend uses pypdfium2

Signed-off-by: Maxim Lysak <mly@zurich.ibm.com>
Co-authored-by: Maxim Lysak <mly@zurich.ibm.com>
2024-08-07 16:22:36 +02:00