Christoph Auer
c253dd743a
Add redbooks to test data, small additions ( #35 )
...
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
Co-authored-by: Christoph Auer <cau@zurich.ibm.com>
2024-08-20 12:36:00 +02:00
Michele Dolfi
90dd676422
feat: update parser with bytesio interface and set as new default backend ( #32 )
...
* update parser with bytesio interface
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
* change default backend
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
* update DEFAULT_BACKEND
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
---------
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
2024-08-14 12:30:00 +02:00
Michele Dolfi
794b20a50a
fix: type of path_or_stream in PdfDocumentBackend ( #28 )
...
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
2024-08-07 17:20:44 +02:00
Maxim Lysak
b8f5e38a8c
feat: introducing docling_backend ( #26 )
...
Uses our own docling_parse to reliably get PDF cells
To get page images, this backend uses pypdfium2
Signed-off-by: Maxim Lysak <mly@zurich.ibm.com>
Co-authored-by: Maxim Lysak <mly@zurich.ibm.com>
2024-08-07 16:22:36 +02:00