Christoph Auer
2a2c65bf4f
feat: Add pipeline timings and toggle visualization, establish debug settings ( #183 )
...
* Add settings to turn visualization on or off
Signed-off-by: Christoph Auer <cau@zurich.ibm.com >
* Add profiling code to all models
Signed-off-by: Christoph Auer <cau@zurich.ibm.com >
* Refactor and fix profiling codes
Signed-off-by: Christoph Auer <cau@zurich.ibm.com >
* Visualization codes output PNG to debug dir
Signed-off-by: Christoph Auer <cau@zurich.ibm.com >
* Fixes for time logging
Signed-off-by: Christoph Auer <cau@zurich.ibm.com >
* Optimize imports
Signed-off-by: Christoph Auer <cau@zurich.ibm.com >
* Update lockfile
Signed-off-by: Christoph Auer <cau@zurich.ibm.com >
* Add start_timestamps to ProfilingItem
Signed-off-by: Christoph Auer <cau@zurich.ibm.com >
---------
Signed-off-by: Christoph Auer <cau@zurich.ibm.com >
2024-10-30 15:04:19 +01:00
Christoph Auer
a00c937e19
Ensure all models work only on valid pages ( #158 )
...
Signed-off-by: Christoph Auer <cau@zurich.ibm.com >
2024-10-18 08:54:06 +02:00
Christoph Auer
7d3be0edeb
feat!: Docling v2 ( #117 )
...
---------
Signed-off-by: Christoph Auer <cau@zurich.ibm.com >
Signed-off-by: Maxim Lysak <mly@zurich.ibm.com >
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com >
Signed-off-by: Panos Vagenas <35837085+vagenas@users.noreply.github.com >
Co-authored-by: Maxim Lysak <mly@zurich.ibm.com >
Co-authored-by: Michele Dolfi <dol@zurich.ibm.com >
Co-authored-by: Panos Vagenas <35837085+vagenas@users.noreply.github.com >
2024-10-16 21:02:03 +02:00
Michele Dolfi
f96ea86a00
feat: add options for choosing OCR engines ( #118 )
...
---------
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com >
Signed-off-by: Nikos Livathinos <nli@zurich.ibm.com >
Signed-off-by: Peter Staar <taa@zurich.ibm.com >
Co-authored-by: Nikos Livathinos <nli@zurich.ibm.com >
Co-authored-by: Peter Staar <taa@zurich.ibm.com >
2024-10-08 19:07:08 +02:00
Maxim Lysak
2422f706a1
feat: new torch-based docling models ( #120 )
...
---------
Signed-off-by: Maxim Lysak <mly@zurich.ibm.com >
Co-authored-by: Maxim Lysak <mly@zurich.ibm.com >
2024-10-03 18:42:33 +02:00
Christoph Auer
d6df76f90b
feat: Support tableformer model choice ( #90 )
...
* Support tableformer model choice
Signed-off-by: Christoph Auer <cau@zurich.ibm.com >
* Update datamodel structure
Signed-off-by: Christoph Auer <cau@zurich.ibm.com >
* Update docs
Signed-off-by: Christoph Auer <cau@zurich.ibm.com >
* Cleanup
Signed-off-by: Christoph Auer <cau@zurich.ibm.com >
* Add test unit for table options
Signed-off-by: Christoph Auer <cau@zurich.ibm.com >
* Ensure import backwards-compatibility for PipelineOptions
Signed-off-by: Christoph Auer <cau@zurich.ibm.com >
* Update README
Signed-off-by: Christoph Auer <cau@zurich.ibm.com >
* Adjust parameters on custom_convert
Signed-off-by: Christoph Auer <60343111+cau-git@users.noreply.github.com >
* Update Dockerfile
Signed-off-by: Christoph Auer <60343111+cau-git@users.noreply.github.com >
---------
Signed-off-by: Christoph Auer <cau@zurich.ibm.com >
Signed-off-by: Christoph Auer <60343111+cau-git@users.noreply.github.com >
2024-09-26 21:37:08 +02:00
Michele Dolfi
8aa476ccd3
test: improve typing definitions (part 1) ( #72 )
...
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com >
2024-09-12 15:56:29 +02:00
Christoph Auer
e94d317c02
feat: Add adaptive OCR, factor out treatment of OCR areas and cell filtering ( #38 )
...
* Introduce adaptive OCR
Signed-off-by: Christoph Auer <cau@zurich.ibm.com >
* Factor out BaseOcrModel, add docling-parse backend tests, fixes
* Make easyocr default dep
Signed-off-by: Christoph Auer <cau@zurich.ibm.com >
---------
Signed-off-by: Christoph Auer <cau@zurich.ibm.com >
2024-08-20 15:28:03 +02:00
Christoph Auer
e9526bb11e
feat: Optimize table extraction quality, add configuration options ( #11 )
...
Signed-off-by: Christoph Auer <cau@zurich.ibm.com >
Signed-off-by: Christoph Auer <60343111+cau-git@users.noreply.github.com >
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com >
Signed-off-by: Michele Dolfi <97102151+dolfim-ibm@users.noreply.github.com >
Signed-off-by: Panos Vagenas <35837085+vagenas@users.noreply.github.com >
Co-authored-by: Christoph Auer <cau@zurich.ibm.com >
Co-authored-by: Michele Dolfi <97102151+dolfim-ibm@users.noreply.github.com >
Co-authored-by: Panos Vagenas <35837085+vagenas@users.noreply.github.com >
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
2024-07-17 16:13:21 +02:00
Christoph Auer
e2d996753b
Initial commit
2024-07-15 09:42:42 +02:00