Docling/docling/models at eb97357b0560b59c14a8be3fb52d6a1362ad0a1d - Docling - Gitea: Git with a cup of tea

NeoAnd/Docling

Files

History

Christoph Auer eb97357b05 feat: Use new TableFormer model weights and default to accurate model version (#1100 )

* feat: New tableformer model weights [WIP]

Signed-off-by: Christoph Auer <60343111+cau-git@users.noreply.github.com>

* Updated TF version

Signed-off-by: Maksym Lysak <mly@zurich.ibm.com>

* Updated tests, after merging with Main, Switched to Accurate TF model by default

Signed-off-by: Maksym Lysak <mly@zurich.ibm.com>

---------

Signed-off-by: Christoph Auer <60343111+cau-git@users.noreply.github.com>
Signed-off-by: Maksym Lysak <mly@zurich.ibm.com>
Co-authored-by: Maksym Lysak <mly@zurich.ibm.com>

2025-03-11 10:53:49 +01:00

..

__init__.py

Initial commit

2024-07-15 09:42:42 +02:00

base_model.py

feat: Describe pictures using vision models (#259 )

2025-02-07 16:30:42 +01:00

base_ocr_model.py

fix: Improve OCR results, stricten criteria before dropping bitmap areas (#719 )

2025-01-10 10:38:49 +01:00

code_formula_model.py

perf: New revision code formula model and document picture classifier (#1140 )

2025-03-11 10:15:28 +01:00

document_picture_classifier.py

perf: New revision code formula model and document picture classifier (#1140 )

2025-03-11 10:15:28 +01:00

easyocr_model.py

fix: remove unused httpx (#919 )

2025-02-07 17:51:31 +01:00

hf_vlm_model.py

feat: [Experimental] Introduce VLM pipeline using HF AutoModelForVision2Seq, featuring SmolDocling model (#1054 )

2025-02-26 14:43:26 +01:00

layout_model.py

refactor: use org--name in artifacts-path (#912 )

2025-02-07 13:58:05 +01:00

ocr_mac_model.py

feat: add support for ocrmac OCR engine on macOS (#276 )

2024-11-20 12:51:19 +01:00

page_assemble_model.py

feat: Implement new reading-order model (#916 )

2025-02-20 17:51:17 +01:00

page_preprocessing_model.py

feat: Add pipeline timings and toggle visualization, establish debug settings (#183 )

2024-10-30 15:04:19 +01:00

picture_description_api_model.py

feat: Introduce the enable_remote_services option to allow remote connections while processing (#941 )

2025-02-12 15:18:01 +01:00

picture_description_base_model.py

feat: Describe pictures using vision models (#259 )

2025-02-07 16:30:42 +01:00

picture_description_vlm_model.py

fix: vlm using artifacts path (#1057 )

2025-02-26 08:33:50 +01:00

rapid_ocr_model.py

feat(ocr): expose rec_keys_path in RapidOcrOptions to support custom dictionaries (#786 )

2025-01-27 13:38:15 +01:00

readingorder_model.py

feat: Implement new reading-order model (#916 )

2025-02-20 17:51:17 +01:00

table_structure_model.py

feat: Use new TableFormer model weights and default to accurate model version (#1100 )

2025-03-11 10:53:49 +01:00

tesseract_ocr_cli_model.py

fix: Runtime error when Pandas Series is not always of string type (#1024 )

2025-02-20 15:41:41 +01:00

tesseract_ocr_model.py

fix: Fix the initialization of the TesseractOcrModel (#935 )

2025-02-11 12:27:12 +01:00