Docling/docling/models
Christoph Auer eb97357b05
feat: Use new TableFormer model weights and default to accurate model version (#1100)
* feat: New tableformer model weights [WIP]

Signed-off-by: Christoph Auer <60343111+cau-git@users.noreply.github.com>

* Updated TF version

Signed-off-by: Maksym Lysak <mly@zurich.ibm.com>

* Updated tests, after merging with Main, Switched to Accurate TF model by default

Signed-off-by: Maksym Lysak <mly@zurich.ibm.com>

---------

Signed-off-by: Christoph Auer <60343111+cau-git@users.noreply.github.com>
Signed-off-by: Maksym Lysak <mly@zurich.ibm.com>
Co-authored-by: Maksym Lysak <mly@zurich.ibm.com>
2025-03-11 10:53:49 +01:00
..
__init__.py Initial commit 2024-07-15 09:42:42 +02:00
base_model.py feat: Describe pictures using vision models (#259) 2025-02-07 16:30:42 +01:00
base_ocr_model.py fix: Improve OCR results, stricten criteria before dropping bitmap areas (#719) 2025-01-10 10:38:49 +01:00
code_formula_model.py perf: New revision code formula model and document picture classifier (#1140) 2025-03-11 10:15:28 +01:00
document_picture_classifier.py perf: New revision code formula model and document picture classifier (#1140) 2025-03-11 10:15:28 +01:00
easyocr_model.py fix: remove unused httpx (#919) 2025-02-07 17:51:31 +01:00
hf_vlm_model.py feat: [Experimental] Introduce VLM pipeline using HF AutoModelForVision2Seq, featuring SmolDocling model (#1054) 2025-02-26 14:43:26 +01:00
layout_model.py refactor: use org--name in artifacts-path (#912) 2025-02-07 13:58:05 +01:00
ocr_mac_model.py feat: add support for ocrmac OCR engine on macOS (#276) 2024-11-20 12:51:19 +01:00
page_assemble_model.py feat: Implement new reading-order model (#916) 2025-02-20 17:51:17 +01:00
page_preprocessing_model.py feat: Add pipeline timings and toggle visualization, establish debug settings (#183) 2024-10-30 15:04:19 +01:00
picture_description_api_model.py feat: Introduce the enable_remote_services option to allow remote connections while processing (#941) 2025-02-12 15:18:01 +01:00
picture_description_base_model.py feat: Describe pictures using vision models (#259) 2025-02-07 16:30:42 +01:00
picture_description_vlm_model.py fix: vlm using artifacts path (#1057) 2025-02-26 08:33:50 +01:00
rapid_ocr_model.py feat(ocr): expose rec_keys_path in RapidOcrOptions to support custom dictionaries (#786) 2025-01-27 13:38:15 +01:00
readingorder_model.py feat: Implement new reading-order model (#916) 2025-02-20 17:51:17 +01:00
table_structure_model.py feat: Use new TableFormer model weights and default to accurate model version (#1100) 2025-03-11 10:53:49 +01:00
tesseract_ocr_cli_model.py fix: Runtime error when Pandas Series is not always of string type (#1024) 2025-02-20 15:41:41 +01:00
tesseract_ocr_model.py fix: Fix the initialization of the TesseractOcrModel (#935) 2025-02-11 12:27:12 +01:00