Docling/docling/models at 56a0e104f76c5ba30ac0fcd247be61f911b560c1 - Docling - Gitea: Git with a cup of tea

NeoAnd/Docling

Files

History

Christoph Auer 56a0e104f7 feat: Integrate ListItemMarkerProcessor into document assembly (#1825 )

* Integrate ListItemMarkerProcessor into document assembly

Signed-off-by: Christoph Auer <cau@zurich.ibm.com>

* Update to final version

Signed-off-by: Christoph Auer <cau@zurich.ibm.com>

* Update all test cases

Signed-off-by: Christoph Auer <cau@zurich.ibm.com>

* Upgrade deps

Signed-off-by: Christoph Auer <cau@zurich.ibm.com>

---------

Signed-off-by: Christoph Auer <cau@zurich.ibm.com>

2025-07-01 10:04:58 +02:00

..

ci: add coverage and ruff (#1383 )

2025-04-14 18:01:26 +02:00

feat: add factory for ocr engines via plugins (#1010 )

2025-03-18 13:58:05 +01:00

feat: new vlm-models support (#1570 )

2025-06-02 17:01:06 +02:00

vlm_models_inline

feat: Maximum image size for Vlm models (#1802 )

2025-06-18 12:57:37 +02:00

__init__.py

Initial commit

2024-07-15 09:42:42 +02:00

api_vlm_model.py

feat: Maximum image size for Vlm models (#1802 )

2025-06-18 12:57:37 +02:00

base_model.py

fix: formula conversion with page_range param set (#1791 )

2025-06-17 13:58:45 +02:00

base_ocr_model.py

feat: Make Page.parsed_page the only source of truth for text cells, add OCR cells to it (#1745 )

2025-06-13 19:01:55 +02:00

code_formula_model.py

feat: new vlm-models support (#1570 )

2025-06-02 17:01:06 +02:00

document_picture_classifier.py

feat: new vlm-models support (#1570 )

2025-06-02 17:01:06 +02:00

easyocr_model.py

feat: Make Page.parsed_page the only source of truth for text cells, add OCR cells to it (#1745 )

2025-06-13 19:01:55 +02:00

layout_model.py

feat: Make Page.parsed_page the only source of truth for text cells, add OCR cells to it (#1745 )

2025-06-13 19:01:55 +02:00

ocr_mac_model.py

feat: Make Page.parsed_page the only source of truth for text cells, add OCR cells to it (#1745 )

2025-06-13 19:01:55 +02:00

page_assemble_model.py

feat: Establish confidence estimation for document and pages (#1313 )

2025-05-21 12:32:49 +02:00

page_preprocessing_model.py

feat: Make Page.parsed_page the only source of truth for text cells, add OCR cells to it (#1745 )

2025-06-13 19:01:55 +02:00

picture_description_api_model.py

feat: new vlm-models support (#1570 )

2025-06-02 17:01:06 +02:00

picture_description_base_model.py

feat: new vlm-models support (#1570 )

2025-06-02 17:01:06 +02:00

picture_description_vlm_model.py

feat: new vlm-models support (#1570 )

2025-06-02 17:01:06 +02:00

rapid_ocr_model.py

feat: Make Page.parsed_page the only source of truth for text cells, add OCR cells to it (#1745 )

2025-06-13 19:01:55 +02:00

readingorder_model.py

feat: Integrate ListItemMarkerProcessor into document assembly (#1825 )

2025-07-01 10:04:58 +02:00

table_structure_model.py

feat: new vlm-models support (#1570 )

2025-06-02 17:01:06 +02:00

tesseract_ocr_cli_model.py

feat: Make Page.parsed_page the only source of truth for text cells, add OCR cells to it (#1745 )

2025-06-13 19:01:55 +02:00

tesseract_ocr_model.py

fix: Ensure that TesseractOcrModel does not crash in case OSD is not installed (#1866 )

2025-06-30 10:55:56 +02:00