Docling

History

Christoph Auer ec6cf6f7e8 feat: Introduce LayoutOptions to control layout postprocessing behaviour (#1870 ) Signed-off-by: Christoph Auer <cau@zurich.ibm.com>		2025-07-04 15:36:13 +02:00
..
factories	ci: add coverage and ruff (#1383 )	2025-04-14 18:01:26 +02:00
plugins	perf: Move expensive imports closer to usage (#1863 )	2025-07-01 22:27:17 +02:00
utils	feat: new vlm-models support (#1570 )	2025-06-02 17:01:06 +02:00
vlm_models_inline	feat: Maximum image size for Vlm models (#1802 )	2025-06-18 12:57:37 +02:00
__init__.py	Initial commit	2024-07-15 09:42:42 +02:00
api_vlm_model.py	feat: Maximum image size for Vlm models (#1802 )	2025-06-18 12:57:37 +02:00
base_model.py	fix: formula conversion with page_range param set (#1791 )	2025-06-17 13:58:45 +02:00
base_ocr_model.py	perf: Move expensive imports closer to usage (#1863 )	2025-07-01 22:27:17 +02:00
code_formula_model.py	feat: new vlm-models support (#1570 )	2025-06-02 17:01:06 +02:00
document_picture_classifier.py	feat: new vlm-models support (#1570 )	2025-06-02 17:01:06 +02:00
easyocr_model.py	feat: Make Page.parsed_page the only source of truth for text cells, add OCR cells to it (#1745 )	2025-06-13 19:01:55 +02:00
layout_model.py	feat: Introduce LayoutOptions to control layout postprocessing behaviour (#1870 )	2025-07-04 15:36:13 +02:00
ocr_mac_model.py	feat: Make Page.parsed_page the only source of truth for text cells, add OCR cells to it (#1745 )	2025-06-13 19:01:55 +02:00
page_assemble_model.py	feat: Establish confidence estimation for document and pages (#1313 )	2025-05-21 12:32:49 +02:00
page_preprocessing_model.py	feat: Make Page.parsed_page the only source of truth for text cells, add OCR cells to it (#1745 )	2025-06-13 19:01:55 +02:00
picture_description_api_model.py	feat: new vlm-models support (#1570 )	2025-06-02 17:01:06 +02:00
picture_description_base_model.py	feat: new vlm-models support (#1570 )	2025-06-02 17:01:06 +02:00
picture_description_vlm_model.py	fix: Secure torch model inits with global locks (#1884 )	2025-07-04 07:27:26 +02:00
rapid_ocr_model.py	feat: Make Page.parsed_page the only source of truth for text cells, add OCR cells to it (#1745 )	2025-06-13 19:01:55 +02:00
readingorder_model.py	feat: Integrate ListItemMarkerProcessor into document assembly (#1825 )	2025-07-01 10:04:58 +02:00
table_structure_model.py	perf: Move expensive imports closer to usage (#1863 )	2025-07-01 22:27:17 +02:00
tesseract_ocr_cli_model.py	feat: Make Page.parsed_page the only source of truth for text cells, add OCR cells to it (#1745 )	2025-06-13 19:01:55 +02:00
tesseract_ocr_model.py	fix: Ensure that TesseractOcrModel does not crash in case OSD is not installed (#1866 )	2025-06-30 10:55:56 +02:00