Docling

History

Maxim Lysak 1c26769785 feat(SmolDocling): Support MLX acceleration in VLM pipeline (#1199 ) * Initial implementation to support MLX for VLM pipeline and SmolDocling Signed-off-by: Maksym Lysak <mly@zurich.ibm.com> * mlx_model unit Signed-off-by: Maksym Lysak <mly@zurich.ibm.com> * Add CLI choices for VLM pipeline and model Signed-off-by: Christoph Auer <cau@zurich.ibm.com> * Initial implementation to support MLX for VLM pipeline and SmolDocling Signed-off-by: Maksym Lysak <mly@zurich.ibm.com> * mlx_model unit Signed-off-by: Maksym Lysak <mly@zurich.ibm.com> * Add CLI choices for VLM pipeline and model Signed-off-by: Christoph Auer <cau@zurich.ibm.com> * Updated minimal vlm pipeline example Signed-off-by: Maksym Lysak <mly@zurich.ibm.com> * make vlm_pipeline python3.9 compatible Signed-off-by: Maksym Lysak <mly@zurich.ibm.com> * Fixed extract_text_from_backend definition Signed-off-by: Maksym Lysak <mly@zurich.ibm.com> * Updated README Signed-off-by: Maksym Lysak <mly@zurich.ibm.com> * Updated example Signed-off-by: Maksym Lysak <mly@zurich.ibm.com> * Updated documentation Signed-off-by: Maksym Lysak <mly@zurich.ibm.com> * corrections in the documentation Signed-off-by: Maksym Lysak <mly@zurich.ibm.com> * Consmetic changes Signed-off-by: Christoph Auer <cau@zurich.ibm.com> --------- Signed-off-by: Maksym Lysak <mly@zurich.ibm.com> Signed-off-by: Christoph Auer <cau@zurich.ibm.com> Co-authored-by: Maksym Lysak <mly@zurich.ibm.com> Co-authored-by: Christoph Auer <cau@zurich.ibm.com>		2025-03-19 15:38:54 +01:00
..
factories	feat: add factory for ocr engines via plugins (#1010 )	2025-03-18 13:58:05 +01:00
plugins	feat: add factory for ocr engines via plugins (#1010 )	2025-03-18 13:58:05 +01:00
__init__.py	Initial commit	2024-07-15 09:42:42 +02:00
base_model.py	feat: add factory for ocr engines via plugins (#1010 )	2025-03-18 13:58:05 +01:00
base_ocr_model.py	feat: add factory for ocr engines via plugins (#1010 )	2025-03-18 13:58:05 +01:00
code_formula_model.py	perf: New revision code formula model and document picture classifier (#1140 )	2025-03-11 10:15:28 +01:00
document_picture_classifier.py	perf: New revision code formula model and document picture classifier (#1140 )	2025-03-11 10:15:28 +01:00
easyocr_model.py	feat: add factory for ocr engines via plugins (#1010 )	2025-03-18 13:58:05 +01:00
hf_mlx_model.py	feat(SmolDocling): Support MLX acceleration in VLM pipeline (#1199 )	2025-03-19 15:38:54 +01:00
hf_vlm_model.py	feat: [Experimental] Introduce VLM pipeline using HF AutoModelForVision2Seq, featuring SmolDocling model (#1054 )	2025-02-26 14:43:26 +01:00
layout_model.py	refactor: use org--name in artifacts-path (#912 )	2025-02-07 13:58:05 +01:00
ocr_mac_model.py	feat: add factory for ocr engines via plugins (#1010 )	2025-03-18 13:58:05 +01:00
page_assemble_model.py	feat: Implement new reading-order model (#916 )	2025-02-20 17:51:17 +01:00
page_preprocessing_model.py	feat: Add DoclingParseV4 backend, using high-level docling-parse API (#905 )	2025-03-18 10:38:19 +01:00
picture_description_api_model.py	feat: add factory for ocr engines via plugins (#1010 )	2025-03-18 13:58:05 +01:00
picture_description_base_model.py	feat: add factory for ocr engines via plugins (#1010 )	2025-03-18 13:58:05 +01:00
picture_description_vlm_model.py	feat: add factory for ocr engines via plugins (#1010 )	2025-03-18 13:58:05 +01:00
rapid_ocr_model.py	feat: add factory for ocr engines via plugins (#1010 )	2025-03-18 13:58:05 +01:00
readingorder_model.py	feat: Implement new reading-order model (#916 )	2025-02-20 17:51:17 +01:00
table_structure_model.py	feat: Add DoclingParseV4 backend, using high-level docling-parse API (#905 )	2025-03-18 10:38:19 +01:00
tesseract_ocr_cli_model.py	feat: add factory for ocr engines via plugins (#1010 )	2025-03-18 13:58:05 +01:00
tesseract_ocr_model.py	feat: add factory for ocr engines via plugins (#1010 )	2025-03-18 13:58:05 +01:00