Docling/docling/models
Michele Dolfi ed74fe2ec0
feat: new artifacts path and CLI utility (#876)
* fix artifacts path

Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>

* add docling-models utility

Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>

* missing formatting

Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>

* rename utility to docling-tools

Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>

* rename download methods and deprecation warnings

Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>

* propagate artifacts path usage for ocr models

Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>

* move function to utils

Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>

* remove unused file

Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>

* update docs

Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>

* simplify downloading specific model(s)

Signed-off-by: Panos Vagenas <35837085+vagenas@users.noreply.github.com>

* minor refactor

Signed-off-by: Panos Vagenas <35837085+vagenas@users.noreply.github.com>

---------

Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
Signed-off-by: Panos Vagenas <35837085+vagenas@users.noreply.github.com>
Co-authored-by: Panos Vagenas <35837085+vagenas@users.noreply.github.com>
2025-02-06 15:46:32 +01:00
..
__init__.py Initial commit 2024-07-15 09:42:42 +02:00
base_model.py fix: enrichment models batch size and expose picture classifier (#878) 2025-02-05 11:46:01 +01:00
base_ocr_model.py fix: Improve OCR results, stricten criteria before dropping bitmap areas (#719) 2025-01-10 10:38:49 +01:00
code_formula_model.py feat: new artifacts path and CLI utility (#876) 2025-02-06 15:46:32 +01:00
document_picture_classifier.py feat: new artifacts path and CLI utility (#876) 2025-02-06 15:46:32 +01:00
ds_glm_model.py feat: Updated Layout processing with forms and key-value areas (#530) 2024-12-17 17:32:24 +01:00
easyocr_model.py feat: new artifacts path and CLI utility (#876) 2025-02-06 15:46:32 +01:00
layout_model.py feat: new artifacts path and CLI utility (#876) 2025-02-06 15:46:32 +01:00
ocr_mac_model.py feat: add support for ocrmac OCR engine on macOS (#276) 2024-11-20 12:51:19 +01:00
page_assemble_model.py feat: Code and equation model for PDF and code blocks in markdown (#752) 2025-01-24 16:54:22 +01:00
page_preprocessing_model.py feat: Add pipeline timings and toggle visualization, establish debug settings (#183) 2024-10-30 15:04:19 +01:00
rapid_ocr_model.py feat(ocr): expose rec_keys_path in RapidOcrOptions to support custom dictionaries (#786) 2025-01-27 13:38:15 +01:00
table_structure_model.py feat: new artifacts path and CLI utility (#876) 2025-02-06 15:46:32 +01:00
tesseract_ocr_cli_model.py feat: Introduce automatic language detection in TesseractOcrCliModel (#800) 2025-01-26 08:07:56 +01:00
tesseract_ocr_model.py feat: Introduce automatic language detection in TesseractOcrCliModel (#800) 2025-01-26 08:07:56 +01:00