Docling/docling/models at ed74fe2ec0a702834f0deacfdb5717c8c587dab1 - Docling - Gitea: Git with a cup of tea

NeoAnd/Docling

Files

History

Michele Dolfi ed74fe2ec0 feat: new artifacts path and CLI utility (#876 )

* fix artifacts path

Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>

* add docling-models utility

Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>

* missing formatting

Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>

* rename utility to docling-tools

Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>

* rename download methods and deprecation warnings

Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>

* propagate artifacts path usage for ocr models

Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>

* move function to utils

Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>

* remove unused file

Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>

* update docs

Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>

* simplify downloading specific model(s)

Signed-off-by: Panos Vagenas <35837085+vagenas@users.noreply.github.com>

* minor refactor

Signed-off-by: Panos Vagenas <35837085+vagenas@users.noreply.github.com>

---------

Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
Signed-off-by: Panos Vagenas <35837085+vagenas@users.noreply.github.com>
Co-authored-by: Panos Vagenas <35837085+vagenas@users.noreply.github.com>

2025-02-06 15:46:32 +01:00

..

__init__.py

Initial commit

2024-07-15 09:42:42 +02:00

base_model.py

fix: enrichment models batch size and expose picture classifier (#878 )

2025-02-05 11:46:01 +01:00

base_ocr_model.py

fix: Improve OCR results, stricten criteria before dropping bitmap areas (#719 )

2025-01-10 10:38:49 +01:00

code_formula_model.py

feat: new artifacts path and CLI utility (#876 )

2025-02-06 15:46:32 +01:00

document_picture_classifier.py

feat: new artifacts path and CLI utility (#876 )

2025-02-06 15:46:32 +01:00

ds_glm_model.py

feat: Updated Layout processing with forms and key-value areas (#530 )

2024-12-17 17:32:24 +01:00

easyocr_model.py

feat: new artifacts path and CLI utility (#876 )

2025-02-06 15:46:32 +01:00

layout_model.py

feat: new artifacts path and CLI utility (#876 )

2025-02-06 15:46:32 +01:00

ocr_mac_model.py

feat: add support for ocrmac OCR engine on macOS (#276 )

2024-11-20 12:51:19 +01:00

page_assemble_model.py

feat: Code and equation model for PDF and code blocks in markdown (#752 )

2025-01-24 16:54:22 +01:00

page_preprocessing_model.py

feat: Add pipeline timings and toggle visualization, establish debug settings (#183 )

2024-10-30 15:04:19 +01:00

rapid_ocr_model.py

feat(ocr): expose rec_keys_path in RapidOcrOptions to support custom dictionaries (#786 )

2025-01-27 13:38:15 +01:00

table_structure_model.py

feat: new artifacts path and CLI utility (#876 )

2025-02-06 15:46:32 +01:00

tesseract_ocr_cli_model.py

feat: Introduce automatic language detection in TesseractOcrCliModel (#800 )

2025-01-26 08:07:56 +01:00

tesseract_ocr_model.py

feat: Introduce automatic language detection in TesseractOcrCliModel (#800 )

2025-01-26 08:07:56 +01:00