Docling/docling
Guilhem VERMOREL b3d111a3cd
fix: Tesseract OCR CLI can't process images composed with numbers only (#1201)
fix wrong type text extracted by tesseract_ocr_cli_model

Signed-off-by: gvl4 <Guilhem.VERMOREL@3ds.com>
Co-authored-by: gvl4 <Guilhem.VERMOREL@3ds.com>
2025-03-31 10:53:49 +02:00
..
backend fix: improve HTML layer detection, various MD fixes (#1241) 2025-03-26 16:07:14 +01:00
chunking feat: expose new hybrid chunker, update docs (#384) 2024-12-09 08:28:29 +01:00
cli feat(SmolDocling): Support MLX acceleration in VLM pipeline (#1199) 2025-03-19 15:38:54 +01:00
datamodel feat(SmolDocling): Support MLX acceleration in VLM pipeline (#1199) 2025-03-19 15:38:54 +01:00
models fix: Tesseract OCR CLI can't process images composed with numbers only (#1201) 2025-03-31 10:53:49 +02:00
pipeline feat(SmolDocling): Support MLX acceleration in VLM pipeline (#1199) 2025-03-19 15:38:54 +01:00
utils feat: Add DoclingParseV4 backend, using high-level docling-parse API (#905) 2025-03-18 10:38:19 +01:00
__init__.py Initial commit 2024-07-15 09:42:42 +02:00
document_converter.py fix(converter): Cache same pipeline class with different options (#1152) 2025-03-25 12:18:44 +01:00
exceptions.py feat: Introduce the enable_remote_services option to allow remote connections while processing (#941) 2025-02-12 15:18:01 +01:00
py.typed fix: Add py.typed marker file (#531) 2024-12-06 13:42:14 +01:00