Michele Dolfi
|
ed74fe2ec0
|
feat: new artifacts path and CLI utility (#876)
* fix artifacts path
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
* add docling-models utility
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
* missing formatting
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
* rename utility to docling-tools
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
* rename download methods and deprecation warnings
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
* propagate artifacts path usage for ocr models
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
* move function to utils
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
* remove unused file
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
* update docs
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
* simplify downloading specific model(s)
Signed-off-by: Panos Vagenas <35837085+vagenas@users.noreply.github.com>
* minor refactor
Signed-off-by: Panos Vagenas <35837085+vagenas@users.noreply.github.com>
---------
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
Signed-off-by: Panos Vagenas <35837085+vagenas@users.noreply.github.com>
Co-authored-by: Panos Vagenas <35837085+vagenas@users.noreply.github.com>
|
2025-02-06 15:46:32 +01:00 |
|
Michele Dolfi
|
5ad6de0560
|
fix: enrichment models batch size and expose picture classifier (#878)
* expose picture classifier in CLI
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
* use different batch size in each model
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
* remove batch size from CLI
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
* cleanup imports
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
---------
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
|
2025-02-05 11:46:01 +01:00 |
|
Matteo
|
3213b247ad
|
feat: Code and equation model for PDF and code blocks in markdown (#752)
* propagated changes for new CodeItem class
Signed-off-by: Matteo Omenetti <omenetti.matteo@gmail.com>
* Rebased branch on latest main. changes for CodeItem
Signed-off-by: Matteo Omenetti <omenetti.matteo@gmail.com>
* removed unused files
Signed-off-by: Matteo Omenetti <omenetti.matteo@gmail.com>
* chore: update lockfile
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
* pin latest docling-core
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
* update docling-core pinning
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
* pin docling-core
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
* use new add_code in backends and update typing in MD backend
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
* added if statement for backend
Signed-off-by: Matteo Omenetti <omenetti.matteo@gmail.com>
* removed unused import
Signed-off-by: Matteo Omenetti <omenetti.matteo@gmail.com>
* removed print statements
Signed-off-by: Matteo Omenetti <omenetti.matteo@gmail.com>
* gt for new pdf
Signed-off-by: Matteo Omenetti <omenetti.matteo@gmail.com>
* Update docling/pipeline/standard_pdf_pipeline.py
Co-authored-by: Michele Dolfi <97102151+dolfim-ibm@users.noreply.github.com>
Signed-off-by: Matteo <43417658+Matteo-Omenetti@users.noreply.github.com>
* fixed doc comment of __call__ function of code_formula_model
Signed-off-by: Matteo Omenetti <omenetti.matteo@gmail.com>
* fix artifacts_path type
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
* move imports
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
* move expansion_factor to base class
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
---------
Signed-off-by: Matteo Omenetti <omenetti.matteo@gmail.com>
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
Signed-off-by: Matteo <43417658+Matteo-Omenetti@users.noreply.github.com>
Co-authored-by: Christoph Auer <cau@zurich.ibm.com>
Co-authored-by: Michele Dolfi <dol@zurich.ibm.com>
Co-authored-by: Michele Dolfi <97102151+dolfim-ibm@users.noreply.github.com>
|
2025-01-24 16:54:22 +01:00 |
|