Commit Graph

560 Commits

Author SHA1 Message Date
github-actions[bot]
f4a1c06937 chore: bump version to 2.40.0 [skip ci] 2025-07-04 15:31:36 +00:00
Christoph Auer
ec6cf6f7e8
feat: Introduce LayoutOptions to control layout postprocessing behaviour (#1870)
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
2025-07-04 15:36:13 +02:00
Christoph Auer
598c9c53d4
fix: Secure torch model inits with global locks (#1884)
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
2025-07-04 07:27:26 +02:00
Qiefan Jiang
13865c06f5
perf(msexcel): _find_table_bounds use iter_rows/iter_cols instead of Worksheet.cell (#1875)
* perf(msexcel): _find_table_bounds use iter_rows/iter_cols instead of sheet.cell

* DCO Remediation Commit for Qiefan Jiang <jiangqiefan@bytedance.com>

I, Qiefan Jiang <jiangqiefan@bytedance.com>, hereby add my Signed-off-by to this commit: 274102a8d4db5d2da8c7ca603e1eb039c1e07967

Signed-off-by: Qiefan Jiang <jiangqiefan@bytedance.com>

* fix lint

* DCO Remediation Commit for Qiefan Jiang <jiangqiefan@bytedance.com>

I, Qiefan Jiang <jiangqiefan@bytedance.com>, hereby add my Signed-off-by to this commit: b6b5b090a99ba7ba23c1facf0317f7e9f95039e5

Signed-off-by: Qiefan Jiang <jiangqiefan@bytedance.com>

---------

Signed-off-by: Qiefan Jiang <jiangqiefan@bytedance.com>
2025-07-03 13:12:06 +02:00
William Easton
3089cf2d26
perf: Move expensive imports closer to usage (#1863)
* Move expensive imports closer to usage

Signed-off-by: William Easton <bill.easton@elastic.co>

* DCO Remediation Commit for William Easton <bill.easton@elastic.co>

I, William Easton <bill.easton@elastic.co>, hereby add my Signed-off-by to this commit: 8a7412ce5bb131a01bb6403067aeb948c9093b0b

Signed-off-by: William Easton <bill.easton@elastic.co>

* formatting fixes

Signed-off-by: William Easton <bill.easton@elastic.co>

* DCO Remediation Commit for William Easton <bill.easton@elastic.co>

I, William Easton <bill.easton@elastic.co>, hereby add my Signed-off-by to this commit: 8a7412ce5bb131a01bb6403067aeb948c9093b0b
I, William Easton <bill.easton@elastic.co>, hereby add my Signed-off-by to this commit: 963e34325071db5e844841f10c27b396a054a0a1

Signed-off-by: William Easton <bill.easton@elastic.co>

* Fix baseocrmodel test issue

Signed-off-by: William Easton <bill.easton@elastic.co>

---------

Signed-off-by: William Easton <bill.easton@elastic.co>
2025-07-01 22:27:17 +02:00
Christoph Auer
56a0e104f7
feat: Integrate ListItemMarkerProcessor into document assembly (#1825)
* Integrate ListItemMarkerProcessor into document assembly

Signed-off-by: Christoph Auer <cau@zurich.ibm.com>

* Update to final version

Signed-off-by: Christoph Auer <cau@zurich.ibm.com>

* Update all test cases

Signed-off-by: Christoph Auer <cau@zurich.ibm.com>

* Upgrade deps

Signed-off-by: Christoph Auer <cau@zurich.ibm.com>

---------

Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
2025-07-01 10:04:58 +02:00
Christoph Auer
bdfee4e2d0
chore: Safer unloading of DPv4 backend (#1867)
fix: Safer unloading of DPv4 backend

Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
2025-06-30 14:41:21 +02:00
Nikos Livathinos
ae39a9411a
fix: Ensure that TesseractOcrModel does not crash in case OSD is not installed (#1866)
fix: Ensure that TesseractOcrModel does not crash if tesseract OSD is not installed

Signed-off-by: Nikos Livathinos <nli@zurich.ibm.com>
2025-06-30 10:55:56 +02:00
github-actions[bot]
bb99be6c24 chore: bump version to 2.39.0 [skip ci] 2025-06-27 15:37:53 +00:00
Panos Vagenas
0533da1923
feat: leverage new list modeling, capture default markers (#1856)
* chore: update docling-core & regenerate test data

Signed-off-by: Panos Vagenas <pva@zurich.ibm.com>

* update backends to leverage new list modeling

Signed-off-by: Panos Vagenas <pva@zurich.ibm.com>

* repin docling-core

Signed-off-by: Panos Vagenas <pva@zurich.ibm.com>

* ensure availability of latest docling-core API

Signed-off-by: Panos Vagenas <pva@zurich.ibm.com>

---------

Signed-off-by: Panos Vagenas <pva@zurich.ibm.com>
2025-06-27 16:37:15 +02:00
Michael Honaker
e79e4f0ab6
fix(markdown): make parsing of rich table cells valid (#1821)
* fix: update md table classification

Signed-off-by: Michael Honaker <Michael.Honaker@ibm.com>

* Fix ground truth header changes

Signed-off-by: Michael Honaker <Michael.Honaker@ibm.com>

* Fix merge issues

Signed-off-by: Michael Honaker <Michael.Honaker@ibm.com>

* Fix minor ground truth errors

Signed-off-by: Michael Honaker <Michael.Honaker@ibm.com>

---------

Signed-off-by: Michael Honaker <Michael.Honaker@ibm.com>
2025-06-26 19:50:45 +02:00
github-actions[bot]
ee4781075a chore: bump version to 2.38.1 [skip ci] 2025-06-25 16:27:46 +00:00
pranaymiri
d337825b8e
fix: updated granite vision model version for picture description (#1852)
* updated granite model version

* DCO Remediation Commit for Miriyala Pranay <miriyalapranay146@gmail.com>
I, Miriyala Pranay <miriyalapranay146@gmail.com>, hereby add my Signed-off-by to this commit: 5de0d5034c5988613bc1c42a2dab043ba0106956

Signed-off-by: Miriyala Pranay <miriyalapranay146@gmail.com>

---------

Signed-off-by: Miriyala Pranay <miriyalapranay146@gmail.com>
2025-06-25 17:49:56 +02:00
Panos Vagenas
7c5614a37a
fix(markdown): fix single-formatted headings & list items (#1820)
* fix(markdown): fix formatting & inline edge cases (show behavior before change)

Signed-off-by: Panos Vagenas <pva@zurich.ibm.com>

* add change and updated test data

Signed-off-by: Panos Vagenas <pva@zurich.ibm.com>

* update lock

Signed-off-by: Panos Vagenas <pva@zurich.ibm.com>

* improve test case

Signed-off-by: Panos Vagenas <pva@zurich.ibm.com>

---------

Signed-off-by: Panos Vagenas <pva@zurich.ibm.com>
2025-06-25 13:05:06 +02:00
Michele Dolfi
41e8cae26b
fix: fix response type of ollama (#1850)
fix response type of ollama

Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
2025-06-25 11:33:09 +02:00
Allen N.
4002de1f92
fix: Handle missing runs to avoid out of range exception (#1844)
Fixes #1681 on upstream

Signed-off-by: Allen Nikka <allennikka@gmail.com>
2025-06-25 07:55:27 +02:00
github-actions[bot]
1dc63d0aa9 chore: bump version to 2.38.0 [skip ci] 2025-06-23 18:14:24 +00:00
Peter W. J. Staar
f3ae3029b8
docs: update readme and add ASR example (#1836)
* updated the README

Signed-off-by: Peter Staar <taa@zurich.ibm.com>

* added minimal_asr_pipeline

Signed-off-by: Peter Staar <taa@zurich.ibm.com>

* Updated README and added ASR example

Signed-off-by: Peter Staar <taa@zurich.ibm.com>

* Updated docs.index.md

Signed-off-by: Peter Staar <taa@zurich.ibm.com>

* updated CI and mkdocs

Signed-off-by: Peter Staar <taa@zurich.ibm.com>

* added link tp existing audio file

Signed-off-by: Peter Staar <taa@zurich.ibm.com>

* added link tp existing audio file

Signed-off-by: Peter Staar <taa@zurich.ibm.com>

* reformatting

Signed-off-by: Peter Staar <taa@zurich.ibm.com>

---------

Signed-off-by: Peter Staar <taa@zurich.ibm.com>
2025-06-23 18:55:16 +02:00
Peter W. J. Staar
1557e7ce3e
feat: Support audio input (#1763)
* scaffolding in place

Signed-off-by: Peter Staar <taa@zurich.ibm.com>

* doing scaffolding for audio pipeline

Signed-off-by: Peter Staar <taa@zurich.ibm.com>

* WIP: got first transcription working

Signed-off-by: Peter Staar <taa@zurich.ibm.com>

* all working, time to start cleaning up

Signed-off-by: Peter Staar <taa@zurich.ibm.com>

* first working ASR pipeline

Signed-off-by: Peter Staar <taa@zurich.ibm.com>

* added openai-whisper as a first transcription model

Signed-off-by: Peter Staar <taa@zurich.ibm.com>

* updating with asr_options

Signed-off-by: Peter Staar <taa@zurich.ibm.com>

* finalised the first working ASR pipeline with Whisper

Signed-off-by: Peter Staar <taa@zurich.ibm.com>

* use whisper from the latest git commit

Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>

* Update docling/datamodel/pipeline_options.py

Co-authored-by: Michele Dolfi <97102151+dolfim-ibm@users.noreply.github.com>
Signed-off-by: Peter W. J. Staar <91719829+PeterStaar-IBM@users.noreply.github.com>

* Update docling/datamodel/pipeline_options.py

Co-authored-by: Michele Dolfi <97102151+dolfim-ibm@users.noreply.github.com>
Signed-off-by: Peter W. J. Staar <91719829+PeterStaar-IBM@users.noreply.github.com>

* updated comment

Signed-off-by: Peter Staar <taa@zurich.ibm.com>

* AudioBackend -> DummyBackend

Signed-off-by: Christoph Auer <cau@zurich.ibm.com>

* file rename

Signed-off-by: Christoph Auer <cau@zurich.ibm.com>

* Rename to NoOpBackend, add test for ASR pipeline

Signed-off-by: Christoph Auer <cau@zurich.ibm.com>

* Support every format in NoOpBackend

Signed-off-by: Christoph Auer <cau@zurich.ibm.com>

* Add missing audio file and test

Signed-off-by: Christoph Auer <cau@zurich.ibm.com>

* Install ffmpeg system dependency for ASR test

Signed-off-by: Christoph Auer <cau@zurich.ibm.com>

---------

Signed-off-by: Peter Staar <taa@zurich.ibm.com>
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
Signed-off-by: Peter W. J. Staar <91719829+PeterStaar-IBM@users.noreply.github.com>
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
Co-authored-by: Michele Dolfi <dol@zurich.ibm.com>
Co-authored-by: Michele Dolfi <97102151+dolfim-ibm@users.noreply.github.com>
Co-authored-by: Christoph Auer <cau@zurich.ibm.com>
2025-06-23 14:47:26 +02:00
Cesar Berrospi Ramis
d26dac61a8
fix(docx): ensure list items have a list parent (#1827)
Signed-off-by: Cesar Berrospi Ramis <75900930+ceberam@users.noreply.github.com>
2025-06-20 14:47:25 +02:00
mkrssg
1350a8d3e5
fix(msword_backend): Identify text in the same line after an image #1425 (#1610)
* fix(msword_backend): Identify text in the same line after an image / image anchor #1425

Signed-off-by: Michael Krissgau <michael.krissgau@ibm.com>

* test: add test file and case for fix(msword_backend): Identify text in the same line after an image / image anchor #1425

Signed-off-by: Michael Krissgau <michael.krissgau@ibm.com>

* test: added groundtruth test files for fix(msword_backend): Identify text in the same line after an image / image anchor #1425

Signed-off-by: Michael Krissgau <michael.krissgau@ibm.com>

* fix: extraneous empty paragraphs for test files

Signed-off-by: Michael Krissgau <michael.krissgau@ibm.com>

---------

Signed-off-by: Michael Krissgau <michael.krissgau@ibm.com>
Co-authored-by: Michael Krissgau <michael.krissgau@ibm.com>
2025-06-20 10:55:30 +02:00
Michele Dolfi
64ac043786
docs: support running examples from root or subfolder (#1816)
support running examples from root or subfolder

Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
2025-06-19 11:10:40 +02:00
Christoph Auer
dd7f64ff28
fix: Ensure uninitialized pages are removed before assembling document (#1812)
Ensure uninitialized pages are removed before assembling document

Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
2025-06-19 07:33:25 +02:00
Panos Vagenas
861abcdcb0
feat(markdown): add formatting & improve inline support (#1804)
feat(markdown): support formatting & hyperlinks

Signed-off-by: Panos Vagenas <pva@zurich.ibm.com>
2025-06-18 15:57:57 +02:00
Shkarupa Alex
215b540f6c
feat: Maximum image size for Vlm models (#1802)
* Image scale moved to base vlm options.
Added max_size image limit (options and vlm models).

* DCO Remediation Commit for Shkarupa Alex <shkarupa.alex@gmail.com>

I, Shkarupa Alex <shkarupa.alex@gmail.com>, hereby add my Signed-off-by to this commit: e93602a0d02fdb6f6dea1f65686cffcc4c616011

Signed-off-by: Shkarupa Alex <shkarupa.alex@gmail.com>

---------

Signed-off-by: Shkarupa Alex <shkarupa.alex@gmail.com>
2025-06-18 12:57:37 +02:00
Mahafuzur Rahman
dbab30e92c
fix: formula conversion with page_range param set (#1791)
When page_range param is used for formula conversion,
the system throws list index out of range error.

Included tests to validate that the fix works.

Signed-off-by: Masum <masumsofts@yahoo.com>
2025-06-17 13:58:45 +02:00
Michele Dolfi
c2ef69718a
chore: dco advisor (#1795)
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
2025-06-17 09:45:56 +02:00
github-actions[bot]
7bae3b6c06 chore: bump version to 2.37.0 [skip ci] 2025-06-16 11:02:54 +00:00
Martin Wind
f28d23cf03
fix: pptx line break and space handling (#1664)
Signed-off-by: Martin Wind <martin.wind@im-c.at>
2025-06-16 10:44:30 +02:00
Cesar Berrospi Ramis
b886e4df31
fix(asciidoc): set default size when missing in image directive (#1769)
The AsciiDoc backend should not create an ImageRef with Size equal to None, instead use default size values.
Refactor static methods as such and add the staticmethod decorator.
Extend the regression test for this fix.

Signed-off-by: Cesar Berrospi Ramis <75900930+ceberam@users.noreply.github.com>
2025-06-16 10:38:46 +02:00
Christoph Auer
7d3302cb48
feat: Make Page.parsed_page the only source of truth for text cells, add OCR cells to it (#1745)
* Keep page.parsed_page.textline_cells and page.cells in sync, including OCR

Signed-off-by: Christoph Auer <cau@zurich.ibm.com>

* Make page.parsed_page the only source of truth for text cells

Signed-off-by: Christoph Auer <cau@zurich.ibm.com>

* Small fix

Signed-off-by: Christoph Auer <cau@zurich.ibm.com>

* Correctly compute PDF boxes from pymupdf

Signed-off-by: Christoph Auer <cau@zurich.ibm.com>

* Use different OCR engine order

Signed-off-by: Christoph Auer <cau@zurich.ibm.com>

* Add type hints and fix mypy

Signed-off-by: Christoph Auer <cau@zurich.ibm.com>

* One more test fix

Signed-off-by: Christoph Auer <cau@zurich.ibm.com>

* Remove with pypdfium2_lock from caller sites

Signed-off-by: Christoph Auer <cau@zurich.ibm.com>

* Fix typing

Signed-off-by: Christoph Auer <cau@zurich.ibm.com>

---------

Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
2025-06-13 19:01:55 +02:00
Michele Dolfi
0432a31b2f
docs: update vlm models api examples with LM Studio (#1759)
update vlm models api examples

Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
2025-06-12 12:58:44 +02:00
Bruno Rigal
7a275c7637
fix: Handle NoneType error in MsPowerpointDocumentBackend (#1747)
fix:nonetyperror in pptx backend

Signed-off-by: Bruno Rigal <bruno.rigal@probayes.com>
Co-authored-by: Bruno Rigal <bruno.rigal@probayes.com>
2025-06-10 19:43:20 +02:00
Ayraf
df140227c3
feat: support xlsm files (#1520)
* code for xlsm support

* updated support for xlsm

* updated code for xlsm support

* Update docling_parse_v4_backend.py

Signed-off-by: ShiroYasha18 <85089952+ShiroYasha18@users.noreply.github.com>

* Update docling_parse_v4_backend.py

Signed-off-by: ShiroYasha18 <85089952+ShiroYasha18@users.noreply.github.com>

* Update test_backend_msexcel_xlsm.py

 updated the tests/test_backend_msexcel_xlsm.py:

 have a function starting with test
removed all print statements
** To add an explicit assert {test}=={pred}

Signed-off-by: ShiroYasha18 <85089952+ShiroYasha18@users.noreply.github.com>

* Update base_models.py

Signed-off-by: ShiroYasha18 <85089952+ShiroYasha18@users.noreply.github.com>

* Update test_backend_msexcel.py

Signed-off-by: ShiroYasha18 <85089952+ShiroYasha18@users.noreply.github.com>

* Update test_backend_msexcel_xlsm.py

Signed-off-by: ShiroYasha18 <85089952+ShiroYasha18@users.noreply.github.com>

* Update document_converter.py

Signed-off-by: ShiroYasha18 <85089952+ShiroYasha18@users.noreply.github.com>

* Delete tests/test_backend_msexcel_xlsm.py

Signed-off-by: ShiroYasha18 <85089952+ShiroYasha18@users.noreply.github.com>

* xlsm file

Signed-off-by: ShiroYasha18 <85089952+ShiroYasha18@users.noreply.github.com>

* run tests

* ran tests

* Fix tests, upgrade XSLM example to a valid file

Signed-off-by: Christoph Auer <cau@zurich.ibm.com>

---------

Signed-off-by: ShiroYasha18 <85089952+ShiroYasha18@users.noreply.github.com>
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
Co-authored-by: Christoph Auer <cau@zurich.ibm.com>
2025-06-10 16:55:59 +02:00
Peter W. J. Staar
6613b9e98b
fix: prov for merged-elems (#1728)
* fix: prov for merged-elems

Signed-off-by: Peter Staar <taa@zurich.ibm.com>

* reformatted the code

Signed-off-by: Peter Staar <taa@zurich.ibm.com>

* Reset pyproject.toml

Signed-off-by: Christoph Auer <cau@zurich.ibm.com>

* Fix tests

Signed-off-by: Christoph Auer <cau@zurich.ibm.com>

---------

Signed-off-by: Peter Staar <taa@zurich.ibm.com>
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
Co-authored-by: Christoph Auer <cau@zurich.ibm.com>
2025-06-10 11:22:42 +02:00
Maras Ioannis
e979750ce9
fix(tesseract): initialize df_osd to avoid uninitialized variable error (#1718)
* fix: initialize df_osd to avoid uninitialized variable error

Signed-off-by: IoannisMaras <maras2002@gmail.com>

* Fix formatting

Signed-off-by: Christoph Auer <60343111+cau-git@users.noreply.github.com>

* Satisfy mypy, regenerate OCR tests

Signed-off-by: Christoph Auer <cau@zurich.ibm.com>

---------

Signed-off-by: IoannisMaras <maras2002@gmail.com>
Signed-off-by: Christoph Auer <60343111+cau-git@users.noreply.github.com>
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
Co-authored-by: Christoph Auer <60343111+cau-git@users.noreply.github.com>
Co-authored-by: Christoph Auer <cau@zurich.ibm.com>
2025-06-10 10:57:45 +02:00
Michele Dolfi
f7f31137f1
fix: allow custom torch_dtype in vlm models (#1735)
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
2025-06-10 10:52:15 +02:00
Michele Dolfi
49b10e7419
docs: add open webui (#1734)
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
2025-06-10 09:35:20 +02:00
AndrewTsai0406
9dbcb3d7d4
fix: Improve extraction from textboxes in Word docs (#1701)
* fix/docx_text_box_extraction

Signed-off-by: JiunAn Tsai <andrew@JiunAns-Mac-mini.local>

* fix/docx_text_box_extraction

Signed-off-by: JiunAn Tsai <andrew@JiunAns-Mac-mini.local>

---------

Signed-off-by: JiunAn Tsai <andrew@JiunAns-Mac-mini.local>
Co-authored-by: JiunAn Tsai <andrew@JiunAns-Mac-mini.local>
2025-06-06 11:37:46 +02:00
Eugene
a2b83fe4ae
fix: Add WEBP to the list of image file extensions (#1711)
feat: Add WEBP to the list of image file extensions

Signed-off-by: Eugene <fogaprod@gmail.com>
2025-06-05 09:09:27 +02:00
github-actions[bot]
40df0d74ad chore: bump version to 2.36.1 [skip ci] 2025-06-04 11:43:13 +00:00
Michele Dolfi
8846f1a393
fix: remove typer and click constraints (#1707)
release typer and click constraints

Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
2025-06-04 13:06:23 +02:00
Michele Dolfi
be42b03f9b
docs: flash-attn usage and install (#1706)
* docs: flash-attn usage and install

Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>

* fix link

Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>

---------

Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
2025-06-04 11:09:54 +02:00
github-actions[bot]
96c54dba91 chore: bump version to 2.36.0 [skip ci] 2025-06-03 13:54:25 +00:00
Michele Dolfi
cdd401847a
feat: simplify dependencies, switch to uv (#1700)
* refactor with uv

Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>

* constraints for onnxruntime

Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>

* more constraints

Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>

---------

Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
2025-06-03 15:18:54 +02:00
Panos Vagenas
61d0d6c755
test: mark flaky test (#1698)
* test: cleanse Word test file

Signed-off-by: Panos Vagenas <pva@zurich.ibm.com>

* mark textbox file test as flaky

Signed-off-by: Panos Vagenas <pva@zurich.ibm.com>

* fix path usage

Signed-off-by: Panos Vagenas <pva@zurich.ibm.com>

---------

Signed-off-by: Panos Vagenas <pva@zurich.ibm.com>
2025-06-03 13:13:44 +02:00
Peter W. J. Staar
cfdf4cea25
feat: new vlm-models support (#1570)
* feat: adding new vlm-models support

Signed-off-by: Peter Staar <taa@zurich.ibm.com>

* fixed the transformers

Signed-off-by: Peter Staar <taa@zurich.ibm.com>

* got microsoft/Phi-4-multimodal-instruct to work

Signed-off-by: Peter Staar <taa@zurich.ibm.com>

* working on vlm's

Signed-off-by: Peter Staar <taa@zurich.ibm.com>

* refactoring the VLM part

Signed-off-by: Peter Staar <taa@zurich.ibm.com>

* all working, now serious refacgtoring necessary

Signed-off-by: Peter Staar <taa@zurich.ibm.com>

* refactoring the download_model

Signed-off-by: Peter Staar <taa@zurich.ibm.com>

* added the formulate_prompt

Signed-off-by: Peter Staar <taa@zurich.ibm.com>

* pixtral 12b runs via MLX and native transformers

Signed-off-by: Peter Staar <taa@zurich.ibm.com>

* added the VlmPredictionToken

Signed-off-by: Peter Staar <taa@zurich.ibm.com>

* refactoring minimal_vlm_pipeline

Signed-off-by: Peter Staar <taa@zurich.ibm.com>

* fixed the MyPy

Signed-off-by: Peter Staar <taa@zurich.ibm.com>

* added pipeline_model_specializations file

Signed-off-by: Peter Staar <taa@zurich.ibm.com>

* need to get Phi4 working again ...

Signed-off-by: Peter Staar <taa@zurich.ibm.com>

* finalising last points for vlms support

Signed-off-by: Peter Staar <taa@zurich.ibm.com>

* fixed the pipeline for Phi4

Signed-off-by: Peter Staar <taa@zurich.ibm.com>

* streamlining all code

Signed-off-by: Peter Staar <taa@zurich.ibm.com>

* reformatted the code

Signed-off-by: Peter Staar <taa@zurich.ibm.com>

* fixing the tests

Signed-off-by: Peter Staar <taa@zurich.ibm.com>

* added the html backend to the VLM pipeline

Signed-off-by: Peter Staar <taa@zurich.ibm.com>

* fixed the static load_from_doctags

Signed-off-by: Peter Staar <taa@zurich.ibm.com>

* restore stable imports

Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>

* use AutoModelForVision2Seq for Pixtral and review example (including rename)

Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>

* remove unused value

Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>

* refactor instances of VLM models

Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>

* skip compare example in CI

Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>

* use lowercase and uppercase only

Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>

* add new minimal_vlm example and refactor pipeline_options_vlm_model for cleaner import

Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>

* rename pipeline_vlm_model_spec

Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>

* move more argument to options and simplify model init

Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>

* add supported_devices

Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>

* remove not-needed function

Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>

* exclude minimal_vlm

Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>

* missing file

Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>

* add message for transformers version

Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>

* rename to specs

Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>

* use module import and remove MLX from non-darwin

Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>

* remove hf_vlm_model and add extra_generation_args

Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>

* use single HF VLM model class

Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>

* remove torch type

Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>

* add docs for vision models

Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>

---------

Signed-off-by: Peter Staar <taa@zurich.ibm.com>
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
Co-authored-by: Michele Dolfi <dol@zurich.ibm.com>
2025-06-02 17:01:06 +02:00
github-actions[bot]
08dcacc5cb chore: bump version to 2.35.0 [skip ci] 2025-06-02 12:30:26 +00:00
Edgar Hipp
11ca4f7a7b
docs: fix typo in index.md (#1676)
Signed-off-by: Edgar Hipp <hipp.edg@gmail.com>
2025-06-02 12:35:59 +02:00
Panos Vagenas
1c8a1283c4
test: ensure utf-8 in test data utils (#1691)
Signed-off-by: Panos Vagenas <pva@zurich.ibm.com>
2025-06-02 12:13:19 +02:00