Docling/docling/backend
Peter W. J. Staar 1557e7ce3e
feat: Support audio input (#1763)
* scaffolding in place

Signed-off-by: Peter Staar <taa@zurich.ibm.com>

* doing scaffolding for audio pipeline

Signed-off-by: Peter Staar <taa@zurich.ibm.com>

* WIP: got first transcription working

Signed-off-by: Peter Staar <taa@zurich.ibm.com>

* all working, time to start cleaning up

Signed-off-by: Peter Staar <taa@zurich.ibm.com>

* first working ASR pipeline

Signed-off-by: Peter Staar <taa@zurich.ibm.com>

* added openai-whisper as a first transcription model

Signed-off-by: Peter Staar <taa@zurich.ibm.com>

* updating with asr_options

Signed-off-by: Peter Staar <taa@zurich.ibm.com>

* finalised the first working ASR pipeline with Whisper

Signed-off-by: Peter Staar <taa@zurich.ibm.com>

* use whisper from the latest git commit

Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>

* Update docling/datamodel/pipeline_options.py

Co-authored-by: Michele Dolfi <97102151+dolfim-ibm@users.noreply.github.com>
Signed-off-by: Peter W. J. Staar <91719829+PeterStaar-IBM@users.noreply.github.com>

* Update docling/datamodel/pipeline_options.py

Co-authored-by: Michele Dolfi <97102151+dolfim-ibm@users.noreply.github.com>
Signed-off-by: Peter W. J. Staar <91719829+PeterStaar-IBM@users.noreply.github.com>

* updated comment

Signed-off-by: Peter Staar <taa@zurich.ibm.com>

* AudioBackend -> DummyBackend

Signed-off-by: Christoph Auer <cau@zurich.ibm.com>

* file rename

Signed-off-by: Christoph Auer <cau@zurich.ibm.com>

* Rename to NoOpBackend, add test for ASR pipeline

Signed-off-by: Christoph Auer <cau@zurich.ibm.com>

* Support every format in NoOpBackend

Signed-off-by: Christoph Auer <cau@zurich.ibm.com>

* Add missing audio file and test

Signed-off-by: Christoph Auer <cau@zurich.ibm.com>

* Install ffmpeg system dependency for ASR test

Signed-off-by: Christoph Auer <cau@zurich.ibm.com>

---------

Signed-off-by: Peter Staar <taa@zurich.ibm.com>
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
Signed-off-by: Peter W. J. Staar <91719829+PeterStaar-IBM@users.noreply.github.com>
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
Co-authored-by: Michele Dolfi <dol@zurich.ibm.com>
Co-authored-by: Michele Dolfi <97102151+dolfim-ibm@users.noreply.github.com>
Co-authored-by: Christoph Auer <cau@zurich.ibm.com>
2025-06-23 14:47:26 +02:00
..
docx ci: add coverage and ruff (#1383) 2025-04-14 18:01:26 +02:00
json feat: add Docling JSON ingestion (#783) 2025-01-24 18:05:23 +01:00
xml chore: typo fix (#1465) 2025-04-28 08:52:09 +02:00
__init__.py Initial commit 2024-07-15 09:42:42 +02:00
abstract_backend.py feat: add Docling JSON ingestion (#783) 2025-01-24 18:05:23 +01:00
asciidoc_backend.py fix(asciidoc): set default size when missing in image directive (#1769) 2025-06-16 10:38:46 +02:00
csv_backend.py ci: add coverage and ruff (#1383) 2025-04-14 18:01:26 +02:00
docling_parse_backend.py feat: Make Page.parsed_page the only source of truth for text cells, add OCR cells to it (#1745) 2025-06-13 19:01:55 +02:00
docling_parse_v2_backend.py feat: Make Page.parsed_page the only source of truth for text cells, add OCR cells to it (#1745) 2025-06-13 19:01:55 +02:00
docling_parse_v4_backend.py feat: Make Page.parsed_page the only source of truth for text cells, add OCR cells to it (#1745) 2025-06-13 19:01:55 +02:00
html_backend.py fix(HTML): handle row spans in header rows (#1536) 2025-05-09 15:14:32 +02:00
md_backend.py feat(markdown): add formatting & improve inline support (#1804) 2025-06-18 15:57:57 +02:00
msexcel_backend.py ci: add coverage and ruff (#1383) 2025-04-14 18:01:26 +02:00
mspowerpoint_backend.py fix: pptx line break and space handling (#1664) 2025-06-16 10:44:30 +02:00
msword_backend.py fix(docx): ensure list items have a list parent (#1827) 2025-06-20 14:47:25 +02:00
noop_backend.py feat: Support audio input (#1763) 2025-06-23 14:47:26 +02:00
pdf_backend.py ci: add coverage and ruff (#1383) 2025-04-14 18:01:26 +02:00
pypdfium2_backend.py feat: Make Page.parsed_page the only source of truth for text cells, add OCR cells to it (#1745) 2025-06-13 19:01:55 +02:00