..
data
feat: Integrate ListItemMarkerProcessor into document assembly ( #1825 )
2025-07-01 10:04:58 +02:00
data_scanned
feat: leverage new list modeling, capture default markers ( #1856 )
2025-06-27 16:37:15 +02:00
__init__.py
fix: Add unit tests ( #51 )
2024-08-30 14:08:20 +02:00
test_asr_pipeline.py
feat: Support audio input ( #1763 )
2025-06-23 14:47:26 +02:00
test_backend_asciidoc.py
fix(asciidoc): set default size when missing in image directive ( #1769 )
2025-06-16 10:38:46 +02:00
test_backend_csv.py
chore: fix or ignore runtime and deprecation warnings ( #1660 )
2025-05-28 17:55:31 +02:00
test_backend_docling_json.py
feat: add Docling JSON ingestion ( #783 )
2025-01-24 18:05:23 +01:00
test_backend_docling_parse_v2.py
ci: add coverage and ruff ( #1383 )
2025-04-14 18:01:26 +02:00
test_backend_docling_parse_v4.py
chore: Safer unloading of DPv4 backend ( #1867 )
2025-06-30 14:41:21 +02:00
test_backend_docling_parse.py
ci: add coverage and ruff ( #1383 )
2025-04-14 18:01:26 +02:00
test_backend_html.py
ci: add coverage and ruff ( #1383 )
2025-04-14 18:01:26 +02:00
test_backend_jats.py
ci: add coverage and ruff ( #1383 )
2025-04-14 18:01:26 +02:00
test_backend_markdown.py
feat(markdown): add formatting & improve inline support ( #1804 )
2025-06-18 15:57:57 +02:00
test_backend_msexcel.py
feat: support xlsm files ( #1520 )
2025-06-10 16:55:59 +02:00
test_backend_msword.py
fix(docx): ensure list items have a list parent ( #1827 )
2025-06-20 14:47:25 +02:00
test_backend_patent_uspto.py
ci: add coverage and ruff ( #1383 )
2025-04-14 18:01:26 +02:00
test_backend_pdfium.py
fix(pypdfium): resolve overlapping text when merging bounding boxes ( #1549 )
2025-05-19 15:26:00 +02:00
test_backend_pptx.py
feat: leverage new list modeling, capture default markers ( #1856 )
2025-06-27 16:37:15 +02:00
test_backend_webp.py
feat: support image/webp file type ( #1415 )
2025-05-14 09:47:28 +02:00
test_cli.py
fix: Test cases for RTL programmatic PDFs and fixes for the formula model ( #903 )
2025-02-07 08:43:31 +01:00
test_code_formula.py
fix: formula conversion with page_range param set ( #1791 )
2025-06-17 13:58:45 +02:00
test_data_gen_flag.py
fix(markdown): handle nested lists ( #910 )
2025-02-07 12:55:12 +01:00
test_document_picture_classifier.py
ci: add coverage and ruff ( #1383 )
2025-04-14 18:01:26 +02:00
test_e2e_conversion.py
feat: new vlm-models support ( #1570 )
2025-06-02 17:01:06 +02:00
test_e2e_ocr_conversion.py
feat: Make Page.parsed_page the only source of truth for text cells, add OCR cells to it ( #1745 )
2025-06-13 19:01:55 +02:00
test_input_doc.py
fix: guess HTML content starting with script tag ( #1673 )
2025-06-02 08:43:24 +02:00
test_interfaces.py
ci: add coverage and ruff ( #1383 )
2025-04-14 18:01:26 +02:00
test_invalid_input.py
ci: add coverage and ruff ( #1383 )
2025-04-14 18:01:26 +02:00
test_legacy_format_transform.py
chore: fix or ignore runtime and deprecation warnings ( #1660 )
2025-05-28 17:55:31 +02:00
test_options.py
feat: new vlm-models support ( #1570 )
2025-06-02 17:01:06 +02:00
test_settings_load.py
fix(settings): fix nested settings load via environment variables ( #1551 )
2025-05-14 13:42:10 +02:00
verify_utils.py
test: ensure utf-8 in test data utils ( #1691 )
2025-06-02 12:13:19 +02:00