.. |
data
|
fix: pptx line break and space handling (#1664)
|
2025-06-16 10:44:30 +02:00 |
data_scanned
|
feat: Make Page.parsed_page the only source of truth for text cells, add OCR cells to it (#1745)
|
2025-06-13 19:01:55 +02:00 |
__init__.py
|
fix: Add unit tests (#51)
|
2024-08-30 14:08:20 +02:00 |
test_backend_asciidoc.py
|
fix(asciidoc): set default size when missing in image directive (#1769)
|
2025-06-16 10:38:46 +02:00 |
test_backend_csv.py
|
chore: fix or ignore runtime and deprecation warnings (#1660)
|
2025-05-28 17:55:31 +02:00 |
test_backend_docling_json.py
|
feat: add Docling JSON ingestion (#783)
|
2025-01-24 18:05:23 +01:00 |
test_backend_docling_parse_v2.py
|
ci: add coverage and ruff (#1383)
|
2025-04-14 18:01:26 +02:00 |
test_backend_docling_parse_v4.py
|
ci: add coverage and ruff (#1383)
|
2025-04-14 18:01:26 +02:00 |
test_backend_docling_parse.py
|
ci: add coverage and ruff (#1383)
|
2025-04-14 18:01:26 +02:00 |
test_backend_html.py
|
ci: add coverage and ruff (#1383)
|
2025-04-14 18:01:26 +02:00 |
test_backend_jats.py
|
ci: add coverage and ruff (#1383)
|
2025-04-14 18:01:26 +02:00 |
test_backend_markdown.py
|
fix(markdown): handle nested lists (#910)
|
2025-02-07 12:55:12 +01:00 |
test_backend_msexcel.py
|
feat: support xlsm files (#1520)
|
2025-06-10 16:55:59 +02:00 |
test_backend_msword.py
|
test: mark flaky test (#1698)
|
2025-06-03 13:13:44 +02:00 |
test_backend_patent_uspto.py
|
ci: add coverage and ruff (#1383)
|
2025-04-14 18:01:26 +02:00 |
test_backend_pdfium.py
|
fix(pypdfium): resolve overlapping text when merging bounding boxes (#1549)
|
2025-05-19 15:26:00 +02:00 |
test_backend_pptx.py
|
ci: add coverage and ruff (#1383)
|
2025-04-14 18:01:26 +02:00 |
test_backend_webp.py
|
feat: support image/webp file type (#1415)
|
2025-05-14 09:47:28 +02:00 |
test_cli.py
|
fix: Test cases for RTL programmatic PDFs and fixes for the formula model (#903)
|
2025-02-07 08:43:31 +01:00 |
test_code_formula.py
|
ci: add coverage and ruff (#1383)
|
2025-04-14 18:01:26 +02:00 |
test_data_gen_flag.py
|
fix(markdown): handle nested lists (#910)
|
2025-02-07 12:55:12 +01:00 |
test_document_picture_classifier.py
|
ci: add coverage and ruff (#1383)
|
2025-04-14 18:01:26 +02:00 |
test_e2e_conversion.py
|
feat: new vlm-models support (#1570)
|
2025-06-02 17:01:06 +02:00 |
test_e2e_ocr_conversion.py
|
feat: Make Page.parsed_page the only source of truth for text cells, add OCR cells to it (#1745)
|
2025-06-13 19:01:55 +02:00 |
test_input_doc.py
|
fix: guess HTML content starting with script tag (#1673)
|
2025-06-02 08:43:24 +02:00 |
test_interfaces.py
|
ci: add coverage and ruff (#1383)
|
2025-04-14 18:01:26 +02:00 |
test_invalid_input.py
|
ci: add coverage and ruff (#1383)
|
2025-04-14 18:01:26 +02:00 |
test_legacy_format_transform.py
|
chore: fix or ignore runtime and deprecation warnings (#1660)
|
2025-05-28 17:55:31 +02:00 |
test_options.py
|
feat: new vlm-models support (#1570)
|
2025-06-02 17:01:06 +02:00 |
test_settings_load.py
|
fix(settings): fix nested settings load via environment variables (#1551)
|
2025-05-14 13:42:10 +02:00 |
verify_utils.py
|
test: ensure utf-8 in test data utils (#1691)
|
2025-06-02 12:13:19 +02:00 |