Docling/tests
Cesar Berrospi Ramis 0cd81a8122
fix(docx): merged table cells not properly converted (#857)
* fix(docx): merged cells not properly converted

Fix conversion issue of merged cells in Word tables leading to repeated text.
Simplify Word table conversion code.
Add docx file with several table formats for regression tests.

Signed-off-by: Cesar Berrospi Ramis <75900930+ceberam@users.noreply.github.com>

* chore: add type hinting to docx backend

Signed-off-by: Cesar Berrospi Ramis <75900930+ceberam@users.noreply.github.com>

---------

Signed-off-by: Cesar Berrospi Ramis <75900930+ceberam@users.noreply.github.com>
2025-02-03 10:20:03 +01:00
..
data fix(docx): merged table cells not properly converted (#857) 2025-02-03 10:20:03 +01:00
data_scanned docs: Add example for inspection of picture content (#624) 2025-01-29 10:39:00 +01:00
__init__.py fix: Add unit tests (#51) 2024-08-30 14:08:20 +02:00
test_backend_asciidoc.py feat: Add pipeline timings and toggle visualization, establish debug settings (#183) 2024-10-30 15:04:19 +01:00
test_backend_docling_json.py feat: add Docling JSON ingestion (#783) 2025-01-24 18:05:23 +01:00
test_backend_docling_parse_v2.py chore: make tests lighter (#228) 2024-11-04 14:02:28 +01:00
test_backend_docling_parse.py chore: make tests lighter (#228) 2024-11-04 14:02:28 +01:00
test_backend_html.py fix: parse html with omitted body tag (#818) 2025-01-27 16:59:00 +01:00
test_backend_markdown.py fix: fix single newline handling in MD backend (#824) 2025-01-28 19:05:55 +01:00
test_backend_msexcel.py chore: add missing imports to Office type tests (#826) 2025-01-28 16:17:44 +01:00
test_backend_msword.py fix(docx): merged table cells not properly converted (#857) 2025-02-03 10:20:03 +01:00
test_backend_patent_uspto.py docs: description of supported formats and backends (#788) 2025-01-26 08:10:33 +01:00
test_backend_pdfium.py chore: make tests lighter (#228) 2024-11-04 14:02:28 +01:00
test_backend_pptx.py chore: add missing imports to Office type tests (#826) 2025-01-28 16:17:44 +01:00
test_backend_pubmed.py docs: description of supported formats and backends (#788) 2025-01-26 08:10:33 +01:00
test_cli.py test: generate file from CLI in a temporary directory (#618) 2024-12-17 16:35:42 +01:00
test_code_formula.py feat: Code and equation model for PDF and code blocks in markdown (#752) 2025-01-24 16:54:22 +01:00
test_document_picture_classifier.py feat: New document picture classifier (#805) 2025-01-24 18:05:51 +01:00
test_e2e_conversion.py docs: Add example for inspection of picture content (#624) 2025-01-29 10:39:00 +01:00
test_e2e_ocr_conversion.py feat: Python 3.13 support (#841) 2025-01-30 17:26:42 +01:00
test_input_doc.py feat: Add option to define page range (#852) 2025-01-31 15:23:00 +01:00
test_interfaces.py fix: improve handling of disallowed formats (#429) 2024-12-03 12:45:32 +01:00
test_invalid_input.py fix: improve handling of disallowed formats (#429) 2024-12-03 12:45:32 +01:00
test_legacy_format_transform.py fix: fix duplicate title and heading + add e2e tests for html and docx (#186) 2024-10-30 13:14:56 +01:00
test_options.py feat: Add option to define page range (#852) 2025-01-31 15:23:00 +01:00
verify_utils.py feat(OCR): Introduce the OcrOptions.force_full_page_ocr parameter that forces a full page OCR scanning (#290) 2024-11-12 09:46:14 +01:00