Docling/tests
Tobias Strebitzer 00d9405b0a
feat: Add support for CSV input with new backend to transform CSV files to DoclingDocument (#945)
* feat: Implement csv backend and format detection

Signed-off-by: Tobias Strebitzer <tobias.strebitzer@magloft.com>

* test: Implement csv parsing and format tests

Signed-off-by: Tobias Strebitzer <tobias.strebitzer@magloft.com>

* docs: Add example and CSV format documentation

Signed-off-by: Tobias Strebitzer <tobias.strebitzer@magloft.com>

* feat: Add support for various CSV dialects and update documentation

Signed-off-by: Tobias Strebitzer <tobias.strebitzer@magloft.com>

* feat: Add validation for delimiters and tests for inconsistent csv files

Signed-off-by: Tobias Strebitzer <tobias.strebitzer@magloft.com>

---------

Signed-off-by: Tobias Strebitzer <tobias.strebitzer@magloft.com>
2025-02-14 08:55:09 +01:00
..
data feat: Add support for CSV input with new backend to transform CSV files to DoclingDocument (#945) 2025-02-14 08:55:09 +01:00
data_scanned feat: Add content_layer property to items to address body, furniture and other roles (#735) 2025-02-10 12:07:49 +01:00
__init__.py fix: Add unit tests (#51) 2024-08-30 14:08:20 +02:00
test_backend_asciidoc.py fix: Test cases for RTL programmatic PDFs and fixes for the formula model (#903) 2025-02-07 08:43:31 +01:00
test_backend_csv.py feat: Add support for CSV input with new backend to transform CSV files to DoclingDocument (#945) 2025-02-14 08:55:09 +01:00
test_backend_docling_json.py feat: add Docling JSON ingestion (#783) 2025-01-24 18:05:23 +01:00
test_backend_docling_parse_v2.py fix: Test cases for RTL programmatic PDFs and fixes for the formula model (#903) 2025-02-07 08:43:31 +01:00
test_backend_docling_parse.py fix: Test cases for RTL programmatic PDFs and fixes for the formula model (#903) 2025-02-07 08:43:31 +01:00
test_backend_html.py fix: parse html with omitted body tag (#818) 2025-01-27 16:59:00 +01:00
test_backend_markdown.py fix(markdown): handle nested lists (#910) 2025-02-07 12:55:12 +01:00
test_backend_msexcel.py chore: add missing imports to Office type tests (#826) 2025-01-28 16:17:44 +01:00
test_backend_msword.py fix(docx): merged table cells not properly converted (#857) 2025-02-03 10:20:03 +01:00
test_backend_patent_uspto.py feat: Add content_layer property to items to address body, furniture and other roles (#735) 2025-02-10 12:07:49 +01:00
test_backend_pdfium.py fix: Test cases for RTL programmatic PDFs and fixes for the formula model (#903) 2025-02-07 08:43:31 +01:00
test_backend_pptx.py chore: add missing imports to Office type tests (#826) 2025-01-28 16:17:44 +01:00
test_backend_pubmed.py docs: description of supported formats and backends (#788) 2025-01-26 08:10:33 +01:00
test_cli.py fix: Test cases for RTL programmatic PDFs and fixes for the formula model (#903) 2025-02-07 08:43:31 +01:00
test_code_formula.py fix: Test cases for RTL programmatic PDFs and fixes for the formula model (#903) 2025-02-07 08:43:31 +01:00
test_data_gen_flag.py fix(markdown): handle nested lists (#910) 2025-02-07 12:55:12 +01:00
test_document_picture_classifier.py fix: Test cases for RTL programmatic PDFs and fixes for the formula model (#903) 2025-02-07 08:43:31 +01:00
test_e2e_conversion.py fix: Test cases for RTL programmatic PDFs and fixes for the formula model (#903) 2025-02-07 08:43:31 +01:00
test_e2e_ocr_conversion.py feat: Python 3.13 support (#841) 2025-01-30 17:26:42 +01:00
test_input_doc.py feat: Add support for CSV input with new backend to transform CSV files to DoclingDocument (#945) 2025-02-14 08:55:09 +01:00
test_interfaces.py fix: Test cases for RTL programmatic PDFs and fixes for the formula model (#903) 2025-02-07 08:43:31 +01:00
test_invalid_input.py fix: Test cases for RTL programmatic PDFs and fixes for the formula model (#903) 2025-02-07 08:43:31 +01:00
test_legacy_format_transform.py fix: Test cases for RTL programmatic PDFs and fixes for the formula model (#903) 2025-02-07 08:43:31 +01:00
test_options.py fix: Test cases for RTL programmatic PDFs and fixes for the formula model (#903) 2025-02-07 08:43:31 +01:00
verify_utils.py fix: Test cases for RTL programmatic PDFs and fixes for the formula model (#903) 2025-02-07 08:43:31 +01:00