Docling/tests/data/groundtruth/docling_v2/tablecell.docx.itxt
Maxim Lysak d0a1180478
fix: Fixes for wordx (#432)
* fixes for referencing drawing blip in wordx

Signed-off-by: Maksym Lysak <mly@zurich.ibm.com>

* Added safety try-except when trying to load pillow image from a docx blob. Added explicit dependency on lxml.

Signed-off-by: Maksym Lysak <mly@zurich.ibm.com>

* Added test for word file with embedded emf images, re-generated full tests for docx, eased up dependency on lxml

Signed-off-by: Maksym Lysak <mly@zurich.ibm.com>

* Updated lxml dependency version

Signed-off-by: Maksym Lysak <mly@zurich.ibm.com>

---------

Signed-off-by: Maksym Lysak <mly@zurich.ibm.com>
Co-authored-by: Maksym Lysak <mly@zurich.ibm.com>
2024-11-26 14:44:43 +01:00

10 lines
404 B
Plaintext
Vendored

item-0 at level 0: unspecified: group _root_
item-1 at level 1: list: group list
item-2 at level 2: list_item: Hello world1
item-3 at level 2: list_item: Hello2
item-4 at level 1: paragraph:
item-5 at level 1: paragraph: Some text before
item-6 at level 1: table with [3x3]
item-7 at level 1: paragraph:
item-8 at level 1: paragraph:
item-9 at level 1: paragraph: Some text after