Docling/tests/data/groundtruth/docling_v2/word_sample.docx.itxt
Maxim Lysak 8533039b0c
fix: Fixing images in the input Word files (#330)
* Fixing images identification in the input Word files

Signed-off-by: Maksym Lysak <mly@zurich.ibm.com>

* Populating extracted image data into docling picture for wordx backend

Signed-off-by: Maksym Lysak <mly@zurich.ibm.com>

* Updated tests

Signed-off-by: Maksym Lysak <mly@zurich.ibm.com>

* removed base64 dependency in msword_backend

Signed-off-by: Maksym Lysak <mly@zurich.ibm.com>

---------

Signed-off-by: Maksym Lysak <mly@zurich.ibm.com>
Co-authored-by: Maksym Lysak <mly@zurich.ibm.com>
2024-11-14 13:33:34 +01:00

29 lines
1.6 KiB
Plaintext
Vendored
Raw Permalink Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

item-0 at level 0: unspecified: group _root_
item-1 at level 1: paragraph: Summer activities
item-2 at level 1: title: Swimming in the lake
item-3 at level 2: paragraph: Duck
item-4 at level 2: picture
item-5 at level 2: paragraph: Figure 1: This is a cute duckling
item-6 at level 2: section_header: Lets swim!
item-7 at level 3: paragraph: To get started with swimming, fi ... down in a water and try not to drown:
item-8 at level 3: list: group list
item-9 at level 4: list_item: You can relax and look around
item-10 at level 4: list_item: Paddle about
item-11 at level 4: list_item: Enjoy summer warmth
item-12 at level 3: paragraph: Also, dont forget:
item-13 at level 3: list: group list
item-14 at level 4: list_item: Wear sunglasses
item-15 at level 4: list_item: Dont forget to drink water
item-16 at level 4: list_item: Use sun cream
item-17 at level 3: paragraph: Hmm, what else…
item-18 at level 3: section_header: Lets eat
item-19 at level 4: paragraph: After we had a good day of swimm ... , its important to eat something nice
item-20 at level 4: paragraph: I like to eat leaves
item-21 at level 4: paragraph: Here are some interesting things a respectful duck could eat:
item-22 at level 4: table with [4x3]
item-23 at level 4: paragraph:
item-24 at level 4: paragraph: And lets add another list in the end:
item-25 at level 4: list: group list
item-26 at level 5: list_item: Leaves
item-27 at level 5: list_item: Berries
item-28 at level 5: list_item: Grain