Docling/tests/data/groundtruth/docling_v2/word_tables.docx.html
Peter W. J. Staar c0ba88edf1
feat(cli): add option for html with split-page mode (#1355)
* updated the cli to output html in split-page mode

Signed-off-by: Peter Staar <taa@zurich.ibm.com>

* add pin for new docling-core with html split argument

Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>

* relock with fixed html export in docling-core

Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>

* update test results

Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>

* update more tests

Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>

* update example

Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>

* update lock with docling-core fixes

Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>

* update test results

Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>

* add again chunking extras

Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>

---------

Signed-off-by: Peter Staar <taa@zurich.ibm.com>
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
Co-authored-by: Michele Dolfi <dol@zurich.ibm.com>
2025-04-14 08:41:50 +02:00

146 lines
4.6 KiB
HTML
Vendored

<!DOCTYPE html>
<html>
<head>
<meta charset="UTF-8">
<title>word_tables</title>
<meta name="generator" content="Docling HTML Serializer">
<style>
html {
background-color: #f5f5f5;
font-family: Arial, sans-serif;
line-height: 1.6;
}
body {
max-width: 800px;
margin: 0 auto;
padding: 2rem;
background-color: white;
box-shadow: 0 0 10px rgba(0,0,0,0.1);
}
h1, h2, h3, h4, h5, h6 {
color: #333;
margin-top: 1.5em;
margin-bottom: 0.5em;
}
h1 {
font-size: 2em;
border-bottom: 1px solid #eee;
padding-bottom: 0.3em;
}
table {
border-collapse: collapse;
margin: 1em 0;
width: 100%;
}
th, td {
border: 1px solid #ddd;
padding: 8px;
text-align: left;
}
th {
background-color: #f2f2f2;
font-weight: bold;
}
figure {
margin: 1.5em 0;
text-align: center;
}
figcaption {
color: #666;
font-style: italic;
margin-top: 0.5em;
}
img {
max-width: 100%;
height: auto;
}
pre {
background-color: #f6f8fa;
border-radius: 3px;
padding: 1em;
overflow: auto;
}
code {
font-family: monospace;
background-color: #f6f8fa;
padding: 0.2em 0.4em;
border-radius: 3px;
}
pre code {
background-color: transparent;
padding: 0;
}
.formula {
text-align: center;
padding: 0.5em;
margin: 1em 0;
background-color: #f9f9f9;
}
.formula-not-decoded {
text-align: center;
padding: 0.5em;
margin: 1em 0;
background: repeating-linear-gradient(
45deg,
#f0f0f0,
#f0f0f0 10px,
#f9f9f9 10px,
#f9f9f9 20px
);
}
.page-break {
page-break-after: always;
border-top: 1px dashed #ccc;
margin: 2em 0;
}
.key-value-region {
background-color: #f9f9f9;
padding: 1em;
border-radius: 4px;
margin: 1em 0;
}
.key-value-region dt {
font-weight: bold;
}
.key-value-region dd {
margin-left: 1em;
margin-bottom: 0.5em;
}
.form-container {
border: 1px solid #ddd;
padding: 1em;
border-radius: 4px;
margin: 1em 0;
}
.form-item {
margin-bottom: 0.5em;
}
.image-classification {
font-size: 0.9em;
color: #666;
margin-top: 0.5em;
}
</style>
</head>
<body>
<div class='page'>
<h2>Test with tables</h2>
<p>A uniform table</p>
<table><tbody><tr><th>Header 0.0</th><th>Header 0.1</th><th>Header 0.2</th></tr><tr><td>Cell 1.0</td><td>Cell 1.1</td><td>Cell 1.2</td></tr><tr><td>Cell 2.0</td><td>Cell 2.1</td><td>Cell 2.2</td></tr></tbody></table>
<p></p>
<p>A non-uniform table with horizontal spans</p>
<table><tbody><tr><th>Header 0.0</th><th>Header 0.1</th><th>Header 0.2</th></tr><tr><td>Cell 1.0</td><td colspan="2">Merged Cell 1.1 1.2</td></tr><tr><td>Cell 2.0</td><td colspan="2">Merged Cell 2.1 2.2</td></tr></tbody></table>
<p></p>
<p>A non-uniform table with horizontal spans in inner columns</p>
<table><tbody><tr><th>Header 0.0</th><th>Header 0.1</th><th>Header 0.2</th><th>Header 0.3</th></tr><tr><td>Cell 1.0</td><td colspan="2">Merged Cell 1.1 1.2</td><td>Cell 1.3</td></tr><tr><td>Cell 2.0</td><td colspan="2">Merged Cell 2.1 2.2</td><td>Cell 2.3</td></tr></tbody></table>
<p></p>
<p>A non-uniform table with vertical spans</p>
<table><tbody><tr><th>Header 0.0</th><th>Header 0.1</th><th>Header 0.2</th></tr><tr><td>Cell 1.0</td><td rowspan="2">Merged Cell 1.1 2.1</td><td>Cell 1.2</td></tr><tr><td>Cell 2.0</td><td>Cell 2.2</td></tr><tr><td>Cell 3.0</td><td rowspan="2">Merged Cell 3.1 4.1</td><td>Cell 3.2</td></tr><tr><td>Cell 4.0</td><td>Cell 4.2</td></tr></tbody></table>
<p></p>
<p>A non-uniform table with all kinds of spans and empty cells</p>
<table><tbody><tr><th>Header 0.0</th><th>Header 0.1</th><th>Header 0.2</th><th></th><th></th></tr><tr><td>Cell 1.0</td><td rowspan="2">Merged Cell 1.1 2.1</td><td>Cell 1.2</td><td></td><td></td></tr><tr><td>Cell 2.0</td><td>Cell 2.2</td><td></td><td></td></tr><tr><td>Cell 3.0</td><td rowspan="2">Merged Cell 3.1 4.1</td><td>Cell 3.2</td><td rowspan="3"></td><td></td></tr><tr><td>Cell 4.0</td><td>Cell 4.2</td><td rowspan="2">Merged Cell 4.4 5.4</td></tr><tr><td></td><td></td><td></td></tr><tr><td></td><td></td><td></td><td></td><td></td></tr><tr><td colspan="5"></td></tr><tr><td></td><td></td><td></td><td></td><td>Cell 8.4</td></tr></tbody></table>
<p></p>
<p></p>
</div>
</body>
</html>