Docling/tests/data/2305.03393v1-pg9.pages.json
Maxim Lysak 2422f706a1
feat: new torch-based docling models (#120)
---------

Signed-off-by: Maxim Lysak <mly@zurich.ibm.com>
Co-authored-by: Maxim Lysak <mly@zurich.ibm.com>
2024-10-03 18:42:33 +02:00

1 line
129 KiB
JSON

[{"page_no": 0, "page_hash": "16ccd0a495625bd9c7a28a4b353d85137f3e6b09508a0d2280663478de9c9b25", "size": {"width": 612.0, "height": 792.0}, "cells": [{"id": 0, "text": "Optimized Table Tokenization for Table Structure Recognition", "bbox": {"l": 194.478, "t": 91.49352999999996, "r": 447.54476999999997, "b": 102.78223000000003, "coord_origin": "1"}}, {"id": 1, "text": "9", "bbox": {"l": 475.98441, "t": 91.49352999999996, "r": 480.59314, "b": 102.78223000000003, "coord_origin": "1"}}, {"id": 2, "text": "order to compute the TED score. Inference timing results for all experiments", "bbox": {"l": 134.765, "t": 116.46301000000005, "r": 480.59067, "b": 128.99597000000006, "coord_origin": "1"}}, {"id": 3, "text": "were obtained from the same machine on a single core with AMD EPYC 7763", "bbox": {"l": 134.765, "t": 128.41803000000004, "r": 480.59665, "b": 140.95099000000005, "coord_origin": "1"}}, {"id": 4, "text": "CPU @2.45 GHz.", "bbox": {"l": 134.765, "t": 140.37401999999997, "r": 210.78761, "b": 152.90697999999998, "coord_origin": "1"}}, {"id": 5, "text": "5.1", "bbox": {"l": 134.765, "t": 166.70514000000003, "r": 149.40306, "b": 179.20818999999995, "coord_origin": "1"}}, {"id": 6, "text": "Hyper Parameter Optimization", "bbox": {"l": 160.85905, "t": 166.70514000000003, "r": 318.45145, "b": 179.20818999999995, "coord_origin": "1"}}, {"id": 7, "text": "We have chosen the PubTabNet data set to perform HPO, since it includes a", "bbox": {"l": 134.765, "t": 183.11505, "r": 479.74982000000006, "b": 195.64801, "coord_origin": "1"}}, {"id": 8, "text": "highly diverse set of tables. Also we report TED scores separately for simple and", "bbox": {"l": 134.765, "t": 195.07007, "r": 480.58765, "b": 207.60303, "coord_origin": "1"}}, {"id": 9, "text": "complex tables (tables with cell spans). Results are presented in Table. 1. It is", "bbox": {"l": 134.765, "t": 207.02502000000004, "r": 480.58859000000007, "b": 219.55798000000004, "coord_origin": "1"}}, {"id": 10, "text": "evident that with OTSL, our model achieves the same TED score and slightly", "bbox": {"l": 134.765, "t": 218.98004000000003, "r": 480.59567, "b": 231.51300000000003, "coord_origin": "1"}}, {"id": 11, "text": "better mAP scores in comparison to HTML. However OTSL yields a", "bbox": {"l": 134.765, "t": 230.93506000000002, "r": 440.9425, "b": 243.46802000000002, "coord_origin": "1"}}, {"id": 12, "text": "2x speed", "bbox": {"l": 444.86800999999997, "t": 230.98486000000003, "r": 480.58792, "b": 243.46802000000002, "coord_origin": "1"}}, {"id": 13, "text": "up", "bbox": {"l": 134.765, "t": 242.94086000000004, "r": 145.19585, "b": 255.42400999999995, "coord_origin": "1"}}, {"id": 14, "text": "in the inference runtime over HTML.", "bbox": {"l": 149.149, "t": 242.89104999999995, "r": 311.22256, "b": 255.42400999999995, "coord_origin": "1"}}, {"id": 15, "text": "Table", "bbox": {"l": 134.765, "t": 272.79474000000005, "r": 159.22983, "b": 284.1999799999999, "coord_origin": "1"}}, {"id": 16, "text": "1.", "bbox": {"l": 167.34442, "t": 272.79474000000005, "r": 174.71301, "b": 284.1999799999999, "coord_origin": "1"}}, {"id": 17, "text": "HPO performed in OTSL and HTML representation on the same", "bbox": {"l": 188.133, "t": 272.85748, "r": 480.58101999999997, "b": 284.14618, "coord_origin": "1"}}, {"id": 18, "text": "transformer-based TableFormer [9] architecture, trained only on PubTabNet [22]. Ef-", "bbox": {"l": 134.765, "t": 283.81647, "r": 480.59890999999993, "b": 295.10516000000007, "coord_origin": "1"}}, {"id": 19, "text": "fects of reducing the # of layers in encoder and decoder stages of the model show that", "bbox": {"l": 134.765, "t": 294.77547999999996, "r": 480.59887999999995, "b": 306.06418, "coord_origin": "1"}}, {"id": 20, "text": "smaller models trained on OTSL perform better, especially in recognizing complex", "bbox": {"l": 134.765, "t": 305.73447, "r": 480.59180000000003, "b": 317.02316, "coord_origin": "1"}}, {"id": 21, "text": "table structures, and maintain a much higher mAP score than the HTML counterpart.", "bbox": {"l": 134.765, "t": 316.69348, "r": 480.58471999999995, "b": 327.98218, "coord_origin": "1"}}, {"id": 22, "text": "#", "bbox": {"l": 160.37, "t": 339.45749, "r": 168.04523, "b": 350.74619, "coord_origin": "1"}}, {"id": 23, "text": "enc-layers", "bbox": {"l": 144.592, "t": 352.40848, "r": 183.82895, "b": 363.69717, "coord_origin": "1"}}, {"id": 24, "text": "#", "bbox": {"l": 207.974, "t": 339.45749, "r": 215.64923000000002, "b": 350.74619, "coord_origin": "1"}}, {"id": 25, "text": "dec-layers", "bbox": {"l": 192.19501, "t": 352.40848, "r": 231.42303, "b": 363.69717, "coord_origin": "1"}}, {"id": 26, "text": "Language", "bbox": {"l": 239.79799999999997, "t": 344.93649, "r": 278.3338, "b": 356.22519000000005, "coord_origin": "1"}}, {"id": 27, "text": "TEDs", "bbox": {"l": 324.67001, "t": 339.45749, "r": 348.26419, "b": 350.74619, "coord_origin": "1"}}, {"id": 28, "text": "mAP", "bbox": {"l": 396.271, "t": 339.45749, "r": 417.12595, "b": 350.74619, "coord_origin": "1"}}, {"id": 29, "text": "(0.75)", "bbox": {"l": 394.927, "t": 350.41647, "r": 418.46921, "b": 361.70517, "coord_origin": "1"}}, {"id": 30, "text": "Inference", "bbox": {"l": 430.771, "t": 339.45749, "r": 467.14142000000004, "b": 350.74619, "coord_origin": "1"}}, {"id": 31, "text": "time (secs)", "bbox": {"l": 427.14801, "t": 350.41647, "r": 470.76955999999996, "b": 361.70517, "coord_origin": "1"}}, {"id": 32, "text": "simple", "bbox": {"l": 286.686, "t": 352.40848, "r": 312.32812, "b": 363.69717, "coord_origin": "1"}}, {"id": 33, "text": "complex", "bbox": {"l": 320.702, "t": 352.40848, "r": 353.71539, "b": 363.69717, "coord_origin": "1"}}, {"id": 34, "text": "all", "bbox": {"l": 369.306, "t": 352.40848, "r": 379.02914, "b": 363.69717, "coord_origin": "1"}}, {"id": 35, "text": "6", "bbox": {"l": 161.90601, "t": 371.23849, "r": 166.51474, "b": 382.52719, "coord_origin": "1"}}, {"id": 36, "text": "6", "bbox": {"l": 209.509, "t": 371.23849, "r": 214.11774, "b": 382.52719, "coord_origin": "1"}}, {"id": 37, "text": "OTSL", "bbox": {"l": 246.71000999999998, "t": 365.75848, "r": 271.41064, "b": 377.04717999999997, "coord_origin": "1"}}, {"id": 38, "text": "0.965", "bbox": {"l": 289.017, "t": 365.75848, "r": 310.00732, "b": 377.04717999999997, "coord_origin": "1"}}, {"id": 39, "text": "0.934", "bbox": {"l": 326.71701, "t": 365.75848, "r": 347.70734, "b": 377.04717999999997, "coord_origin": "1"}}, {"id": 40, "text": "0.955", "bbox": {"l": 363.67599, "t": 365.75848, "r": 384.66632, "b": 377.04717999999997, "coord_origin": "1"}}, {"id": 41, "text": "0.88", "bbox": {"l": 397.26999, "t": 365.69571, "r": 416.12634, "b": 377.10098000000005, "coord_origin": "1"}}, {"id": 42, "text": "2.73", "bbox": {"l": 439.52701, "t": 365.69571, "r": 458.38336, "b": 377.10098000000005, "coord_origin": "1"}}, {"id": 43, "text": "HTML", "bbox": {"l": 245.17598999999998, "t": 378.71048, "r": 272.94495, "b": 389.99917999999997, "coord_origin": "1"}}, {"id": 44, "text": "0.969", "bbox": {"l": 289.017, "t": 378.71048, "r": 310.00732, "b": 389.99917999999997, "coord_origin": "1"}}, {"id": 45, "text": "0.927", "bbox": {"l": 326.71701, "t": 378.71048, "r": 347.70734, "b": 389.99917999999997, "coord_origin": "1"}}, {"id": 46, "text": "0.955", "bbox": {"l": 363.67599, "t": 378.71048, "r": 384.66632, "b": 389.99917999999997, "coord_origin": "1"}}, {"id": 47, "text": "0.857", "bbox": {"l": 396.20599, "t": 378.71048, "r": 417.19632, "b": 389.99917999999997, "coord_origin": "1"}}, {"id": 48, "text": "5.39", "bbox": {"l": 440.767, "t": 378.71048, "r": 457.15039, "b": 389.99917999999997, "coord_origin": "1"}}, {"id": 49, "text": "4", "bbox": {"l": 161.90601, "t": 397.53949, "r": 166.51474, "b": 408.82819, "coord_origin": "1"}}, {"id": 50, "text": "4", "bbox": {"l": 209.509, "t": 397.53949, "r": 214.11774, "b": 408.82819, "coord_origin": "1"}}, {"id": 51, "text": "OTSL", "bbox": {"l": 246.71000999999998, "t": 392.06049, "r": 271.41064, "b": 403.34918, "coord_origin": "1"}}, {"id": 52, "text": "0.938", "bbox": {"l": 289.017, "t": 392.06049, "r": 310.00732, "b": 403.34918, "coord_origin": "1"}}, {"id": 53, "text": "0.904", "bbox": {"l": 326.71701, "t": 392.06049, "r": 347.70734, "b": 403.34918, "coord_origin": "1"}}, {"id": 54, "text": "0.927", "bbox": {"l": 363.67599, "t": 392.06049, "r": 384.66632, "b": 403.34918, "coord_origin": "1"}}, {"id": 55, "text": "0.853", "bbox": {"l": 394.61801, "t": 391.99771, "r": 418.77798, "b": 403.40298, "coord_origin": "1"}}, {"id": 56, "text": "1.97", "bbox": {"l": 439.52701, "t": 391.99771, "r": 458.38336, "b": 403.40298, "coord_origin": "1"}}, {"id": 57, "text": "HTML", "bbox": {"l": 245.17598999999998, "t": 405.01147, "r": 272.94495, "b": 416.30017, "coord_origin": "1"}}, {"id": 58, "text": "0.952", "bbox": {"l": 289.017, "t": 405.01147, "r": 310.00732, "b": 416.30017, "coord_origin": "1"}}, {"id": 59, "text": "0.909", "bbox": {"l": 326.71701, "t": 405.01147, "r": 347.70734, "b": 416.30017, "coord_origin": "1"}}, {"id": 60, "text": "0.938", "bbox": {"l": 362.08801, "t": 404.9486999999999, "r": 386.24799, "b": 416.35397, "coord_origin": "1"}}, {"id": 61, "text": "0.843", "bbox": {"l": 396.20599, "t": 405.01147, "r": 417.19632, "b": 416.30017, "coord_origin": "1"}}, {"id": 62, "text": "3.77", "bbox": {"l": 440.767, "t": 405.01147, "r": 457.15039, "b": 416.30017, "coord_origin": "1"}}, {"id": 63, "text": "2", "bbox": {"l": 161.90601, "t": 423.84048, "r": 166.51474, "b": 435.12918, "coord_origin": "1"}}, {"id": 64, "text": "4", "bbox": {"l": 209.509, "t": 423.84048, "r": 214.11774, "b": 435.12918, "coord_origin": "1"}}, {"id": 65, "text": "OTSL", "bbox": {"l": 246.71000999999998, "t": 418.3614799999999, "r": 271.41064, "b": 429.65018, "coord_origin": "1"}}, {"id": 66, "text": "0.923", "bbox": {"l": 289.017, "t": 418.3614799999999, "r": 310.00732, "b": 429.65018, "coord_origin": "1"}}, {"id": 67, "text": "0.897", "bbox": {"l": 326.71701, "t": 418.3614799999999, "r": 347.70734, "b": 429.65018, "coord_origin": "1"}}, {"id": 68, "text": "0.915", "bbox": {"l": 363.67599, "t": 418.3614799999999, "r": 384.66632, "b": 429.65018, "coord_origin": "1"}}, {"id": 69, "text": "0.859", "bbox": {"l": 394.61801, "t": 418.29871, "r": 418.77798, "b": 429.70398, "coord_origin": "1"}}, {"id": 70, "text": "1.91", "bbox": {"l": 439.52701, "t": 418.29871, "r": 458.38336, "b": 429.70398, "coord_origin": "1"}}, {"id": 71, "text": "HTML", "bbox": {"l": 245.17598999999998, "t": 431.31246999999996, "r": 272.94495, "b": 442.60117, "coord_origin": "1"}}, {"id": 72, "text": "0.945", "bbox": {"l": 289.017, "t": 431.31246999999996, "r": 310.00732, "b": 442.60117, "coord_origin": "1"}}, {"id": 73, "text": "0.901", "bbox": {"l": 326.71701, "t": 431.31246999999996, "r": 347.70734, "b": 442.60117, "coord_origin": "1"}}, {"id": 74, "text": "0.931", "bbox": {"l": 362.08801, "t": 431.24969, "r": 386.24799, "b": 442.65497, "coord_origin": "1"}}, {"id": 75, "text": "0.834", "bbox": {"l": 396.20599, "t": 431.31246999999996, "r": 417.19632, "b": 442.60117, "coord_origin": "1"}}, {"id": 76, "text": "3.81", "bbox": {"l": 440.767, "t": 431.31246999999996, "r": 457.15039, "b": 442.60117, "coord_origin": "1"}}, {"id": 77, "text": "4", "bbox": {"l": 161.90601, "t": 450.14248999999995, "r": 166.51474, "b": 461.43118, "coord_origin": "1"}}, {"id": 78, "text": "2", "bbox": {"l": 209.509, "t": 450.14248999999995, "r": 214.11774, "b": 461.43118, "coord_origin": "1"}}, {"id": 79, "text": "OTSL", "bbox": {"l": 246.71000999999998, "t": 444.66248, "r": 271.41064, "b": 455.95117, "coord_origin": "1"}}, {"id": 80, "text": "0.952", "bbox": {"l": 289.017, "t": 444.66248, "r": 310.00732, "b": 455.95117, "coord_origin": "1"}}, {"id": 81, "text": "0.92", "bbox": {"l": 329.021, "t": 444.66248, "r": 345.40439, "b": 455.95117, "coord_origin": "1"}}, {"id": 82, "text": "0.942", "bbox": {"l": 362.08801, "t": 444.5996999999999, "r": 386.24799, "b": 456.00497, "coord_origin": "1"}}, {"id": 83, "text": "0.857", "bbox": {"l": 394.61801, "t": 444.5996999999999, "r": 418.77798, "b": 456.00497, "coord_origin": "1"}}, {"id": 84, "text": "1.22", "bbox": {"l": 439.52701, "t": 444.5996999999999, "r": 458.38336, "b": 456.00497, "coord_origin": "1"}}, {"id": 85, "text": "HTML", "bbox": {"l": 245.17598999999998, "t": 457.61447, "r": 272.94495, "b": 468.90317, "coord_origin": "1"}}, {"id": 86, "text": "0.944", "bbox": {"l": 289.017, "t": 457.61447, "r": 310.00732, "b": 468.90317, "coord_origin": "1"}}, {"id": 87, "text": "0.903", "bbox": {"l": 326.71701, "t": 457.61447, "r": 347.70734, "b": 468.90317, "coord_origin": "1"}}, {"id": 88, "text": "0.931", "bbox": {"l": 363.67599, "t": 457.61447, "r": 384.66632, "b": 468.90317, "coord_origin": "1"}}, {"id": 89, "text": "0.824", "bbox": {"l": 396.20599, "t": 457.61447, "r": 417.19632, "b": 468.90317, "coord_origin": "1"}}, {"id": 90, "text": "2", "bbox": {"l": 446.65302, "t": 457.61447, "r": 451.26175, "b": 468.90317, "coord_origin": "1"}}, {"id": 91, "text": "5.2", "bbox": {"l": 134.765, "t": 505.67111, "r": 149.40306, "b": 518.17419, "coord_origin": "1"}}, {"id": 92, "text": "Quantitative Results", "bbox": {"l": 160.85905, "t": 505.67111, "r": 264.40829, "b": 518.17419, "coord_origin": "1"}}, {"id": 93, "text": "We picked the model parameter configuration that produced the best prediction", "bbox": {"l": 134.765, "t": 522.08005, "r": 479.72983, "b": 534.61301, "coord_origin": "1"}}, {"id": 94, "text": "quality (enc=6, dec=6, heads=8) with PubTabNet alone, then independently", "bbox": {"l": 134.765, "t": 534.03604, "r": 480.5897499999999, "b": 546.569, "coord_origin": "1"}}, {"id": 95, "text": "trained and evaluated it on three publicly available data sets: PubTabNet (395k", "bbox": {"l": 134.765, "t": 545.99104, "r": 480.72003, "b": 558.524, "coord_origin": "1"}}, {"id": 96, "text": "samples), FinTabNet (113k samples) and PubTables-1M (about 1M samples).", "bbox": {"l": 134.765, "t": 557.94604, "r": 480.60577, "b": 570.479, "coord_origin": "1"}}, {"id": 97, "text": "Performance results are presented in Table. 2. It is clearly evident that the model", "bbox": {"l": 134.765, "t": 569.90103, "r": 480.5936899999999, "b": 582.43399, "coord_origin": "1"}}, {"id": 98, "text": "trained on OTSL outperforms HTML across the board, keeping high TEDs and", "bbox": {"l": 134.765, "t": 581.85603, "r": 480.59158, "b": 594.38899, "coord_origin": "1"}}, {"id": 99, "text": "mAP scores even on difficult financial tables (FinTabNet) that contain sparse", "bbox": {"l": 134.765, "t": 593.81204, "r": 480.58080999999993, "b": 606.345, "coord_origin": "1"}}, {"id": 100, "text": "and large tables.", "bbox": {"l": 134.765, "t": 605.76704, "r": 206.79959, "b": 618.3, "coord_origin": "1"}}, {"id": 101, "text": "Additionally, the results show that OTSL has an advantage over HTML", "bbox": {"l": 149.709, "t": 617.72205, "r": 480.59479, "b": 630.255, "coord_origin": "1"}}, {"id": 102, "text": "when applied on a bigger data set like PubTables-1M and achieves significantly", "bbox": {"l": 134.765, "t": 629.6770300000001, "r": 480.59857000000005, "b": 642.2099900000001, "coord_origin": "1"}}, {"id": 103, "text": "improved scores. Finally, OTSL achieves faster inference due to fewer decoding", "bbox": {"l": 134.765, "t": 641.63203, "r": 480.59384000000006, "b": 654.16499, "coord_origin": "1"}}, {"id": 104, "text": "steps which is a result of the reduced sequence representation.", "bbox": {"l": 134.765, "t": 653.58704, "r": 405.7995, "b": 666.12, "coord_origin": "1"}}], "predictions": {"layout": {"clusters": [{"id": 0, "label": "Page-header", "bbox": {"l": 193.83700561523438, "t": 91.49352999999996, "r": 447.54476999999997, "b": 102.78223000000003, "coord_origin": "1"}, "confidence": 0.9235936999320984, "cells": [{"id": 0, "text": "Optimized Table Tokenization for Table Structure Recognition", "bbox": {"l": 194.478, "t": 91.49352999999996, "r": 447.54476999999997, "b": 102.78223000000003, "coord_origin": "1"}}]}, {"id": 1, "label": "Page-header", "bbox": {"l": 475.3370056152344, "t": 91.49352999999996, "r": 480.59314, "b": 102.78223000000003, "coord_origin": "1"}, "confidence": 0.7262580394744873, "cells": [{"id": 1, "text": "9", "bbox": {"l": 475.98441, "t": 91.49352999999996, "r": 480.59314, "b": 102.78223000000003, "coord_origin": "1"}}]}, {"id": 2, "label": "Text", "bbox": {"l": 134.23260498046875, "t": 116.46301000000005, "r": 480.63818359375, "b": 152.90697999999998, "coord_origin": "1"}, "confidence": 0.9810057878494263, "cells": [{"id": 2, "text": "order to compute the TED score. Inference timing results for all experiments", "bbox": {"l": 134.765, "t": 116.46301000000005, "r": 480.59067, "b": 128.99597000000006, "coord_origin": "1"}}, {"id": 3, "text": "were obtained from the same machine on a single core with AMD EPYC 7763", "bbox": {"l": 134.765, "t": 128.41803000000004, "r": 480.59665, "b": 140.95099000000005, "coord_origin": "1"}}, {"id": 4, "text": "CPU @2.45 GHz.", "bbox": {"l": 134.765, "t": 140.37401999999997, "r": 210.78761, "b": 152.90697999999998, "coord_origin": "1"}}]}, {"id": 3, "label": "Section-header", "bbox": {"l": 134.15780639648438, "t": 166.70514000000003, "r": 318.45145, "b": 179.20818999999995, "coord_origin": "1"}, "confidence": 0.9181773662567139, "cells": [{"id": 5, "text": "5.1", "bbox": {"l": 134.765, "t": 166.70514000000003, "r": 149.40306, "b": 179.20818999999995, "coord_origin": "1"}}, {"id": 6, "text": "Hyper Parameter Optimization", "bbox": {"l": 160.85905, "t": 166.70514000000003, "r": 318.45145, "b": 179.20818999999995, "coord_origin": "1"}}]}, {"id": 4, "label": "Text", "bbox": {"l": 134.27206420898438, "t": 183.11505, "r": 480.8331604003906, "b": 255.42400999999995, "coord_origin": "1"}, "confidence": 0.9886466860771179, "cells": [{"id": 7, "text": "We have chosen the PubTabNet data set to perform HPO, since it includes a", "bbox": {"l": 134.765, "t": 183.11505, "r": 479.74982000000006, "b": 195.64801, "coord_origin": "1"}}, {"id": 8, "text": "highly diverse set of tables. Also we report TED scores separately for simple and", "bbox": {"l": 134.765, "t": 195.07007, "r": 480.58765, "b": 207.60303, "coord_origin": "1"}}, {"id": 9, "text": "complex tables (tables with cell spans). Results are presented in Table. 1. It is", "bbox": {"l": 134.765, "t": 207.02502000000004, "r": 480.58859000000007, "b": 219.55798000000004, "coord_origin": "1"}}, {"id": 10, "text": "evident that with OTSL, our model achieves the same TED score and slightly", "bbox": {"l": 134.765, "t": 218.98004000000003, "r": 480.59567, "b": 231.51300000000003, "coord_origin": "1"}}, {"id": 11, "text": "better mAP scores in comparison to HTML. However OTSL yields a", "bbox": {"l": 134.765, "t": 230.93506000000002, "r": 440.9425, "b": 243.46802000000002, "coord_origin": "1"}}, {"id": 12, "text": "2x speed", "bbox": {"l": 444.86800999999997, "t": 230.98486000000003, "r": 480.58792, "b": 243.46802000000002, "coord_origin": "1"}}, {"id": 13, "text": "up", "bbox": {"l": 134.765, "t": 242.94086000000004, "r": 145.19585, "b": 255.42400999999995, "coord_origin": "1"}}, {"id": 14, "text": "in the inference runtime over HTML.", "bbox": {"l": 149.149, "t": 242.89104999999995, "r": 311.22256, "b": 255.42400999999995, "coord_origin": "1"}}]}, {"id": 5, "label": "Caption", "bbox": {"l": 134.35366821289062, "t": 272.79474000000005, "r": 480.59890999999993, "b": 327.98218, "coord_origin": "1"}, "confidence": 0.9766141772270203, "cells": [{"id": 15, "text": "Table", "bbox": {"l": 134.765, "t": 272.79474000000005, "r": 159.22983, "b": 284.1999799999999, "coord_origin": "1"}}, {"id": 16, "text": "1.", "bbox": {"l": 167.34442, "t": 272.79474000000005, "r": 174.71301, "b": 284.1999799999999, "coord_origin": "1"}}, {"id": 17, "text": "HPO performed in OTSL and HTML representation on the same", "bbox": {"l": 188.133, "t": 272.85748, "r": 480.58101999999997, "b": 284.14618, "coord_origin": "1"}}, {"id": 18, "text": "transformer-based TableFormer [9] architecture, trained only on PubTabNet [22]. Ef-", "bbox": {"l": 134.765, "t": 283.81647, "r": 480.59890999999993, "b": 295.10516000000007, "coord_origin": "1"}}, {"id": 19, "text": "fects of reducing the # of layers in encoder and decoder stages of the model show that", "bbox": {"l": 134.765, "t": 294.77547999999996, "r": 480.59887999999995, "b": 306.06418, "coord_origin": "1"}}, {"id": 20, "text": "smaller models trained on OTSL perform better, especially in recognizing complex", "bbox": {"l": 134.765, "t": 305.73447, "r": 480.59180000000003, "b": 317.02316, "coord_origin": "1"}}, {"id": 21, "text": "table structures, and maintain a much higher mAP score than the HTML counterpart.", "bbox": {"l": 134.765, "t": 316.69348, "r": 480.58471999999995, "b": 327.98218, "coord_origin": "1"}}]}, {"id": 6, "label": "Table", "bbox": {"l": 139.21253967285156, "t": 336.4130859375, "r": 475.24322509765625, "b": 469.6602783203125, "coord_origin": "1"}, "confidence": 0.9847328066825867, "cells": [{"id": 22, "text": "#", "bbox": {"l": 160.37, "t": 339.45749, "r": 168.04523, "b": 350.74619, "coord_origin": "1"}}, {"id": 23, "text": "enc-layers", "bbox": {"l": 144.592, "t": 352.40848, "r": 183.82895, "b": 363.69717, "coord_origin": "1"}}, {"id": 24, "text": "#", "bbox": {"l": 207.974, "t": 339.45749, "r": 215.64923000000002, "b": 350.74619, "coord_origin": "1"}}, {"id": 25, "text": "dec-layers", "bbox": {"l": 192.19501, "t": 352.40848, "r": 231.42303, "b": 363.69717, "coord_origin": "1"}}, {"id": 26, "text": "Language", "bbox": {"l": 239.79799999999997, "t": 344.93649, "r": 278.3338, "b": 356.22519000000005, "coord_origin": "1"}}, {"id": 27, "text": "TEDs", "bbox": {"l": 324.67001, "t": 339.45749, "r": 348.26419, "b": 350.74619, "coord_origin": "1"}}, {"id": 28, "text": "mAP", "bbox": {"l": 396.271, "t": 339.45749, "r": 417.12595, "b": 350.74619, "coord_origin": "1"}}, {"id": 29, "text": "(0.75)", "bbox": {"l": 394.927, "t": 350.41647, "r": 418.46921, "b": 361.70517, "coord_origin": "1"}}, {"id": 30, "text": "Inference", "bbox": {"l": 430.771, "t": 339.45749, "r": 467.14142000000004, "b": 350.74619, "coord_origin": "1"}}, {"id": 31, "text": "time (secs)", "bbox": {"l": 427.14801, "t": 350.41647, "r": 470.76955999999996, "b": 361.70517, "coord_origin": "1"}}, {"id": 32, "text": "simple", "bbox": {"l": 286.686, "t": 352.40848, "r": 312.32812, "b": 363.69717, "coord_origin": "1"}}, {"id": 33, "text": "complex", "bbox": {"l": 320.702, "t": 352.40848, "r": 353.71539, "b": 363.69717, "coord_origin": "1"}}, {"id": 34, "text": "all", "bbox": {"l": 369.306, "t": 352.40848, "r": 379.02914, "b": 363.69717, "coord_origin": "1"}}, {"id": 35, "text": "6", "bbox": {"l": 161.90601, "t": 371.23849, "r": 166.51474, "b": 382.52719, "coord_origin": "1"}}, {"id": 36, "text": "6", "bbox": {"l": 209.509, "t": 371.23849, "r": 214.11774, "b": 382.52719, "coord_origin": "1"}}, {"id": 37, "text": "OTSL", "bbox": {"l": 246.71000999999998, "t": 365.75848, "r": 271.41064, "b": 377.04717999999997, "coord_origin": "1"}}, {"id": 38, "text": "0.965", "bbox": {"l": 289.017, "t": 365.75848, "r": 310.00732, "b": 377.04717999999997, "coord_origin": "1"}}, {"id": 39, "text": "0.934", "bbox": {"l": 326.71701, "t": 365.75848, "r": 347.70734, "b": 377.04717999999997, "coord_origin": "1"}}, {"id": 40, "text": "0.955", "bbox": {"l": 363.67599, "t": 365.75848, "r": 384.66632, "b": 377.04717999999997, "coord_origin": "1"}}, {"id": 41, "text": "0.88", "bbox": {"l": 397.26999, "t": 365.69571, "r": 416.12634, "b": 377.10098000000005, "coord_origin": "1"}}, {"id": 42, "text": "2.73", "bbox": {"l": 439.52701, "t": 365.69571, "r": 458.38336, "b": 377.10098000000005, "coord_origin": "1"}}, {"id": 43, "text": "HTML", "bbox": {"l": 245.17598999999998, "t": 378.71048, "r": 272.94495, "b": 389.99917999999997, "coord_origin": "1"}}, {"id": 44, "text": "0.969", "bbox": {"l": 289.017, "t": 378.71048, "r": 310.00732, "b": 389.99917999999997, "coord_origin": "1"}}, {"id": 45, "text": "0.927", "bbox": {"l": 326.71701, "t": 378.71048, "r": 347.70734, "b": 389.99917999999997, "coord_origin": "1"}}, {"id": 46, "text": "0.955", "bbox": {"l": 363.67599, "t": 378.71048, "r": 384.66632, "b": 389.99917999999997, "coord_origin": "1"}}, {"id": 47, "text": "0.857", "bbox": {"l": 396.20599, "t": 378.71048, "r": 417.19632, "b": 389.99917999999997, "coord_origin": "1"}}, {"id": 48, "text": "5.39", "bbox": {"l": 440.767, "t": 378.71048, "r": 457.15039, "b": 389.99917999999997, "coord_origin": "1"}}, {"id": 49, "text": "4", "bbox": {"l": 161.90601, "t": 397.53949, "r": 166.51474, "b": 408.82819, "coord_origin": "1"}}, {"id": 50, "text": "4", "bbox": {"l": 209.509, "t": 397.53949, "r": 214.11774, "b": 408.82819, "coord_origin": "1"}}, {"id": 51, "text": "OTSL", "bbox": {"l": 246.71000999999998, "t": 392.06049, "r": 271.41064, "b": 403.34918, "coord_origin": "1"}}, {"id": 52, "text": "0.938", "bbox": {"l": 289.017, "t": 392.06049, "r": 310.00732, "b": 403.34918, "coord_origin": "1"}}, {"id": 53, "text": "0.904", "bbox": {"l": 326.71701, "t": 392.06049, "r": 347.70734, "b": 403.34918, "coord_origin": "1"}}, {"id": 54, "text": "0.927", "bbox": {"l": 363.67599, "t": 392.06049, "r": 384.66632, "b": 403.34918, "coord_origin": "1"}}, {"id": 55, "text": "0.853", "bbox": {"l": 394.61801, "t": 391.99771, "r": 418.77798, "b": 403.40298, "coord_origin": "1"}}, {"id": 56, "text": "1.97", "bbox": {"l": 439.52701, "t": 391.99771, "r": 458.38336, "b": 403.40298, "coord_origin": "1"}}, {"id": 57, "text": "HTML", "bbox": {"l": 245.17598999999998, "t": 405.01147, "r": 272.94495, "b": 416.30017, "coord_origin": "1"}}, {"id": 58, "text": "0.952", "bbox": {"l": 289.017, "t": 405.01147, "r": 310.00732, "b": 416.30017, "coord_origin": "1"}}, {"id": 59, "text": "0.909", "bbox": {"l": 326.71701, "t": 405.01147, "r": 347.70734, "b": 416.30017, "coord_origin": "1"}}, {"id": 60, "text": "0.938", "bbox": {"l": 362.08801, "t": 404.9486999999999, "r": 386.24799, "b": 416.35397, "coord_origin": "1"}}, {"id": 61, "text": "0.843", "bbox": {"l": 396.20599, "t": 405.01147, "r": 417.19632, "b": 416.30017, "coord_origin": "1"}}, {"id": 62, "text": "3.77", "bbox": {"l": 440.767, "t": 405.01147, "r": 457.15039, "b": 416.30017, "coord_origin": "1"}}, {"id": 63, "text": "2", "bbox": {"l": 161.90601, "t": 423.84048, "r": 166.51474, "b": 435.12918, "coord_origin": "1"}}, {"id": 64, "text": "4", "bbox": {"l": 209.509, "t": 423.84048, "r": 214.11774, "b": 435.12918, "coord_origin": "1"}}, {"id": 65, "text": "OTSL", "bbox": {"l": 246.71000999999998, "t": 418.3614799999999, "r": 271.41064, "b": 429.65018, "coord_origin": "1"}}, {"id": 66, "text": "0.923", "bbox": {"l": 289.017, "t": 418.3614799999999, "r": 310.00732, "b": 429.65018, "coord_origin": "1"}}, {"id": 67, "text": "0.897", "bbox": {"l": 326.71701, "t": 418.3614799999999, "r": 347.70734, "b": 429.65018, "coord_origin": "1"}}, {"id": 68, "text": "0.915", "bbox": {"l": 363.67599, "t": 418.3614799999999, "r": 384.66632, "b": 429.65018, "coord_origin": "1"}}, {"id": 69, "text": "0.859", "bbox": {"l": 394.61801, "t": 418.29871, "r": 418.77798, "b": 429.70398, "coord_origin": "1"}}, {"id": 70, "text": "1.91", "bbox": {"l": 439.52701, "t": 418.29871, "r": 458.38336, "b": 429.70398, "coord_origin": "1"}}, {"id": 71, "text": "HTML", "bbox": {"l": 245.17598999999998, "t": 431.31246999999996, "r": 272.94495, "b": 442.60117, "coord_origin": "1"}}, {"id": 72, "text": "0.945", "bbox": {"l": 289.017, "t": 431.31246999999996, "r": 310.00732, "b": 442.60117, "coord_origin": "1"}}, {"id": 73, "text": "0.901", "bbox": {"l": 326.71701, "t": 431.31246999999996, "r": 347.70734, "b": 442.60117, "coord_origin": "1"}}, {"id": 74, "text": "0.931", "bbox": {"l": 362.08801, "t": 431.24969, "r": 386.24799, "b": 442.65497, "coord_origin": "1"}}, {"id": 75, "text": "0.834", "bbox": {"l": 396.20599, "t": 431.31246999999996, "r": 417.19632, "b": 442.60117, "coord_origin": "1"}}, {"id": 76, "text": "3.81", "bbox": {"l": 440.767, "t": 431.31246999999996, "r": 457.15039, "b": 442.60117, "coord_origin": "1"}}, {"id": 77, "text": "4", "bbox": {"l": 161.90601, "t": 450.14248999999995, "r": 166.51474, "b": 461.43118, "coord_origin": "1"}}, {"id": 78, "text": "2", "bbox": {"l": 209.509, "t": 450.14248999999995, "r": 214.11774, "b": 461.43118, "coord_origin": "1"}}, {"id": 79, "text": "OTSL", "bbox": {"l": 246.71000999999998, "t": 444.66248, "r": 271.41064, "b": 455.95117, "coord_origin": "1"}}, {"id": 80, "text": "0.952", "bbox": {"l": 289.017, "t": 444.66248, "r": 310.00732, "b": 455.95117, "coord_origin": "1"}}, {"id": 81, "text": "0.92", "bbox": {"l": 329.021, "t": 444.66248, "r": 345.40439, "b": 455.95117, "coord_origin": "1"}}, {"id": 82, "text": "0.942", "bbox": {"l": 362.08801, "t": 444.5996999999999, "r": 386.24799, "b": 456.00497, "coord_origin": "1"}}, {"id": 83, "text": "0.857", "bbox": {"l": 394.61801, "t": 444.5996999999999, "r": 418.77798, "b": 456.00497, "coord_origin": "1"}}, {"id": 84, "text": "1.22", "bbox": {"l": 439.52701, "t": 444.5996999999999, "r": 458.38336, "b": 456.00497, "coord_origin": "1"}}, {"id": 85, "text": "HTML", "bbox": {"l": 245.17598999999998, "t": 457.61447, "r": 272.94495, "b": 468.90317, "coord_origin": "1"}}, {"id": 86, "text": "0.944", "bbox": {"l": 289.017, "t": 457.61447, "r": 310.00732, "b": 468.90317, "coord_origin": "1"}}, {"id": 87, "text": "0.903", "bbox": {"l": 326.71701, "t": 457.61447, "r": 347.70734, "b": 468.90317, "coord_origin": "1"}}, {"id": 88, "text": "0.931", "bbox": {"l": 363.67599, "t": 457.61447, "r": 384.66632, "b": 468.90317, "coord_origin": "1"}}, {"id": 89, "text": "0.824", "bbox": {"l": 396.20599, "t": 457.61447, "r": 417.19632, "b": 468.90317, "coord_origin": "1"}}, {"id": 90, "text": "2", "bbox": {"l": 446.65302, "t": 457.61447, "r": 451.26175, "b": 468.90317, "coord_origin": "1"}}]}, {"id": 7, "label": "Section-header", "bbox": {"l": 134.34051513671875, "t": 505.67111, "r": 264.40829, "b": 518.17419, "coord_origin": "1"}, "confidence": 0.9198964834213257, "cells": [{"id": 91, "text": "5.2", "bbox": {"l": 134.765, "t": 505.67111, "r": 149.40306, "b": 518.17419, "coord_origin": "1"}}, {"id": 92, "text": "Quantitative Results", "bbox": {"l": 160.85905, "t": 505.67111, "r": 264.40829, "b": 518.17419, "coord_origin": "1"}}]}, {"id": 8, "label": "Text", "bbox": {"l": 134.35081481933594, "t": 522.08005, "r": 480.72003, "b": 618.3, "coord_origin": "1"}, "confidence": 0.98807293176651, "cells": [{"id": 93, "text": "We picked the model parameter configuration that produced the best prediction", "bbox": {"l": 134.765, "t": 522.08005, "r": 479.72983, "b": 534.61301, "coord_origin": "1"}}, {"id": 94, "text": "quality (enc=6, dec=6, heads=8) with PubTabNet alone, then independently", "bbox": {"l": 134.765, "t": 534.03604, "r": 480.5897499999999, "b": 546.569, "coord_origin": "1"}}, {"id": 95, "text": "trained and evaluated it on three publicly available data sets: PubTabNet (395k", "bbox": {"l": 134.765, "t": 545.99104, "r": 480.72003, "b": 558.524, "coord_origin": "1"}}, {"id": 96, "text": "samples), FinTabNet (113k samples) and PubTables-1M (about 1M samples).", "bbox": {"l": 134.765, "t": 557.94604, "r": 480.60577, "b": 570.479, "coord_origin": "1"}}, {"id": 97, "text": "Performance results are presented in Table. 2. It is clearly evident that the model", "bbox": {"l": 134.765, "t": 569.90103, "r": 480.5936899999999, "b": 582.43399, "coord_origin": "1"}}, {"id": 98, "text": "trained on OTSL outperforms HTML across the board, keeping high TEDs and", "bbox": {"l": 134.765, "t": 581.85603, "r": 480.59158, "b": 594.38899, "coord_origin": "1"}}, {"id": 99, "text": "mAP scores even on difficult financial tables (FinTabNet) that contain sparse", "bbox": {"l": 134.765, "t": 593.81204, "r": 480.58080999999993, "b": 606.345, "coord_origin": "1"}}, {"id": 100, "text": "and large tables.", "bbox": {"l": 134.765, "t": 605.76704, "r": 206.79959, "b": 618.3, "coord_origin": "1"}}]}, {"id": 9, "label": "Text", "bbox": {"l": 134.27769470214844, "t": 617.72205, "r": 480.59857000000005, "b": 666.12, "coord_origin": "1"}, "confidence": 0.9812840819358826, "cells": [{"id": 101, "text": "Additionally, the results show that OTSL has an advantage over HTML", "bbox": {"l": 149.709, "t": 617.72205, "r": 480.59479, "b": 630.255, "coord_origin": "1"}}, {"id": 102, "text": "when applied on a bigger data set like PubTables-1M and achieves significantly", "bbox": {"l": 134.765, "t": 629.6770300000001, "r": 480.59857000000005, "b": 642.2099900000001, "coord_origin": "1"}}, {"id": 103, "text": "improved scores. Finally, OTSL achieves faster inference due to fewer decoding", "bbox": {"l": 134.765, "t": 641.63203, "r": 480.59384000000006, "b": 654.16499, "coord_origin": "1"}}, {"id": 104, "text": "steps which is a result of the reduced sequence representation.", "bbox": {"l": 134.765, "t": 653.58704, "r": 405.7995, "b": 666.12, "coord_origin": "1"}}]}]}, "tablestructure": {"table_map": {"6": {"label": "Table", "id": 6, "page_no": 0, "cluster": {"id": 6, "label": "Table", "bbox": {"l": 139.21253967285156, "t": 336.4130859375, "r": 475.24322509765625, "b": 469.6602783203125, "coord_origin": "1"}, "confidence": 0.9847328066825867, "cells": [{"id": 22, "text": "#", "bbox": {"l": 160.37, "t": 339.45749, "r": 168.04523, "b": 350.74619, "coord_origin": "1"}}, {"id": 23, "text": "enc-layers", "bbox": {"l": 144.592, "t": 352.40848, "r": 183.82895, "b": 363.69717, "coord_origin": "1"}}, {"id": 24, "text": "#", "bbox": {"l": 207.974, "t": 339.45749, "r": 215.64923000000002, "b": 350.74619, "coord_origin": "1"}}, {"id": 25, "text": "dec-layers", "bbox": {"l": 192.19501, "t": 352.40848, "r": 231.42303, "b": 363.69717, "coord_origin": "1"}}, {"id": 26, "text": "Language", "bbox": {"l": 239.79799999999997, "t": 344.93649, "r": 278.3338, "b": 356.22519000000005, "coord_origin": "1"}}, {"id": 27, "text": "TEDs", "bbox": {"l": 324.67001, "t": 339.45749, "r": 348.26419, "b": 350.74619, "coord_origin": "1"}}, {"id": 28, "text": "mAP", "bbox": {"l": 396.271, "t": 339.45749, "r": 417.12595, "b": 350.74619, "coord_origin": "1"}}, {"id": 29, "text": "(0.75)", "bbox": {"l": 394.927, "t": 350.41647, "r": 418.46921, "b": 361.70517, "coord_origin": "1"}}, {"id": 30, "text": "Inference", "bbox": {"l": 430.771, "t": 339.45749, "r": 467.14142000000004, "b": 350.74619, "coord_origin": "1"}}, {"id": 31, "text": "time (secs)", "bbox": {"l": 427.14801, "t": 350.41647, "r": 470.76955999999996, "b": 361.70517, "coord_origin": "1"}}, {"id": 32, "text": "simple", "bbox": {"l": 286.686, "t": 352.40848, "r": 312.32812, "b": 363.69717, "coord_origin": "1"}}, {"id": 33, "text": "complex", "bbox": {"l": 320.702, "t": 352.40848, "r": 353.71539, "b": 363.69717, "coord_origin": "1"}}, {"id": 34, "text": "all", "bbox": {"l": 369.306, "t": 352.40848, "r": 379.02914, "b": 363.69717, "coord_origin": "1"}}, {"id": 35, "text": "6", "bbox": {"l": 161.90601, "t": 371.23849, "r": 166.51474, "b": 382.52719, "coord_origin": "1"}}, {"id": 36, "text": "6", "bbox": {"l": 209.509, "t": 371.23849, "r": 214.11774, "b": 382.52719, "coord_origin": "1"}}, {"id": 37, "text": "OTSL", "bbox": {"l": 246.71000999999998, "t": 365.75848, "r": 271.41064, "b": 377.04717999999997, "coord_origin": "1"}}, {"id": 38, "text": "0.965", "bbox": {"l": 289.017, "t": 365.75848, "r": 310.00732, "b": 377.04717999999997, "coord_origin": "1"}}, {"id": 39, "text": "0.934", "bbox": {"l": 326.71701, "t": 365.75848, "r": 347.70734, "b": 377.04717999999997, "coord_origin": "1"}}, {"id": 40, "text": "0.955", "bbox": {"l": 363.67599, "t": 365.75848, "r": 384.66632, "b": 377.04717999999997, "coord_origin": "1"}}, {"id": 41, "text": "0.88", "bbox": {"l": 397.26999, "t": 365.69571, "r": 416.12634, "b": 377.10098000000005, "coord_origin": "1"}}, {"id": 42, "text": "2.73", "bbox": {"l": 439.52701, "t": 365.69571, "r": 458.38336, "b": 377.10098000000005, "coord_origin": "1"}}, {"id": 43, "text": "HTML", "bbox": {"l": 245.17598999999998, "t": 378.71048, "r": 272.94495, "b": 389.99917999999997, "coord_origin": "1"}}, {"id": 44, "text": "0.969", "bbox": {"l": 289.017, "t": 378.71048, "r": 310.00732, "b": 389.99917999999997, "coord_origin": "1"}}, {"id": 45, "text": "0.927", "bbox": {"l": 326.71701, "t": 378.71048, "r": 347.70734, "b": 389.99917999999997, "coord_origin": "1"}}, {"id": 46, "text": "0.955", "bbox": {"l": 363.67599, "t": 378.71048, "r": 384.66632, "b": 389.99917999999997, "coord_origin": "1"}}, {"id": 47, "text": "0.857", "bbox": {"l": 396.20599, "t": 378.71048, "r": 417.19632, "b": 389.99917999999997, "coord_origin": "1"}}, {"id": 48, "text": "5.39", "bbox": {"l": 440.767, "t": 378.71048, "r": 457.15039, "b": 389.99917999999997, "coord_origin": "1"}}, {"id": 49, "text": "4", "bbox": {"l": 161.90601, "t": 397.53949, "r": 166.51474, "b": 408.82819, "coord_origin": "1"}}, {"id": 50, "text": "4", "bbox": {"l": 209.509, "t": 397.53949, "r": 214.11774, "b": 408.82819, "coord_origin": "1"}}, {"id": 51, "text": "OTSL", "bbox": {"l": 246.71000999999998, "t": 392.06049, "r": 271.41064, "b": 403.34918, "coord_origin": "1"}}, {"id": 52, "text": "0.938", "bbox": {"l": 289.017, "t": 392.06049, "r": 310.00732, "b": 403.34918, "coord_origin": "1"}}, {"id": 53, "text": "0.904", "bbox": {"l": 326.71701, "t": 392.06049, "r": 347.70734, "b": 403.34918, "coord_origin": "1"}}, {"id": 54, "text": "0.927", "bbox": {"l": 363.67599, "t": 392.06049, "r": 384.66632, "b": 403.34918, "coord_origin": "1"}}, {"id": 55, "text": "0.853", "bbox": {"l": 394.61801, "t": 391.99771, "r": 418.77798, "b": 403.40298, "coord_origin": "1"}}, {"id": 56, "text": "1.97", "bbox": {"l": 439.52701, "t": 391.99771, "r": 458.38336, "b": 403.40298, "coord_origin": "1"}}, {"id": 57, "text": "HTML", "bbox": {"l": 245.17598999999998, "t": 405.01147, "r": 272.94495, "b": 416.30017, "coord_origin": "1"}}, {"id": 58, "text": "0.952", "bbox": {"l": 289.017, "t": 405.01147, "r": 310.00732, "b": 416.30017, "coord_origin": "1"}}, {"id": 59, "text": "0.909", "bbox": {"l": 326.71701, "t": 405.01147, "r": 347.70734, "b": 416.30017, "coord_origin": "1"}}, {"id": 60, "text": "0.938", "bbox": {"l": 362.08801, "t": 404.9486999999999, "r": 386.24799, "b": 416.35397, "coord_origin": "1"}}, {"id": 61, "text": "0.843", "bbox": {"l": 396.20599, "t": 405.01147, "r": 417.19632, "b": 416.30017, "coord_origin": "1"}}, {"id": 62, "text": "3.77", "bbox": {"l": 440.767, "t": 405.01147, "r": 457.15039, "b": 416.30017, "coord_origin": "1"}}, {"id": 63, "text": "2", "bbox": {"l": 161.90601, "t": 423.84048, "r": 166.51474, "b": 435.12918, "coord_origin": "1"}}, {"id": 64, "text": "4", "bbox": {"l": 209.509, "t": 423.84048, "r": 214.11774, "b": 435.12918, "coord_origin": "1"}}, {"id": 65, "text": "OTSL", "bbox": {"l": 246.71000999999998, "t": 418.3614799999999, "r": 271.41064, "b": 429.65018, "coord_origin": "1"}}, {"id": 66, "text": "0.923", "bbox": {"l": 289.017, "t": 418.3614799999999, "r": 310.00732, "b": 429.65018, "coord_origin": "1"}}, {"id": 67, "text": "0.897", "bbox": {"l": 326.71701, "t": 418.3614799999999, "r": 347.70734, "b": 429.65018, "coord_origin": "1"}}, {"id": 68, "text": "0.915", "bbox": {"l": 363.67599, "t": 418.3614799999999, "r": 384.66632, "b": 429.65018, "coord_origin": "1"}}, {"id": 69, "text": "0.859", "bbox": {"l": 394.61801, "t": 418.29871, "r": 418.77798, "b": 429.70398, "coord_origin": "1"}}, {"id": 70, "text": "1.91", "bbox": {"l": 439.52701, "t": 418.29871, "r": 458.38336, "b": 429.70398, "coord_origin": "1"}}, {"id": 71, "text": "HTML", "bbox": {"l": 245.17598999999998, "t": 431.31246999999996, "r": 272.94495, "b": 442.60117, "coord_origin": "1"}}, {"id": 72, "text": "0.945", "bbox": {"l": 289.017, "t": 431.31246999999996, "r": 310.00732, "b": 442.60117, "coord_origin": "1"}}, {"id": 73, "text": "0.901", "bbox": {"l": 326.71701, "t": 431.31246999999996, "r": 347.70734, "b": 442.60117, "coord_origin": "1"}}, {"id": 74, "text": "0.931", "bbox": {"l": 362.08801, "t": 431.24969, "r": 386.24799, "b": 442.65497, "coord_origin": "1"}}, {"id": 75, "text": "0.834", "bbox": {"l": 396.20599, "t": 431.31246999999996, "r": 417.19632, "b": 442.60117, "coord_origin": "1"}}, {"id": 76, "text": "3.81", "bbox": {"l": 440.767, "t": 431.31246999999996, "r": 457.15039, "b": 442.60117, "coord_origin": "1"}}, {"id": 77, "text": "4", "bbox": {"l": 161.90601, "t": 450.14248999999995, "r": 166.51474, "b": 461.43118, "coord_origin": "1"}}, {"id": 78, "text": "2", "bbox": {"l": 209.509, "t": 450.14248999999995, "r": 214.11774, "b": 461.43118, "coord_origin": "1"}}, {"id": 79, "text": "OTSL", "bbox": {"l": 246.71000999999998, "t": 444.66248, "r": 271.41064, "b": 455.95117, "coord_origin": "1"}}, {"id": 80, "text": "0.952", "bbox": {"l": 289.017, "t": 444.66248, "r": 310.00732, "b": 455.95117, "coord_origin": "1"}}, {"id": 81, "text": "0.92", "bbox": {"l": 329.021, "t": 444.66248, "r": 345.40439, "b": 455.95117, "coord_origin": "1"}}, {"id": 82, "text": "0.942", "bbox": {"l": 362.08801, "t": 444.5996999999999, "r": 386.24799, "b": 456.00497, "coord_origin": "1"}}, {"id": 83, "text": "0.857", "bbox": {"l": 394.61801, "t": 444.5996999999999, "r": 418.77798, "b": 456.00497, "coord_origin": "1"}}, {"id": 84, "text": "1.22", "bbox": {"l": 439.52701, "t": 444.5996999999999, "r": 458.38336, "b": 456.00497, "coord_origin": "1"}}, {"id": 85, "text": "HTML", "bbox": {"l": 245.17598999999998, "t": 457.61447, "r": 272.94495, "b": 468.90317, "coord_origin": "1"}}, {"id": 86, "text": "0.944", "bbox": {"l": 289.017, "t": 457.61447, "r": 310.00732, "b": 468.90317, "coord_origin": "1"}}, {"id": 87, "text": "0.903", "bbox": {"l": 326.71701, "t": 457.61447, "r": 347.70734, "b": 468.90317, "coord_origin": "1"}}, {"id": 88, "text": "0.931", "bbox": {"l": 363.67599, "t": 457.61447, "r": 384.66632, "b": 468.90317, "coord_origin": "1"}}, {"id": 89, "text": "0.824", "bbox": {"l": 396.20599, "t": 457.61447, "r": 417.19632, "b": 468.90317, "coord_origin": "1"}}, {"id": 90, "text": "2", "bbox": {"l": 446.65302, "t": 457.61447, "r": 451.26175, "b": 468.90317, "coord_origin": "1"}}]}, "text": null, "otsl_seq": ["ched", "ched", "ched", "ched", "lcel", "lcel", "ched", "ched", "nl", "ched", "ched", "ucel", "ched", "ched", "ched", "ched", "ched", "nl", "fcel", "fcel", "fcel", "fcel", "fcel", "fcel", "fcel", "fcel", "nl", "fcel", "fcel", "fcel", "fcel", "fcel", "fcel", "fcel", "fcel", "nl", "fcel", "fcel", "fcel", "fcel", "fcel", "fcel", "fcel", "fcel", "nl", "fcel", "fcel", "fcel", "fcel", "fcel", "fcel", "fcel", "fcel", "nl", "fcel", "fcel", "fcel", "fcel", "fcel", "fcel", "fcel", "fcel", "nl"], "num_rows": 7, "num_cols": 8, "table_cells": [{"bbox": {"l": 160.37, "t": 339.45749, "r": 168.04523, "b": 350.74619, "coord_origin": "1"}, "row_span": 1, "col_span": 1, "start_row_offset_idx": 0, "end_row_offset_idx": 1, "start_col_offset_idx": 0, "end_col_offset_idx": 1, "text": "#", "column_header": true, "row_header": false, "row_section": false}, {"bbox": {"l": 144.592, "t": 352.40848, "r": 183.82895, "b": 363.69717, "coord_origin": "1"}, "row_span": 1, "col_span": 1, "start_row_offset_idx": 1, "end_row_offset_idx": 2, "start_col_offset_idx": 0, "end_col_offset_idx": 1, "text": "enc-layers", "column_header": true, "row_header": false, "row_section": false}, {"bbox": {"l": 207.974, "t": 339.45749, "r": 215.64923000000002, "b": 350.74619, "coord_origin": "1"}, "row_span": 1, "col_span": 1, "start_row_offset_idx": 0, "end_row_offset_idx": 1, "start_col_offset_idx": 1, "end_col_offset_idx": 2, "text": "#", "column_header": true, "row_header": false, "row_section": false}, {"bbox": {"l": 192.19501, "t": 352.40848, "r": 231.42303, "b": 363.69717, "coord_origin": "1"}, "row_span": 1, "col_span": 1, "start_row_offset_idx": 1, "end_row_offset_idx": 2, "start_col_offset_idx": 1, "end_col_offset_idx": 2, "text": "dec-layers", "column_header": true, "row_header": false, "row_section": false}, {"bbox": {"l": 239.79799999999997, "t": 344.93649, "r": 278.3338, "b": 356.22519000000005, "coord_origin": "1"}, "row_span": 2, "col_span": 1, "start_row_offset_idx": 0, "end_row_offset_idx": 2, "start_col_offset_idx": 2, "end_col_offset_idx": 3, "text": "Language", "column_header": true, "row_header": false, "row_section": false}, {"bbox": {"l": 324.67001, "t": 339.45749, "r": 348.26419, "b": 350.74619, "coord_origin": "1"}, "row_span": 1, "col_span": 3, "start_row_offset_idx": 0, "end_row_offset_idx": 1, "start_col_offset_idx": 3, "end_col_offset_idx": 6, "text": "TEDs", "column_header": true, "row_header": false, "row_section": false}, {"bbox": {"l": 396.271, "t": 339.45749, "r": 417.12595, "b": 350.74619, "coord_origin": "1"}, "row_span": 1, "col_span": 1, "start_row_offset_idx": 0, "end_row_offset_idx": 1, "start_col_offset_idx": 6, "end_col_offset_idx": 7, "text": "mAP", "column_header": true, "row_header": false, "row_section": false}, {"bbox": {"l": 394.927, "t": 350.41647, "r": 418.46921, "b": 361.70517, "coord_origin": "1"}, "row_span": 1, "col_span": 1, "start_row_offset_idx": 1, "end_row_offset_idx": 2, "start_col_offset_idx": 6, "end_col_offset_idx": 7, "text": "(0.75)", "column_header": true, "row_header": false, "row_section": false}, {"bbox": {"l": 430.771, "t": 339.45749, "r": 467.14142000000004, "b": 350.74619, "coord_origin": "1"}, "row_span": 1, "col_span": 1, "start_row_offset_idx": 0, "end_row_offset_idx": 1, "start_col_offset_idx": 7, "end_col_offset_idx": 8, "text": "Inference", "column_header": true, "row_header": false, "row_section": false}, {"bbox": {"l": 427.14801, "t": 350.41647, "r": 470.76955999999996, "b": 361.70517, "coord_origin": "1"}, "row_span": 1, "col_span": 1, "start_row_offset_idx": 1, "end_row_offset_idx": 2, "start_col_offset_idx": 7, "end_col_offset_idx": 8, "text": "time (secs)", "column_header": true, "row_header": false, "row_section": false}, {"bbox": {"l": 286.686, "t": 352.40848, "r": 312.32812, "b": 363.69717, "coord_origin": "1"}, "row_span": 1, "col_span": 1, "start_row_offset_idx": 1, "end_row_offset_idx": 2, "start_col_offset_idx": 3, "end_col_offset_idx": 4, "text": "simple", "column_header": true, "row_header": false, "row_section": false}, {"bbox": {"l": 320.702, "t": 352.40848, "r": 353.71539, "b": 363.69717, "coord_origin": "1"}, "row_span": 1, "col_span": 1, "start_row_offset_idx": 1, "end_row_offset_idx": 2, "start_col_offset_idx": 4, "end_col_offset_idx": 5, "text": "complex", "column_header": true, "row_header": false, "row_section": false}, {"bbox": {"l": 369.306, "t": 352.40848, "r": 379.02914, "b": 363.69717, "coord_origin": "1"}, "row_span": 1, "col_span": 1, "start_row_offset_idx": 1, "end_row_offset_idx": 2, "start_col_offset_idx": 5, "end_col_offset_idx": 6, "text": "all", "column_header": true, "row_header": false, "row_section": false}, {"bbox": {"l": 161.90601, "t": 371.23849, "r": 166.51474, "b": 382.52719, "coord_origin": "1"}, "row_span": 1, "col_span": 1, "start_row_offset_idx": 2, "end_row_offset_idx": 3, "start_col_offset_idx": 0, "end_col_offset_idx": 1, "text": "6", "column_header": false, "row_header": false, "row_section": false}, {"bbox": {"l": 209.509, "t": 371.23849, "r": 214.11774, "b": 382.52719, "coord_origin": "1"}, "row_span": 1, "col_span": 1, "start_row_offset_idx": 2, "end_row_offset_idx": 3, "start_col_offset_idx": 1, "end_col_offset_idx": 2, "text": "6", "column_header": false, "row_header": false, "row_section": false}, {"bbox": {"l": 245.17598999999998, "t": 365.75848, "r": 272.94495, "b": 389.99917999999997, "coord_origin": "1"}, "row_span": 1, "col_span": 1, "start_row_offset_idx": 2, "end_row_offset_idx": 3, "start_col_offset_idx": 2, "end_col_offset_idx": 3, "text": "OTSL HTML", "column_header": false, "row_header": false, "row_section": false}, {"bbox": {"l": 289.017, "t": 365.75848, "r": 310.00732, "b": 389.99917999999997, "coord_origin": "1"}, "row_span": 1, "col_span": 1, "start_row_offset_idx": 2, "end_row_offset_idx": 3, "start_col_offset_idx": 3, "end_col_offset_idx": 4, "text": "0.965 0.969", "column_header": false, "row_header": false, "row_section": false}, {"bbox": {"l": 326.71701, "t": 365.75848, "r": 347.70734, "b": 389.99917999999997, "coord_origin": "1"}, "row_span": 1, "col_span": 1, "start_row_offset_idx": 2, "end_row_offset_idx": 3, "start_col_offset_idx": 4, "end_col_offset_idx": 5, "text": "0.934 0.927", "column_header": false, "row_header": false, "row_section": false}, {"bbox": {"l": 363.67599, "t": 365.75848, "r": 384.66632, "b": 389.99917999999997, "coord_origin": "1"}, "row_span": 1, "col_span": 1, "start_row_offset_idx": 2, "end_row_offset_idx": 3, "start_col_offset_idx": 5, "end_col_offset_idx": 6, "text": "0.955 0.955", "column_header": false, "row_header": false, "row_section": false}, {"bbox": {"l": 396.20599, "t": 365.69571, "r": 417.19632, "b": 389.99917999999997, "coord_origin": "1"}, "row_span": 1, "col_span": 1, "start_row_offset_idx": 2, "end_row_offset_idx": 3, "start_col_offset_idx": 6, "end_col_offset_idx": 7, "text": "0.88 0.857", "column_header": false, "row_header": false, "row_section": false}, {"bbox": {"l": 439.52701, "t": 365.69571, "r": 458.38336, "b": 389.99917999999997, "coord_origin": "1"}, "row_span": 1, "col_span": 1, "start_row_offset_idx": 2, "end_row_offset_idx": 3, "start_col_offset_idx": 7, "end_col_offset_idx": 8, "text": "2.73 5.39", "column_header": false, "row_header": false, "row_section": false}, {"bbox": {"l": 161.90601, "t": 397.53949, "r": 166.51474, "b": 408.82819, "coord_origin": "1"}, "row_span": 1, "col_span": 1, "start_row_offset_idx": 3, "end_row_offset_idx": 4, "start_col_offset_idx": 0, "end_col_offset_idx": 1, "text": "4", "column_header": false, "row_header": false, "row_section": false}, {"bbox": {"l": 209.509, "t": 397.53949, "r": 214.11774, "b": 408.82819, "coord_origin": "1"}, "row_span": 1, "col_span": 1, "start_row_offset_idx": 3, "end_row_offset_idx": 4, "start_col_offset_idx": 1, "end_col_offset_idx": 2, "text": "4", "column_header": false, "row_header": false, "row_section": false}, {"bbox": {"l": 245.17598999999998, "t": 392.06049, "r": 272.94495, "b": 416.30017, "coord_origin": "1"}, "row_span": 1, "col_span": 1, "start_row_offset_idx": 3, "end_row_offset_idx": 4, "start_col_offset_idx": 2, "end_col_offset_idx": 3, "text": "OTSL HTML", "column_header": false, "row_header": false, "row_section": false}, {"bbox": {"l": 289.017, "t": 392.06049, "r": 310.00732, "b": 416.30017, "coord_origin": "1"}, "row_span": 1, "col_span": 1, "start_row_offset_idx": 3, "end_row_offset_idx": 4, "start_col_offset_idx": 3, "end_col_offset_idx": 4, "text": "0.938 0.952", "column_header": false, "row_header": false, "row_section": false}, {"bbox": {"l": 326.71701, "t": 392.06049, "r": 347.70734, "b": 403.34918, "coord_origin": "1"}, "row_span": 1, "col_span": 1, "start_row_offset_idx": 3, "end_row_offset_idx": 4, "start_col_offset_idx": 4, "end_col_offset_idx": 5, "text": "0.904", "column_header": false, "row_header": false, "row_section": false}, {"bbox": {"l": 362.08801, "t": 392.06049, "r": 386.24799, "b": 416.35397, "coord_origin": "1"}, "row_span": 1, "col_span": 1, "start_row_offset_idx": 3, "end_row_offset_idx": 4, "start_col_offset_idx": 5, "end_col_offset_idx": 6, "text": "0.927 0.938", "column_header": false, "row_header": false, "row_section": false}, {"bbox": {"l": 394.61801, "t": 391.99771, "r": 418.77798, "b": 403.40298, "coord_origin": "1"}, "row_span": 1, "col_span": 1, "start_row_offset_idx": 3, "end_row_offset_idx": 4, "start_col_offset_idx": 6, "end_col_offset_idx": 7, "text": "0.853", "column_header": false, "row_header": false, "row_section": false}, {"bbox": {"l": 439.52701, "t": 391.99771, "r": 458.38336, "b": 403.40298, "coord_origin": "1"}, "row_span": 1, "col_span": 1, "start_row_offset_idx": 3, "end_row_offset_idx": 4, "start_col_offset_idx": 7, "end_col_offset_idx": 8, "text": "1.97", "column_header": false, "row_header": false, "row_section": false}, {"bbox": {"l": 326.71701, "t": 405.01147, "r": 347.70734, "b": 442.60117, "coord_origin": "1"}, "row_span": 1, "col_span": 1, "start_row_offset_idx": 4, "end_row_offset_idx": 5, "start_col_offset_idx": 4, "end_col_offset_idx": 5, "text": "0.909 0.897 0.901", "column_header": false, "row_header": false, "row_section": false}, {"bbox": {"l": 396.20599, "t": 405.01147, "r": 417.19632, "b": 416.30017, "coord_origin": "1"}, "row_span": 1, "col_span": 1, "start_row_offset_idx": 4, "end_row_offset_idx": 5, "start_col_offset_idx": 6, "end_col_offset_idx": 7, "text": "0.843", "column_header": false, "row_header": false, "row_section": false}, {"bbox": {"l": 440.767, "t": 405.01147, "r": 457.15039, "b": 416.30017, "coord_origin": "1"}, "row_span": 1, "col_span": 1, "start_row_offset_idx": 4, "end_row_offset_idx": 5, "start_col_offset_idx": 7, "end_col_offset_idx": 8, "text": "3.77", "column_header": false, "row_header": false, "row_section": false}, {"bbox": {"l": 161.90601, "t": 423.84048, "r": 166.51474, "b": 435.12918, "coord_origin": "1"}, "row_span": 1, "col_span": 1, "start_row_offset_idx": 5, "end_row_offset_idx": 6, "start_col_offset_idx": 0, "end_col_offset_idx": 1, "text": "2", "column_header": false, "row_header": false, "row_section": false}, {"bbox": {"l": 209.509, "t": 423.84048, "r": 214.11774, "b": 435.12918, "coord_origin": "1"}, "row_span": 1, "col_span": 1, "start_row_offset_idx": 5, "end_row_offset_idx": 6, "start_col_offset_idx": 1, "end_col_offset_idx": 2, "text": "4", "column_header": false, "row_header": false, "row_section": false}, {"bbox": {"l": 246.71000999999998, "t": 418.3614799999999, "r": 271.41064, "b": 429.65018, "coord_origin": "1"}, "row_span": 1, "col_span": 1, "start_row_offset_idx": 5, "end_row_offset_idx": 6, "start_col_offset_idx": 2, "end_col_offset_idx": 3, "text": "OTSL", "column_header": false, "row_header": false, "row_section": false}, {"bbox": {"l": 289.017, "t": 418.3614799999999, "r": 310.00732, "b": 429.65018, "coord_origin": "1"}, "row_span": 1, "col_span": 1, "start_row_offset_idx": 4, "end_row_offset_idx": 5, "start_col_offset_idx": 3, "end_col_offset_idx": 4, "text": "0.923", "column_header": false, "row_header": false, "row_section": false}, {"bbox": {"l": 363.67599, "t": 418.3614799999999, "r": 384.66632, "b": 429.65018, "coord_origin": "1"}, "row_span": 1, "col_span": 1, "start_row_offset_idx": 4, "end_row_offset_idx": 5, "start_col_offset_idx": 5, "end_col_offset_idx": 6, "text": "0.915", "column_header": false, "row_header": false, "row_section": false}, {"bbox": {"l": 394.61801, "t": 418.29871, "r": 418.77798, "b": 442.60117, "coord_origin": "1"}, "row_span": 1, "col_span": 1, "start_row_offset_idx": 5, "end_row_offset_idx": 6, "start_col_offset_idx": 6, "end_col_offset_idx": 7, "text": "0.859 0.834", "column_header": false, "row_header": false, "row_section": false}, {"bbox": {"l": 439.52701, "t": 418.29871, "r": 458.38336, "b": 442.60117, "coord_origin": "1"}, "row_span": 1, "col_span": 1, "start_row_offset_idx": 5, "end_row_offset_idx": 6, "start_col_offset_idx": 7, "end_col_offset_idx": 8, "text": "1.91 3.81", "column_header": false, "row_header": false, "row_section": false}, {"bbox": {"l": 245.17598999999998, "t": 431.31246999999996, "r": 272.94495, "b": 442.60117, "coord_origin": "1"}, "row_span": 1, "col_span": 1, "start_row_offset_idx": 4, "end_row_offset_idx": 5, "start_col_offset_idx": 2, "end_col_offset_idx": 3, "text": "HTML", "column_header": false, "row_header": false, "row_section": false}, {"bbox": {"l": 289.017, "t": 431.31246999999996, "r": 310.00732, "b": 442.60117, "coord_origin": "1"}, "row_span": 1, "col_span": 1, "start_row_offset_idx": 5, "end_row_offset_idx": 6, "start_col_offset_idx": 3, "end_col_offset_idx": 4, "text": "0.945", "column_header": false, "row_header": false, "row_section": false}, {"bbox": {"l": 362.08801, "t": 431.24969, "r": 386.24799, "b": 442.65497, "coord_origin": "1"}, "row_span": 1, "col_span": 1, "start_row_offset_idx": 5, "end_row_offset_idx": 6, "start_col_offset_idx": 5, "end_col_offset_idx": 6, "text": "0.931", "column_header": false, "row_header": false, "row_section": false}, {"bbox": {"l": 161.90601, "t": 450.14248999999995, "r": 166.51474, "b": 461.43118, "coord_origin": "1"}, "row_span": 1, "col_span": 1, "start_row_offset_idx": 6, "end_row_offset_idx": 7, "start_col_offset_idx": 0, "end_col_offset_idx": 1, "text": "4", "column_header": false, "row_header": false, "row_section": false}, {"bbox": {"l": 209.509, "t": 450.14248999999995, "r": 214.11774, "b": 461.43118, "coord_origin": "1"}, "row_span": 1, "col_span": 1, "start_row_offset_idx": 6, "end_row_offset_idx": 7, "start_col_offset_idx": 1, "end_col_offset_idx": 2, "text": "2", "column_header": false, "row_header": false, "row_section": false}, {"bbox": {"l": 245.17598999999998, "t": 444.66248, "r": 272.94495, "b": 468.90317, "coord_origin": "1"}, "row_span": 1, "col_span": 1, "start_row_offset_idx": 6, "end_row_offset_idx": 7, "start_col_offset_idx": 2, "end_col_offset_idx": 3, "text": "OTSL HTML", "column_header": false, "row_header": false, "row_section": false}, {"bbox": {"l": 289.017, "t": 444.66248, "r": 310.00732, "b": 468.90317, "coord_origin": "1"}, "row_span": 1, "col_span": 1, "start_row_offset_idx": 6, "end_row_offset_idx": 7, "start_col_offset_idx": 3, "end_col_offset_idx": 4, "text": "0.952 0.944", "column_header": false, "row_header": false, "row_section": false}, {"bbox": {"l": 326.71701, "t": 444.66248, "r": 347.70734, "b": 468.90317, "coord_origin": "1"}, "row_span": 1, "col_span": 1, "start_row_offset_idx": 6, "end_row_offset_idx": 7, "start_col_offset_idx": 4, "end_col_offset_idx": 5, "text": "0.92 0.903", "column_header": false, "row_header": false, "row_section": false}, {"bbox": {"l": 362.08801, "t": 444.5996999999999, "r": 386.24799, "b": 468.90317, "coord_origin": "1"}, "row_span": 1, "col_span": 1, "start_row_offset_idx": 6, "end_row_offset_idx": 7, "start_col_offset_idx": 5, "end_col_offset_idx": 6, "text": "0.942 0.931", "column_header": false, "row_header": false, "row_section": false}, {"bbox": {"l": 394.61801, "t": 444.5996999999999, "r": 418.77798, "b": 468.90317, "coord_origin": "1"}, "row_span": 1, "col_span": 1, "start_row_offset_idx": 6, "end_row_offset_idx": 7, "start_col_offset_idx": 6, "end_col_offset_idx": 7, "text": "0.857 0.824", "column_header": false, "row_header": false, "row_section": false}, {"bbox": {"l": 439.52701, "t": 444.5996999999999, "r": 458.38336, "b": 468.90317, "coord_origin": "1"}, "row_span": 1, "col_span": 1, "start_row_offset_idx": 6, "end_row_offset_idx": 7, "start_col_offset_idx": 7, "end_col_offset_idx": 8, "text": "1.22 2", "column_header": false, "row_header": false, "row_section": false}]}}}, "figures_classification": null, "equations_prediction": null}, "assembled": {"elements": [{"label": "Page-header", "id": 0, "page_no": 0, "cluster": {"id": 0, "label": "Page-header", "bbox": {"l": 193.83700561523438, "t": 91.49352999999996, "r": 447.54476999999997, "b": 102.78223000000003, "coord_origin": "1"}, "confidence": 0.9235936999320984, "cells": [{"id": 0, "text": "Optimized Table Tokenization for Table Structure Recognition", "bbox": {"l": 194.478, "t": 91.49352999999996, "r": 447.54476999999997, "b": 102.78223000000003, "coord_origin": "1"}}]}, "text": "Optimized Table Tokenization for Table Structure Recognition"}, {"label": "Page-header", "id": 1, "page_no": 0, "cluster": {"id": 1, "label": "Page-header", "bbox": {"l": 475.3370056152344, "t": 91.49352999999996, "r": 480.59314, "b": 102.78223000000003, "coord_origin": "1"}, "confidence": 0.7262580394744873, "cells": [{"id": 1, "text": "9", "bbox": {"l": 475.98441, "t": 91.49352999999996, "r": 480.59314, "b": 102.78223000000003, "coord_origin": "1"}}]}, "text": "9"}, {"label": "Text", "id": 2, "page_no": 0, "cluster": {"id": 2, "label": "Text", "bbox": {"l": 134.23260498046875, "t": 116.46301000000005, "r": 480.63818359375, "b": 152.90697999999998, "coord_origin": "1"}, "confidence": 0.9810057878494263, "cells": [{"id": 2, "text": "order to compute the TED score. Inference timing results for all experiments", "bbox": {"l": 134.765, "t": 116.46301000000005, "r": 480.59067, "b": 128.99597000000006, "coord_origin": "1"}}, {"id": 3, "text": "were obtained from the same machine on a single core with AMD EPYC 7763", "bbox": {"l": 134.765, "t": 128.41803000000004, "r": 480.59665, "b": 140.95099000000005, "coord_origin": "1"}}, {"id": 4, "text": "CPU @2.45 GHz.", "bbox": {"l": 134.765, "t": 140.37401999999997, "r": 210.78761, "b": 152.90697999999998, "coord_origin": "1"}}]}, "text": "order to compute the TED score. Inference timing results for all experiments were obtained from the same machine on a single core with AMD EPYC 7763 CPU @2.45 GHz."}, {"label": "Section-header", "id": 3, "page_no": 0, "cluster": {"id": 3, "label": "Section-header", "bbox": {"l": 134.15780639648438, "t": 166.70514000000003, "r": 318.45145, "b": 179.20818999999995, "coord_origin": "1"}, "confidence": 0.9181773662567139, "cells": [{"id": 5, "text": "5.1", "bbox": {"l": 134.765, "t": 166.70514000000003, "r": 149.40306, "b": 179.20818999999995, "coord_origin": "1"}}, {"id": 6, "text": "Hyper Parameter Optimization", "bbox": {"l": 160.85905, "t": 166.70514000000003, "r": 318.45145, "b": 179.20818999999995, "coord_origin": "1"}}]}, "text": "5.1 Hyper Parameter Optimization"}, {"label": "Text", "id": 4, "page_no": 0, "cluster": {"id": 4, "label": "Text", "bbox": {"l": 134.27206420898438, "t": 183.11505, "r": 480.8331604003906, "b": 255.42400999999995, "coord_origin": "1"}, "confidence": 0.9886466860771179, "cells": [{"id": 7, "text": "We have chosen the PubTabNet data set to perform HPO, since it includes a", "bbox": {"l": 134.765, "t": 183.11505, "r": 479.74982000000006, "b": 195.64801, "coord_origin": "1"}}, {"id": 8, "text": "highly diverse set of tables. Also we report TED scores separately for simple and", "bbox": {"l": 134.765, "t": 195.07007, "r": 480.58765, "b": 207.60303, "coord_origin": "1"}}, {"id": 9, "text": "complex tables (tables with cell spans). Results are presented in Table. 1. It is", "bbox": {"l": 134.765, "t": 207.02502000000004, "r": 480.58859000000007, "b": 219.55798000000004, "coord_origin": "1"}}, {"id": 10, "text": "evident that with OTSL, our model achieves the same TED score and slightly", "bbox": {"l": 134.765, "t": 218.98004000000003, "r": 480.59567, "b": 231.51300000000003, "coord_origin": "1"}}, {"id": 11, "text": "better mAP scores in comparison to HTML. However OTSL yields a", "bbox": {"l": 134.765, "t": 230.93506000000002, "r": 440.9425, "b": 243.46802000000002, "coord_origin": "1"}}, {"id": 12, "text": "2x speed", "bbox": {"l": 444.86800999999997, "t": 230.98486000000003, "r": 480.58792, "b": 243.46802000000002, "coord_origin": "1"}}, {"id": 13, "text": "up", "bbox": {"l": 134.765, "t": 242.94086000000004, "r": 145.19585, "b": 255.42400999999995, "coord_origin": "1"}}, {"id": 14, "text": "in the inference runtime over HTML.", "bbox": {"l": 149.149, "t": 242.89104999999995, "r": 311.22256, "b": 255.42400999999995, "coord_origin": "1"}}]}, "text": "We have chosen the PubTabNet data set to perform HPO, since it includes a highly diverse set of tables. Also we report TED scores separately for simple and complex tables (tables with cell spans). Results are presented in Table. 1. It is evident that with OTSL, our model achieves the same TED score and slightly better mAP scores in comparison to HTML. However OTSL yields a 2x speed up in the inference runtime over HTML."}, {"label": "Caption", "id": 5, "page_no": 0, "cluster": {"id": 5, "label": "Caption", "bbox": {"l": 134.35366821289062, "t": 272.79474000000005, "r": 480.59890999999993, "b": 327.98218, "coord_origin": "1"}, "confidence": 0.9766141772270203, "cells": [{"id": 15, "text": "Table", "bbox": {"l": 134.765, "t": 272.79474000000005, "r": 159.22983, "b": 284.1999799999999, "coord_origin": "1"}}, {"id": 16, "text": "1.", "bbox": {"l": 167.34442, "t": 272.79474000000005, "r": 174.71301, "b": 284.1999799999999, "coord_origin": "1"}}, {"id": 17, "text": "HPO performed in OTSL and HTML representation on the same", "bbox": {"l": 188.133, "t": 272.85748, "r": 480.58101999999997, "b": 284.14618, "coord_origin": "1"}}, {"id": 18, "text": "transformer-based TableFormer [9] architecture, trained only on PubTabNet [22]. Ef-", "bbox": {"l": 134.765, "t": 283.81647, "r": 480.59890999999993, "b": 295.10516000000007, "coord_origin": "1"}}, {"id": 19, "text": "fects of reducing the # of layers in encoder and decoder stages of the model show that", "bbox": {"l": 134.765, "t": 294.77547999999996, "r": 480.59887999999995, "b": 306.06418, "coord_origin": "1"}}, {"id": 20, "text": "smaller models trained on OTSL perform better, especially in recognizing complex", "bbox": {"l": 134.765, "t": 305.73447, "r": 480.59180000000003, "b": 317.02316, "coord_origin": "1"}}, {"id": 21, "text": "table structures, and maintain a much higher mAP score than the HTML counterpart.", "bbox": {"l": 134.765, "t": 316.69348, "r": 480.58471999999995, "b": 327.98218, "coord_origin": "1"}}]}, "text": "Table 1. HPO performed in OTSL and HTML representation on the same transformer-based TableFormer [9] architecture, trained only on PubTabNet [22]. Effects of reducing the # of layers in encoder and decoder stages of the model show that smaller models trained on OTSL perform better, especially in recognizing complex table structures, and maintain a much higher mAP score than the HTML counterpart."}, {"label": "Table", "id": 6, "page_no": 0, "cluster": {"id": 6, "label": "Table", "bbox": {"l": 139.21253967285156, "t": 336.4130859375, "r": 475.24322509765625, "b": 469.6602783203125, "coord_origin": "1"}, "confidence": 0.9847328066825867, "cells": [{"id": 22, "text": "#", "bbox": {"l": 160.37, "t": 339.45749, "r": 168.04523, "b": 350.74619, "coord_origin": "1"}}, {"id": 23, "text": "enc-layers", "bbox": {"l": 144.592, "t": 352.40848, "r": 183.82895, "b": 363.69717, "coord_origin": "1"}}, {"id": 24, "text": "#", "bbox": {"l": 207.974, "t": 339.45749, "r": 215.64923000000002, "b": 350.74619, "coord_origin": "1"}}, {"id": 25, "text": "dec-layers", "bbox": {"l": 192.19501, "t": 352.40848, "r": 231.42303, "b": 363.69717, "coord_origin": "1"}}, {"id": 26, "text": "Language", "bbox": {"l": 239.79799999999997, "t": 344.93649, "r": 278.3338, "b": 356.22519000000005, "coord_origin": "1"}}, {"id": 27, "text": "TEDs", "bbox": {"l": 324.67001, "t": 339.45749, "r": 348.26419, "b": 350.74619, "coord_origin": "1"}}, {"id": 28, "text": "mAP", "bbox": {"l": 396.271, "t": 339.45749, "r": 417.12595, "b": 350.74619, "coord_origin": "1"}}, {"id": 29, "text": "(0.75)", "bbox": {"l": 394.927, "t": 350.41647, "r": 418.46921, "b": 361.70517, "coord_origin": "1"}}, {"id": 30, "text": "Inference", "bbox": {"l": 430.771, "t": 339.45749, "r": 467.14142000000004, "b": 350.74619, "coord_origin": "1"}}, {"id": 31, "text": "time (secs)", "bbox": {"l": 427.14801, "t": 350.41647, "r": 470.76955999999996, "b": 361.70517, "coord_origin": "1"}}, {"id": 32, "text": "simple", "bbox": {"l": 286.686, "t": 352.40848, "r": 312.32812, "b": 363.69717, "coord_origin": "1"}}, {"id": 33, "text": "complex", "bbox": {"l": 320.702, "t": 352.40848, "r": 353.71539, "b": 363.69717, "coord_origin": "1"}}, {"id": 34, "text": "all", "bbox": {"l": 369.306, "t": 352.40848, "r": 379.02914, "b": 363.69717, "coord_origin": "1"}}, {"id": 35, "text": "6", "bbox": {"l": 161.90601, "t": 371.23849, "r": 166.51474, "b": 382.52719, "coord_origin": "1"}}, {"id": 36, "text": "6", "bbox": {"l": 209.509, "t": 371.23849, "r": 214.11774, "b": 382.52719, "coord_origin": "1"}}, {"id": 37, "text": "OTSL", "bbox": {"l": 246.71000999999998, "t": 365.75848, "r": 271.41064, "b": 377.04717999999997, "coord_origin": "1"}}, {"id": 38, "text": "0.965", "bbox": {"l": 289.017, "t": 365.75848, "r": 310.00732, "b": 377.04717999999997, "coord_origin": "1"}}, {"id": 39, "text": "0.934", "bbox": {"l": 326.71701, "t": 365.75848, "r": 347.70734, "b": 377.04717999999997, "coord_origin": "1"}}, {"id": 40, "text": "0.955", "bbox": {"l": 363.67599, "t": 365.75848, "r": 384.66632, "b": 377.04717999999997, "coord_origin": "1"}}, {"id": 41, "text": "0.88", "bbox": {"l": 397.26999, "t": 365.69571, "r": 416.12634, "b": 377.10098000000005, "coord_origin": "1"}}, {"id": 42, "text": "2.73", "bbox": {"l": 439.52701, "t": 365.69571, "r": 458.38336, "b": 377.10098000000005, "coord_origin": "1"}}, {"id": 43, "text": "HTML", "bbox": {"l": 245.17598999999998, "t": 378.71048, "r": 272.94495, "b": 389.99917999999997, "coord_origin": "1"}}, {"id": 44, "text": "0.969", "bbox": {"l": 289.017, "t": 378.71048, "r": 310.00732, "b": 389.99917999999997, "coord_origin": "1"}}, {"id": 45, "text": "0.927", "bbox": {"l": 326.71701, "t": 378.71048, "r": 347.70734, "b": 389.99917999999997, "coord_origin": "1"}}, {"id": 46, "text": "0.955", "bbox": {"l": 363.67599, "t": 378.71048, "r": 384.66632, "b": 389.99917999999997, "coord_origin": "1"}}, {"id": 47, "text": "0.857", "bbox": {"l": 396.20599, "t": 378.71048, "r": 417.19632, "b": 389.99917999999997, "coord_origin": "1"}}, {"id": 48, "text": "5.39", "bbox": {"l": 440.767, "t": 378.71048, "r": 457.15039, "b": 389.99917999999997, "coord_origin": "1"}}, {"id": 49, "text": "4", "bbox": {"l": 161.90601, "t": 397.53949, "r": 166.51474, "b": 408.82819, "coord_origin": "1"}}, {"id": 50, "text": "4", "bbox": {"l": 209.509, "t": 397.53949, "r": 214.11774, "b": 408.82819, "coord_origin": "1"}}, {"id": 51, "text": "OTSL", "bbox": {"l": 246.71000999999998, "t": 392.06049, "r": 271.41064, "b": 403.34918, "coord_origin": "1"}}, {"id": 52, "text": "0.938", "bbox": {"l": 289.017, "t": 392.06049, "r": 310.00732, "b": 403.34918, "coord_origin": "1"}}, {"id": 53, "text": "0.904", "bbox": {"l": 326.71701, "t": 392.06049, "r": 347.70734, "b": 403.34918, "coord_origin": "1"}}, {"id": 54, "text": "0.927", "bbox": {"l": 363.67599, "t": 392.06049, "r": 384.66632, "b": 403.34918, "coord_origin": "1"}}, {"id": 55, "text": "0.853", "bbox": {"l": 394.61801, "t": 391.99771, "r": 418.77798, "b": 403.40298, "coord_origin": "1"}}, {"id": 56, "text": "1.97", "bbox": {"l": 439.52701, "t": 391.99771, "r": 458.38336, "b": 403.40298, "coord_origin": "1"}}, {"id": 57, "text": "HTML", "bbox": {"l": 245.17598999999998, "t": 405.01147, "r": 272.94495, "b": 416.30017, "coord_origin": "1"}}, {"id": 58, "text": "0.952", "bbox": {"l": 289.017, "t": 405.01147, "r": 310.00732, "b": 416.30017, "coord_origin": "1"}}, {"id": 59, "text": "0.909", "bbox": {"l": 326.71701, "t": 405.01147, "r": 347.70734, "b": 416.30017, "coord_origin": "1"}}, {"id": 60, "text": "0.938", "bbox": {"l": 362.08801, "t": 404.9486999999999, "r": 386.24799, "b": 416.35397, "coord_origin": "1"}}, {"id": 61, "text": "0.843", "bbox": {"l": 396.20599, "t": 405.01147, "r": 417.19632, "b": 416.30017, "coord_origin": "1"}}, {"id": 62, "text": "3.77", "bbox": {"l": 440.767, "t": 405.01147, "r": 457.15039, "b": 416.30017, "coord_origin": "1"}}, {"id": 63, "text": "2", "bbox": {"l": 161.90601, "t": 423.84048, "r": 166.51474, "b": 435.12918, "coord_origin": "1"}}, {"id": 64, "text": "4", "bbox": {"l": 209.509, "t": 423.84048, "r": 214.11774, "b": 435.12918, "coord_origin": "1"}}, {"id": 65, "text": "OTSL", "bbox": {"l": 246.71000999999998, "t": 418.3614799999999, "r": 271.41064, "b": 429.65018, "coord_origin": "1"}}, {"id": 66, "text": "0.923", "bbox": {"l": 289.017, "t": 418.3614799999999, "r": 310.00732, "b": 429.65018, "coord_origin": "1"}}, {"id": 67, "text": "0.897", "bbox": {"l": 326.71701, "t": 418.3614799999999, "r": 347.70734, "b": 429.65018, "coord_origin": "1"}}, {"id": 68, "text": "0.915", "bbox": {"l": 363.67599, "t": 418.3614799999999, "r": 384.66632, "b": 429.65018, "coord_origin": "1"}}, {"id": 69, "text": "0.859", "bbox": {"l": 394.61801, "t": 418.29871, "r": 418.77798, "b": 429.70398, "coord_origin": "1"}}, {"id": 70, "text": "1.91", "bbox": {"l": 439.52701, "t": 418.29871, "r": 458.38336, "b": 429.70398, "coord_origin": "1"}}, {"id": 71, "text": "HTML", "bbox": {"l": 245.17598999999998, "t": 431.31246999999996, "r": 272.94495, "b": 442.60117, "coord_origin": "1"}}, {"id": 72, "text": "0.945", "bbox": {"l": 289.017, "t": 431.31246999999996, "r": 310.00732, "b": 442.60117, "coord_origin": "1"}}, {"id": 73, "text": "0.901", "bbox": {"l": 326.71701, "t": 431.31246999999996, "r": 347.70734, "b": 442.60117, "coord_origin": "1"}}, {"id": 74, "text": "0.931", "bbox": {"l": 362.08801, "t": 431.24969, "r": 386.24799, "b": 442.65497, "coord_origin": "1"}}, {"id": 75, "text": "0.834", "bbox": {"l": 396.20599, "t": 431.31246999999996, "r": 417.19632, "b": 442.60117, "coord_origin": "1"}}, {"id": 76, "text": "3.81", "bbox": {"l": 440.767, "t": 431.31246999999996, "r": 457.15039, "b": 442.60117, "coord_origin": "1"}}, {"id": 77, "text": "4", "bbox": {"l": 161.90601, "t": 450.14248999999995, "r": 166.51474, "b": 461.43118, "coord_origin": "1"}}, {"id": 78, "text": "2", "bbox": {"l": 209.509, "t": 450.14248999999995, "r": 214.11774, "b": 461.43118, "coord_origin": "1"}}, {"id": 79, "text": "OTSL", "bbox": {"l": 246.71000999999998, "t": 444.66248, "r": 271.41064, "b": 455.95117, "coord_origin": "1"}}, {"id": 80, "text": "0.952", "bbox": {"l": 289.017, "t": 444.66248, "r": 310.00732, "b": 455.95117, "coord_origin": "1"}}, {"id": 81, "text": "0.92", "bbox": {"l": 329.021, "t": 444.66248, "r": 345.40439, "b": 455.95117, "coord_origin": "1"}}, {"id": 82, "text": "0.942", "bbox": {"l": 362.08801, "t": 444.5996999999999, "r": 386.24799, "b": 456.00497, "coord_origin": "1"}}, {"id": 83, "text": "0.857", "bbox": {"l": 394.61801, "t": 444.5996999999999, "r": 418.77798, "b": 456.00497, "coord_origin": "1"}}, {"id": 84, "text": "1.22", "bbox": {"l": 439.52701, "t": 444.5996999999999, "r": 458.38336, "b": 456.00497, "coord_origin": "1"}}, {"id": 85, "text": "HTML", "bbox": {"l": 245.17598999999998, "t": 457.61447, "r": 272.94495, "b": 468.90317, "coord_origin": "1"}}, {"id": 86, "text": "0.944", "bbox": {"l": 289.017, "t": 457.61447, "r": 310.00732, "b": 468.90317, "coord_origin": "1"}}, {"id": 87, "text": "0.903", "bbox": {"l": 326.71701, "t": 457.61447, "r": 347.70734, "b": 468.90317, "coord_origin": "1"}}, {"id": 88, "text": "0.931", "bbox": {"l": 363.67599, "t": 457.61447, "r": 384.66632, "b": 468.90317, "coord_origin": "1"}}, {"id": 89, "text": "0.824", "bbox": {"l": 396.20599, "t": 457.61447, "r": 417.19632, "b": 468.90317, "coord_origin": "1"}}, {"id": 90, "text": "2", "bbox": {"l": 446.65302, "t": 457.61447, "r": 451.26175, "b": 468.90317, "coord_origin": "1"}}]}, "text": null, "otsl_seq": ["ched", "ched", "ched", "ched", "lcel", "lcel", "ched", "ched", "nl", "ched", "ched", "ucel", "ched", "ched", "ched", "ched", "ched", "nl", "fcel", "fcel", "fcel", "fcel", "fcel", "fcel", "fcel", "fcel", "nl", "fcel", "fcel", "fcel", "fcel", "fcel", "fcel", "fcel", "fcel", "nl", "fcel", "fcel", "fcel", "fcel", "fcel", "fcel", "fcel", "fcel", "nl", "fcel", "fcel", "fcel", "fcel", "fcel", "fcel", "fcel", "fcel", "nl", "fcel", "fcel", "fcel", "fcel", "fcel", "fcel", "fcel", "fcel", "nl"], "num_rows": 7, "num_cols": 8, "table_cells": [{"bbox": {"l": 160.37, "t": 339.45749, "r": 168.04523, "b": 350.74619, "coord_origin": "1"}, "row_span": 1, "col_span": 1, "start_row_offset_idx": 0, "end_row_offset_idx": 1, "start_col_offset_idx": 0, "end_col_offset_idx": 1, "text": "#", "column_header": true, "row_header": false, "row_section": false}, {"bbox": {"l": 144.592, "t": 352.40848, "r": 183.82895, "b": 363.69717, "coord_origin": "1"}, "row_span": 1, "col_span": 1, "start_row_offset_idx": 1, "end_row_offset_idx": 2, "start_col_offset_idx": 0, "end_col_offset_idx": 1, "text": "enc-layers", "column_header": true, "row_header": false, "row_section": false}, {"bbox": {"l": 207.974, "t": 339.45749, "r": 215.64923000000002, "b": 350.74619, "coord_origin": "1"}, "row_span": 1, "col_span": 1, "start_row_offset_idx": 0, "end_row_offset_idx": 1, "start_col_offset_idx": 1, "end_col_offset_idx": 2, "text": "#", "column_header": true, "row_header": false, "row_section": false}, {"bbox": {"l": 192.19501, "t": 352.40848, "r": 231.42303, "b": 363.69717, "coord_origin": "1"}, "row_span": 1, "col_span": 1, "start_row_offset_idx": 1, "end_row_offset_idx": 2, "start_col_offset_idx": 1, "end_col_offset_idx": 2, "text": "dec-layers", "column_header": true, "row_header": false, "row_section": false}, {"bbox": {"l": 239.79799999999997, "t": 344.93649, "r": 278.3338, "b": 356.22519000000005, "coord_origin": "1"}, "row_span": 2, "col_span": 1, "start_row_offset_idx": 0, "end_row_offset_idx": 2, "start_col_offset_idx": 2, "end_col_offset_idx": 3, "text": "Language", "column_header": true, "row_header": false, "row_section": false}, {"bbox": {"l": 324.67001, "t": 339.45749, "r": 348.26419, "b": 350.74619, "coord_origin": "1"}, "row_span": 1, "col_span": 3, "start_row_offset_idx": 0, "end_row_offset_idx": 1, "start_col_offset_idx": 3, "end_col_offset_idx": 6, "text": "TEDs", "column_header": true, "row_header": false, "row_section": false}, {"bbox": {"l": 396.271, "t": 339.45749, "r": 417.12595, "b": 350.74619, "coord_origin": "1"}, "row_span": 1, "col_span": 1, "start_row_offset_idx": 0, "end_row_offset_idx": 1, "start_col_offset_idx": 6, "end_col_offset_idx": 7, "text": "mAP", "column_header": true, "row_header": false, "row_section": false}, {"bbox": {"l": 394.927, "t": 350.41647, "r": 418.46921, "b": 361.70517, "coord_origin": "1"}, "row_span": 1, "col_span": 1, "start_row_offset_idx": 1, "end_row_offset_idx": 2, "start_col_offset_idx": 6, "end_col_offset_idx": 7, "text": "(0.75)", "column_header": true, "row_header": false, "row_section": false}, {"bbox": {"l": 430.771, "t": 339.45749, "r": 467.14142000000004, "b": 350.74619, "coord_origin": "1"}, "row_span": 1, "col_span": 1, "start_row_offset_idx": 0, "end_row_offset_idx": 1, "start_col_offset_idx": 7, "end_col_offset_idx": 8, "text": "Inference", "column_header": true, "row_header": false, "row_section": false}, {"bbox": {"l": 427.14801, "t": 350.41647, "r": 470.76955999999996, "b": 361.70517, "coord_origin": "1"}, "row_span": 1, "col_span": 1, "start_row_offset_idx": 1, "end_row_offset_idx": 2, "start_col_offset_idx": 7, "end_col_offset_idx": 8, "text": "time (secs)", "column_header": true, "row_header": false, "row_section": false}, {"bbox": {"l": 286.686, "t": 352.40848, "r": 312.32812, "b": 363.69717, "coord_origin": "1"}, "row_span": 1, "col_span": 1, "start_row_offset_idx": 1, "end_row_offset_idx": 2, "start_col_offset_idx": 3, "end_col_offset_idx": 4, "text": "simple", "column_header": true, "row_header": false, "row_section": false}, {"bbox": {"l": 320.702, "t": 352.40848, "r": 353.71539, "b": 363.69717, "coord_origin": "1"}, "row_span": 1, "col_span": 1, "start_row_offset_idx": 1, "end_row_offset_idx": 2, "start_col_offset_idx": 4, "end_col_offset_idx": 5, "text": "complex", "column_header": true, "row_header": false, "row_section": false}, {"bbox": {"l": 369.306, "t": 352.40848, "r": 379.02914, "b": 363.69717, "coord_origin": "1"}, "row_span": 1, "col_span": 1, "start_row_offset_idx": 1, "end_row_offset_idx": 2, "start_col_offset_idx": 5, "end_col_offset_idx": 6, "text": "all", "column_header": true, "row_header": false, "row_section": false}, {"bbox": {"l": 161.90601, "t": 371.23849, "r": 166.51474, "b": 382.52719, "coord_origin": "1"}, "row_span": 1, "col_span": 1, "start_row_offset_idx": 2, "end_row_offset_idx": 3, "start_col_offset_idx": 0, "end_col_offset_idx": 1, "text": "6", "column_header": false, "row_header": false, "row_section": false}, {"bbox": {"l": 209.509, "t": 371.23849, "r": 214.11774, "b": 382.52719, "coord_origin": "1"}, "row_span": 1, "col_span": 1, "start_row_offset_idx": 2, "end_row_offset_idx": 3, "start_col_offset_idx": 1, "end_col_offset_idx": 2, "text": "6", "column_header": false, "row_header": false, "row_section": false}, {"bbox": {"l": 245.17598999999998, "t": 365.75848, "r": 272.94495, "b": 389.99917999999997, "coord_origin": "1"}, "row_span": 1, "col_span": 1, "start_row_offset_idx": 2, "end_row_offset_idx": 3, "start_col_offset_idx": 2, "end_col_offset_idx": 3, "text": "OTSL HTML", "column_header": false, "row_header": false, "row_section": false}, {"bbox": {"l": 289.017, "t": 365.75848, "r": 310.00732, "b": 389.99917999999997, "coord_origin": "1"}, "row_span": 1, "col_span": 1, "start_row_offset_idx": 2, "end_row_offset_idx": 3, "start_col_offset_idx": 3, "end_col_offset_idx": 4, "text": "0.965 0.969", "column_header": false, "row_header": false, "row_section": false}, {"bbox": {"l": 326.71701, "t": 365.75848, "r": 347.70734, "b": 389.99917999999997, "coord_origin": "1"}, "row_span": 1, "col_span": 1, "start_row_offset_idx": 2, "end_row_offset_idx": 3, "start_col_offset_idx": 4, "end_col_offset_idx": 5, "text": "0.934 0.927", "column_header": false, "row_header": false, "row_section": false}, {"bbox": {"l": 363.67599, "t": 365.75848, "r": 384.66632, "b": 389.99917999999997, "coord_origin": "1"}, "row_span": 1, "col_span": 1, "start_row_offset_idx": 2, "end_row_offset_idx": 3, "start_col_offset_idx": 5, "end_col_offset_idx": 6, "text": "0.955 0.955", "column_header": false, "row_header": false, "row_section": false}, {"bbox": {"l": 396.20599, "t": 365.69571, "r": 417.19632, "b": 389.99917999999997, "coord_origin": "1"}, "row_span": 1, "col_span": 1, "start_row_offset_idx": 2, "end_row_offset_idx": 3, "start_col_offset_idx": 6, "end_col_offset_idx": 7, "text": "0.88 0.857", "column_header": false, "row_header": false, "row_section": false}, {"bbox": {"l": 439.52701, "t": 365.69571, "r": 458.38336, "b": 389.99917999999997, "coord_origin": "1"}, "row_span": 1, "col_span": 1, "start_row_offset_idx": 2, "end_row_offset_idx": 3, "start_col_offset_idx": 7, "end_col_offset_idx": 8, "text": "2.73 5.39", "column_header": false, "row_header": false, "row_section": false}, {"bbox": {"l": 161.90601, "t": 397.53949, "r": 166.51474, "b": 408.82819, "coord_origin": "1"}, "row_span": 1, "col_span": 1, "start_row_offset_idx": 3, "end_row_offset_idx": 4, "start_col_offset_idx": 0, "end_col_offset_idx": 1, "text": "4", "column_header": false, "row_header": false, "row_section": false}, {"bbox": {"l": 209.509, "t": 397.53949, "r": 214.11774, "b": 408.82819, "coord_origin": "1"}, "row_span": 1, "col_span": 1, "start_row_offset_idx": 3, "end_row_offset_idx": 4, "start_col_offset_idx": 1, "end_col_offset_idx": 2, "text": "4", "column_header": false, "row_header": false, "row_section": false}, {"bbox": {"l": 245.17598999999998, "t": 392.06049, "r": 272.94495, "b": 416.30017, "coord_origin": "1"}, "row_span": 1, "col_span": 1, "start_row_offset_idx": 3, "end_row_offset_idx": 4, "start_col_offset_idx": 2, "end_col_offset_idx": 3, "text": "OTSL HTML", "column_header": false, "row_header": false, "row_section": false}, {"bbox": {"l": 289.017, "t": 392.06049, "r": 310.00732, "b": 416.30017, "coord_origin": "1"}, "row_span": 1, "col_span": 1, "start_row_offset_idx": 3, "end_row_offset_idx": 4, "start_col_offset_idx": 3, "end_col_offset_idx": 4, "text": "0.938 0.952", "column_header": false, "row_header": false, "row_section": false}, {"bbox": {"l": 326.71701, "t": 392.06049, "r": 347.70734, "b": 403.34918, "coord_origin": "1"}, "row_span": 1, "col_span": 1, "start_row_offset_idx": 3, "end_row_offset_idx": 4, "start_col_offset_idx": 4, "end_col_offset_idx": 5, "text": "0.904", "column_header": false, "row_header": false, "row_section": false}, {"bbox": {"l": 362.08801, "t": 392.06049, "r": 386.24799, "b": 416.35397, "coord_origin": "1"}, "row_span": 1, "col_span": 1, "start_row_offset_idx": 3, "end_row_offset_idx": 4, "start_col_offset_idx": 5, "end_col_offset_idx": 6, "text": "0.927 0.938", "column_header": false, "row_header": false, "row_section": false}, {"bbox": {"l": 394.61801, "t": 391.99771, "r": 418.77798, "b": 403.40298, "coord_origin": "1"}, "row_span": 1, "col_span": 1, "start_row_offset_idx": 3, "end_row_offset_idx": 4, "start_col_offset_idx": 6, "end_col_offset_idx": 7, "text": "0.853", "column_header": false, "row_header": false, "row_section": false}, {"bbox": {"l": 439.52701, "t": 391.99771, "r": 458.38336, "b": 403.40298, "coord_origin": "1"}, "row_span": 1, "col_span": 1, "start_row_offset_idx": 3, "end_row_offset_idx": 4, "start_col_offset_idx": 7, "end_col_offset_idx": 8, "text": "1.97", "column_header": false, "row_header": false, "row_section": false}, {"bbox": {"l": 326.71701, "t": 405.01147, "r": 347.70734, "b": 442.60117, "coord_origin": "1"}, "row_span": 1, "col_span": 1, "start_row_offset_idx": 4, "end_row_offset_idx": 5, "start_col_offset_idx": 4, "end_col_offset_idx": 5, "text": "0.909 0.897 0.901", "column_header": false, "row_header": false, "row_section": false}, {"bbox": {"l": 396.20599, "t": 405.01147, "r": 417.19632, "b": 416.30017, "coord_origin": "1"}, "row_span": 1, "col_span": 1, "start_row_offset_idx": 4, "end_row_offset_idx": 5, "start_col_offset_idx": 6, "end_col_offset_idx": 7, "text": "0.843", "column_header": false, "row_header": false, "row_section": false}, {"bbox": {"l": 440.767, "t": 405.01147, "r": 457.15039, "b": 416.30017, "coord_origin": "1"}, "row_span": 1, "col_span": 1, "start_row_offset_idx": 4, "end_row_offset_idx": 5, "start_col_offset_idx": 7, "end_col_offset_idx": 8, "text": "3.77", "column_header": false, "row_header": false, "row_section": false}, {"bbox": {"l": 161.90601, "t": 423.84048, "r": 166.51474, "b": 435.12918, "coord_origin": "1"}, "row_span": 1, "col_span": 1, "start_row_offset_idx": 5, "end_row_offset_idx": 6, "start_col_offset_idx": 0, "end_col_offset_idx": 1, "text": "2", "column_header": false, "row_header": false, "row_section": false}, {"bbox": {"l": 209.509, "t": 423.84048, "r": 214.11774, "b": 435.12918, "coord_origin": "1"}, "row_span": 1, "col_span": 1, "start_row_offset_idx": 5, "end_row_offset_idx": 6, "start_col_offset_idx": 1, "end_col_offset_idx": 2, "text": "4", "column_header": false, "row_header": false, "row_section": false}, {"bbox": {"l": 246.71000999999998, "t": 418.3614799999999, "r": 271.41064, "b": 429.65018, "coord_origin": "1"}, "row_span": 1, "col_span": 1, "start_row_offset_idx": 5, "end_row_offset_idx": 6, "start_col_offset_idx": 2, "end_col_offset_idx": 3, "text": "OTSL", "column_header": false, "row_header": false, "row_section": false}, {"bbox": {"l": 289.017, "t": 418.3614799999999, "r": 310.00732, "b": 429.65018, "coord_origin": "1"}, "row_span": 1, "col_span": 1, "start_row_offset_idx": 4, "end_row_offset_idx": 5, "start_col_offset_idx": 3, "end_col_offset_idx": 4, "text": "0.923", "column_header": false, "row_header": false, "row_section": false}, {"bbox": {"l": 363.67599, "t": 418.3614799999999, "r": 384.66632, "b": 429.65018, "coord_origin": "1"}, "row_span": 1, "col_span": 1, "start_row_offset_idx": 4, "end_row_offset_idx": 5, "start_col_offset_idx": 5, "end_col_offset_idx": 6, "text": "0.915", "column_header": false, "row_header": false, "row_section": false}, {"bbox": {"l": 394.61801, "t": 418.29871, "r": 418.77798, "b": 442.60117, "coord_origin": "1"}, "row_span": 1, "col_span": 1, "start_row_offset_idx": 5, "end_row_offset_idx": 6, "start_col_offset_idx": 6, "end_col_offset_idx": 7, "text": "0.859 0.834", "column_header": false, "row_header": false, "row_section": false}, {"bbox": {"l": 439.52701, "t": 418.29871, "r": 458.38336, "b": 442.60117, "coord_origin": "1"}, "row_span": 1, "col_span": 1, "start_row_offset_idx": 5, "end_row_offset_idx": 6, "start_col_offset_idx": 7, "end_col_offset_idx": 8, "text": "1.91 3.81", "column_header": false, "row_header": false, "row_section": false}, {"bbox": {"l": 245.17598999999998, "t": 431.31246999999996, "r": 272.94495, "b": 442.60117, "coord_origin": "1"}, "row_span": 1, "col_span": 1, "start_row_offset_idx": 4, "end_row_offset_idx": 5, "start_col_offset_idx": 2, "end_col_offset_idx": 3, "text": "HTML", "column_header": false, "row_header": false, "row_section": false}, {"bbox": {"l": 289.017, "t": 431.31246999999996, "r": 310.00732, "b": 442.60117, "coord_origin": "1"}, "row_span": 1, "col_span": 1, "start_row_offset_idx": 5, "end_row_offset_idx": 6, "start_col_offset_idx": 3, "end_col_offset_idx": 4, "text": "0.945", "column_header": false, "row_header": false, "row_section": false}, {"bbox": {"l": 362.08801, "t": 431.24969, "r": 386.24799, "b": 442.65497, "coord_origin": "1"}, "row_span": 1, "col_span": 1, "start_row_offset_idx": 5, "end_row_offset_idx": 6, "start_col_offset_idx": 5, "end_col_offset_idx": 6, "text": "0.931", "column_header": false, "row_header": false, "row_section": false}, {"bbox": {"l": 161.90601, "t": 450.14248999999995, "r": 166.51474, "b": 461.43118, "coord_origin": "1"}, "row_span": 1, "col_span": 1, "start_row_offset_idx": 6, "end_row_offset_idx": 7, "start_col_offset_idx": 0, "end_col_offset_idx": 1, "text": "4", "column_header": false, "row_header": false, "row_section": false}, {"bbox": {"l": 209.509, "t": 450.14248999999995, "r": 214.11774, "b": 461.43118, "coord_origin": "1"}, "row_span": 1, "col_span": 1, "start_row_offset_idx": 6, "end_row_offset_idx": 7, "start_col_offset_idx": 1, "end_col_offset_idx": 2, "text": "2", "column_header": false, "row_header": false, "row_section": false}, {"bbox": {"l": 245.17598999999998, "t": 444.66248, "r": 272.94495, "b": 468.90317, "coord_origin": "1"}, "row_span": 1, "col_span": 1, "start_row_offset_idx": 6, "end_row_offset_idx": 7, "start_col_offset_idx": 2, "end_col_offset_idx": 3, "text": "OTSL HTML", "column_header": false, "row_header": false, "row_section": false}, {"bbox": {"l": 289.017, "t": 444.66248, "r": 310.00732, "b": 468.90317, "coord_origin": "1"}, "row_span": 1, "col_span": 1, "start_row_offset_idx": 6, "end_row_offset_idx": 7, "start_col_offset_idx": 3, "end_col_offset_idx": 4, "text": "0.952 0.944", "column_header": false, "row_header": false, "row_section": false}, {"bbox": {"l": 326.71701, "t": 444.66248, "r": 347.70734, "b": 468.90317, "coord_origin": "1"}, "row_span": 1, "col_span": 1, "start_row_offset_idx": 6, "end_row_offset_idx": 7, "start_col_offset_idx": 4, "end_col_offset_idx": 5, "text": "0.92 0.903", "column_header": false, "row_header": false, "row_section": false}, {"bbox": {"l": 362.08801, "t": 444.5996999999999, "r": 386.24799, "b": 468.90317, "coord_origin": "1"}, "row_span": 1, "col_span": 1, "start_row_offset_idx": 6, "end_row_offset_idx": 7, "start_col_offset_idx": 5, "end_col_offset_idx": 6, "text": "0.942 0.931", "column_header": false, "row_header": false, "row_section": false}, {"bbox": {"l": 394.61801, "t": 444.5996999999999, "r": 418.77798, "b": 468.90317, "coord_origin": "1"}, "row_span": 1, "col_span": 1, "start_row_offset_idx": 6, "end_row_offset_idx": 7, "start_col_offset_idx": 6, "end_col_offset_idx": 7, "text": "0.857 0.824", "column_header": false, "row_header": false, "row_section": false}, {"bbox": {"l": 439.52701, "t": 444.5996999999999, "r": 458.38336, "b": 468.90317, "coord_origin": "1"}, "row_span": 1, "col_span": 1, "start_row_offset_idx": 6, "end_row_offset_idx": 7, "start_col_offset_idx": 7, "end_col_offset_idx": 8, "text": "1.22 2", "column_header": false, "row_header": false, "row_section": false}]}, {"label": "Section-header", "id": 7, "page_no": 0, "cluster": {"id": 7, "label": "Section-header", "bbox": {"l": 134.34051513671875, "t": 505.67111, "r": 264.40829, "b": 518.17419, "coord_origin": "1"}, "confidence": 0.9198964834213257, "cells": [{"id": 91, "text": "5.2", "bbox": {"l": 134.765, "t": 505.67111, "r": 149.40306, "b": 518.17419, "coord_origin": "1"}}, {"id": 92, "text": "Quantitative Results", "bbox": {"l": 160.85905, "t": 505.67111, "r": 264.40829, "b": 518.17419, "coord_origin": "1"}}]}, "text": "5.2 Quantitative Results"}, {"label": "Text", "id": 8, "page_no": 0, "cluster": {"id": 8, "label": "Text", "bbox": {"l": 134.35081481933594, "t": 522.08005, "r": 480.72003, "b": 618.3, "coord_origin": "1"}, "confidence": 0.98807293176651, "cells": [{"id": 93, "text": "We picked the model parameter configuration that produced the best prediction", "bbox": {"l": 134.765, "t": 522.08005, "r": 479.72983, "b": 534.61301, "coord_origin": "1"}}, {"id": 94, "text": "quality (enc=6, dec=6, heads=8) with PubTabNet alone, then independently", "bbox": {"l": 134.765, "t": 534.03604, "r": 480.5897499999999, "b": 546.569, "coord_origin": "1"}}, {"id": 95, "text": "trained and evaluated it on three publicly available data sets: PubTabNet (395k", "bbox": {"l": 134.765, "t": 545.99104, "r": 480.72003, "b": 558.524, "coord_origin": "1"}}, {"id": 96, "text": "samples), FinTabNet (113k samples) and PubTables-1M (about 1M samples).", "bbox": {"l": 134.765, "t": 557.94604, "r": 480.60577, "b": 570.479, "coord_origin": "1"}}, {"id": 97, "text": "Performance results are presented in Table. 2. It is clearly evident that the model", "bbox": {"l": 134.765, "t": 569.90103, "r": 480.5936899999999, "b": 582.43399, "coord_origin": "1"}}, {"id": 98, "text": "trained on OTSL outperforms HTML across the board, keeping high TEDs and", "bbox": {"l": 134.765, "t": 581.85603, "r": 480.59158, "b": 594.38899, "coord_origin": "1"}}, {"id": 99, "text": "mAP scores even on difficult financial tables (FinTabNet) that contain sparse", "bbox": {"l": 134.765, "t": 593.81204, "r": 480.58080999999993, "b": 606.345, "coord_origin": "1"}}, {"id": 100, "text": "and large tables.", "bbox": {"l": 134.765, "t": 605.76704, "r": 206.79959, "b": 618.3, "coord_origin": "1"}}]}, "text": "We picked the model parameter configuration that produced the best prediction quality (enc=6, dec=6, heads=8) with PubTabNet alone, then independently trained and evaluated it on three publicly available data sets: PubTabNet (395k samples), FinTabNet (113k samples) and PubTables-1M (about 1M samples). Performance results are presented in Table. 2. It is clearly evident that the model trained on OTSL outperforms HTML across the board, keeping high TEDs and mAP scores even on difficult financial tables (FinTabNet) that contain sparse and large tables."}, {"label": "Text", "id": 9, "page_no": 0, "cluster": {"id": 9, "label": "Text", "bbox": {"l": 134.27769470214844, "t": 617.72205, "r": 480.59857000000005, "b": 666.12, "coord_origin": "1"}, "confidence": 0.9812840819358826, "cells": [{"id": 101, "text": "Additionally, the results show that OTSL has an advantage over HTML", "bbox": {"l": 149.709, "t": 617.72205, "r": 480.59479, "b": 630.255, "coord_origin": "1"}}, {"id": 102, "text": "when applied on a bigger data set like PubTables-1M and achieves significantly", "bbox": {"l": 134.765, "t": 629.6770300000001, "r": 480.59857000000005, "b": 642.2099900000001, "coord_origin": "1"}}, {"id": 103, "text": "improved scores. Finally, OTSL achieves faster inference due to fewer decoding", "bbox": {"l": 134.765, "t": 641.63203, "r": 480.59384000000006, "b": 654.16499, "coord_origin": "1"}}, {"id": 104, "text": "steps which is a result of the reduced sequence representation.", "bbox": {"l": 134.765, "t": 653.58704, "r": 405.7995, "b": 666.12, "coord_origin": "1"}}]}, "text": "Additionally, the results show that OTSL has an advantage over HTML when applied on a bigger data set like PubTables-1M and achieves significantly improved scores. Finally, OTSL achieves faster inference due to fewer decoding steps which is a result of the reduced sequence representation."}], "body": [{"label": "Text", "id": 2, "page_no": 0, "cluster": {"id": 2, "label": "Text", "bbox": {"l": 134.23260498046875, "t": 116.46301000000005, "r": 480.63818359375, "b": 152.90697999999998, "coord_origin": "1"}, "confidence": 0.9810057878494263, "cells": [{"id": 2, "text": "order to compute the TED score. Inference timing results for all experiments", "bbox": {"l": 134.765, "t": 116.46301000000005, "r": 480.59067, "b": 128.99597000000006, "coord_origin": "1"}}, {"id": 3, "text": "were obtained from the same machine on a single core with AMD EPYC 7763", "bbox": {"l": 134.765, "t": 128.41803000000004, "r": 480.59665, "b": 140.95099000000005, "coord_origin": "1"}}, {"id": 4, "text": "CPU @2.45 GHz.", "bbox": {"l": 134.765, "t": 140.37401999999997, "r": 210.78761, "b": 152.90697999999998, "coord_origin": "1"}}]}, "text": "order to compute the TED score. Inference timing results for all experiments were obtained from the same machine on a single core with AMD EPYC 7763 CPU @2.45 GHz."}, {"label": "Section-header", "id": 3, "page_no": 0, "cluster": {"id": 3, "label": "Section-header", "bbox": {"l": 134.15780639648438, "t": 166.70514000000003, "r": 318.45145, "b": 179.20818999999995, "coord_origin": "1"}, "confidence": 0.9181773662567139, "cells": [{"id": 5, "text": "5.1", "bbox": {"l": 134.765, "t": 166.70514000000003, "r": 149.40306, "b": 179.20818999999995, "coord_origin": "1"}}, {"id": 6, "text": "Hyper Parameter Optimization", "bbox": {"l": 160.85905, "t": 166.70514000000003, "r": 318.45145, "b": 179.20818999999995, "coord_origin": "1"}}]}, "text": "5.1 Hyper Parameter Optimization"}, {"label": "Text", "id": 4, "page_no": 0, "cluster": {"id": 4, "label": "Text", "bbox": {"l": 134.27206420898438, "t": 183.11505, "r": 480.8331604003906, "b": 255.42400999999995, "coord_origin": "1"}, "confidence": 0.9886466860771179, "cells": [{"id": 7, "text": "We have chosen the PubTabNet data set to perform HPO, since it includes a", "bbox": {"l": 134.765, "t": 183.11505, "r": 479.74982000000006, "b": 195.64801, "coord_origin": "1"}}, {"id": 8, "text": "highly diverse set of tables. Also we report TED scores separately for simple and", "bbox": {"l": 134.765, "t": 195.07007, "r": 480.58765, "b": 207.60303, "coord_origin": "1"}}, {"id": 9, "text": "complex tables (tables with cell spans). Results are presented in Table. 1. It is", "bbox": {"l": 134.765, "t": 207.02502000000004, "r": 480.58859000000007, "b": 219.55798000000004, "coord_origin": "1"}}, {"id": 10, "text": "evident that with OTSL, our model achieves the same TED score and slightly", "bbox": {"l": 134.765, "t": 218.98004000000003, "r": 480.59567, "b": 231.51300000000003, "coord_origin": "1"}}, {"id": 11, "text": "better mAP scores in comparison to HTML. However OTSL yields a", "bbox": {"l": 134.765, "t": 230.93506000000002, "r": 440.9425, "b": 243.46802000000002, "coord_origin": "1"}}, {"id": 12, "text": "2x speed", "bbox": {"l": 444.86800999999997, "t": 230.98486000000003, "r": 480.58792, "b": 243.46802000000002, "coord_origin": "1"}}, {"id": 13, "text": "up", "bbox": {"l": 134.765, "t": 242.94086000000004, "r": 145.19585, "b": 255.42400999999995, "coord_origin": "1"}}, {"id": 14, "text": "in the inference runtime over HTML.", "bbox": {"l": 149.149, "t": 242.89104999999995, "r": 311.22256, "b": 255.42400999999995, "coord_origin": "1"}}]}, "text": "We have chosen the PubTabNet data set to perform HPO, since it includes a highly diverse set of tables. Also we report TED scores separately for simple and complex tables (tables with cell spans). Results are presented in Table. 1. It is evident that with OTSL, our model achieves the same TED score and slightly better mAP scores in comparison to HTML. However OTSL yields a 2x speed up in the inference runtime over HTML."}, {"label": "Caption", "id": 5, "page_no": 0, "cluster": {"id": 5, "label": "Caption", "bbox": {"l": 134.35366821289062, "t": 272.79474000000005, "r": 480.59890999999993, "b": 327.98218, "coord_origin": "1"}, "confidence": 0.9766141772270203, "cells": [{"id": 15, "text": "Table", "bbox": {"l": 134.765, "t": 272.79474000000005, "r": 159.22983, "b": 284.1999799999999, "coord_origin": "1"}}, {"id": 16, "text": "1.", "bbox": {"l": 167.34442, "t": 272.79474000000005, "r": 174.71301, "b": 284.1999799999999, "coord_origin": "1"}}, {"id": 17, "text": "HPO performed in OTSL and HTML representation on the same", "bbox": {"l": 188.133, "t": 272.85748, "r": 480.58101999999997, "b": 284.14618, "coord_origin": "1"}}, {"id": 18, "text": "transformer-based TableFormer [9] architecture, trained only on PubTabNet [22]. Ef-", "bbox": {"l": 134.765, "t": 283.81647, "r": 480.59890999999993, "b": 295.10516000000007, "coord_origin": "1"}}, {"id": 19, "text": "fects of reducing the # of layers in encoder and decoder stages of the model show that", "bbox": {"l": 134.765, "t": 294.77547999999996, "r": 480.59887999999995, "b": 306.06418, "coord_origin": "1"}}, {"id": 20, "text": "smaller models trained on OTSL perform better, especially in recognizing complex", "bbox": {"l": 134.765, "t": 305.73447, "r": 480.59180000000003, "b": 317.02316, "coord_origin": "1"}}, {"id": 21, "text": "table structures, and maintain a much higher mAP score than the HTML counterpart.", "bbox": {"l": 134.765, "t": 316.69348, "r": 480.58471999999995, "b": 327.98218, "coord_origin": "1"}}]}, "text": "Table 1. HPO performed in OTSL and HTML representation on the same transformer-based TableFormer [9] architecture, trained only on PubTabNet [22]. Effects of reducing the # of layers in encoder and decoder stages of the model show that smaller models trained on OTSL perform better, especially in recognizing complex table structures, and maintain a much higher mAP score than the HTML counterpart."}, {"label": "Table", "id": 6, "page_no": 0, "cluster": {"id": 6, "label": "Table", "bbox": {"l": 139.21253967285156, "t": 336.4130859375, "r": 475.24322509765625, "b": 469.6602783203125, "coord_origin": "1"}, "confidence": 0.9847328066825867, "cells": [{"id": 22, "text": "#", "bbox": {"l": 160.37, "t": 339.45749, "r": 168.04523, "b": 350.74619, "coord_origin": "1"}}, {"id": 23, "text": "enc-layers", "bbox": {"l": 144.592, "t": 352.40848, "r": 183.82895, "b": 363.69717, "coord_origin": "1"}}, {"id": 24, "text": "#", "bbox": {"l": 207.974, "t": 339.45749, "r": 215.64923000000002, "b": 350.74619, "coord_origin": "1"}}, {"id": 25, "text": "dec-layers", "bbox": {"l": 192.19501, "t": 352.40848, "r": 231.42303, "b": 363.69717, "coord_origin": "1"}}, {"id": 26, "text": "Language", "bbox": {"l": 239.79799999999997, "t": 344.93649, "r": 278.3338, "b": 356.22519000000005, "coord_origin": "1"}}, {"id": 27, "text": "TEDs", "bbox": {"l": 324.67001, "t": 339.45749, "r": 348.26419, "b": 350.74619, "coord_origin": "1"}}, {"id": 28, "text": "mAP", "bbox": {"l": 396.271, "t": 339.45749, "r": 417.12595, "b": 350.74619, "coord_origin": "1"}}, {"id": 29, "text": "(0.75)", "bbox": {"l": 394.927, "t": 350.41647, "r": 418.46921, "b": 361.70517, "coord_origin": "1"}}, {"id": 30, "text": "Inference", "bbox": {"l": 430.771, "t": 339.45749, "r": 467.14142000000004, "b": 350.74619, "coord_origin": "1"}}, {"id": 31, "text": "time (secs)", "bbox": {"l": 427.14801, "t": 350.41647, "r": 470.76955999999996, "b": 361.70517, "coord_origin": "1"}}, {"id": 32, "text": "simple", "bbox": {"l": 286.686, "t": 352.40848, "r": 312.32812, "b": 363.69717, "coord_origin": "1"}}, {"id": 33, "text": "complex", "bbox": {"l": 320.702, "t": 352.40848, "r": 353.71539, "b": 363.69717, "coord_origin": "1"}}, {"id": 34, "text": "all", "bbox": {"l": 369.306, "t": 352.40848, "r": 379.02914, "b": 363.69717, "coord_origin": "1"}}, {"id": 35, "text": "6", "bbox": {"l": 161.90601, "t": 371.23849, "r": 166.51474, "b": 382.52719, "coord_origin": "1"}}, {"id": 36, "text": "6", "bbox": {"l": 209.509, "t": 371.23849, "r": 214.11774, "b": 382.52719, "coord_origin": "1"}}, {"id": 37, "text": "OTSL", "bbox": {"l": 246.71000999999998, "t": 365.75848, "r": 271.41064, "b": 377.04717999999997, "coord_origin": "1"}}, {"id": 38, "text": "0.965", "bbox": {"l": 289.017, "t": 365.75848, "r": 310.00732, "b": 377.04717999999997, "coord_origin": "1"}}, {"id": 39, "text": "0.934", "bbox": {"l": 326.71701, "t": 365.75848, "r": 347.70734, "b": 377.04717999999997, "coord_origin": "1"}}, {"id": 40, "text": "0.955", "bbox": {"l": 363.67599, "t": 365.75848, "r": 384.66632, "b": 377.04717999999997, "coord_origin": "1"}}, {"id": 41, "text": "0.88", "bbox": {"l": 397.26999, "t": 365.69571, "r": 416.12634, "b": 377.10098000000005, "coord_origin": "1"}}, {"id": 42, "text": "2.73", "bbox": {"l": 439.52701, "t": 365.69571, "r": 458.38336, "b": 377.10098000000005, "coord_origin": "1"}}, {"id": 43, "text": "HTML", "bbox": {"l": 245.17598999999998, "t": 378.71048, "r": 272.94495, "b": 389.99917999999997, "coord_origin": "1"}}, {"id": 44, "text": "0.969", "bbox": {"l": 289.017, "t": 378.71048, "r": 310.00732, "b": 389.99917999999997, "coord_origin": "1"}}, {"id": 45, "text": "0.927", "bbox": {"l": 326.71701, "t": 378.71048, "r": 347.70734, "b": 389.99917999999997, "coord_origin": "1"}}, {"id": 46, "text": "0.955", "bbox": {"l": 363.67599, "t": 378.71048, "r": 384.66632, "b": 389.99917999999997, "coord_origin": "1"}}, {"id": 47, "text": "0.857", "bbox": {"l": 396.20599, "t": 378.71048, "r": 417.19632, "b": 389.99917999999997, "coord_origin": "1"}}, {"id": 48, "text": "5.39", "bbox": {"l": 440.767, "t": 378.71048, "r": 457.15039, "b": 389.99917999999997, "coord_origin": "1"}}, {"id": 49, "text": "4", "bbox": {"l": 161.90601, "t": 397.53949, "r": 166.51474, "b": 408.82819, "coord_origin": "1"}}, {"id": 50, "text": "4", "bbox": {"l": 209.509, "t": 397.53949, "r": 214.11774, "b": 408.82819, "coord_origin": "1"}}, {"id": 51, "text": "OTSL", "bbox": {"l": 246.71000999999998, "t": 392.06049, "r": 271.41064, "b": 403.34918, "coord_origin": "1"}}, {"id": 52, "text": "0.938", "bbox": {"l": 289.017, "t": 392.06049, "r": 310.00732, "b": 403.34918, "coord_origin": "1"}}, {"id": 53, "text": "0.904", "bbox": {"l": 326.71701, "t": 392.06049, "r": 347.70734, "b": 403.34918, "coord_origin": "1"}}, {"id": 54, "text": "0.927", "bbox": {"l": 363.67599, "t": 392.06049, "r": 384.66632, "b": 403.34918, "coord_origin": "1"}}, {"id": 55, "text": "0.853", "bbox": {"l": 394.61801, "t": 391.99771, "r": 418.77798, "b": 403.40298, "coord_origin": "1"}}, {"id": 56, "text": "1.97", "bbox": {"l": 439.52701, "t": 391.99771, "r": 458.38336, "b": 403.40298, "coord_origin": "1"}}, {"id": 57, "text": "HTML", "bbox": {"l": 245.17598999999998, "t": 405.01147, "r": 272.94495, "b": 416.30017, "coord_origin": "1"}}, {"id": 58, "text": "0.952", "bbox": {"l": 289.017, "t": 405.01147, "r": 310.00732, "b": 416.30017, "coord_origin": "1"}}, {"id": 59, "text": "0.909", "bbox": {"l": 326.71701, "t": 405.01147, "r": 347.70734, "b": 416.30017, "coord_origin": "1"}}, {"id": 60, "text": "0.938", "bbox": {"l": 362.08801, "t": 404.9486999999999, "r": 386.24799, "b": 416.35397, "coord_origin": "1"}}, {"id": 61, "text": "0.843", "bbox": {"l": 396.20599, "t": 405.01147, "r": 417.19632, "b": 416.30017, "coord_origin": "1"}}, {"id": 62, "text": "3.77", "bbox": {"l": 440.767, "t": 405.01147, "r": 457.15039, "b": 416.30017, "coord_origin": "1"}}, {"id": 63, "text": "2", "bbox": {"l": 161.90601, "t": 423.84048, "r": 166.51474, "b": 435.12918, "coord_origin": "1"}}, {"id": 64, "text": "4", "bbox": {"l": 209.509, "t": 423.84048, "r": 214.11774, "b": 435.12918, "coord_origin": "1"}}, {"id": 65, "text": "OTSL", "bbox": {"l": 246.71000999999998, "t": 418.3614799999999, "r": 271.41064, "b": 429.65018, "coord_origin": "1"}}, {"id": 66, "text": "0.923", "bbox": {"l": 289.017, "t": 418.3614799999999, "r": 310.00732, "b": 429.65018, "coord_origin": "1"}}, {"id": 67, "text": "0.897", "bbox": {"l": 326.71701, "t": 418.3614799999999, "r": 347.70734, "b": 429.65018, "coord_origin": "1"}}, {"id": 68, "text": "0.915", "bbox": {"l": 363.67599, "t": 418.3614799999999, "r": 384.66632, "b": 429.65018, "coord_origin": "1"}}, {"id": 69, "text": "0.859", "bbox": {"l": 394.61801, "t": 418.29871, "r": 418.77798, "b": 429.70398, "coord_origin": "1"}}, {"id": 70, "text": "1.91", "bbox": {"l": 439.52701, "t": 418.29871, "r": 458.38336, "b": 429.70398, "coord_origin": "1"}}, {"id": 71, "text": "HTML", "bbox": {"l": 245.17598999999998, "t": 431.31246999999996, "r": 272.94495, "b": 442.60117, "coord_origin": "1"}}, {"id": 72, "text": "0.945", "bbox": {"l": 289.017, "t": 431.31246999999996, "r": 310.00732, "b": 442.60117, "coord_origin": "1"}}, {"id": 73, "text": "0.901", "bbox": {"l": 326.71701, "t": 431.31246999999996, "r": 347.70734, "b": 442.60117, "coord_origin": "1"}}, {"id": 74, "text": "0.931", "bbox": {"l": 362.08801, "t": 431.24969, "r": 386.24799, "b": 442.65497, "coord_origin": "1"}}, {"id": 75, "text": "0.834", "bbox": {"l": 396.20599, "t": 431.31246999999996, "r": 417.19632, "b": 442.60117, "coord_origin": "1"}}, {"id": 76, "text": "3.81", "bbox": {"l": 440.767, "t": 431.31246999999996, "r": 457.15039, "b": 442.60117, "coord_origin": "1"}}, {"id": 77, "text": "4", "bbox": {"l": 161.90601, "t": 450.14248999999995, "r": 166.51474, "b": 461.43118, "coord_origin": "1"}}, {"id": 78, "text": "2", "bbox": {"l": 209.509, "t": 450.14248999999995, "r": 214.11774, "b": 461.43118, "coord_origin": "1"}}, {"id": 79, "text": "OTSL", "bbox": {"l": 246.71000999999998, "t": 444.66248, "r": 271.41064, "b": 455.95117, "coord_origin": "1"}}, {"id": 80, "text": "0.952", "bbox": {"l": 289.017, "t": 444.66248, "r": 310.00732, "b": 455.95117, "coord_origin": "1"}}, {"id": 81, "text": "0.92", "bbox": {"l": 329.021, "t": 444.66248, "r": 345.40439, "b": 455.95117, "coord_origin": "1"}}, {"id": 82, "text": "0.942", "bbox": {"l": 362.08801, "t": 444.5996999999999, "r": 386.24799, "b": 456.00497, "coord_origin": "1"}}, {"id": 83, "text": "0.857", "bbox": {"l": 394.61801, "t": 444.5996999999999, "r": 418.77798, "b": 456.00497, "coord_origin": "1"}}, {"id": 84, "text": "1.22", "bbox": {"l": 439.52701, "t": 444.5996999999999, "r": 458.38336, "b": 456.00497, "coord_origin": "1"}}, {"id": 85, "text": "HTML", "bbox": {"l": 245.17598999999998, "t": 457.61447, "r": 272.94495, "b": 468.90317, "coord_origin": "1"}}, {"id": 86, "text": "0.944", "bbox": {"l": 289.017, "t": 457.61447, "r": 310.00732, "b": 468.90317, "coord_origin": "1"}}, {"id": 87, "text": "0.903", "bbox": {"l": 326.71701, "t": 457.61447, "r": 347.70734, "b": 468.90317, "coord_origin": "1"}}, {"id": 88, "text": "0.931", "bbox": {"l": 363.67599, "t": 457.61447, "r": 384.66632, "b": 468.90317, "coord_origin": "1"}}, {"id": 89, "text": "0.824", "bbox": {"l": 396.20599, "t": 457.61447, "r": 417.19632, "b": 468.90317, "coord_origin": "1"}}, {"id": 90, "text": "2", "bbox": {"l": 446.65302, "t": 457.61447, "r": 451.26175, "b": 468.90317, "coord_origin": "1"}}]}, "text": null, "otsl_seq": ["ched", "ched", "ched", "ched", "lcel", "lcel", "ched", "ched", "nl", "ched", "ched", "ucel", "ched", "ched", "ched", "ched", "ched", "nl", "fcel", "fcel", "fcel", "fcel", "fcel", "fcel", "fcel", "fcel", "nl", "fcel", "fcel", "fcel", "fcel", "fcel", "fcel", "fcel", "fcel", "nl", "fcel", "fcel", "fcel", "fcel", "fcel", "fcel", "fcel", "fcel", "nl", "fcel", "fcel", "fcel", "fcel", "fcel", "fcel", "fcel", "fcel", "nl", "fcel", "fcel", "fcel", "fcel", "fcel", "fcel", "fcel", "fcel", "nl"], "num_rows": 7, "num_cols": 8, "table_cells": [{"bbox": {"l": 160.37, "t": 339.45749, "r": 168.04523, "b": 350.74619, "coord_origin": "1"}, "row_span": 1, "col_span": 1, "start_row_offset_idx": 0, "end_row_offset_idx": 1, "start_col_offset_idx": 0, "end_col_offset_idx": 1, "text": "#", "column_header": true, "row_header": false, "row_section": false}, {"bbox": {"l": 144.592, "t": 352.40848, "r": 183.82895, "b": 363.69717, "coord_origin": "1"}, "row_span": 1, "col_span": 1, "start_row_offset_idx": 1, "end_row_offset_idx": 2, "start_col_offset_idx": 0, "end_col_offset_idx": 1, "text": "enc-layers", "column_header": true, "row_header": false, "row_section": false}, {"bbox": {"l": 207.974, "t": 339.45749, "r": 215.64923000000002, "b": 350.74619, "coord_origin": "1"}, "row_span": 1, "col_span": 1, "start_row_offset_idx": 0, "end_row_offset_idx": 1, "start_col_offset_idx": 1, "end_col_offset_idx": 2, "text": "#", "column_header": true, "row_header": false, "row_section": false}, {"bbox": {"l": 192.19501, "t": 352.40848, "r": 231.42303, "b": 363.69717, "coord_origin": "1"}, "row_span": 1, "col_span": 1, "start_row_offset_idx": 1, "end_row_offset_idx": 2, "start_col_offset_idx": 1, "end_col_offset_idx": 2, "text": "dec-layers", "column_header": true, "row_header": false, "row_section": false}, {"bbox": {"l": 239.79799999999997, "t": 344.93649, "r": 278.3338, "b": 356.22519000000005, "coord_origin": "1"}, "row_span": 2, "col_span": 1, "start_row_offset_idx": 0, "end_row_offset_idx": 2, "start_col_offset_idx": 2, "end_col_offset_idx": 3, "text": "Language", "column_header": true, "row_header": false, "row_section": false}, {"bbox": {"l": 324.67001, "t": 339.45749, "r": 348.26419, "b": 350.74619, "coord_origin": "1"}, "row_span": 1, "col_span": 3, "start_row_offset_idx": 0, "end_row_offset_idx": 1, "start_col_offset_idx": 3, "end_col_offset_idx": 6, "text": "TEDs", "column_header": true, "row_header": false, "row_section": false}, {"bbox": {"l": 396.271, "t": 339.45749, "r": 417.12595, "b": 350.74619, "coord_origin": "1"}, "row_span": 1, "col_span": 1, "start_row_offset_idx": 0, "end_row_offset_idx": 1, "start_col_offset_idx": 6, "end_col_offset_idx": 7, "text": "mAP", "column_header": true, "row_header": false, "row_section": false}, {"bbox": {"l": 394.927, "t": 350.41647, "r": 418.46921, "b": 361.70517, "coord_origin": "1"}, "row_span": 1, "col_span": 1, "start_row_offset_idx": 1, "end_row_offset_idx": 2, "start_col_offset_idx": 6, "end_col_offset_idx": 7, "text": "(0.75)", "column_header": true, "row_header": false, "row_section": false}, {"bbox": {"l": 430.771, "t": 339.45749, "r": 467.14142000000004, "b": 350.74619, "coord_origin": "1"}, "row_span": 1, "col_span": 1, "start_row_offset_idx": 0, "end_row_offset_idx": 1, "start_col_offset_idx": 7, "end_col_offset_idx": 8, "text": "Inference", "column_header": true, "row_header": false, "row_section": false}, {"bbox": {"l": 427.14801, "t": 350.41647, "r": 470.76955999999996, "b": 361.70517, "coord_origin": "1"}, "row_span": 1, "col_span": 1, "start_row_offset_idx": 1, "end_row_offset_idx": 2, "start_col_offset_idx": 7, "end_col_offset_idx": 8, "text": "time (secs)", "column_header": true, "row_header": false, "row_section": false}, {"bbox": {"l": 286.686, "t": 352.40848, "r": 312.32812, "b": 363.69717, "coord_origin": "1"}, "row_span": 1, "col_span": 1, "start_row_offset_idx": 1, "end_row_offset_idx": 2, "start_col_offset_idx": 3, "end_col_offset_idx": 4, "text": "simple", "column_header": true, "row_header": false, "row_section": false}, {"bbox": {"l": 320.702, "t": 352.40848, "r": 353.71539, "b": 363.69717, "coord_origin": "1"}, "row_span": 1, "col_span": 1, "start_row_offset_idx": 1, "end_row_offset_idx": 2, "start_col_offset_idx": 4, "end_col_offset_idx": 5, "text": "complex", "column_header": true, "row_header": false, "row_section": false}, {"bbox": {"l": 369.306, "t": 352.40848, "r": 379.02914, "b": 363.69717, "coord_origin": "1"}, "row_span": 1, "col_span": 1, "start_row_offset_idx": 1, "end_row_offset_idx": 2, "start_col_offset_idx": 5, "end_col_offset_idx": 6, "text": "all", "column_header": true, "row_header": false, "row_section": false}, {"bbox": {"l": 161.90601, "t": 371.23849, "r": 166.51474, "b": 382.52719, "coord_origin": "1"}, "row_span": 1, "col_span": 1, "start_row_offset_idx": 2, "end_row_offset_idx": 3, "start_col_offset_idx": 0, "end_col_offset_idx": 1, "text": "6", "column_header": false, "row_header": false, "row_section": false}, {"bbox": {"l": 209.509, "t": 371.23849, "r": 214.11774, "b": 382.52719, "coord_origin": "1"}, "row_span": 1, "col_span": 1, "start_row_offset_idx": 2, "end_row_offset_idx": 3, "start_col_offset_idx": 1, "end_col_offset_idx": 2, "text": "6", "column_header": false, "row_header": false, "row_section": false}, {"bbox": {"l": 245.17598999999998, "t": 365.75848, "r": 272.94495, "b": 389.99917999999997, "coord_origin": "1"}, "row_span": 1, "col_span": 1, "start_row_offset_idx": 2, "end_row_offset_idx": 3, "start_col_offset_idx": 2, "end_col_offset_idx": 3, "text": "OTSL HTML", "column_header": false, "row_header": false, "row_section": false}, {"bbox": {"l": 289.017, "t": 365.75848, "r": 310.00732, "b": 389.99917999999997, "coord_origin": "1"}, "row_span": 1, "col_span": 1, "start_row_offset_idx": 2, "end_row_offset_idx": 3, "start_col_offset_idx": 3, "end_col_offset_idx": 4, "text": "0.965 0.969", "column_header": false, "row_header": false, "row_section": false}, {"bbox": {"l": 326.71701, "t": 365.75848, "r": 347.70734, "b": 389.99917999999997, "coord_origin": "1"}, "row_span": 1, "col_span": 1, "start_row_offset_idx": 2, "end_row_offset_idx": 3, "start_col_offset_idx": 4, "end_col_offset_idx": 5, "text": "0.934 0.927", "column_header": false, "row_header": false, "row_section": false}, {"bbox": {"l": 363.67599, "t": 365.75848, "r": 384.66632, "b": 389.99917999999997, "coord_origin": "1"}, "row_span": 1, "col_span": 1, "start_row_offset_idx": 2, "end_row_offset_idx": 3, "start_col_offset_idx": 5, "end_col_offset_idx": 6, "text": "0.955 0.955", "column_header": false, "row_header": false, "row_section": false}, {"bbox": {"l": 396.20599, "t": 365.69571, "r": 417.19632, "b": 389.99917999999997, "coord_origin": "1"}, "row_span": 1, "col_span": 1, "start_row_offset_idx": 2, "end_row_offset_idx": 3, "start_col_offset_idx": 6, "end_col_offset_idx": 7, "text": "0.88 0.857", "column_header": false, "row_header": false, "row_section": false}, {"bbox": {"l": 439.52701, "t": 365.69571, "r": 458.38336, "b": 389.99917999999997, "coord_origin": "1"}, "row_span": 1, "col_span": 1, "start_row_offset_idx": 2, "end_row_offset_idx": 3, "start_col_offset_idx": 7, "end_col_offset_idx": 8, "text": "2.73 5.39", "column_header": false, "row_header": false, "row_section": false}, {"bbox": {"l": 161.90601, "t": 397.53949, "r": 166.51474, "b": 408.82819, "coord_origin": "1"}, "row_span": 1, "col_span": 1, "start_row_offset_idx": 3, "end_row_offset_idx": 4, "start_col_offset_idx": 0, "end_col_offset_idx": 1, "text": "4", "column_header": false, "row_header": false, "row_section": false}, {"bbox": {"l": 209.509, "t": 397.53949, "r": 214.11774, "b": 408.82819, "coord_origin": "1"}, "row_span": 1, "col_span": 1, "start_row_offset_idx": 3, "end_row_offset_idx": 4, "start_col_offset_idx": 1, "end_col_offset_idx": 2, "text": "4", "column_header": false, "row_header": false, "row_section": false}, {"bbox": {"l": 245.17598999999998, "t": 392.06049, "r": 272.94495, "b": 416.30017, "coord_origin": "1"}, "row_span": 1, "col_span": 1, "start_row_offset_idx": 3, "end_row_offset_idx": 4, "start_col_offset_idx": 2, "end_col_offset_idx": 3, "text": "OTSL HTML", "column_header": false, "row_header": false, "row_section": false}, {"bbox": {"l": 289.017, "t": 392.06049, "r": 310.00732, "b": 416.30017, "coord_origin": "1"}, "row_span": 1, "col_span": 1, "start_row_offset_idx": 3, "end_row_offset_idx": 4, "start_col_offset_idx": 3, "end_col_offset_idx": 4, "text": "0.938 0.952", "column_header": false, "row_header": false, "row_section": false}, {"bbox": {"l": 326.71701, "t": 392.06049, "r": 347.70734, "b": 403.34918, "coord_origin": "1"}, "row_span": 1, "col_span": 1, "start_row_offset_idx": 3, "end_row_offset_idx": 4, "start_col_offset_idx": 4, "end_col_offset_idx": 5, "text": "0.904", "column_header": false, "row_header": false, "row_section": false}, {"bbox": {"l": 362.08801, "t": 392.06049, "r": 386.24799, "b": 416.35397, "coord_origin": "1"}, "row_span": 1, "col_span": 1, "start_row_offset_idx": 3, "end_row_offset_idx": 4, "start_col_offset_idx": 5, "end_col_offset_idx": 6, "text": "0.927 0.938", "column_header": false, "row_header": false, "row_section": false}, {"bbox": {"l": 394.61801, "t": 391.99771, "r": 418.77798, "b": 403.40298, "coord_origin": "1"}, "row_span": 1, "col_span": 1, "start_row_offset_idx": 3, "end_row_offset_idx": 4, "start_col_offset_idx": 6, "end_col_offset_idx": 7, "text": "0.853", "column_header": false, "row_header": false, "row_section": false}, {"bbox": {"l": 439.52701, "t": 391.99771, "r": 458.38336, "b": 403.40298, "coord_origin": "1"}, "row_span": 1, "col_span": 1, "start_row_offset_idx": 3, "end_row_offset_idx": 4, "start_col_offset_idx": 7, "end_col_offset_idx": 8, "text": "1.97", "column_header": false, "row_header": false, "row_section": false}, {"bbox": {"l": 326.71701, "t": 405.01147, "r": 347.70734, "b": 442.60117, "coord_origin": "1"}, "row_span": 1, "col_span": 1, "start_row_offset_idx": 4, "end_row_offset_idx": 5, "start_col_offset_idx": 4, "end_col_offset_idx": 5, "text": "0.909 0.897 0.901", "column_header": false, "row_header": false, "row_section": false}, {"bbox": {"l": 396.20599, "t": 405.01147, "r": 417.19632, "b": 416.30017, "coord_origin": "1"}, "row_span": 1, "col_span": 1, "start_row_offset_idx": 4, "end_row_offset_idx": 5, "start_col_offset_idx": 6, "end_col_offset_idx": 7, "text": "0.843", "column_header": false, "row_header": false, "row_section": false}, {"bbox": {"l": 440.767, "t": 405.01147, "r": 457.15039, "b": 416.30017, "coord_origin": "1"}, "row_span": 1, "col_span": 1, "start_row_offset_idx": 4, "end_row_offset_idx": 5, "start_col_offset_idx": 7, "end_col_offset_idx": 8, "text": "3.77", "column_header": false, "row_header": false, "row_section": false}, {"bbox": {"l": 161.90601, "t": 423.84048, "r": 166.51474, "b": 435.12918, "coord_origin": "1"}, "row_span": 1, "col_span": 1, "start_row_offset_idx": 5, "end_row_offset_idx": 6, "start_col_offset_idx": 0, "end_col_offset_idx": 1, "text": "2", "column_header": false, "row_header": false, "row_section": false}, {"bbox": {"l": 209.509, "t": 423.84048, "r": 214.11774, "b": 435.12918, "coord_origin": "1"}, "row_span": 1, "col_span": 1, "start_row_offset_idx": 5, "end_row_offset_idx": 6, "start_col_offset_idx": 1, "end_col_offset_idx": 2, "text": "4", "column_header": false, "row_header": false, "row_section": false}, {"bbox": {"l": 246.71000999999998, "t": 418.3614799999999, "r": 271.41064, "b": 429.65018, "coord_origin": "1"}, "row_span": 1, "col_span": 1, "start_row_offset_idx": 5, "end_row_offset_idx": 6, "start_col_offset_idx": 2, "end_col_offset_idx": 3, "text": "OTSL", "column_header": false, "row_header": false, "row_section": false}, {"bbox": {"l": 289.017, "t": 418.3614799999999, "r": 310.00732, "b": 429.65018, "coord_origin": "1"}, "row_span": 1, "col_span": 1, "start_row_offset_idx": 4, "end_row_offset_idx": 5, "start_col_offset_idx": 3, "end_col_offset_idx": 4, "text": "0.923", "column_header": false, "row_header": false, "row_section": false}, {"bbox": {"l": 363.67599, "t": 418.3614799999999, "r": 384.66632, "b": 429.65018, "coord_origin": "1"}, "row_span": 1, "col_span": 1, "start_row_offset_idx": 4, "end_row_offset_idx": 5, "start_col_offset_idx": 5, "end_col_offset_idx": 6, "text": "0.915", "column_header": false, "row_header": false, "row_section": false}, {"bbox": {"l": 394.61801, "t": 418.29871, "r": 418.77798, "b": 442.60117, "coord_origin": "1"}, "row_span": 1, "col_span": 1, "start_row_offset_idx": 5, "end_row_offset_idx": 6, "start_col_offset_idx": 6, "end_col_offset_idx": 7, "text": "0.859 0.834", "column_header": false, "row_header": false, "row_section": false}, {"bbox": {"l": 439.52701, "t": 418.29871, "r": 458.38336, "b": 442.60117, "coord_origin": "1"}, "row_span": 1, "col_span": 1, "start_row_offset_idx": 5, "end_row_offset_idx": 6, "start_col_offset_idx": 7, "end_col_offset_idx": 8, "text": "1.91 3.81", "column_header": false, "row_header": false, "row_section": false}, {"bbox": {"l": 245.17598999999998, "t": 431.31246999999996, "r": 272.94495, "b": 442.60117, "coord_origin": "1"}, "row_span": 1, "col_span": 1, "start_row_offset_idx": 4, "end_row_offset_idx": 5, "start_col_offset_idx": 2, "end_col_offset_idx": 3, "text": "HTML", "column_header": false, "row_header": false, "row_section": false}, {"bbox": {"l": 289.017, "t": 431.31246999999996, "r": 310.00732, "b": 442.60117, "coord_origin": "1"}, "row_span": 1, "col_span": 1, "start_row_offset_idx": 5, "end_row_offset_idx": 6, "start_col_offset_idx": 3, "end_col_offset_idx": 4, "text": "0.945", "column_header": false, "row_header": false, "row_section": false}, {"bbox": {"l": 362.08801, "t": 431.24969, "r": 386.24799, "b": 442.65497, "coord_origin": "1"}, "row_span": 1, "col_span": 1, "start_row_offset_idx": 5, "end_row_offset_idx": 6, "start_col_offset_idx": 5, "end_col_offset_idx": 6, "text": "0.931", "column_header": false, "row_header": false, "row_section": false}, {"bbox": {"l": 161.90601, "t": 450.14248999999995, "r": 166.51474, "b": 461.43118, "coord_origin": "1"}, "row_span": 1, "col_span": 1, "start_row_offset_idx": 6, "end_row_offset_idx": 7, "start_col_offset_idx": 0, "end_col_offset_idx": 1, "text": "4", "column_header": false, "row_header": false, "row_section": false}, {"bbox": {"l": 209.509, "t": 450.14248999999995, "r": 214.11774, "b": 461.43118, "coord_origin": "1"}, "row_span": 1, "col_span": 1, "start_row_offset_idx": 6, "end_row_offset_idx": 7, "start_col_offset_idx": 1, "end_col_offset_idx": 2, "text": "2", "column_header": false, "row_header": false, "row_section": false}, {"bbox": {"l": 245.17598999999998, "t": 444.66248, "r": 272.94495, "b": 468.90317, "coord_origin": "1"}, "row_span": 1, "col_span": 1, "start_row_offset_idx": 6, "end_row_offset_idx": 7, "start_col_offset_idx": 2, "end_col_offset_idx": 3, "text": "OTSL HTML", "column_header": false, "row_header": false, "row_section": false}, {"bbox": {"l": 289.017, "t": 444.66248, "r": 310.00732, "b": 468.90317, "coord_origin": "1"}, "row_span": 1, "col_span": 1, "start_row_offset_idx": 6, "end_row_offset_idx": 7, "start_col_offset_idx": 3, "end_col_offset_idx": 4, "text": "0.952 0.944", "column_header": false, "row_header": false, "row_section": false}, {"bbox": {"l": 326.71701, "t": 444.66248, "r": 347.70734, "b": 468.90317, "coord_origin": "1"}, "row_span": 1, "col_span": 1, "start_row_offset_idx": 6, "end_row_offset_idx": 7, "start_col_offset_idx": 4, "end_col_offset_idx": 5, "text": "0.92 0.903", "column_header": false, "row_header": false, "row_section": false}, {"bbox": {"l": 362.08801, "t": 444.5996999999999, "r": 386.24799, "b": 468.90317, "coord_origin": "1"}, "row_span": 1, "col_span": 1, "start_row_offset_idx": 6, "end_row_offset_idx": 7, "start_col_offset_idx": 5, "end_col_offset_idx": 6, "text": "0.942 0.931", "column_header": false, "row_header": false, "row_section": false}, {"bbox": {"l": 394.61801, "t": 444.5996999999999, "r": 418.77798, "b": 468.90317, "coord_origin": "1"}, "row_span": 1, "col_span": 1, "start_row_offset_idx": 6, "end_row_offset_idx": 7, "start_col_offset_idx": 6, "end_col_offset_idx": 7, "text": "0.857 0.824", "column_header": false, "row_header": false, "row_section": false}, {"bbox": {"l": 439.52701, "t": 444.5996999999999, "r": 458.38336, "b": 468.90317, "coord_origin": "1"}, "row_span": 1, "col_span": 1, "start_row_offset_idx": 6, "end_row_offset_idx": 7, "start_col_offset_idx": 7, "end_col_offset_idx": 8, "text": "1.22 2", "column_header": false, "row_header": false, "row_section": false}]}, {"label": "Section-header", "id": 7, "page_no": 0, "cluster": {"id": 7, "label": "Section-header", "bbox": {"l": 134.34051513671875, "t": 505.67111, "r": 264.40829, "b": 518.17419, "coord_origin": "1"}, "confidence": 0.9198964834213257, "cells": [{"id": 91, "text": "5.2", "bbox": {"l": 134.765, "t": 505.67111, "r": 149.40306, "b": 518.17419, "coord_origin": "1"}}, {"id": 92, "text": "Quantitative Results", "bbox": {"l": 160.85905, "t": 505.67111, "r": 264.40829, "b": 518.17419, "coord_origin": "1"}}]}, "text": "5.2 Quantitative Results"}, {"label": "Text", "id": 8, "page_no": 0, "cluster": {"id": 8, "label": "Text", "bbox": {"l": 134.35081481933594, "t": 522.08005, "r": 480.72003, "b": 618.3, "coord_origin": "1"}, "confidence": 0.98807293176651, "cells": [{"id": 93, "text": "We picked the model parameter configuration that produced the best prediction", "bbox": {"l": 134.765, "t": 522.08005, "r": 479.72983, "b": 534.61301, "coord_origin": "1"}}, {"id": 94, "text": "quality (enc=6, dec=6, heads=8) with PubTabNet alone, then independently", "bbox": {"l": 134.765, "t": 534.03604, "r": 480.5897499999999, "b": 546.569, "coord_origin": "1"}}, {"id": 95, "text": "trained and evaluated it on three publicly available data sets: PubTabNet (395k", "bbox": {"l": 134.765, "t": 545.99104, "r": 480.72003, "b": 558.524, "coord_origin": "1"}}, {"id": 96, "text": "samples), FinTabNet (113k samples) and PubTables-1M (about 1M samples).", "bbox": {"l": 134.765, "t": 557.94604, "r": 480.60577, "b": 570.479, "coord_origin": "1"}}, {"id": 97, "text": "Performance results are presented in Table. 2. It is clearly evident that the model", "bbox": {"l": 134.765, "t": 569.90103, "r": 480.5936899999999, "b": 582.43399, "coord_origin": "1"}}, {"id": 98, "text": "trained on OTSL outperforms HTML across the board, keeping high TEDs and", "bbox": {"l": 134.765, "t": 581.85603, "r": 480.59158, "b": 594.38899, "coord_origin": "1"}}, {"id": 99, "text": "mAP scores even on difficult financial tables (FinTabNet) that contain sparse", "bbox": {"l": 134.765, "t": 593.81204, "r": 480.58080999999993, "b": 606.345, "coord_origin": "1"}}, {"id": 100, "text": "and large tables.", "bbox": {"l": 134.765, "t": 605.76704, "r": 206.79959, "b": 618.3, "coord_origin": "1"}}]}, "text": "We picked the model parameter configuration that produced the best prediction quality (enc=6, dec=6, heads=8) with PubTabNet alone, then independently trained and evaluated it on three publicly available data sets: PubTabNet (395k samples), FinTabNet (113k samples) and PubTables-1M (about 1M samples). Performance results are presented in Table. 2. It is clearly evident that the model trained on OTSL outperforms HTML across the board, keeping high TEDs and mAP scores even on difficult financial tables (FinTabNet) that contain sparse and large tables."}, {"label": "Text", "id": 9, "page_no": 0, "cluster": {"id": 9, "label": "Text", "bbox": {"l": 134.27769470214844, "t": 617.72205, "r": 480.59857000000005, "b": 666.12, "coord_origin": "1"}, "confidence": 0.9812840819358826, "cells": [{"id": 101, "text": "Additionally, the results show that OTSL has an advantage over HTML", "bbox": {"l": 149.709, "t": 617.72205, "r": 480.59479, "b": 630.255, "coord_origin": "1"}}, {"id": 102, "text": "when applied on a bigger data set like PubTables-1M and achieves significantly", "bbox": {"l": 134.765, "t": 629.6770300000001, "r": 480.59857000000005, "b": 642.2099900000001, "coord_origin": "1"}}, {"id": 103, "text": "improved scores. Finally, OTSL achieves faster inference due to fewer decoding", "bbox": {"l": 134.765, "t": 641.63203, "r": 480.59384000000006, "b": 654.16499, "coord_origin": "1"}}, {"id": 104, "text": "steps which is a result of the reduced sequence representation.", "bbox": {"l": 134.765, "t": 653.58704, "r": 405.7995, "b": 666.12, "coord_origin": "1"}}]}, "text": "Additionally, the results show that OTSL has an advantage over HTML when applied on a bigger data set like PubTables-1M and achieves significantly improved scores. Finally, OTSL achieves faster inference due to fewer decoding steps which is a result of the reduced sequence representation."}], "headers": [{"label": "Page-header", "id": 0, "page_no": 0, "cluster": {"id": 0, "label": "Page-header", "bbox": {"l": 193.83700561523438, "t": 91.49352999999996, "r": 447.54476999999997, "b": 102.78223000000003, "coord_origin": "1"}, "confidence": 0.9235936999320984, "cells": [{"id": 0, "text": "Optimized Table Tokenization for Table Structure Recognition", "bbox": {"l": 194.478, "t": 91.49352999999996, "r": 447.54476999999997, "b": 102.78223000000003, "coord_origin": "1"}}]}, "text": "Optimized Table Tokenization for Table Structure Recognition"}, {"label": "Page-header", "id": 1, "page_no": 0, "cluster": {"id": 1, "label": "Page-header", "bbox": {"l": 475.3370056152344, "t": 91.49352999999996, "r": 480.59314, "b": 102.78223000000003, "coord_origin": "1"}, "confidence": 0.7262580394744873, "cells": [{"id": 1, "text": "9", "bbox": {"l": 475.98441, "t": 91.49352999999996, "r": 480.59314, "b": 102.78223000000003, "coord_origin": "1"}}]}, "text": "9"}]}}]