feat: allow computing page images on-demand with scale and cache them (#36)

* feat: allow computing page images on-demand and cache them

Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>

* feat: expose scale for export of page images and document elements

Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>

* fix comment

Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>

---------

Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
This commit is contained in:
Michele Dolfi
2024-08-20 13:27:19 +02:00
committed by GitHub
parent c253dd743a
commit 78347bf679
9 changed files with 104 additions and 77 deletions

View File

@@ -267,7 +267,9 @@ class LayoutModel:
def __call__(self, page_batch: Iterable[Page]) -> Iterable[Page]:
for page in page_batch:
clusters = []
for ix, pred_item in enumerate(self.layout_predictor.predict(page.image)):
for ix, pred_item in enumerate(
self.layout_predictor.predict(page.get_image(scale=1.0))
):
cluster = Cluster(
id=ix,
label=pred_item["label"],