Docling/docs/integrations/data_prep_kit.md
Panos Vagenas 93fc1be61a
docs: add Data Prep Kit integration (#316)
Signed-off-by: Panos Vagenas <35837085+vagenas@users.noreply.github.com>
2024-11-12 12:21:48 +01:00

14 lines
753 B
Markdown

## Get started
Docling is used by the [Data Prep Kit \[↗\]](https://ibm.github.io/data-prep-kit/) open-source toolkit for preparing unstructured data for LLM application development ranging from laptop scale to datacenter scale.
Below you find the Data Prep Kit modules powered by Docling.
## PDF ingestion to Parquet
- 💻 [GitHub \[↗\]](https://github.com/IBM/data-prep-kit/tree/dev/transforms/language/pdf2parquet)
- 📖 [API docs \[↗\]](https://ibm.github.io/data-prep-kit/transforms/language/pdf2parquet/python/)
## Document chunking
- 💻 [GitHub \[↗\]](https://github.com/IBM/data-prep-kit/tree/dev/transforms/language/doc_chunk)
- 📖 [API docs \[↗\]](https://ibm.github.io/data-prep-kit/transforms/language/doc_chunk/python/)