docs: extend integration docs & README (#456)
Signed-off-by: Panos Vagenas <35837085+vagenas@users.noreply.github.com>
This commit is contained in:
parent
211f4f7570
commit
84c46fdeb3
24
README.md
24
README.md
@ -4,7 +4,7 @@
|
||||
</a>
|
||||
</p>
|
||||
|
||||
# Docling
|
||||
# 🦆 Docling
|
||||
|
||||
<p align="center">
|
||||
<a href="https://trendshift.io/repositories/12132" target="_blank"><img src="https://trendshift.io/api/badge/repositories/12132" alt="DS4SD%2Fdocling | Trendshift" style="width: 250px; height: 55px;" width="250" height="55"/></a>
|
||||
@ -29,7 +29,7 @@ Docling parses documents and exports them to the desired format with ease and sp
|
||||
* 🗂️ Reads popular document formats (PDF, DOCX, PPTX, XLSX, Images, HTML, AsciiDoc & Markdown) and exports to Markdown and JSON
|
||||
* 📑 Advanced PDF document understanding including page layout, reading order & table structures
|
||||
* 🧩 Unified, expressive [DoclingDocument](https://ds4sd.github.io/docling/concepts/docling_document/) representation format
|
||||
* 🤖 Easy integration with LlamaIndex 🦙 & LangChain 🦜🔗 for powerful RAG / QA applications
|
||||
* 🤖 Easy integration with 🦙 LlamaIndex & 🦜🔗 LangChain for powerful RAG / QA applications
|
||||
* 🔍 OCR support for scanned PDFs
|
||||
* 💻 Simple and convenient CLI
|
||||
|
||||
@ -65,8 +65,24 @@ result = converter.convert(source)
|
||||
print(result.document.export_to_markdown()) # output: "## Docling Technical Report[...]"
|
||||
```
|
||||
|
||||
Check out [Getting started](https://ds4sd.github.io/docling/).
|
||||
You will find lots of tuning options to leverage all the advanced capabilities.
|
||||
More [advanced usage options](https://ds4sd.github.io/docling/usage/) are available in
|
||||
the docs.
|
||||
|
||||
## Documentation
|
||||
|
||||
Check out Docling's [documentation](https://ds4sd.github.io/docling/), for details on
|
||||
installation, usage, concepts, recipes, extensions, and more.
|
||||
|
||||
## Examples
|
||||
|
||||
Go hands-on with our [examples](https://ds4sd.github.io/docling/examples/),
|
||||
demonstrating how to address different application use cases with Docling.
|
||||
|
||||
## Integrations
|
||||
|
||||
To further accelerate your AI application development, check out Docling's native
|
||||
[integrations](https://ds4sd.github.io/docling/integrations/) with popular frameworks
|
||||
and tools.
|
||||
|
||||
## Get help and support
|
||||
|
||||
|
BIN
docs/assets/docling_ecosystem.png
Normal file
BIN
docs/assets/docling_ecosystem.png
Normal file
Binary file not shown.
After Width: | Height: | Size: 233 KiB |
BIN
docs/assets/docling_ecosystem.pptx
Normal file
BIN
docs/assets/docling_ecosystem.pptx
Normal file
Binary file not shown.
@ -1,5 +1,3 @@
|
||||
# Docling
|
||||
|
||||
<p align="center">
|
||||
<img loading="lazy" alt="Docling" src="assets/docling_processing.png" width="100%" />
|
||||
<a href="https://trendshift.io/repositories/12132" target="_blank"><img src="https://trendshift.io/api/badge/repositories/12132" alt="DS4SD%2Fdocling | Trendshift" style="width: 250px; height: 55px;" width="250" height="55"/></a>
|
||||
@ -23,7 +21,7 @@ Docling parses documents and exports them to the desired format with ease and sp
|
||||
* 🗂️ Reads popular document formats (PDF, DOCX, PPTX, XLSX, Images, HTML, AsciiDoc & Markdown) and exports to Markdown and JSON
|
||||
* 📑 Advanced PDF document understanding incl. page layout, reading order & table structures
|
||||
* 🧩 Unified, expressive [DoclingDocument](./concepts/docling_document.md) representation format
|
||||
* 🤖 Easy integration with LlamaIndex 🦙 & LangChain 🦜🔗 for powerful RAG / QA applications
|
||||
* 🤖 Easy integration with 🦙 LlamaIndex & 🦜🔗 LangChain for powerful RAG / QA applications
|
||||
* 🔍 OCR support for scanned PDFs
|
||||
* 💻 Simple and convenient CLI
|
||||
|
||||
|
9
docs/integrations/bee.md
Normal file
9
docs/integrations/bee.md
Normal file
@ -0,0 +1,9 @@
|
||||
Docling is available as an extraction backend in the [Bee][github] framework.
|
||||
|
||||
- 💻 [Bee GitHub][github]
|
||||
- 📖 [Bee Docs][docs]
|
||||
- 📦 [Bee NPM][package]
|
||||
|
||||
[github]: https://github.com/i-am-bee
|
||||
[docs]: https://i-am-bee.github.io/bee-agent-framework/
|
||||
[package]: https://www.npmjs.com/package/bee-agent-framework
|
@ -1 +1,6 @@
|
||||
Use the navigation on the left to browse through Docling integrations with popular frameworks and tools.
|
||||
|
||||
|
||||
<p align="center">
|
||||
<img loading="lazy" alt="Docling" src="../assets/docling_ecosystem.png" width="100%" />
|
||||
</p>
|
||||
|
17
docs/integrations/instructlab.md
Normal file
17
docs/integrations/instructlab.md
Normal file
@ -0,0 +1,17 @@
|
||||
Docling is powering document processing in [InstructLab](https://instructlab.ai/),
|
||||
enabling users to unlock the knowledge hidden in documents and present it to
|
||||
InstructLab's fine-tuning for aligning AI models to the user's specific data.
|
||||
|
||||
More details can be found in this [blog post][blog].
|
||||
|
||||
- 🏠 [InstructLab Home][home]
|
||||
- 💻 [InstructLab GitHub][github]
|
||||
- 🧑🏻💻 [InstructLab UI][ui]
|
||||
- 📖 [InstructLab Docs][docs]
|
||||
<!-- - 📝 [Blog post]() -->
|
||||
|
||||
[home]: https://instructlab.ai
|
||||
[github]: https://github.com/instructlab
|
||||
[ui]: https://ui.instructlab.ai/
|
||||
[docs]: https://docs.instructlab.ai/
|
||||
[blog]: https://www.redhat.com/en/blog/docling-missing-document-processing-companion-generative-ai
|
9
docs/integrations/prodigy.md
Normal file
9
docs/integrations/prodigy.md
Normal file
@ -0,0 +1,9 @@
|
||||
Docling is available in [Prodigy][home] as a [Prodigy-PDF plugin][plugin] recipe.
|
||||
|
||||
- 🌐 [Prodigy Home][home]
|
||||
- 🔌 [Prodigy-PDF Plugin][plugin]
|
||||
- 🧑🏽🍳 [pdf-spans.manual Recipe][recipe]
|
||||
|
||||
[home]: https://prodi.gy/
|
||||
[plugin]: https://prodi.gy/docs/plugins#pdf
|
||||
[recipe]: https://prodi.gy/docs/plugins#pdf-spans.manual
|
@ -1,3 +1,5 @@
|
||||
# spaCy
|
||||
|
||||
Docling is available in [spaCy](https://spacy.io/) as the "SpaCy Layout" plugin:
|
||||
|
||||
- 💻 [SpacyLayout GitHub][github]
|
||||
|
@ -1,5 +1,7 @@
|
||||
{% extends "base.html" %}
|
||||
|
||||
{#
|
||||
{% block announce %}
|
||||
<p>🎉 Docling has gone v2! <a href="{{ 'v2' | url }}">Check out</a> what's new and how to get started!</p>
|
||||
{% endblock %}
|
||||
#}
|
||||
|
@ -52,8 +52,8 @@ theme:
|
||||
- search.suggest
|
||||
- toc.follow
|
||||
nav:
|
||||
- Get started:
|
||||
- Home: index.md
|
||||
- Home:
|
||||
- "🦆 Docling": index.md
|
||||
- Installation: installation.md
|
||||
- Usage: usage.md
|
||||
- CLI: cli.md
|
||||
@ -85,10 +85,13 @@ nav:
|
||||
# - CLI: examples/cli.md
|
||||
- Integrations:
|
||||
- Integrations: integrations/index.md
|
||||
- "🐝 Bee": integrations/bee.md
|
||||
- "Data Prep Kit": integrations/data_prep_kit.md
|
||||
- "DocETL": integrations/docetl.md
|
||||
- "🐶 InstructLab": integrations/instructlab.md
|
||||
- "Kotaemon": integrations/kotaemon.md
|
||||
- "LlamaIndex 🦙": integrations/llamaindex.md
|
||||
- "🦙 LlamaIndex": integrations/llamaindex.md
|
||||
- "Prodigy": integrations/prodigy.md
|
||||
- "spaCy": integrations/spacy.md
|
||||
# - "LangChain 🦜🔗": integrations/langchain.md
|
||||
# - API reference:
|
||||
|
Loading…
Reference in New Issue
Block a user