docs: add coming-soon section (#235)

Signed-off-by: Panos Vagenas <35837085+vagenas@users.noreply.github.com>
This commit is contained in:
Panos Vagenas 2024-11-05 08:53:02 +01:00 committed by GitHub
parent d5e65aedac
commit 5ce02c5c59
No known key found for this signature in database
GPG Key ID: B5690EEEBB952194
2 changed files with 14 additions and 8 deletions

View File

@ -19,19 +19,22 @@
Docling parses documents and exports them to the desired format with ease and speed. Docling parses documents and exports them to the desired format with ease and speed.
## Features ## Features
* 🗂️ Reads popular document formats (PDF, DOCX, PPTX, Images, HTML, AsciiDoc, Markdown) and exports to Markdown and JSON * 🗂️ Reads popular document formats (PDF, DOCX, PPTX, Images, HTML, AsciiDoc, Markdown) and exports to Markdown and JSON
* 📑 Advanced PDF document understanding including page layout, reading order & table structures * 📑 Advanced PDF document understanding including page layout, reading order & table structures
* 🧩 Unified, expressive [DoclingDocument](https://ds4sd.github.io/docling/concepts/docling_document/) representation format * 🧩 Unified, expressive [DoclingDocument](https://ds4sd.github.io/docling/concepts/docling_document/) representation format
* 📝 Metadata extraction, including title, authors, references & language * 🤖 Easy integration with LlamaIndex 🦙 & LangChain 🦜🔗 for powerful RAG / QA applications
* 🤖 Seamless LlamaIndex 🦙 & LangChain 🦜🔗 integration for powerful RAG / QA applications
* 🔍 OCR support for scanned PDFs * 🔍 OCR support for scanned PDFs
* 💻 Simple and convenient CLI * 💻 Simple and convenient CLI
Explore the [documentation](https://ds4sd.github.io/docling/) to discover plenty examples and unlock the full power of Docling! Explore the [documentation](https://ds4sd.github.io/docling/) to discover plenty examples and unlock the full power of Docling!
### Coming soon
* ♾️ Equation & code extraction
* 📝 Metadata extraction, including title, authors, references & language
* 🦜🔗 Native LangChain extension
## Installation ## Installation
@ -57,7 +60,6 @@ result = converter.convert(source)
print(result.document.export_to_markdown()) # output: "## Docling Technical Report[...]" print(result.document.export_to_markdown()) # output: "## Docling Technical Report[...]"
``` ```
Check out [Getting started](https://ds4sd.github.io/docling/). Check out [Getting started](https://ds4sd.github.io/docling/).
You will find lots of tuning options to leverage all the advanced capabilities. You will find lots of tuning options to leverage all the advanced capabilities.
@ -66,7 +68,6 @@ You will find lots of tuning options to leverage all the advanced capabilities.
Please feel free to connect with us using the [discussion section](https://github.com/DS4SD/docling/discussions). Please feel free to connect with us using the [discussion section](https://github.com/DS4SD/docling/discussions).
## Technical report ## Technical report
For more details on Docling's inner workings, check out the [Docling Technical Report](https://arxiv.org/abs/2408.09869). For more details on Docling's inner workings, check out the [Docling Technical Report](https://arxiv.org/abs/2408.09869).
@ -95,5 +96,5 @@ If you use Docling in your projects, please consider citing the following:
## License ## License
The Docling codebase is under MIT license. The Docling codebase is under MIT license.
For individual model usage, please refer to the model licenses found in the original packages. For individual model usage, please refer to the model licenses found in the original packages.

View File

@ -22,7 +22,12 @@ Docling parses documents and exports them to the desired format with ease and sp
* 🗂️ Reads popular document formats (PDF, DOCX, PPTX, Images, HTML, AsciiDoc, Markdown) and exports to Markdown and JSON * 🗂️ Reads popular document formats (PDF, DOCX, PPTX, Images, HTML, AsciiDoc, Markdown) and exports to Markdown and JSON
* 📑 Advanced PDF document understanding incl. page layout, reading order & table structures * 📑 Advanced PDF document understanding incl. page layout, reading order & table structures
* 🧩 Unified, expressive [DoclingDocument](./concepts/docling_document.md) representation format * 🧩 Unified, expressive [DoclingDocument](./concepts/docling_document.md) representation format
* 📝 Metadata extraction, including title, authors, references & language * 🤖 Easy integration with LlamaIndex 🦙 & LangChain 🦜🔗 for powerful RAG / QA applications
* 🤖 Seamless LlamaIndex 🦙 & LangChain 🦜🔗 integration for powerful RAG / QA applications
* 🔍 OCR support for scanned PDFs * 🔍 OCR support for scanned PDFs
* 💻 Simple and convenient CLI * 💻 Simple and convenient CLI
### Coming soon
* ♾️ Equation & code extraction
* 📝 Metadata extraction, including title, authors, references & language
* 🦜🔗 Native LangChain extension