docs: add coming-soon section (#235)

Signed-off-by: Panos Vagenas <35837085+vagenas@users.noreply.github.com>
This commit is contained in:
Panos Vagenas 2024-11-05 08:53:02 +01:00 committed by GitHub
parent d5e65aedac
commit 5ce02c5c59
No known key found for this signature in database
GPG Key ID: B5690EEEBB952194
2 changed files with 14 additions and 8 deletions

View File

@ -19,19 +19,22 @@
Docling parses documents and exports them to the desired format with ease and speed. Docling parses documents and exports them to the desired format with ease and speed.
## Features ## Features
* 🗂️ Reads popular document formats (PDF, DOCX, PPTX, Images, HTML, AsciiDoc, Markdown) and exports to Markdown and JSON * 🗂️ Reads popular document formats (PDF, DOCX, PPTX, Images, HTML, AsciiDoc, Markdown) and exports to Markdown and JSON
* 📑 Advanced PDF document understanding including page layout, reading order & table structures * 📑 Advanced PDF document understanding including page layout, reading order & table structures
* 🧩 Unified, expressive [DoclingDocument](https://ds4sd.github.io/docling/concepts/docling_document/) representation format * 🧩 Unified, expressive [DoclingDocument](https://ds4sd.github.io/docling/concepts/docling_document/) representation format
* 📝 Metadata extraction, including title, authors, references & language * 🤖 Easy integration with LlamaIndex 🦙 & LangChain 🦜🔗 for powerful RAG / QA applications
* 🤖 Seamless LlamaIndex 🦙 & LangChain 🦜🔗 integration for powerful RAG / QA applications
* 🔍 OCR support for scanned PDFs * 🔍 OCR support for scanned PDFs
* 💻 Simple and convenient CLI * 💻 Simple and convenient CLI
Explore the [documentation](https://ds4sd.github.io/docling/) to discover plenty examples and unlock the full power of Docling! Explore the [documentation](https://ds4sd.github.io/docling/) to discover plenty examples and unlock the full power of Docling!
### Coming soon
* ♾️ Equation & code extraction
* 📝 Metadata extraction, including title, authors, references & language
* 🦜🔗 Native LangChain extension
## Installation ## Installation
@ -57,7 +60,6 @@ result = converter.convert(source)
print(result.document.export_to_markdown()) # output: "## Docling Technical Report[...]" print(result.document.export_to_markdown()) # output: "## Docling Technical Report[...]"
``` ```
Check out [Getting started](https://ds4sd.github.io/docling/). Check out [Getting started](https://ds4sd.github.io/docling/).
You will find lots of tuning options to leverage all the advanced capabilities. You will find lots of tuning options to leverage all the advanced capabilities.
@ -66,7 +68,6 @@ You will find lots of tuning options to leverage all the advanced capabilities.
Please feel free to connect with us using the [discussion section](https://github.com/DS4SD/docling/discussions). Please feel free to connect with us using the [discussion section](https://github.com/DS4SD/docling/discussions).
## Technical report ## Technical report
For more details on Docling's inner workings, check out the [Docling Technical Report](https://arxiv.org/abs/2408.09869). For more details on Docling's inner workings, check out the [Docling Technical Report](https://arxiv.org/abs/2408.09869).

View File

@ -22,7 +22,12 @@ Docling parses documents and exports them to the desired format with ease and sp
* 🗂️ Reads popular document formats (PDF, DOCX, PPTX, Images, HTML, AsciiDoc, Markdown) and exports to Markdown and JSON * 🗂️ Reads popular document formats (PDF, DOCX, PPTX, Images, HTML, AsciiDoc, Markdown) and exports to Markdown and JSON
* 📑 Advanced PDF document understanding incl. page layout, reading order & table structures * 📑 Advanced PDF document understanding incl. page layout, reading order & table structures
* 🧩 Unified, expressive [DoclingDocument](./concepts/docling_document.md) representation format * 🧩 Unified, expressive [DoclingDocument](./concepts/docling_document.md) representation format
* 📝 Metadata extraction, including title, authors, references & language * 🤖 Easy integration with LlamaIndex 🦙 & LangChain 🦜🔗 for powerful RAG / QA applications
* 🤖 Seamless LlamaIndex 🦙 & LangChain 🦜🔗 integration for powerful RAG / QA applications
* 🔍 OCR support for scanned PDFs * 🔍 OCR support for scanned PDFs
* 💻 Simple and convenient CLI * 💻 Simple and convenient CLI
### Coming soon
* ♾️ Equation & code extraction
* 📝 Metadata extraction, including title, authors, references & language
* 🦜🔗 Native LangChain extension