Commit Graph

  • f542460af3
    fix: fix duplicate title and heading + add e2e tests for html and docx (#186) Peter W. J. Staar 2024-10-30 13:14:56 +0100
  • dda2645d4c chore: bump version to 2.2.1 [skip ci] github-actions[bot] 2024-10-28 17:18:41 +0000
  • b9f5c74a7d
    fix: fix header levels for DOCX & HTML (#184) Panos Vagenas 2024-10-28 17:02:52 +0100
  • 94d0729c50
    fix: handling of long sequence of unescaped underscore chars in markdown (#173) Maxim Lysak 2024-10-28 16:34:48 +0100
  • 2cece27208
    docs: update LlamaIndex docs for Docling v2 (#182) Panos Vagenas 2024-10-28 14:28:26 +0100
  • 189d3c2d44
    docs: fix batch convert (#177) Michele Dolfi 2024-10-26 05:50:34 +0200
  • 7d19418b77
    fix: HTML backend, fixes for Lists and nested texts (#180) Maxim Lysak 2024-10-25 20:14:04 +0200
  • 88c1673057
    fix: MD Backend, fixes to properly handle trailing inline text and emphasis in headers (#178) Maxim Lysak 2024-10-25 18:02:20 +0200
  • 77a89c3334
    chore: make auto-release on request (#179) Michele Dolfi 2024-10-25 10:47:25 +0200
  • 8d356aa247
    docs: add export with embedded images (#175) Michele Dolfi 2024-10-24 20:19:41 +0200
  • 8208c93e3a chore: bump version to 2.2.0 [skip ci] github-actions[bot] 2024-10-23 16:04:55 +0000
  • 4116819b51
    feat: Update to docling-parse v2 without history (#170) Peter W. J. Staar 2024-10-23 17:20:11 +0200
  • 3023f18ba0
    feat: Support AsciiDoc and Markdown input format (#168) Christoph Auer 2024-10-23 16:14:26 +0200
  • 3496b4838f
    fix: set valid=false for invalid backends (#171) Michele Dolfi 2024-10-23 15:52:30 +0200
  • b8d2286dd1
    chore: various minor docs fixes (#169) Panos Vagenas 2024-10-22 15:29:36 +0200
  • fa5f94ec10
    Fix Typo errors in CONTRIBUTING.md file (#164) Mohamed Ali 2024-10-22 10:31:48 +0530
  • d5460e2d1f chore: bump version to 2.1.0 [skip ci] github-actions[bot] 2024-10-18 13:21:15 +0000
  • b346faf622
    feat: add coverage_threshold to skip OCR for small images (#161) Michele Dolfi 2024-10-18 13:58:23 +0200
  • f799e777c1
    docs: typo fix (#155) ABHISHEK FADAKE 2024-10-18 17:26:48 +0530
  • 63bef59d9e
    fix: fix legacy doc ref (#162) Panos Vagenas 2024-10-18 13:11:20 +0200
  • bb7a58d45d
    ci: run ci also on forks (#160) Michele Dolfi 2024-10-18 12:32:27 +0200
  • a00c937e19
    Ensure all models work only on valid pages (#158) Christoph Auer 2024-10-18 08:54:06 +0200
  • 034a411057
    docs: add graphical band in readme (#154) Maxim Lysak 2024-10-17 18:15:40 +0200
  • 61c092f445
    docs: add use docling (#150) Michele Dolfi 2024-10-17 18:14:48 +0200
  • 24f949ada2
    chore: run apt-get update before install (#156) Michele Dolfi 2024-10-17 17:27:16 +0200
  • a29c256041 chore: bump version to 2.0.0 [skip ci] github-actions[bot] 2024-10-16 19:48:06 +0000
  • 7d3be0edeb
    feat!: Docling v2 (#117) Christoph Auer 2024-10-16 21:02:03 +0200
  • d504432c1e
    docs: introduce docs site (#141) Panos Vagenas 2024-10-14 14:13:13 +0200
  • 2b1e72d327
    refactor: fix type of tesseractocr options (#140) Michele Dolfi 2024-10-14 08:40:22 +0200
  • 4672b24c1a chore: bump version to 1.20.0 [skip ci] github-actions[bot] 2024-10-11 13:48:02 +0000
  • 5e4944f15f
    feat: new experimental docling-parse v2 backend (#131) Christoph Auer 2024-10-11 15:12:49 +0200
  • 2ec39636f0 chore: bump version to 1.19.1 [skip ci] github-actions[bot] 2024-10-11 08:52:09 +0000
  • dae2a3b667
    fix: remove stderr from tesseract cli and introduce fuzziness in the text validation of OCR tests (#138) Nikos Livathinos 2024-10-11 10:21:19 +0200
  • 5f1bd9e9c8
    docs: simplify LlamaIndex example using Docling extension (#135) Panos Vagenas 2024-10-09 22:17:56 +0200
  • 6924999f1f
    chore: explicitly manage pandas dependency (#134) Panos Vagenas 2024-10-09 14:50:39 +0200
  • 0ffc1708d2 chore: bump version to 1.19.0 [skip ci] github-actions[bot] 2024-10-08 17:42:29 +0000
  • f96ea86a00
    feat: add options for choosing OCR engines (#118) Michele Dolfi 2024-10-08 19:07:08 +0200
  • d412c363d7
    fixed unload pdf backend resources (#129) Fasal Shah 2024-10-08 14:16:43 +0530
  • 9b82ae3324 chore: bump version to 1.18.0 [skip ci] github-actions[bot] 2024-10-03 17:16:00 +0000
  • 2422f706a1
    feat: new torch-based docling models (#120) Maxim Lysak 2024-10-03 18:42:33 +0200
  • 9ebbbc1245 chore: bump version to 1.17.0 [skip ci] github-actions[bot] 2024-10-03 13:44:52 +0000
  • dde0aff8bd
    update examples (#123) Rui Dias Gomes 2024-10-03 13:28:25 +0100
  • d44c62d7ce
    feat: windows support (#122) Michele Dolfi 2024-10-03 14:23:47 +0200
  • cde671cf34 chore: bump version to 1.16.1 [skip ci] github-actions[bot] 2024-09-27 14:36:40 +0000
  • 34bd887a7f
    fix: allow usage of opencv 4.6.x (#110) Michele Dolfi 2024-09-27 15:51:43 +0200
  • c05b692d69
    docs: document chunking (#111) Panos Vagenas 2024-09-27 11:16:04 +0200
  • 6760571fe1 chore: bump version to 1.16.0 [skip ci] github-actions[bot] 2024-09-27 06:21:15 +0000
  • d6df76f90b
    feat: Support tableformer model choice (#90) Christoph Auer 2024-09-26 21:37:08 +0200
  • 39977b5631
    chore: move examples extras to respective group (#103) Panos Vagenas 2024-09-25 15:47:48 +0200
  • 3dfd02a7e9 chore: bump version to 1.15.0 [skip ci] github-actions[bot] 2024-09-24 15:58:16 +0000
  • 6a03c208ec
    feat: add figure in markdown (#98) Michele Dolfi 2024-09-24 17:28:23 +0200
  • 001d214a13 chore: bump version to 1.14.0 [skip ci] github-actions[bot] 2024-09-24 13:38:23 +0000
  • d96b96c848
    fix: fix OCR setting for pypdfium, minor refactor (#102) Panos Vagenas 2024-09-24 14:36:00 +0200
  • f8f2303348
    docs: document CLI, minor README revamp (#100) Panos Vagenas 2024-09-24 09:21:28 +0200
  • f555815343
    chore: add RAG notebook titles (#101) Panos Vagenas 2024-09-24 09:17:46 +0200
  • 3c46e4266c
    feat: add URL support to CLI (#99) Panos Vagenas 2024-09-24 08:47:53 +0200
  • c65a01c9b7 chore: bump version to 1.13.1 [skip ci] github-actions[bot] 2024-09-23 19:04:01 +0000
  • 4794ce460a
    fix: updated the render_as_doctags with the new arguments from docling-core (#93) Peter W. J. Staar 2024-09-23 20:12:18 +0200
  • dce9934a0f
    Updated to new, clean vector logo, svg and rendered png are provided (#96) Maxim Lysak 2024-09-23 15:31:21 +0200
  • 1f4b224ab6
    chore: switch to gh apps user (#92) Michele Dolfi 2024-09-20 17:02:27 +0200
  • 6dd1e91c4a chore: bump version to 1.13.0 [skip ci] github-actions[bot] 2024-09-18 09:26:03 +0000
  • 0da7519896
    docs: updated Docling logo.png with transparent background (#88) Maxim Lysak 2024-09-18 10:39:11 +0200
  • f19bd43798
    feat: add table exports (#86) Michele Dolfi 2024-09-18 08:44:13 +0200
  • 442443a102
    fix: bumped the glm version and adjusted the tests (#83) Peter W. J. Staar 2024-09-18 07:43:49 +0200
  • 8242bce4fa chore: bump version to 1.12.2 [skip ci] github-actions[bot] 2024-09-17 16:01:34 +0000
  • fa9699fa3c
    fix(tests): Adjust the test data to match the new version of LayoutPredictor (#82) Nikos Livathinos 2024-09-17 15:50:35 +0200
  • 30a0ef69b4
    chore: Add PR template (#81) Michele Dolfi 2024-09-16 18:36:26 +0200
  • f1932fd8c5 chore: bump version to 1.12.1 [skip ci] github-actions[bot] 2024-09-16 10:58:09 +0000
  • 2870fdc857
    fix: CLI compatibility with python 3.10 and 3.11 (#79) Michele Dolfi 2024-09-16 12:32:45 +0200
  • 34b2772a2e chore: bump version to 1.12.0 [skip ci] github-actions[bot] 2024-09-13 12:34:15 +0000
  • 98990784df
    feat: add docling cli (#75) Peter W. J. Staar 2024-09-13 14:03:09 +0200
  • 8aa476ccd3
    test: improve typing definitions (part 1) (#72) Michele Dolfi 2024-09-12 15:56:29 +0200
  • 53569a1023
    docs: showcase RAG with LlamaIndex and LangChain (#71) Panos Vagenas 2024-09-11 15:07:08 +0200
  • 79932b7d69
    test: check for stable obj_type (#70) Michele Dolfi 2024-09-11 12:53:59 +0200
  • e66dc53765 chore: bump version to 1.11.0 [skip ci] github-actions[bot] 2024-09-10 16:18:59 +0000
  • bdfdfbf092
    feat: adding txt and doctags output (#68) Peter W. J. Staar 2024-09-10 17:30:52 +0200
  • cd5b6293cc chore: bump version to 1.10.0 [skip ci] github-actions[bot] 2024-09-10 14:38:07 +0000
  • 27a7a152e1
    feat: linux arm64 support and reducing dependencies (#69) Michele Dolfi 2024-09-10 15:43:27 +0200
  • 1051eb9465
    chore: update README (#65) Panos Vagenas 2024-09-09 12:03:04 +0200
  • 6f1811e050
    chore: fix placeholders in license (#63) Michele Dolfi 2024-09-06 17:10:07 +0200
  • d3711437f6 chore: bump version to 1.9.0 [skip ci] github-actions[bot] 2024-09-03 13:33:40 +0000
  • 1de2e4f924
    feat: export document pages as multimodal output (#54) Michele Dolfi 2024-09-03 15:05:35 +0200
  • 69e5d951a3
    docs: Update MAINTAINERS.md (#59) Christoph Auer 2024-09-02 12:34:38 +0200
  • 85b7348846
    docs: Mention quackling on README (#58) Christoph Auer 2024-09-02 12:27:29 +0200
  • 66ed096c40 chore: bump version to 1.8.5 [skip ci] github-actions[bot] 2024-08-30 12:37:54 +0000
  • 48f4d1ba52
    fix: Add unit tests (#51) Peter W. J. Staar 2024-08-30 14:08:20 +0200
  • 256f4d504e chore: bump version to 1.8.4 [skip ci] github-actions[bot] 2024-08-30 08:47:57 +0000
  • de85e46ced
    fix: propagate row_section in tables (#57) Michele Dolfi 2024-08-30 10:36:00 +0200
  • a8a60d52b1
    docs: add instructions for cpu-only installation (#56) Michele Dolfi 2024-08-30 10:20:21 +0200
  • 5c46749e70 chore: bump version to 1.8.3 [skip ci] github-actions[bot] 2024-08-28 10:37:38 +0000
  • f49ee825c3
    fix: table cells overlap and model warnings (#53) Michele Dolfi 2024-08-28 12:30:42 +0200
  • d0403aaebf chore: bump version to 1.8.2 [skip ci] github-actions[bot] 2024-08-27 09:53:15 +0000
  • e46a66a176
    fix: refine conversion result (#52) Panos Vagenas 2024-08-27 11:50:43 +0200
  • fe817b11d7
    docs: update interface in README (#50) Michele Dolfi 2024-08-26 15:36:39 +0200
  • 7052bee999 chore: bump version to 1.8.1 [skip ci] github-actions[bot] 2024-08-26 11:55:37 +0000
  • 8cc147bc56
    fix: align output formats (#49) Michele Dolfi 2024-08-26 13:30:26 +0200
  • 053eae4bdf chore: bump version to 1.8.0 [skip ci] github-actions[bot] 2024-08-23 14:24:04 +0000
  • a294b7e64a
    feat: Page-level error reporting from PDF backend, introduce PARTIAL_SUCCESS status (#47) Christoph Auer 2024-08-23 16:18:41 +0200
  • 3226b20779 chore: bump version to 1.7.1 [skip ci] github-actions[bot] 2024-08-23 11:56:02 +0000
  • 8808463cec
    fix: Better raise exception when a page fails to parse (#46) Christoph Auer 2024-08-23 13:51:42 +0200