Commit Graph

10 Commits

Author SHA1 Message Date
Michael Honaker
e79e4f0ab6
fix(markdown): make parsing of rich table cells valid (#1821)
* fix: update md table classification

Signed-off-by: Michael Honaker <Michael.Honaker@ibm.com>

* Fix ground truth header changes

Signed-off-by: Michael Honaker <Michael.Honaker@ibm.com>

* Fix merge issues

Signed-off-by: Michael Honaker <Michael.Honaker@ibm.com>

* Fix minor ground truth errors

Signed-off-by: Michael Honaker <Michael.Honaker@ibm.com>

---------

Signed-off-by: Michael Honaker <Michael.Honaker@ibm.com>
2025-06-26 19:50:45 +02:00
Panos Vagenas
7c5614a37a
fix(markdown): fix single-formatted headings & list items (#1820)
* fix(markdown): fix formatting & inline edge cases (show behavior before change)

Signed-off-by: Panos Vagenas <pva@zurich.ibm.com>

* add change and updated test data

Signed-off-by: Panos Vagenas <pva@zurich.ibm.com>

* update lock

Signed-off-by: Panos Vagenas <pva@zurich.ibm.com>

* improve test case

Signed-off-by: Panos Vagenas <pva@zurich.ibm.com>

---------

Signed-off-by: Panos Vagenas <pva@zurich.ibm.com>
2025-06-25 13:05:06 +02:00
Panos Vagenas
861abcdcb0
feat(markdown): add formatting & improve inline support (#1804)
feat(markdown): support formatting & hyperlinks

Signed-off-by: Panos Vagenas <pva@zurich.ibm.com>
2025-06-18 15:57:57 +02:00
Panos Vagenas
9210812bfa
fix: improve HTML layer detection, various MD fixes (#1241)
Markdown fixes:
- properly propagate section header levels
- improve handling of list subroots without text

Signed-off-by: Panos Vagenas <pva@zurich.ibm.com>
2025-03-26 16:07:14 +01:00
Panos Vagenas
90b766e2ae
fix(markdown): handle nested lists (#910)
Signed-off-by: Panos Vagenas <35837085+vagenas@users.noreply.github.com>
2025-02-07 12:55:12 +01:00
Panos Vagenas
5ac2887e4a
fix(markdown): fix parsing if doc ending with table (#873)
Signed-off-by: Panos Vagenas <35837085+vagenas@users.noreply.github.com>
2025-02-03 14:38:38 +01:00
Panos Vagenas
94751a78f4
fix(markdown): add support for HTML content (#855)
Signed-off-by: Panos Vagenas <35837085+vagenas@users.noreply.github.com>
2025-02-03 12:21:05 +01:00
Panos Vagenas
bccb022fc8
fix(markdown): fix empty block handling (#843)
Signed-off-by: Panos Vagenas <35837085+vagenas@users.noreply.github.com>
2025-01-30 16:22:29 +01:00
Panos Vagenas
5aed9f8aeb
fix: fix single newline handling in MD backend (#824)
Signed-off-by: Panos Vagenas <35837085+vagenas@users.noreply.github.com>
2025-01-28 19:05:55 +01:00
Panos Vagenas
c8ecdd987e
feat: expose new hybrid chunker, update docs (#384)
Signed-off-by: Panos Vagenas <35837085+vagenas@users.noreply.github.com>
2024-12-09 08:28:29 +01:00