Document parsing that never loses provenance: Markdown + JSON output where every block knows its source page, section, and location.

0 stars 0 forks 0 watchers Python Apache License 2.0
ai-agents citations document-parsing docx llm markdown pdf provenance python rag
4 Open Issues Need Help Last updated: Jul 5, 2026

Open Issues Need Help

View All on GitHub
Add RTF parser about 2 hours ago
good first issue

Document parsing that never loses provenance: Markdown + JSON output where every block knows its source page, section, and location.

Python
#ai-agents#citations#document-parsing#docx#llm#markdown#pdf#provenance#python#rag
good first issue

Document parsing that never loses provenance: Markdown + JSON output where every block knows its source page, section, and location.

Python
#ai-agents#citations#document-parsing#docx#llm#markdown#pdf#provenance#python#rag
good first issue

Document parsing that never loses provenance: Markdown + JSON output where every block knows its source page, section, and location.

Python
#ai-agents#citations#document-parsing#docx#llm#markdown#pdf#provenance#python#rag
Add EPUB parser about 2 hours ago
good first issue

Document parsing that never loses provenance: Markdown + JSON output where every block knows its source page, section, and location.

Python
#ai-agents#citations#document-parsing#docx#llm#markdown#pdf#provenance#python#rag