Open Issues Need Help
View All on GitHub Table detection in PDF files 2 months ago
bug help wanted good first issue
Goldziher/kreuzberg
2.3K
Document intelligence framework for Python - Extract text, metadata, and structured data from PDFs, images, Office documents, and more. Built on Pandoc, PDFium, and Tesseract.
Python
#async#document-intelligence#mcp#metadata-extraction#ocr#pandoc#pdf-extraction#pdfium#python#rag#table-extraction#tesseract#text-extraction