Repository History
Explore all analyzed open source repositories

gptpdf: Effortlessly Parse PDFs into Markdown with GPT-4o
gptpdf is a powerful Python library that leverages large visual models like GPT-4o to accurately parse PDF documents into clean Markdown format. With just 293 lines of code, it excels at preserving typography, math formulas, tables, and images. This tool offers an efficient and cost-effective solution for converting complex PDFs.

Ollama-OCR: Advanced OCR with Vision Language Models via Ollama
Ollama-OCR is a robust Python package and Streamlit application for Optical Character Recognition. It leverages state-of-the-art vision language models, accessible through Ollama, to accurately extract text from both images and PDF documents. The tool offers extensive features including support for multiple models, various output formats, and batch processing capabilities.