Tag: pdf
All the articles with the tag "pdf".
-
Fast PDF Text Extraction for Embeddings - Switching from Unstructured to PyMuPDF
Processing 200+ audio equipment PDF manuals for embeddings revealed significant performance bottlenecks. Switching from unstructured to PyMuPDF reduced extraction time from 45 minutes to under 2 minutes.