Skip to main content

· 6 min read

Parsing printed text data is easy, but parsing data from PDFs, scans, and images is hard because they can come in various formats and quality. There may also be handwriting involved. To help solve this challenge, we need to rely on computer vision and optical character recognition (OCR). Let's explore the most popular OCR text extraction tools and compare them.

· 8 min read

In the previous tutorial, we learned how to make a docuemnt search tool using Node.js and Pinecone, that allows us to search private data like you would with ChatGPT. In this tutorial, we will learn how to do this with Python and Faiss. In addition, we will also use Streamlit to quickly spin up a simple user interface and dynamically ask questions and get answers.

· 2 min read

We will learn how to run a model from HuggingFace Hub locally on our machine, and ask it questions just like you would with ChatGPT. If you are worried about privacy issues with models hosted by someone else, this is a good place to start! Hosting your own model on your machine will give you all the control.