A local-first document management system built with Flask, SQLAlchemy, and Postgres+pgvector. Supports OCR, semantic search, and LLM-powered Q&A for academic research.

Knowledge Hub is a document management system I built to handle my MS in CS coursework at USC. The goal was simple: make it easy to search through course materials, research papers, and notes without digging through folders. It runs on Flask and PostgreSQL with the pgvector extension, so it supports both regular full-text search and vector-based semantic search. I added OCR processing for PDFs and images, document chunking, and hooked it up to a local LLM (Ollama) for question answering over uploaded docs. The whole thing is containerized with Docker. It has made finding relevant information across my coursework way faster than manual searching.


Upload, store, and organize documents with automatic metadata extraction and categorization
Automatic text extraction from PDFs and images using OpenCV, PyMuPDF, and Tesseract
Vector-based similarity search using pgvector and Sentence-Transformers for intelligent content discovery
RAG-powered Q&A system with local LLM integration for contextual answers with citations


Technical documentation covering architecture, implementation details, and design decisions.