| --- |
| license: apache-2.0 |
| title: Long Context Caching Gemini PDF QA |
| sdk: docker |
| emoji: ๐ |
| colorFrom: yellow |
| --- |
| # ๐ Smart Document Analysis Platform |
|
|
| A modern web application that leverages Google Gemini API's caching capabilities to provide efficient document analysis. Upload documents once, ask questions forever! |
|
|
| ## ๐ Features |
|
|
| - **Document Upload**: Upload PDF files via drag-and-drop or URL |
| - **Gemini API Caching**: Documents are cached using Gemini's explicit caching feature |
| - **Cost-Effective**: Save on API costs by reusing cached document tokens |
| - **Real-time Chat**: Ask multiple questions about your documents |
| - **Beautiful UI**: Modern, responsive design with smooth animations |
| - **Token Tracking**: See how many tokens are cached for cost transparency |
| - **Smart Error Handling**: Graceful handling of small documents that don't meet caching requirements |
|
|
| ## ๐ฏ Use Cases |
|
|
| This platform is perfect for: |
|
|
| - **Research Analysis**: Upload research papers and ask detailed questions |
| - **Legal Document Review**: Analyze contracts, legal documents, and policies |
| - **Academic Studies**: Study course materials and textbooks |
| - **Business Reports**: Analyze quarterly reports, whitepapers, and presentations |
| - **Technical Documentation**: Review manuals, specifications, and guides |
|
|
| ## โก๏ธ Deploy on Hugging Face Spaces |
|
|
| You can deploy this app on [Hugging Face Spaces](https://huggingface.co/spaces) using the **Docker** SDK. |
|
|
| ### 1. **Select Docker SDK** |
| - When creating your Space, choose **Docker** (not Gradio, not Static). |
|
|
| ### 2. **Project Structure** |
| Make sure your repo includes: |
| - `app.py` (Flask app) |
| - `requirements.txt` |
| - `Dockerfile` |
| - `.env.example` (for reference, do not include secrets) |
|
|
| ### 3. **Dockerfile** |
| A sample Dockerfile is provided: |
| ```dockerfile |
| FROM python:3.10-slim |
| WORKDIR /app |
| RUN apt-get update && apt-get install -y build-essential && rm -rf /var/lib/apt/lists/* |
| COPY requirements.txt . |
| RUN pip install --no-cache-dir -r requirements.txt |
| COPY . . |
| EXPOSE 7860 |
| CMD ["python", "app.py"] |
| ``` |
|
|
| ### 4. **Port Configuration** |
| The app will run on the port provided by the `PORT` environment variable (default 7860), as required by Hugging Face Spaces. |
|
|
| ### 5. **Set Environment Variables** |
| - In your Space settings, add your `GOOGLE_API_KEY` as a secret environment variable. |
|
|
| ### 6. **Push to Hugging Face** |
| - Push your code to the Space's Git repository. |
| - The build and deployment will happen automatically. |
|
|
| --- |
|
|
| ## ๐ Prerequisites |
|
|
| - Python 3.8 or higher |
| - Google Gemini API key |
| - Internet connection for API calls |
|
|
| ## ๐ง Local Installation |
|
|
| 1. **Clone the repository** |
| ```bash |
| git clone <repository-url> |
| cd smart-document-analysis |
| ``` |
|
|
| 2. **Install dependencies** |
| ```bash |
| pip install -r requirements.txt |
| ``` |
|
|
| 3. **Set up environment variables** |
| ```bash |
| cp .env.example .env |
| ``` |
| |
| Edit `.env` and add your Google Gemini API key: |
| ``` |
| GOOGLE_API_KEY=your_actual_api_key_here |
| ``` |
|
|
| 4. **Get your API key** |
| - Visit [Google AI Studio](https://makersuite.google.com/app/apikey) |
| - Create a new API key |
| - Copy it to your `.env` file |
|
|
| ## ๐ Running the Application Locally |
|
|
| 1. **Start the server** |
| ```bash |
| python app.py |
| ``` |
|
|
| 2. **Open your browser** |
| Navigate to `http://localhost:7860` |
|
|
| 3. **Upload a document** |
| - Drag and drop a PDF file, or |
| - Click to select a file, or |
| - Provide a URL to a PDF |
|
|
| 4. **Start asking questions** |
| Once your document is cached, you can ask unlimited questions! |
|
|
| ## ๐ก How It Works |
|
|
| ### 1. Document Upload |
| When you upload a PDF, the application: |
| - Uploads the file to Gemini's File API |
| - Checks if the document meets minimum token requirements (4,096 tokens) |
| - If eligible, creates a cache with the document content |
| - If too small, provides helpful error message and suggestions |
| - Stores cache metadata locally |
| - Returns a cache ID for future reference |
|
|
| ### 2. Question Processing |
| When you ask a question: |
| - The question is sent to Gemini API |
| - The cached document content is automatically included |
| - You only pay for the question tokens, not the document tokens |
| - Responses are generated based on the cached content |
|
|
| ### 3. Cost Savings |
| - **Without caching**: You pay for document tokens + question tokens every time |
| - **With caching**: You pay for document tokens once + question tokens for each question |
|
|
| ## ๐ API Endpoints |
|
|
| - `GET /` - Main application interface |
| - `POST /upload` - Upload PDF file |
| - `POST /upload-url` - Upload PDF from URL |
| - `POST /ask` - Ask question about cached document |
| - `GET /caches` - List all cached documents |
| - `DELETE /cache/<cache_id>` - Delete specific cache |
|
|
| ## ๐ Cost Analysis |
|
|
| ### Example Scenario |
| - Document: 10,000 tokens |
| - Question: 50 tokens |
| - 10 questions asked |
|
|
| **Without Caching:** |
| - Cost = (10,000 + 50) ร 10 = 100,500 tokens |
|
|
| **With Caching:** |
| - Cost = 10,000 + (50 ร 10) = 10,500 tokens |
| - **Savings: 90% cost reduction!** |
|
|
| ### Token Requirements |
| - **Minimum for caching**: 4,096 tokens |
| - **Recommended minimum**: 5,000 tokens for cost-effectiveness |
| - **Optimal range**: 10,000 - 100,000 tokens |
| - **Maximum**: Model-specific limits (check Gemini API docs) |
|
|
| ## ๐จ Customization |
|
|
| ### Changing the Model |
| Edit `app.py` and change the model name: |
| ```python |
| model="models/gemini-2.0-flash-001" # Current |
| model="models/gemini-2.0-pro-001" # Alternative |
| ``` |
|
|
| ### Custom System Instructions |
| Modify the system instruction in the cache creation: |
| ```python |
| system_instruction="Your custom instruction here" |
| ``` |
|
|
| ### Cache TTL |
| Add TTL configuration to cache creation: |
| ```python |
| config=types.CreateCachedContentConfig( |
| system_instruction=system_instruction, |
| contents=[document], |
| ttl='24h' # Cache for 24 hours |
| ) |
| ``` |
|
|
| ## ๐ Security Considerations |
|
|
| - API keys are stored in environment variables |
| - File uploads are validated for PDF format |
| - Cached content is managed securely through Gemini API |
| - No sensitive data is stored locally |
|
|
| ## ๐ง Production Deployment |
|
|
| For production deployment: |
|
|
| 1. **Use a production WSGI server** |
| ```bash |
| pip install gunicorn |
| gunicorn -w 4 -b 0.0.0.0:7860 app:app |
| ``` |
|
|
| 2. **Add database storage** |
| - Replace in-memory storage with PostgreSQL/MySQL |
| - Add user authentication |
| - Implement session management |
|
|
| 3. **Add monitoring** |
| - Log API usage and costs |
| - Monitor cache hit rates |
| - Track user interactions |
|
|
| 4. **Security enhancements** |
| - Add rate limiting |
| - Implement file size limits |
| - Add input validation |
|
|
| ## ๐ค Contributing |
|
|
| 1. Fork the repository |
| 2. Create a feature branch |
| 3. Make your changes |
| 4. Add tests if applicable |
| 5. Submit a pull request |
|
|
| ## ๐ License |
|
|
| This project is licensed under the MIT License - see the LICENSE file for details. |
|
|
| ## ๐ Acknowledgments |
|
|
| - Google Gemini API for providing the caching functionality |
| - Flask community for the excellent web framework |
| - The open-source community for inspiration and tools |
|
|
| ## ๐ Support |
|
|
| If you encounter any issues: |
|
|
| 1. Check the [Gemini API documentation](https://ai.google.dev/docs) |
| 2. Verify your API key is correct |
| 3. Ensure your PDF files are valid |
| 4. Check the browser console for JavaScript errors |
| 5. **For small document errors**: Upload a larger document or combine multiple documents |
|
|
| ## ๐ฎ Future Enhancements |
|
|
| - [ ] Support for multiple file formats (Word, PowerPoint, etc.) |
| - [ ] User authentication and document sharing |
| - [ ] Advanced analytics and usage tracking |
| - [ ] Integration with cloud storage (Google Drive, Dropbox) |
| - [ ] Mobile app version |
| - [ ] Multi-language support |
| - [ ] Advanced caching strategies |
| - [ ] Real-time collaboration features |
| - [ ] Document preprocessing to meet token requirements |
| - [ ] Batch document processing |