gemma / README.md
w1r4
readme2
37a4db0

A newer version of the Gradio SDK is available: 6.6.0

Upgrade
metadata
title: Gemma-2 Multimodal Chat
emoji: ๐Ÿš€
colorFrom: blue
colorTo: purple
sdk: gradio
sdk_version: 4.0.0
app_file: app.py
pinned: false

๐Ÿš€ Gemma-2 Multimodal Chat Application

A sophisticated Gradio-based chat application featuring multimodal capabilities with Google's Gemma-2 model.

โœจ Features

  • ๐Ÿ’ฌ Interactive Chat Interface: Persistent conversation history with context awareness
  • ๐Ÿ–ผ๏ธ Vision Capabilities: Upload and analyze images with AI-powered insights
  • ๐Ÿ“„ File Processing: Support for PDF and TXT file uploads with text extraction
  • ๐Ÿง  Contextual Responses: Maintains conversation context for follow-up questions
  • ๐ŸŽจ Modern UI: Clean, responsive interface built with Gradio
  • ๐Ÿ”„ State Management: Persistent chat history and file context across interactions

๐Ÿ› ๏ธ Technologies Used

  • Frontend: Gradio 4.0+
  • AI Model: Google's Gemma-2-2B-IT
  • File Processing: PyPDF2 for PDFs, PIL for images
  • Backend: Python with Hugging Face Transformers
  • Deployment: Hugging Face Spaces

๐Ÿš€ Quick Start

Local Development

  1. Clone the repository:

    git clone <repository-url>
    cd gemma
    
  2. Install dependencies:

    pip install -r requirements.txt
    
  3. Run the application:

    python app.py
    
  4. Open your browser and navigate to http://localhost:7860

Hugging Face Spaces Deployment

  1. Create a new Space on Hugging Face Spaces
  2. Choose "Gradio" as the SDK
  3. Upload the files from this repository
  4. The app will automatically deploy and be accessible via your Space URL

๐Ÿ“– How to Use

Basic Chat

  1. Type your message in the text input box
  2. Click "Submit" or press Enter
  3. View the AI response in the chat history

Image Analysis

  1. Upload an image using the image upload component
  2. Type a question about the image (e.g., "What do you see in this image?")
  3. Submit to get AI-powered image analysis

File Processing

  1. Upload a PDF or TXT file using the file upload component
  2. Ask questions about the file content
  3. The extracted text will be used as context for responses

Advanced Features

  • Persistent Context: Previous conversations are remembered
  • File Context: Uploaded file content persists for follow-up questions
  • Clear Chat: Reset conversation history and uploaded files

๐Ÿ”ง Configuration

Model Configuration

The application uses Google's Gemma-2-2B-IT model from Hugging Face. The model is loaded and used for inference in the gemma_3_inference function in app.py.

Customization

  • Modify the UI theme in the gr.Blocks configuration
  • Adjust file size limits and supported formats
  • Customize the chat history display format
  • Add additional file processing capabilities

๐Ÿ“ Project Structure

gemma/
โ”œโ”€โ”€ .gitattributes            # Git configuration
โ”œโ”€โ”€ .gitignore                # Git ignore file
โ”œโ”€โ”€ .huggingface/             # Hugging Face configuration
โ”‚   โ””โ”€โ”€ CODEOWNERS            # Space ownership configuration
โ”œโ”€โ”€ app.py                    # Main Gradio application
โ”œโ”€โ”€ app_config.yaml           # Hugging Face Space configuration
โ”œโ”€โ”€ HUGGINGFACE_DEPLOYMENT.md # Deployment instructions
โ”œโ”€โ”€ push_to_huggingface.bat   # Windows deployment script
โ”œโ”€โ”€ push_to_huggingface.py    # Python deployment script
โ”œโ”€โ”€ README.md                 # Project documentation (with Space config)
โ”œโ”€โ”€ README.space.md           # Hugging Face Space README
โ””โ”€โ”€ requirements.txt          # Python dependencies

๐Ÿ”ฎ Future Enhancements

  • Upgrade to Gemma-3 model when available
  • Support for additional file formats (DOCX, XLSX)
  • Advanced image processing capabilities
  • User authentication and personalized chat history
  • Export chat conversations
  • Multi-language support
  • Voice input/output capabilities

๐Ÿค Contributing

  1. Fork the repository
  2. Create a feature branch (git checkout -b feature/amazing-feature)
  3. Commit your changes (git commit -m 'Add amazing feature')
  4. Push to the branch (git push origin feature/amazing-feature)
  5. Open a Pull Request

๐Ÿ“„ License

This project is licensed under the MIT License - see the LICENSE file for details.

๐Ÿ™ Acknowledgments

  • Google for the Gemma model family
  • Hugging Face for the amazing ecosystem and Spaces platform
  • Gradio team for the intuitive UI framework

๐Ÿ“ž Support

If you encounter any issues or have questions, please open an issue on the repository or contact the maintainers.


Note: This application uses Google's Gemma-2-2B-IT model. The model doesn't have native vision capabilities, but the application is designed to handle image uploads with appropriate messaging.