Spaces:

w1r4
/

gemma

Sleeping

App Files Files Community

gemma / README.md

w1r4

readme2

37a4db0 8 months ago

preview code

raw

history blame contribute delete

4.87 kB

A newer version of the Gradio SDK is available: 6.6.0

Upgrade

metadata

title: Gemma-2 Multimodal Chat
emoji: 🚀
colorFrom: blue
colorTo: purple
sdk: gradio
sdk_version: 4.0.0
app_file: app.py
pinned: false

🚀 Gemma-2 Multimodal Chat Application

A sophisticated Gradio-based chat application featuring multimodal capabilities with Google's Gemma-2 model.

✨ Features

💬 Interactive Chat Interface: Persistent conversation history with context awareness
🖼️ Vision Capabilities: Upload and analyze images with AI-powered insights
📄 File Processing: Support for PDF and TXT file uploads with text extraction
🧠 Contextual Responses: Maintains conversation context for follow-up questions
🎨 Modern UI: Clean, responsive interface built with Gradio
🔄 State Management: Persistent chat history and file context across interactions

🛠️ Technologies Used

Frontend: Gradio 4.0+
AI Model: Google's Gemma-2-2B-IT
File Processing: PyPDF2 for PDFs, PIL for images
Backend: Python with Hugging Face Transformers
Deployment: Hugging Face Spaces

🚀 Quick Start

Local Development

Clone the repository:
```
git clone <repository-url>
cd gemma
```
Install dependencies:
```
pip install -r requirements.txt
```
Run the application:
```
python app.py
```
Open your browser and navigate to http://localhost:7860

Hugging Face Spaces Deployment

Create a new Space on Hugging Face Spaces
Choose "Gradio" as the SDK
Upload the files from this repository
The app will automatically deploy and be accessible via your Space URL

📖 How to Use

Basic Chat

Type your message in the text input box
Click "Submit" or press Enter
View the AI response in the chat history

Image Analysis

Upload an image using the image upload component
Type a question about the image (e.g., "What do you see in this image?")
Submit to get AI-powered image analysis

File Processing

Upload a PDF or TXT file using the file upload component
Ask questions about the file content
The extracted text will be used as context for responses

Advanced Features

Persistent Context: Previous conversations are remembered
File Context: Uploaded file content persists for follow-up questions
Clear Chat: Reset conversation history and uploaded files

🔧 Configuration

Model Configuration

The application uses Google's Gemma-2-2B-IT model from Hugging Face. The model is loaded and used for inference in the gemma_3_inference function in app.py.

Customization

Modify the UI theme in the gr.Blocks configuration
Adjust file size limits and supported formats
Customize the chat history display format
Add additional file processing capabilities

📁 Project Structure

gemma/
├── .gitattributes            # Git configuration
├── .gitignore                # Git ignore file
├── .huggingface/             # Hugging Face configuration
│   └── CODEOWNERS            # Space ownership configuration
├── app.py                    # Main Gradio application
├── app_config.yaml           # Hugging Face Space configuration
├── HUGGINGFACE_DEPLOYMENT.md # Deployment instructions
├── push_to_huggingface.bat   # Windows deployment script
├── push_to_huggingface.py    # Python deployment script
├── README.md                 # Project documentation (with Space config)
├── README.space.md           # Hugging Face Space README
└── requirements.txt          # Python dependencies

🔮 Future Enhancements

Upgrade to Gemma-3 model when available
Support for additional file formats (DOCX, XLSX)
Advanced image processing capabilities
User authentication and personalized chat history
Export chat conversations
Multi-language support
Voice input/output capabilities

🤝 Contributing

Fork the repository
Create a feature branch (git checkout -b feature/amazing-feature)
Commit your changes (git commit -m 'Add amazing feature')
Push to the branch (git push origin feature/amazing-feature)
Open a Pull Request

📄 License

This project is licensed under the MIT License - see the LICENSE file for details.

🙏 Acknowledgments

Google for the Gemma model family
Hugging Face for the amazing ecosystem and Spaces platform
Gradio team for the intuitive UI framework

📞 Support

If you encounter any issues or have questions, please open an issue on the repository or contact the maintainers.

Note: This application uses Google's Gemma-2-2B-IT model. The model doesn't have native vision capabilities, but the application is designed to handle image uploads with appropriate messaging.