CraigRoberts15's picture
Update README.md
de5693d verified

A newer version of the Gradio SDK is available: 6.1.0

Upgrade
metadata
title: Business Intelligence Dashboard
emoji: πŸ“Š
colorFrom: blue
colorTo: purple
sdk: gradio
sdk_version: 3.50.2
app_file: app.py
pinned: false

πŸ“Š Business Intelligence Dashboard

A professional, interactive Business Intelligence dashboard built with Gradio that enables non-technical stakeholders to explore and analyze business data.

🌟 Features

πŸ“‚ Data Management

  • Pre-loaded Datasets: Online Retail and Airbnb datasets included
  • Custom Upload: Support for CSV, Excel (.xlsx, .xls), JSON, and Parquet files (max 50MB)
  • Automatic Data Cleaning: Handles missing values, type conversions, and duplicate removal
  • Data Validation: Comprehensive error handling and user-friendly error messages

πŸ“ˆ Statistics & Profiling

  • Automated Data Profiling: Get instant insights into your dataset
  • Numerical Summary: Mean, median, std deviation, quartiles, min/max
  • Categorical Analysis: Unique values, value counts, mode
  • Missing Values Report: Identify data quality issues
  • Correlation Matrix: Visual correlation heatmap for numerical features

πŸ” Interactive Filtering

  • Dynamic Filters: Filter by numerical ranges, categorical values, or date ranges
  • Real-time Updates: See row counts update as you apply filters
  • Multiple Filters: Combine multiple filters for precise data exploration
  • Filter Management: Easy to add, view, and clear filters

πŸ“‰ Smart Visualizations

  • AI-Powered Recommendations: Get intelligent visualization suggestions based on your data
  • One-Click Creation: Create recommended visualizations with a single click
  • 5 Visualization Types:
    • Time Series Plots (with aggregation: sum, mean, count, median)
    • Distribution Plots (histogram, box plot)
    • Category Analysis (bar chart, pie chart)
    • Scatter Plots (with color coding and trend lines)
    • Correlation Heatmap
  • Dual Backend: Supports both Matplotlib and Plotly
  • Customization: Full control over columns, aggregations, and visual parameters

πŸ’‘ Automated Insights

  • Top/Bottom Performers: Identify highest and lowest values
  • Trend Analysis: Detect patterns over time with growth rate and volatility
  • Anomaly Detection: Find outliers using Z-score or IQR methods
  • Distribution Analysis: Understand data distributions with skewness and kurtosis
  • Correlation Insights: Discover strong relationships between variables

πŸ’Ύ Export Capabilities

  • Data Export: Export filtered data as CSV or Excel
  • Visualization Export: Save charts as PNG images

πŸ—οΈ Architecture & Design

SOLID Principles Implementation

  • Single Responsibility: Each class has one clear purpose
  • Open/Closed: Extensible through Strategy Pattern without modifying existing code
  • Liskov Substitution: All strategies are interchangeable
  • Interface Segregation: Specific interfaces for different operations
  • Dependency Inversion: Depends on abstractions, not concrete implementations

Design Patterns

  • Strategy Pattern: Used for data loading, visualizations, and insights
  • Facade Pattern: DataProcessor provides simple interface to complex operations
  • Factory Pattern: Dynamic strategy selection based on file type

Project Structure

Business-Intelligence-Dashboard/
β”œβ”€β”€ app.py                      # Main Gradio application with 6 tabs
β”œβ”€β”€ data_processor.py           # Data loading, cleaning, filtering (Strategy Pattern)
β”œβ”€β”€ visualizations.py           # Chart creation with multiple strategies
β”œβ”€β”€ insights.py                 # Automated insight generation
β”œβ”€β”€ utils.py                    # Utility functions and validators
β”œβ”€β”€ requirements.txt            # Python dependencies
β”œβ”€β”€ README.md                   # This file
β”œβ”€β”€ data/                       # Sample datasets
β”‚   β”œβ”€β”€ Online_Retail.xlsx
β”‚   └── Airbnb.csv
└── tests/                      # Comprehensive test suite
    β”œβ”€β”€ init.py
    β”œβ”€β”€ conftest.py
    β”œβ”€β”€ test_utils.py
    β”œβ”€β”€ test_data_processor.py
    β”œβ”€β”€ test_visualizations.py
    └── test_insights.py

πŸš€ Getting Started

Prerequisites

  • Python 3.8 or higher
  • pip package manager

Installation

  1. Clone the repository
git clone https://github.com/CR1502/Business-Intelligence-Dashboard.git
cd Business-Intelligence-Dashboard
  1. Create a virtual environment
# On macOS/Linux
python3 -m venv venv
source venv/bin/activate

# On Windows
python -m venv venv
venv\Scripts\activate
  1. Install dependencies
pip install -r requirements.txt
  1. Run the application
python app.py

The dashboard will launch and open in your default browser at http://localhost:7860

πŸ“– Usage Guide

1. Loading Data

  • Option A: Select "Online Retail" or "Airbnb" from the dropdown
  • Option B: Upload your own dataset (CSV, Excel, JSON, or Parquet)

2. Exploring Statistics

  • Navigate to "Statistics & Profiling" tab
  • Click "Generate Data Profile" to see comprehensive statistics
  • View missing values, numerical summaries, and correlation matrix

3. Filtering Data

  • Go to "Filter & Explore" tab
  • Select filter type (Numerical, Categorical, or Date)
  • Choose column and set filter criteria
  • Click "Add Filter" and see real-time updates

4. Creating Visualizations

  • Navigate to "Visualizations" tab
  • Smart Recommendations: Click "Get Visualization Recommendations" for AI-powered suggestions
  • Custom Visualizations: Select visualization type and configure parameters
  • Supported charts: Time Series, Distribution, Category, Scatter, Correlation

5. Generating Insights

  • Go to "Insights" tab
  • Click "Generate All Insights" for automated analysis
  • Or select specific insight type for targeted analysis

6. Exporting Results

  • Navigate to "Export" tab
  • Choose format (CSV or Excel)
  • Click "Export Data" to download filtered dataset

πŸ§ͺ Testing

Run the comprehensive test suite:

# Run all tests
pytest tests/ -v

# Run specific test file
pytest tests/test_utils.py -v

# Run with coverage
pytest tests/ --cov=. --cov-report=html

Test coverage includes:

  • 180+ test cases across all modules
  • Unit tests for all functions and classes
  • Strategy Pattern implementation tests
  • Edge case and error handling tests

πŸ› οΈ Technologies Used

  • Gradio: Web interface and interactive components
  • Pandas: Data manipulation and analysis
  • NumPy: Numerical computations
  • Matplotlib/Seaborn: Static visualizations
  • Plotly: Interactive visualizations
  • Python 3.10+: Core programming language

πŸ“Š Sample Datasets

Online Retail Dataset

  • 8 columns: InvoiceNo, StockCode, Description, Quantity, InvoiceDate, UnitPrice, CustomerID, Country
  • Use case: E-commerce sales analysis, product trends, customer analysis

Airbnb Dataset

  • 26 columns: Including price, location, room type, reviews, availability
  • Use case: Pricing analysis, location trends, booking patterns

🀝 Contributing

Contributions are welcome! Please follow these steps:

  1. Fork the repository
  2. Create a feature branch (git checkout -b feature/AmazingFeature)
  3. Commit your changes (git commit -m 'Add some AmazingFeature')
  4. Push to the branch (git push origin feature/AmazingFeature)
  5. Open a Pull Request

Development Guidelines

  • Follow PEP 8 style guidelines
  • Add docstrings to all functions
  • Include unit tests for new features
  • Update README.md for significant changes

πŸ‘¨β€πŸ’» Author

Craig Roberts

πŸ™ Acknowledgments

  • Northeastern University - CS5130 Course (Prof Lino)
  • Dataset sources: UCI ML Repository, Kaggle

⚑ Performance Notes

  • Handles datasets up to 50MB efficiently
  • Optimized for 1,000-10,000 rows
  • Tested with datasets containing 100+ columns
  • Real-time filtering with sub-second response times

πŸ› Known Issues

  • Large datasets (>100MB) may cause memory issues
  • Some complex visualizations may take time to render
  • Browser storage not available (by design for security)