File size: 8,209 Bytes
ab89b14 c36d938 ab89b14 c51e926 ab89b14 c51e926 a9f05e7 c51e926 de5693d c51e926 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 |
---
title: Business Intelligence Dashboard
emoji: π
colorFrom: blue
colorTo: purple
sdk: gradio
sdk_version: 3.50.2
app_file: app.py
pinned: false
---
# π Business Intelligence Dashboard
A professional, interactive Business Intelligence dashboard built with Gradio that enables non-technical stakeholders to explore and analyze business data.
## π Features
### π Data Management
- **Pre-loaded Datasets**: Online Retail and Airbnb datasets included
- **Custom Upload**: Support for CSV, Excel (.xlsx, .xls), JSON, and Parquet files (max 50MB)
- **Automatic Data Cleaning**: Handles missing values, type conversions, and duplicate removal
- **Data Validation**: Comprehensive error handling and user-friendly error messages
### π Statistics & Profiling
- **Automated Data Profiling**: Get instant insights into your dataset
- **Numerical Summary**: Mean, median, std deviation, quartiles, min/max
- **Categorical Analysis**: Unique values, value counts, mode
- **Missing Values Report**: Identify data quality issues
- **Correlation Matrix**: Visual correlation heatmap for numerical features
### π Interactive Filtering
- **Dynamic Filters**: Filter by numerical ranges, categorical values, or date ranges
- **Real-time Updates**: See row counts update as you apply filters
- **Multiple Filters**: Combine multiple filters for precise data exploration
- **Filter Management**: Easy to add, view, and clear filters
### π Smart Visualizations
- **AI-Powered Recommendations**: Get intelligent visualization suggestions based on your data
- **One-Click Creation**: Create recommended visualizations with a single click
- **5 Visualization Types**:
- Time Series Plots (with aggregation: sum, mean, count, median)
- Distribution Plots (histogram, box plot)
- Category Analysis (bar chart, pie chart)
- Scatter Plots (with color coding and trend lines)
- Correlation Heatmap
- **Dual Backend**: Supports both Matplotlib and Plotly
- **Customization**: Full control over columns, aggregations, and visual parameters
### π‘ Automated Insights
- **Top/Bottom Performers**: Identify highest and lowest values
- **Trend Analysis**: Detect patterns over time with growth rate and volatility
- **Anomaly Detection**: Find outliers using Z-score or IQR methods
- **Distribution Analysis**: Understand data distributions with skewness and kurtosis
- **Correlation Insights**: Discover strong relationships between variables
### πΎ Export Capabilities
- **Data Export**: Export filtered data as CSV or Excel
- **Visualization Export**: Save charts as PNG images
## ποΈ Architecture & Design
### SOLID Principles Implementation
- **Single Responsibility**: Each class has one clear purpose
- **Open/Closed**: Extensible through Strategy Pattern without modifying existing code
- **Liskov Substitution**: All strategies are interchangeable
- **Interface Segregation**: Specific interfaces for different operations
- **Dependency Inversion**: Depends on abstractions, not concrete implementations
### Design Patterns
- **Strategy Pattern**: Used for data loading, visualizations, and insights
- **Facade Pattern**: DataProcessor provides simple interface to complex operations
- **Factory Pattern**: Dynamic strategy selection based on file type
### Project Structure
```
Business-Intelligence-Dashboard/
βββ app.py # Main Gradio application with 6 tabs
βββ data_processor.py # Data loading, cleaning, filtering (Strategy Pattern)
βββ visualizations.py # Chart creation with multiple strategies
βββ insights.py # Automated insight generation
βββ utils.py # Utility functions and validators
βββ requirements.txt # Python dependencies
βββ README.md # This file
βββ data/ # Sample datasets
β βββ Online_Retail.xlsx
β βββ Airbnb.csv
βββ tests/ # Comprehensive test suite
βββ init.py
βββ conftest.py
βββ test_utils.py
βββ test_data_processor.py
βββ test_visualizations.py
βββ test_insights.py
```
## π Getting Started
### Prerequisites
- Python 3.8 or higher
- pip package manager
### Installation
1. **Clone the repository**
```bash
git clone https://github.com/CR1502/Business-Intelligence-Dashboard.git
cd Business-Intelligence-Dashboard
```
2. **Create a virtual environment**
```bash
# On macOS/Linux
python3 -m venv venv
source venv/bin/activate
# On Windows
python -m venv venv
venv\Scripts\activate
```
3. **Install dependencies**
```bash
pip install -r requirements.txt
```
4. **Run the application**
```bash
python app.py
```
The dashboard will launch and open in your default browser at `http://localhost:7860`
## π Usage Guide
### 1. Loading Data
- **Option A**: Select "Online Retail" or "Airbnb" from the dropdown
- **Option B**: Upload your own dataset (CSV, Excel, JSON, or Parquet)
### 2. Exploring Statistics
- Navigate to "Statistics & Profiling" tab
- Click "Generate Data Profile" to see comprehensive statistics
- View missing values, numerical summaries, and correlation matrix
### 3. Filtering Data
- Go to "Filter & Explore" tab
- Select filter type (Numerical, Categorical, or Date)
- Choose column and set filter criteria
- Click "Add Filter" and see real-time updates
### 4. Creating Visualizations
- Navigate to "Visualizations" tab
- **Smart Recommendations**: Click "Get Visualization Recommendations" for AI-powered suggestions
- **Custom Visualizations**: Select visualization type and configure parameters
- Supported charts: Time Series, Distribution, Category, Scatter, Correlation
### 5. Generating Insights
- Go to "Insights" tab
- Click "Generate All Insights" for automated analysis
- Or select specific insight type for targeted analysis
### 6. Exporting Results
- Navigate to "Export" tab
- Choose format (CSV or Excel)
- Click "Export Data" to download filtered dataset
## π§ͺ Testing
Run the comprehensive test suite:
```bash
# Run all tests
pytest tests/ -v
# Run specific test file
pytest tests/test_utils.py -v
# Run with coverage
pytest tests/ --cov=. --cov-report=html
```
Test coverage includes:
- **180+ test cases** across all modules
- Unit tests for all functions and classes
- Strategy Pattern implementation tests
- Edge case and error handling tests
## π οΈ Technologies Used
- **Gradio**: Web interface and interactive components
- **Pandas**: Data manipulation and analysis
- **NumPy**: Numerical computations
- **Matplotlib/Seaborn**: Static visualizations
- **Plotly**: Interactive visualizations
- **Python 3.10+**: Core programming language
## π Sample Datasets
### Online Retail Dataset
- **8 columns**: InvoiceNo, StockCode, Description, Quantity, InvoiceDate, UnitPrice, CustomerID, Country
- **Use case**: E-commerce sales analysis, product trends, customer analysis
### Airbnb Dataset
- **26 columns**: Including price, location, room type, reviews, availability
- **Use case**: Pricing analysis, location trends, booking patterns
## π€ Contributing
Contributions are welcome! Please follow these steps:
1. Fork the repository
2. Create a feature branch (`git checkout -b feature/AmazingFeature`)
3. Commit your changes (`git commit -m 'Add some AmazingFeature'`)
4. Push to the branch (`git push origin feature/AmazingFeature`)
5. Open a Pull Request
### Development Guidelines
- Follow PEP 8 style guidelines
- Add docstrings to all functions
- Include unit tests for new features
- Update README.md for significant changes
## π¨βπ» Author
**Craig Roberts**
## π Acknowledgments
- Northeastern University - CS5130 Course (Prof Lino)
- Dataset sources: UCI ML Repository, Kaggle
## β‘ Performance Notes
- Handles datasets up to 50MB efficiently
- Optimized for 1,000-10,000 rows
- Tested with datasets containing 100+ columns
- Real-time filtering with sub-second response times
## π Known Issues
- Large datasets (>100MB) may cause memory issues
- Some complex visualizations may take time to render
- Browser storage not available (by design for security)
---
|