desolo-2918's picture
Move sync action to .github/workflows
5d1056c
# MCP Architecture Documentation
## Overview
This document explains the Model Context Protocol (MCP) architecture used in the Competitive Analysis Agent system.
## What is the Model Context Protocol?
MCP is a standardized protocol designed to enable seamless integration of:
- **AI Models** (Claude, GPT-4, etc.) with
- **External Tools & Services** (web search, databases, APIs, etc.)
- **Custom Business Logic** (analysis, validation, report generation)
### Why MCP?
1. **Modularity**: Tools are isolated and reusable
2. **Scalability**: Add tools without modifying core agent code
3. **Standardization**: Common protocol across different AI systems
4. **Separation of Concerns**: Clear boundaries between reasoning and action
5. **Production Ready**: Built for enterprise-grade AI applications
---
## System Architecture
### Three-Tier Architecture
```
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ PRESENTATION LAYER - Gradio UI β”‚
β”‚ β€’ User input (company name, API key) β”‚
β”‚ β€’ Report display (formatted Markdown) β”‚
β”‚ β€’ Error handling and validation β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
β”‚ HTTP/REST
β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ APPLICATION LAYER - MCP Client β”‚
β”‚ β€’ OpenAI Agent (GPT-4) β”‚
β”‚ β€’ Strategic reasoning and planning β”‚
β”‚ β€’ Tool orchestration and sequencing β”‚
β”‚ β€’ Result synthesis β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
β”‚ MCP Protocol
β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ SERVICE LAYER - MCP Server (FastMCP) β”‚
β”‚ Tools: β”‚
β”‚ β€’ validate_company() β”‚
β”‚ β€’ identify_sector() β”‚
β”‚ β€’ identify_competitors() β”‚
β”‚ β€’ browse_page() β”‚
β”‚ β€’ generate_report() β”‚
β”‚ β”‚
β”‚ External Services: β”‚
β”‚ β€’ DuckDuckGo API β”‚
β”‚ β€’ HTTP/BeautifulSoup scraping β”‚
β”‚ β€’ OpenAI API (GPT-4) β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
```
---
## Component Details
### 1. Presentation Layer (`app.py`)
**Gradio Interface**
- User-friendly web UI
- Input validation
- Output formatting
- Error messaging
```python
# Example flow
User Input: "Tesla"
↓
Validate inputs
↓
Call MCP Client.analyze_company()
↓
Display Markdown report
```
### 2. Application Layer (`mcp_client.py`)
**MCP Client with OpenAI Agent**
The client implements:
- **System Prompt**: Defines agent role and goals
- **Message History**: Maintains conversation context
- **Tool Calling**: Translates agent decisions to MCP calls
- **Response Synthesis**: Compiles results into reports
```python
system_prompt = """
You are a competitive analysis expert.
Use available tools to:
1. Validate the company
2. Identify sector
3. Find competitors
4. Gather strategic data
5. Generate insights
"""
# Agent workflow:
messages = [
{"role": "system", "content": system_prompt},
{"role": "user", "content": "Analyze Sony"}
]
response = client.chat.completions.create(
model="gpt-4",
messages=messages
)
# OpenAI returns tool calls, which we execute
```
**Key Features**:
- Graceful fallback to simple analysis when MCP unavailable
- Handles API errors and timeouts
- Synthesizes multiple tool results
### 3. Service Layer (`mcp_server.py`)
**FastMCP Server with Tools**
#### Tools Overview
| Tool | Purpose | Returns |
|------|---------|---------|
| `validate_company(name)` | Check if company exists | Bool + evidence |
| `identify_sector(name)` | Find industry classification | Sector name |
| `identify_competitors(sector, company)` | Discover top 3 rivals | "Comp1, Comp2, Comp3" |
| `browse_page(url, instructions)` | Extract webpage content | Relevant text |
| `generate_report(company, context)` | Create analysis report | Markdown report |
#### Tool Implementation Pattern
```python
@mcp.tool()
def validate_company(company_name: str) -> str:
"""
Docstring: Describes tool purpose and parameters
"""
# Implementation
try:
results = web_search_tool(f"{company_name} company")
evidence_count = analyze_search_results(results)
return validation_result
except Exception as e:
return f"Error: {str(e)}"
```
#### Web Search Integration
```python
from duckduckgo_search import DDGS
def web_search_tool(query: str) -> str:
"""Unified search interface for all tools"""
with DDGS() as ddgs:
results = list(ddgs.text(query, max_results=5))
return format_results(results)
```
---
## Message Flow
### Complete Analysis Request
```
1. USER INTERFACE (Gradio)
β”‚
β”œβ”€ Company: "Apple"
└─ OpenAI Key: "sk-..."
2. GRADIO β†’ MCP CLIENT
β”‚
β”œβ”€ analyze_competitor_landscape("Apple", api_key)
β”‚
└─ Creates CompetitiveAnalysisAgent instance
3. MCP CLIENT β†’ OPENAI
β”‚
β”œβ”€ System: "You are a competitive analyst..."
β”œβ”€ User: "Analyze Apple's competitors"
β”‚
β”œβ”€ OpenAI responds with:
β”‚ └─ "Call validate_company('Apple')"
4. MCP CLIENT β†’ MCP SERVER
β”‚
β”œβ”€ Calls: validate_company("Apple")
β”œβ”€ Calls: identify_sector("Apple")
β”œβ”€ Calls: identify_competitors("Technology", "Apple")
β”‚
└─ Receives results for each tool
5. MCP SERVER
β”‚
β”œβ”€ validate_company()
β”‚ └─ Web search β†’ DuckDuckGo API β†’ Parse results
β”‚
β”œβ”€ identify_sector()
β”‚ └─ Multi-stage search β†’ Keyword analysis β†’ Return sector
β”‚
β”œβ”€ identify_competitors()
β”‚ └─ Industry search β†’ Competitor extraction β†’ Ranking
β”‚
└─ generate_report()
└─ Format results β†’ Markdown template β†’ Return report
6. MCP CLIENT SYNTHESIS
β”‚
β”œβ”€ Compile all tool results
β”œβ”€ Add OpenAI insights
└─ Return complete report
7. GRADIO DISPLAY
β”‚
└─ Render Markdown report to user
```
---
## Data Flow Diagram
```
USER INPUT
β”‚
β”œβ”€ company_name: "Company X"
└─ api_key: "sk-xxx"
β”‚
β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ Input Validation β”‚
β”‚ (Length, Format) β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
β”‚
β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ OpenAI Agent Planning β”‚
β”‚ (System + User Messages) β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ β”‚ β”‚ β”‚
β–Ό β–Ό β–Ό β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ validate_ β”‚ β”‚ identify_ β”‚ β”‚ identify_ β”‚ β”‚ browse_ β”‚
β”‚ company() β”‚ β”‚ sector() β”‚ β”‚ competitors()β”‚ β”‚ page() β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”˜
β”‚ β”‚ β”‚ β”‚
β–Ό β–Ό β–Ό β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ Web Search β”‚ β”‚ Web Search β”‚ β”‚ Web Search β”‚ β”‚ HTTP Get β”‚
β”‚ DuckDuckGo β”‚ β”‚ Multi-stage β”‚ β”‚ Industry β”‚ β”‚ Parse β”‚
β”‚ + Analysis β”‚ β”‚ β”‚ β”‚ Leaders β”‚ β”‚ HTML β”‚
β””β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”˜
β”‚ β”‚ β”‚ β”‚
β–Ό β–Ό β–Ό β–Ό
VALIDATION β†’ SECTOR ID β†’ COMPETITORS β†’ ADDITIONAL DATA
β”‚ β”‚ β”‚ β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
β”‚
β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ generate_report() β”‚
β”‚ (Compile results) β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
β”‚
β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ OpenAI Final β”‚
β”‚ Synthesis β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
β”‚
β–Ό
FINAL REPORT
(Markdown format)
```
---
## Tool Implementation Details
### Tool 1: `validate_company()`
```python
# Multi-stage validation
search_results = web_search_tool("Tesla company business official site")
# Evidence signals:
βœ“ Official website found (.com/.io)
βœ“ "Official site" or "official website" mention
βœ“ Company + sector description
βœ“ Business terminology present
βœ“ Wikipedia/news mentions
# Result: Evidence count >= 2 β†’ Valid company
```
### Tool 2: `identify_sector()`
```python
# Three search strategies:
1. "What does Tesla do?" β†’ Extract sector keywords
2. "Tesla industry type" β†’ Direct classification
3. "Tesla sector news" β†’ Financial/news sources
# Sector patterns:
{
"Technology": ["software", "hardware", "cloud", "ai", ...],
"Finance": ["banking", "fintech", "insurance", ...],
"Manufacturing": ["automotive", "industrial", ...],
...
}
# Weighted voting to determine primary sector
```
### Tool 3: `identify_competitors()`
```python
# Search strategy:
1. "Top technology companies" β†’ Market leaders
2. "Tesla competitors" β†’ Direct rivals
3. "EV industry leaders" β†’ Sector players
# Extraction methods:
- Pattern matching for company names
- List parsing (comma-separated, bulleted)
- Frequency analysis and ranking
# Returns: Top 3 ranked competitors
```
### Tool 4: `browse_page()`
```python
# Content extraction workflow:
requests.get(url)
β†’ BeautifulSoup parsing
β†’ Remove scripts/styles/headers/footers
β†’ Extract main content divs/articles/paragraphs
β†’ Keyword matching against instructions
β†’ Return top N relevant sentences
# Safety: Timeout=10s, max_content=5000 chars
```
### Tool 5: `generate_report()`
```python
# Template-based report generation
report = f"""
# Competitive Analysis Report: {company_name}
## Executive Summary
[Synthesized findings]
## Competitor Comparison
| Competitor | Strategy | Pricing | Products | Market |
|------------|----------|---------|----------|--------|
| [extracted competitors] | - | - | - | - |
## Strategic Insights
[Recommendations]
"""
```
---
## Error Handling Strategy
### Layered Error Handling
```
Layer 1: Input Validation (Gradio)
└─ Check company name length
└─ Validate API key format
└─ Return user-friendly error
Layer 2: Tool Execution (MCP Server)
└─ Try/except on each tool
└─ Timeout protection (10s requests)
└─ Graceful degradation
└─ Log detailed errors
Layer 3: Agent Logic (MCP Client)
└─ API timeout handling
└─ Rate limit handling
└─ Fallback to simple analysis
└─ Return partial results
Layer 4: User Feedback (Gradio)
└─ Display error with context
└─ Suggest remediation
└─ Allow retry
```
---
## Performance Optimization
### Caching Strategy
```python
# Web search results cached for 5 minutes
# Sector identify, re-used across tools
# Competitor list, reused in reports
```
### Parallel Tool Execution
```python
# Future enhancement: Run independent tools in parallel
validate_company() (parallel)
identify_sector() (parallel)
identify_competitors() (sequential, depends on sector)
```
### Rate Limiting
```python
# DuckDuckGo: 2.0 second delays between searches
# OpenAI: Batched requests, monitoring quota
# HTTP: 10-second timeout, connection pooling
```
---
## Security Considerations
### API Key Handling
```python
# Keys accepted via:
βœ“ UI input field (temporary in memory)
βœ— NOT stored in files
βœ— NOT logged in output
βœ— NOT persisted in database
# Environment variables optional:
Optional: Load from .env via python-dotenv
```
### Data Privacy
```python
# Web search results: Temporary, discarded after analysis
# Company data: Not cached or stored
# User queries: Not logged or tracked
# Report generation: All local processing
```
### Web Scraping Safety
```python
# User-Agent provided (genuine browser identification)
# Robots.txt respected (DuckDuckGo + BeautifulSoup)
# Timeout protection (10 seconds)
# Error handling for blocked requests
```
---
## Extension Points
### Adding New Tools
```python
@mcp.tool()
def custom_tool(param1: str, param2: int) -> str:
"""
Your custom tool description.
Args:
param1: Parameter 1 description
param2: Parameter 2 description
Returns:
str: Result description
"""
try:
# Implementation
result = some_operation(param1, param2)
return result
except Exception as e:
return f"Error: {str(e)}"
```
### Modifying Agent Behavior
```python
# In mcp_client.py, edit system_prompt:
system_prompt = """
Updated instructions for agent behavior
"""
# Or add initial human message:
messages.append({
"role": "user",
"content": "Additional analysis request..."
})
```
### Customizing Report Generation
```python
# In mcp_server.py, edit generate_report() template:
report = f"""
# Custom Report Format
Your custom structure here...
"""
```
---
## Testing
### Manual Testing
```bash
# Test MCP Server
python mcp_server.py
# Test MCP Client functions
python -c "from mcp_client import analyze_competitor_landscape; print(analyze_competitor_landscape('Microsoft', 'sk-...'))"
# Test Gradio UI
python app.py
# Navigate to http://localhost:7860
```
### Validation Tests
```python
# Test validate_company()
assert "VALID" in validate_company("Google")
assert "NOT" in validate_company("FakeCompanyXYZ123")
# Test identify_sector()
assert "Technology" in identify_sector("Microsoft")
assert "Finance" in identify_sector("JPMorgan")
# Test competitor discovery
competitors = identify_competitors("Technology", "Google")
assert len(competitors) <= 3
```
---
## Future Enhancements
1. **Real-time Market Data**: Integrate financial APIs (Alpha Vantage, etc.)
2. **Sentiment Analysis**: Analyze news sentiment about companies
3. **Patent Analysis**: Include R&D insights from patents
4. **Social Media**: Monitor competitor social media activity
5. **Pricing Intelligence**: Track price changes over time
6. **SWOT Matrix**: Generate structured SWOT analysis
7. **Visualization**: Create charts and graphs
8. **PDF Export**: Generate PDF reports
9. **Multi-company Batch**: Analyze multiple companies
10. **Integration APIs**: Connect to Slack, Salesforce, etc.
---
## Conclusion
The MCP architecture provides:
- βœ… Modularity and extensibility
- βœ… Clear separation of concerns
- βœ… Robust error handling
- βœ… Scalability for future enhancements
- βœ… Production-ready design
- βœ… Easy tool management
This design enables rapid development, maintenance, and deployment of AI-powered competitive analysis systems.
---
**Document Version**: 1.0
**Last Updated**: March 2026