desolo-2918's picture
Move sync action to .github/workflows
5d1056c

A newer version of the Gradio SDK is available: 6.9.0

Upgrade

MCP Architecture Documentation

Overview

This document explains the Model Context Protocol (MCP) architecture used in the Competitive Analysis Agent system.

What is the Model Context Protocol?

MCP is a standardized protocol designed to enable seamless integration of:

  • AI Models (Claude, GPT-4, etc.) with
  • External Tools & Services (web search, databases, APIs, etc.)
  • Custom Business Logic (analysis, validation, report generation)

Why MCP?

  1. Modularity: Tools are isolated and reusable
  2. Scalability: Add tools without modifying core agent code
  3. Standardization: Common protocol across different AI systems
  4. Separation of Concerns: Clear boundaries between reasoning and action
  5. Production Ready: Built for enterprise-grade AI applications

System Architecture

Three-Tier Architecture

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚          PRESENTATION LAYER - Gradio UI             β”‚
β”‚  β€’ User input (company name, API key)               β”‚
β”‚  β€’ Report display (formatted Markdown)              β”‚
β”‚  β€’ Error handling and validation                    β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                   β”‚ HTTP/REST
                   β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚         APPLICATION LAYER - MCP Client              β”‚
β”‚  β€’ OpenAI Agent (GPT-4)                            β”‚
β”‚  β€’ Strategic reasoning and planning                 β”‚
β”‚  β€’ Tool orchestration and sequencing               β”‚
β”‚  β€’ Result synthesis                                 β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                   β”‚ MCP Protocol
                   β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚         SERVICE LAYER - MCP Server (FastMCP)        β”‚
β”‚  Tools:                                             β”‚
β”‚  β€’ validate_company()                              β”‚
β”‚  β€’ identify_sector()                               β”‚
β”‚  β€’ identify_competitors()                          β”‚
β”‚  β€’ browse_page()                                   β”‚
β”‚  β€’ generate_report()                               β”‚
β”‚                                                     β”‚
β”‚  External Services:                                β”‚
β”‚  β€’ DuckDuckGo API                                 β”‚
β”‚  β€’ HTTP/BeautifulSoup scraping                    β”‚
β”‚  β€’ OpenAI API (GPT-4)                             β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Component Details

1. Presentation Layer (app.py)

Gradio Interface

  • User-friendly web UI
  • Input validation
  • Output formatting
  • Error messaging
# Example flow
User Input: "Tesla"
    ↓
Validate inputs
    ↓
Call MCP Client.analyze_company()
    ↓
Display Markdown report

2. Application Layer (mcp_client.py)

MCP Client with OpenAI Agent

The client implements:

  • System Prompt: Defines agent role and goals
  • Message History: Maintains conversation context
  • Tool Calling: Translates agent decisions to MCP calls
  • Response Synthesis: Compiles results into reports
system_prompt = """
You are a competitive analysis expert.
Use available tools to:
1. Validate the company
2. Identify sector
3. Find competitors
4. Gather strategic data
5. Generate insights
"""

# Agent workflow:
messages = [
    {"role": "system", "content": system_prompt},
    {"role": "user", "content": "Analyze Sony"}
]

response = client.chat.completions.create(
    model="gpt-4",
    messages=messages
)
# OpenAI returns tool calls, which we execute

Key Features:

  • Graceful fallback to simple analysis when MCP unavailable
  • Handles API errors and timeouts
  • Synthesizes multiple tool results

3. Service Layer (mcp_server.py)

FastMCP Server with Tools

Tools Overview

Tool Purpose Returns
validate_company(name) Check if company exists Bool + evidence
identify_sector(name) Find industry classification Sector name
identify_competitors(sector, company) Discover top 3 rivals "Comp1, Comp2, Comp3"
browse_page(url, instructions) Extract webpage content Relevant text
generate_report(company, context) Create analysis report Markdown report

Tool Implementation Pattern

@mcp.tool()
def validate_company(company_name: str) -> str:
    """
    Docstring: Describes tool purpose and parameters
    """
    # Implementation
    try:
        results = web_search_tool(f"{company_name} company")
        evidence_count = analyze_search_results(results)
        return validation_result
    except Exception as e:
        return f"Error: {str(e)}"

Web Search Integration

from duckduckgo_search import DDGS

def web_search_tool(query: str) -> str:
    """Unified search interface for all tools"""
    with DDGS() as ddgs:
        results = list(ddgs.text(query, max_results=5))
    return format_results(results)

Message Flow

Complete Analysis Request

1. USER INTERFACE (Gradio)
   β”‚
   β”œβ”€ Company: "Apple"
   └─ OpenAI Key: "sk-..."
   
2. GRADIO β†’ MCP CLIENT
   β”‚
   β”œβ”€ analyze_competitor_landscape("Apple", api_key)
   β”‚
   └─ Creates CompetitiveAnalysisAgent instance
   
3. MCP CLIENT β†’ OPENAI
   β”‚
   β”œβ”€ System: "You are a competitive analyst..."
   β”œβ”€ User: "Analyze Apple's competitors"
   β”‚
   β”œβ”€ OpenAI responds with:
   β”‚  └─ "Call validate_company('Apple')"
   
4. MCP CLIENT β†’ MCP SERVER
   β”‚
   β”œβ”€ Calls: validate_company("Apple")
   β”œβ”€ Calls: identify_sector("Apple")
   β”œβ”€ Calls: identify_competitors("Technology", "Apple")
   β”‚
   └─ Receives results for each tool
   
5. MCP SERVER
   β”‚
   β”œβ”€ validate_company()
   β”‚  └─ Web search β†’ DuckDuckGo API β†’ Parse results
   β”‚
   β”œβ”€ identify_sector()
   β”‚  └─ Multi-stage search β†’ Keyword analysis β†’ Return sector
   β”‚
   β”œβ”€ identify_competitors()
   β”‚  └─ Industry search β†’ Competitor extraction β†’ Ranking
   β”‚
   └─ generate_report()
      └─ Format results β†’ Markdown template β†’ Return report
   
6. MCP CLIENT SYNTHESIS
   β”‚
   β”œβ”€ Compile all tool results
   β”œβ”€ Add OpenAI insights
   └─ Return complete report
   
7. GRADIO DISPLAY
   β”‚
   └─ Render Markdown report to user

Data Flow Diagram

USER INPUT
    β”‚
    β”œβ”€ company_name: "Company X"
    └─ api_key: "sk-xxx"
    β”‚
    β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚   Input Validation   β”‚
β”‚  (Length, Format)    β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
           β”‚
           β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚   OpenAI Agent Planning      β”‚
β”‚  (System + User Messages)    β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
           β”‚
           β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
           β”‚                             β”‚                         β”‚              β”‚
           β–Ό                             β–Ό                         β–Ό              β–Ό
    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”         β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”      β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”   β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”
    β”‚ validate_      β”‚         β”‚ identify_        β”‚      β”‚ identify_    β”‚   β”‚ browse_ β”‚
    β”‚ company()      β”‚         β”‚ sector()         β”‚      β”‚ competitors()β”‚   β”‚ page()  β”‚
    β””β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”˜         β””β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜      β””β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”˜   β””β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”˜
             β”‚                         β”‚                       β”‚                  β”‚
             β–Ό                         β–Ό                       β–Ό                  β–Ό
       β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”         β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”         β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”     β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
       β”‚ Web Search   β”‚         β”‚ Web Search  β”‚         β”‚ Web Search β”‚     β”‚ HTTP Get β”‚
       β”‚ DuckDuckGo   β”‚         β”‚ Multi-stage β”‚         β”‚ Industry   β”‚     β”‚ Parse    β”‚
       β”‚ + Analysis   β”‚         β”‚             β”‚         β”‚ Leaders    β”‚     β”‚ HTML     β”‚
       β””β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”˜         β””β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”˜         β””β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”˜     β””β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”˜
              β”‚                       β”‚                      β”‚                  β”‚
              β–Ό                       β–Ό                      β–Ό                  β–Ό
         VALIDATION  β†’  SECTOR ID  β†’  COMPETITORS  β†’  ADDITIONAL DATA
              β”‚              β”‚              β”‚                  β”‚
              β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                             β”‚
                             β–Ό
                    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
                    β”‚  generate_report()  β”‚
                    β”‚  (Compile results)  β”‚
                    β””β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                             β”‚
                             β–Ό
                    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
                    β”‚  OpenAI Final       β”‚
                    β”‚  Synthesis          β”‚
                    β””β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                             β”‚
                             β–Ό
                     FINAL REPORT
                    (Markdown format)

Tool Implementation Details

Tool 1: validate_company()

# Multi-stage validation
search_results = web_search_tool("Tesla company business official site")

# Evidence signals:
βœ“ Official website found (.com/.io)
βœ“ "Official site" or "official website" mention
βœ“ Company + sector description
βœ“ Business terminology present
βœ“ Wikipedia/news mentions

# Result: Evidence count >= 2 β†’ Valid company

Tool 2: identify_sector()

# Three search strategies:
1. "What does Tesla do?" β†’ Extract sector keywords
2. "Tesla industry type" β†’ Direct classification
3. "Tesla sector news" β†’ Financial/news sources

# Sector patterns:
{
  "Technology": ["software", "hardware", "cloud", "ai", ...],
  "Finance": ["banking", "fintech", "insurance", ...],
  "Manufacturing": ["automotive", "industrial", ...],
  ...
}

# Weighted voting to determine primary sector

Tool 3: identify_competitors()

# Search strategy:
1. "Top technology companies" β†’ Market leaders
2. "Tesla competitors" β†’ Direct rivals
3. "EV industry leaders" β†’ Sector players

# Extraction methods:
- Pattern matching for company names
- List parsing (comma-separated, bulleted)
- Frequency analysis and ranking

# Returns: Top 3 ranked competitors

Tool 4: browse_page()

# Content extraction workflow:
requests.get(url) 
  β†’ BeautifulSoup parsing
  β†’ Remove scripts/styles/headers/footers
  β†’ Extract main content divs/articles/paragraphs
  β†’ Keyword matching against instructions
  β†’ Return top N relevant sentences

# Safety: Timeout=10s, max_content=5000 chars

Tool 5: generate_report()

# Template-based report generation
report = f"""
# Competitive Analysis Report: {company_name}

## Executive Summary
[Synthesized findings]

## Competitor Comparison
| Competitor | Strategy | Pricing | Products | Market |
|------------|----------|---------|----------|--------|
| [extracted competitors] | - | - | - | - |

## Strategic Insights
[Recommendations]
"""

Error Handling Strategy

Layered Error Handling

Layer 1: Input Validation (Gradio)
  └─ Check company name length
  └─ Validate API key format
  └─ Return user-friendly error

Layer 2: Tool Execution (MCP Server)
  └─ Try/except on each tool
  └─ Timeout protection (10s requests)
  └─ Graceful degradation
  └─ Log detailed errors

Layer 3: Agent Logic (MCP Client)
  └─ API timeout handling
  └─ Rate limit handling
  └─ Fallback to simple analysis
  └─ Return partial results

Layer 4: User Feedback (Gradio)
  └─ Display error with context
  └─ Suggest remediation
  └─ Allow retry

Performance Optimization

Caching Strategy

# Web search results cached for 5 minutes
# Sector identify, re-used across tools
# Competitor list, reused in reports

Parallel Tool Execution

# Future enhancement: Run independent tools in parallel
validate_company() (parallel)
identify_sector()  (parallel)
identify_competitors() (sequential, depends on sector)

Rate Limiting

# DuckDuckGo: 2.0 second delays between searches
# OpenAI: Batched requests, monitoring quota
# HTTP: 10-second timeout, connection pooling

Security Considerations

API Key Handling

# Keys accepted via:
βœ“ UI input field (temporary in memory)
βœ— NOT stored in files
βœ— NOT logged in output
βœ— NOT persisted in database

# Environment variables optional:
Optional: Load from .env via python-dotenv

Data Privacy

# Web search results: Temporary, discarded after analysis
# Company data: Not cached or stored
# User queries: Not logged or tracked
# Report generation: All local processing

Web Scraping Safety

# User-Agent provided (genuine browser identification)
# Robots.txt respected (DuckDuckGo + BeautifulSoup)
# Timeout protection (10 seconds)
# Error handling for blocked requests

Extension Points

Adding New Tools

@mcp.tool()
def custom_tool(param1: str, param2: int) -> str:
    """
    Your custom tool description.
    Args:
        param1: Parameter 1 description
        param2: Parameter 2 description
    Returns:
        str: Result description
    """
    try:
        # Implementation
        result = some_operation(param1, param2)
        return result
    except Exception as e:
        return f"Error: {str(e)}"

Modifying Agent Behavior

# In mcp_client.py, edit system_prompt:
system_prompt = """
Updated instructions for agent behavior
"""

# Or add initial human message:
messages.append({
    "role": "user",
    "content": "Additional analysis request..."
})

Customizing Report Generation

# In mcp_server.py, edit generate_report() template:
report = f"""
# Custom Report Format

Your custom structure here...
"""

Testing

Manual Testing

# Test MCP Server
python mcp_server.py

# Test MCP Client functions
python -c "from mcp_client import analyze_competitor_landscape; print(analyze_competitor_landscape('Microsoft', 'sk-...'))"

# Test Gradio UI
python app.py
# Navigate to http://localhost:7860

Validation Tests

# Test validate_company()
assert "VALID" in validate_company("Google")
assert "NOT" in validate_company("FakeCompanyXYZ123")

# Test identify_sector()
assert "Technology" in identify_sector("Microsoft")
assert "Finance" in identify_sector("JPMorgan")

# Test competitor discovery
competitors = identify_competitors("Technology", "Google")
assert len(competitors) <= 3

Future Enhancements

  1. Real-time Market Data: Integrate financial APIs (Alpha Vantage, etc.)
  2. Sentiment Analysis: Analyze news sentiment about companies
  3. Patent Analysis: Include R&D insights from patents
  4. Social Media: Monitor competitor social media activity
  5. Pricing Intelligence: Track price changes over time
  6. SWOT Matrix: Generate structured SWOT analysis
  7. Visualization: Create charts and graphs
  8. PDF Export: Generate PDF reports
  9. Multi-company Batch: Analyze multiple companies
  10. Integration APIs: Connect to Slack, Salesforce, etc.

Conclusion

The MCP architecture provides:

  • βœ… Modularity and extensibility
  • βœ… Clear separation of concerns
  • βœ… Robust error handling
  • βœ… Scalability for future enhancements
  • βœ… Production-ready design
  • βœ… Easy tool management

This design enables rapid development, maintenance, and deployment of AI-powered competitive analysis systems.


Document Version: 1.0
Last Updated: March 2026