Spaces:

Qdonnars
/

MCP_indicators

Running

Qdonnars Cursor commited on 21 days ago

Commit

bad6218

1 Parent(s): b90d5d5

feat: Implement MCP Server for Indicateurs Territoriaux API

Add complete MCP server exposing 4 tools for querying French territorial
ecological indicators via the CGDD/Ministry Cube.js API.

Tools implemented:
- list_indicators: List all indicators with thematique/maille filters
- get_indicator_details: Get full metadata and sources for an indicator
- query_indicator_data: Query data values by geographic level and code
- search_indicators: Full-text search in indicator names/descriptions

Architecture:
- Gradio SSE endpoint at /gradio_api/mcp/ for Claude.ai integration
- CubeResolver for mapping indicator_id to data cubes via /meta parsing
- Metadata cache with periodic refresh
- Async httpx client with proper error handling

Cube naming convention discovered:
- Data cubes: {thematique}_{maille} (e.g., conso_enaf_com)
- Measures: {cube}.id_{indicator_id} (e.g., conso_enaf_com.id_611)
- Geo dimensions: geocode_*/libelle_* for all levels

Co-authored-by: Cursor <cursoragent@cursor.com>

Files changed (11) hide show

.env.example +9 -0
.gitignore +21 -0
README.md +290 -2
app.py +186 -0
requirements.txt +11 -0
src/__init__.py +3 -0
src/api_client.py +317 -0
src/cache.py +299 -0
src/cube_resolver.py +286 -0
src/models.py +239 -0
src/tools.py +354 -0

.env.example ADDED Viewed

	@@ -0,0 +1,9 @@

+# API Token for the Indicateurs Territoriaux API
+# Get your token from the API provider
+INDICATEURS_TE_TOKEN=your_jwt_token_here
+# Base URL of the API (default: production)
+INDICATEURS_TE_BASE_URL=https://api.indicateurs.ecologie.gouv.fr
+# Cache refresh interval in seconds (default: 1 hour)
+CACHE_REFRESH_SECONDS=3600

.gitignore ADDED Viewed

	@@ -0,0 +1,21 @@

+# Environment
+.env
+venv/
+__pycache__/
+*.pyc
+# IDE
+.vscode/
+.idea/
+# Logs
+*.log
+server.log
+# Test/debug files
+cubes_list.json
+cubes_structure.json
+# OS
+.DS_Store
+Thumbs.db

README.md CHANGED Viewed

@@ -4,9 +4,297 @@ emoji: 📉
 colorFrom: blue
 colorTo: pink
 sdk: gradio
-sdk_version: 6.5.1
 app_file: app.py
 pinned: false
 ---
-Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference

 colorFrom: blue
 colorTo: pink
 sdk: gradio
+sdk_version: 5.0.0
 app_file: app.py
 pinned: false
 ---
+# MCP Server - Indicateurs Territoriaux de Transition Écologique
+Serveur MCP (Model Context Protocol) exposant l'API du **Hub d'Indicateurs Territoriaux de Transition Écologique** (CGDD/Ministère de la Transition Écologique).
+Ce serveur permet à des LLMs (Claude, GPT, etc.) d'interroger les données environnementales territoriales françaises.
+## Outils MCP disponibles
+### 1. `list_indicators`
+Liste tous les indicateurs avec filtres optionnels.
+**Paramètres :**
+- `thematique` (optionnel) : Filtre par thématique FNV ("mieux se déplacer", "mieux se loger"...)
+- `maille` (optionnel) : Filtre par niveau géographique ("region", "departement", "epci", "commune")
+### 2. `get_indicator_details`
+Retourne les détails complets d'un indicateur (description, méthode de calcul, sources).
+**Paramètres :**
+- `indicator_id` : ID numérique de l'indicateur
+### 3. `query_indicator_data`
+Interroge les données d'un indicateur pour un territoire.
+**Paramètres :**
+- `indicator_id` : ID de l'indicateur
+- `geographic_level` : "region" | "departement" | "epci" | "commune"
+- `geographic_code` (optionnel) : Code INSEE du territoire
+- `year` (optionnel) : Année des données
+### 4. `search_indicators`
+Recherche d'indicateurs par mots-clés.
+**Paramètres :**
+- `query` : Termes de recherche (recherche dans libellé et description)
+## Installation
+### Prérequis
+- Python >= 3.10
+- Un token d'authentification pour l'API Indicateurs
+### Installation locale
+```bash
+# Cloner le dépôt
+git clone https://github.com/your-repo/mcp-indicateurs-te.git
+cd mcp-indicateurs-te
+# Créer un environnement virtuel
+python -m venv venv
+source venv/bin/activate  # Linux/Mac
+# ou: venv\Scripts\activate  # Windows
+# Installer les dépendances
+pip install -r requirements.txt
+# Configurer le token
+cp .env.example .env
+# Éditer .env et ajouter votre token INDICATEURS_TE_TOKEN
+# Lancer le serveur
+python app.py
+```
+Le serveur sera accessible sur `http://localhost:7860`.
+### Déploiement HuggingFace Spaces
+1. Créer un nouveau Space avec le SDK Gradio
+2. Pousser le code vers le Space
+3. Configurer le secret `INDICATEURS_TE_TOKEN` dans les paramètres du Space
+## Configuration MCP Client
+### Claude Desktop
+Ajouter dans `claude_desktop_config.json` :
+```json
+{
+  "mcpServers": {
+    "indicateurs-te": {
+      "url": "https://YOUR-SPACE.hf.space/gradio_api/mcp/"
+    }
+  }
+}
+```
+### Cursor
+Ajouter dans les paramètres MCP de Cursor :
+```json
+{
+  "mcpServers": {
+    "indicateurs-te": {
+      "url": "http://localhost:7860/gradio_api/mcp/"
+    }
+  }
+}
+```
+### Avec mcp-remote (pour clients ne supportant pas HTTP)
+```json
+{
+  "mcpServers": {
+    "indicateurs-te": {
+      "command": "npx",
+      "args": [
+        "mcp-remote",
+        "http://localhost:7860/gradio_api/mcp/"
+      ]
+    }
+  }
+}
+```
+## Architecture de l'API (validée par tests)
+### Convention de nommage des cubes
+Les cubes de **données** suivent le format : `{thematique}_{maille}`
+| Suffixe | Maille |
+|---------|--------|
+| `_com` | Commune |
+| `_epci` | EPCI |
+| `_dpt` | Département |
+| `_reg` | Région |
+Exemples :
+- `conso_enaf_com` → Consommation ENAF, maille commune
+- `surface_bio_dpt` → Surface bio, maille département
+### Les measures contiennent l'ID de l'indicateur
+Format : `{cube_name}.id_{indicator_id}`
+Exemples :
+- `conso_enaf_com.id_611` → Indicateur 611 dans le cube conso_enaf_com
+- `surface_bio_dpt.id_606` → Indicateur 606 dans le cube surface_bio_dpt
+### Dimensions géographiques (standardisées)
+| Dimension | Description |
+|-----------|-------------|
+| `geocode_commune` | Code INSEE commune (5 chiffres) |
+| `libelle_commune` | Nom de la commune |
+| `geocode_epci` | Code SIREN EPCI (9 chiffres) |
+| `libelle_epci` | Nom de l'EPCI |
+| `geocode_departement` | Code département (2-3 car.) |
+| `libelle_departement` | Nom du département |
+| `geocode_region` | Code région (2 chiffres) |
+| `libelle_region` | Nom de la région |
+### Dimensions temporelles
+| Dimension | Description |
+|-----------|-------------|
+| `annee` | Année (string : "2020") |
+## Exemples d'utilisation
+### Via un LLM
+```
+Utilisateur: Quels indicateurs sur la consommation d'espace ?
+LLM: [appelle search_indicators("consommation espace")]
+     Voici les indicateurs disponibles :
+     - ID 611: Consommation d'espaces naturels, agricoles et forestiers
+Utilisateur: Détails sur l'indicateur 611
+LLM: [appelle get_indicator_details("611")]
+     L'indicateur 611 mesure la consommation d'ENAF...
+     Mailles disponibles: commune, epci, departement, region
+Utilisateur: Valeurs pour la région PACA en 2020
+LLM: [appelle query_indicator_data("611", "region", "93", "2020")]
+     Pour PACA (code 93) en 2020 : 1737.29 ha
+```
+### Exemple de requête Cube.js validée
+```json
+{
+  "query": {
+    "measures": ["conso_enaf_com.id_611"],
+    "dimensions": [
+      "conso_enaf_com.libelle_region",
+      "conso_enaf_com.annee"
+    ],
+    "filters": [
+      {
+        "member": "conso_enaf_com.geocode_region",
+        "operator": "equals",
+        "values": ["93"]
+      }
+    ],
+    "limit": 100
+  }
+}
+```
+## Codes géographiques INSEE
+| Niveau | Format | Exemples |
+|--------|--------|----------|
+| Région | 2 chiffres | 93 (PACA), 11 (Île-de-France), 75 (Nouvelle-Aquitaine), 84 (Auvergne-Rhône-Alpes) |
+| Département | 2-3 caractères | 13, 2A, 974 |
+| EPCI | 9 chiffres (SIREN) | 200054807 |
+| Commune | 5 chiffres | 75056 (Paris), 13055 (Marseille) |
+## Variables d'environnement
+| Variable | Description | Défaut |
+|----------|-------------|--------|
+| `INDICATEURS_TE_TOKEN` | Token JWT pour l'API | (requis) |
+| `INDICATEURS_TE_BASE_URL` | URL de base de l'API | `https://api.indicateurs.ecologie.gouv.fr` |
+| `CACHE_REFRESH_SECONDS` | Intervalle de rafraîchissement du cache | 3600 |
+## Architecture technique
+```
+┌─────────────────┐     ┌──────────────────┐     ┌─────────────────┐
+│   MCP Client    │────▶│   Gradio App     │────▶│   Cube.js API   │
+│ (Claude, etc.)  │     │   (MCP Server)   │     │  (Indicateurs)  │
+└─────────────────┘     └──────────────────┘     └─────────────────┘
+                               │
+                        ┌──────┴──────┐
+                        │ CubeResolver │
+                        │   + Cache    │
+                        └─────────────┘
+```
+- **Gradio** : Interface web + endpoint SSE MCP
+- **CubeResolver** : Mapping indicator_id → cube_name via parsing /meta
+- **Cache** : Métadonnées des indicateurs chargées au démarrage
+## Structure du projet
+```
+├── src/
+│   ├── __init__.py          # Package init
+│   ├── api_client.py        # Client HTTP Cube.js async
+│   ├── cube_resolver.py     # Logique find_cube_for_indicator
+│   ├── cache.py             # Cache des métadonnées
+│   ├── models.py            # Modèles Pydantic
+│   └── tools.py             # Implémentation des 4 outils MCP
+├── app.py                   # Point d'entrée Gradio
+├── requirements.txt         # Dépendances
+└── .env.example             # Template de configuration
+```
+## Développement
+### Test avec MCP Inspector
+```bash
+# Lancer le serveur
+python app.py
+# Dans un autre terminal
+npx @modelcontextprotocol/inspector
+# Connecter à http://localhost:7860/gradio_api/mcp/
+```
+### Points d'attention
+1. **Cache du /meta** : ~100+ cubes, chargé une fois au startup
+2. **Mapping indicator_id → cube** : Parcours des measures de chaque cube
+3. **Mailles non uniformes** : Vérifier `mailles_disponibles` avant de requêter
+4. **Valeurs string** : Les filtres Cube.js attendent des strings (`"93"` pas `93`)
+## Ressources
+- [Documentation MCP](https://modelcontextprotocol.io/)
+- [Gradio MCP Guide](https://gradio.app/guides/building-mcp-server-with-gradio)
+- [API Cube.js](https://cube.dev/docs/rest-api)
+- [Portail Indicateurs](https://ecologie.data.gouv.fr/indicators)
+## Licence
+MIT

app.py ADDED Viewed

	@@ -0,0 +1,186 @@

+"""Gradio MCP Server for Indicateurs Territoriaux de Transition Écologique.
+This application exposes 4 MCP tools for querying French territorial
+ecological indicators via the Cube.js API.
+Tools:
+    - list_indicators: List all indicators with optional filters
+    - get_indicator_details: Get detailed info about a specific indicator
+    - query_indicator_data: Query data values for a territory
+    - search_indicators: Search indicators by keywords
+Usage:
+    Run locally:
+        python app.py
+    Deploy on HuggingFace Spaces:
+        Push to a Space with Gradio SDK configured.
+    Connect as MCP Server:
+        URL: http://your-server:7860/gradio_api/mcp/
+"""
+import os
+import gradio as gr
+from dotenv import load_dotenv
+# Load environment variables
+load_dotenv()
+# Import tools
+from src.tools import (
+    list_indicators,
+    get_indicator_details,
+    query_indicator_data,
+    search_indicators,
+)
+from src.models import GEOGRAPHIC_LEVELS
+# Check if token is configured
+if not os.getenv("INDICATEURS_TE_TOKEN"):
+    print("WARNING: INDICATEURS_TE_TOKEN not set. API calls will fail.")
+    print("Set the token in .env file or as environment variable.")
+# Create individual interfaces for each tool
+list_interface = gr.Interface(
+    fn=list_indicators,
+    inputs=[
+        gr.Textbox(
+            label="Thématique FNV",
+            placeholder="Ex: mieux se déplacer, mieux se loger...",
+            info="Filtre par thématique France Nation Verte (recherche partielle)",
+        ),
+        gr.Dropdown(
+            choices=[""] + GEOGRAPHIC_LEVELS,
+            label="Maille géographique",
+            info="Filtre par niveau géographique disponible",
+        ),
+    ],
+    outputs=gr.JSON(label="Indicateurs"),
+    title="Lister les indicateurs",
+    description="Liste tous les indicateurs disponibles avec filtres optionnels.",
+    api_name="list_indicators",
+)
+details_interface = gr.Interface(
+    fn=get_indicator_details,
+    inputs=[
+        gr.Textbox(
+            label="ID de l'indicateur",
+            placeholder="Ex: 611",
+            info="Identifiant numérique de l'indicateur",
+        ),
+    ],
+    outputs=gr.JSON(label="Détails"),
+    title="Détails d'un indicateur",
+    description="Retourne les métadonnées complètes et les sources d'un indicateur.",
+    api_name="get_indicator_details",
+)
+query_interface = gr.Interface(
+    fn=query_indicator_data,
+    inputs=[
+        gr.Textbox(
+            label="ID de l'indicateur",
+            placeholder="Ex: 611",
+            info="Identifiant numérique de l'indicateur",
+        ),
+        gr.Dropdown(
+            choices=GEOGRAPHIC_LEVELS,
+            label="Niveau géographique",
+            value="region",
+            info="Maille territoriale à interroger",
+        ),
+        gr.Textbox(
+            label="Code INSEE",
+            placeholder="Ex: 93 (PACA), 13 (Bouches-du-Rhône)...",
+            info="Code du territoire (optionnel)",
+        ),
+        gr.Textbox(
+            label="Année",
+            placeholder="Ex: 2020",
+            info="Année des données (optionnel)",
+        ),
+    ],
+    outputs=gr.JSON(label="Données"),
+    title="Interroger les données",
+    description="Récupère les valeurs d'un indicateur pour un territoire donné.",
+    api_name="query_indicator_data",
+)
+search_interface = gr.Interface(
+    fn=search_indicators,
+    inputs=[
+        gr.Textbox(
+            label="Recherche",
+            placeholder="Ex: consommation espace, surface bio, émissions CO2...",
+            info="Mots-clés à rechercher dans le nom et la description",
+        ),
+    ],
+    outputs=gr.JSON(label="Résultats"),
+    title="Rechercher des indicateurs",
+    description="Recherche des indicateurs par mots-clés.",
+    api_name="search_indicators",
+)
+# Combine all interfaces into a tabbed interface
+demo = gr.TabbedInterface(
+    interface_list=[
+        list_interface,
+        search_interface,
+        details_interface,
+        query_interface,
+    ],
+    tab_names=[
+        "Lister",
+        "Rechercher",
+        "Détails",
+        "Données",
+    ],
+    title="MCP Server - Indicateurs Territoriaux de Transition Écologique",
+)
+# Add a description block
+with demo:
+    gr.Markdown(
+        """
+        ---
+        ### Connexion MCP
+        Pour utiliser ce serveur comme outil MCP dans Claude Desktop, Cursor ou autre client MCP :
+        ```json
+        {
+          "mcpServers": {
+            "indicateurs-te": {
+              "url": "https://YOUR-SPACE.hf.space/gradio_api/mcp/"
+            }
+          }
+        }
+        ```
+        ### Structure des données
+        Les cubes de données suivent le format `{thematique}_{maille}` :
+        - `conso_enaf_com` → Consommation ENAF, maille commune
+        - `surface_bio_dpt` → Surface bio, maille département
+        Les measures contiennent l'ID de l'indicateur : `{cube}.id_{indicator_id}`
+        ### API Cube.js
+        Ce serveur interroge l'API du Hub d'Indicateurs Territoriaux du Ministère de la Transition Écologique.
+        - Documentation : [ecologie.data.gouv.fr/indicators](https://ecologie.data.gouv.fr/indicators)
+        - API : `https://api.indicateurs.ecologie.gouv.fr`
+        """
+    )
+if __name__ == "__main__":
+    demo.launch(
+        mcp_server=True,
+        server_name="0.0.0.0",
+        server_port=7860,
+    )

requirements.txt ADDED Viewed

	@@ -0,0 +1,11 @@

+# MCP Server for Indicateurs Territoriaux de Transition Écologique
+# Python >= 3.10 required
+# Core dependencies
+gradio[mcp]>=5.0.0
+httpx>=0.27.0
+pydantic>=2.0.0
+python-dotenv>=1.0.0
+# For async support
+anyio>=4.0.0

src/__init__.py ADDED Viewed

	@@ -0,0 +1,3 @@


1	+ """MCP Server for French Territorial Ecological Indicators."""
2	+
3	+ __version__ = "0.1.0"

src/api_client.py ADDED Viewed

	@@ -0,0 +1,317 @@

+"""HTTP client for the Cube.js API of Indicateurs Territoriaux."""
+import os
+from typing import Any
+import httpx
+from dotenv import load_dotenv
+load_dotenv()
+class CubeJsClientError(Exception):
+    """Base exception for Cube.js client errors."""
+    pass
+class AuthenticationError(CubeJsClientError):
+    """Raised when authentication fails (401)."""
+    pass
+class BadRequestError(CubeJsClientError):
+    """Raised when the request is malformed (400)."""
+    pass
+class CubeJsClient:
+    """HTTP client for the Cube.js REST API.
+    This client handles authentication and provides methods to interact
+    with the Indicateurs Territoriaux API endpoints.
+    """
+    def __init__(
+        self,
+        base_url: str | None = None,
+        token: str | None = None,
+        timeout: float = 30.0,
+    ):
+        """Initialize the Cube.js client.
+        Args:
+            base_url: Base URL of the API. Defaults to env var INDICATEURS_TE_BASE_URL.
+            token: JWT authentication token. Defaults to env var INDICATEURS_TE_TOKEN.
+            timeout: Request timeout in seconds.
+        """
+        self.base_url = (
+            base_url
+            or os.getenv("INDICATEURS_TE_BASE_URL")
+            or "https://api.indicateurs.ecologie.gouv.fr"
+        )
+        self.token = token or os.getenv("INDICATEURS_TE_TOKEN")
+        if not self.token:
+            raise ValueError(
+                "No API token provided. Set INDICATEURS_TE_TOKEN environment variable "
+                "or pass token parameter."
+            )
+        self.timeout = timeout
+        self._client: httpx.AsyncClient | None = None
+    @property
+    def headers(self) -> dict[str, str]:
+        """HTTP headers for API requests."""
+        return {
+            "Authorization": f"Bearer {self.token}",
+            "Content-Type": "application/json",
+        }
+    async def _get_client(self) -> httpx.AsyncClient:
+        """Get or create the async HTTP client."""
+        if self._client is None or self._client.is_closed:
+            self._client = httpx.AsyncClient(
+                base_url=self.base_url,
+                headers=self.headers,
+                timeout=self.timeout,
+            )
+        return self._client
+    async def close(self) -> None:
+        """Close the HTTP client."""
+        if self._client is not None and not self._client.is_closed:
+            await self._client.aclose()
+            self._client = None
+    async def _handle_response(self, response: httpx.Response) -> dict[str, Any]:
+        """Handle API response and raise appropriate errors.
+        Args:
+            response: The HTTP response object.
+        Returns:
+            Parsed JSON response.
+        Raises:
+            AuthenticationError: If the token is invalid or expired (401).
+            BadRequestError: If the request is malformed (400).
+            CubeJsClientError: For other HTTP errors.
+        """
+        if response.status_code == 401:
+            raise AuthenticationError(
+                "Authentication failed. Your API token may be invalid or expired. "
+                "Please check your INDICATEURS_TE_TOKEN environment variable."
+            )
+        if response.status_code == 400:
+            try:
+                error_detail = response.json()
+            except Exception:
+                error_detail = response.text
+            raise BadRequestError(
+                f"Bad request to API. Details: {error_detail}"
+            )
+        if response.status_code >= 400:
+            raise CubeJsClientError(
+                f"API request failed with status {response.status_code}: {response.text}"
+            )
+        return response.json()
+    async def get_meta(self) -> dict[str, Any]:
+        """Fetch the API schema metadata.
+        Returns the complete schema including all cubes, their measures,
+        dimensions, and available filters.
+        Returns:
+            Dict containing the API metadata with 'cubes' key.
+        Raises:
+            AuthenticationError: If authentication fails.
+            CubeJsClientError: For other API errors.
+        """
+        client = await self._get_client()
+        response = await client.get("/cubejs-api/v1/meta")
+        return await self._handle_response(response)
+    async def load(self, query: dict[str, Any]) -> dict[str, Any]:
+        """Execute a data query against the Cube.js API.
+        Args:
+            query: The Cube.js query object containing measures, dimensions,
+                   filters, and other query parameters.
+        Returns:
+            Dict containing the query results with 'data' key.
+        Raises:
+            AuthenticationError: If authentication fails.
+            BadRequestError: If the query is malformed.
+            CubeJsClientError: For other API errors.
+        Example:
+            >>> query = {
+            ...     "measures": ["indicateur_metadata.count"],
+            ...     "dimensions": ["indicateur_metadata.id", "indicateur_metadata.libelle"],
+            ...     "limit": 10
+            ... }
+            >>> result = await client.load(query)
+        """
+        client = await self._get_client()
+        response = await client.post(
+            "/cubejs-api/v1/load",
+            json={"query": query},
+        )
+        return await self._handle_response(response)
+    async def load_indicators_metadata(
+        self,
+        dimensions: list[str] | None = None,
+        filters: list[dict[str, Any]] | None = None,
+        limit: int = 500,
+    ) -> list[dict[str, Any]]:
+        """Load indicator metadata from the indicateur_metadata cube.
+        Convenience method for querying the indicator metadata cube.
+        Args:
+            dimensions: List of dimensions to fetch. Defaults to basic info.
+            filters: Optional list of filters to apply.
+            limit: Maximum number of results.
+        Returns:
+            List of indicator metadata records.
+        """
+        if dimensions is None:
+            dimensions = [
+                "indicateur_metadata.id",
+                "indicateur_metadata.libelle",
+                "indicateur_metadata.unite",
+                "indicateur_metadata.description",
+                "indicateur_metadata.mailles_disponibles",
+                "indicateur_metadata.thematique_fnv",
+                "indicateur_metadata.annees_disponibles",
+            ]
+        query: dict[str, Any] = {
+            "dimensions": dimensions,
+            "limit": limit,
+        }
+        if filters:
+            query["filters"] = filters
+        result = await self.load(query)
+        return result.get("data", [])
+    async def load_sources_metadata(
+        self,
+        indicator_id: int | None = None,
+        limit: int = 100,
+    ) -> list[dict[str, Any]]:
+        """Load source metadata from the indicateur_x_source_metadata cube.
+        Args:
+            indicator_id: Optional indicator ID to filter sources.
+            limit: Maximum number of results.
+        Returns:
+            List of source metadata records.
+        """
+        dimensions = [
+            "indicateur_x_source_metadata.id_indicateur",
+            "indicateur_x_source_metadata.nom_source",
+            "indicateur_x_source_metadata.libelle",
+            "indicateur_x_source_metadata.description",
+            "indicateur_x_source_metadata.producteur_source",
+            "indicateur_x_source_metadata.distributeur_source",
+            "indicateur_x_source_metadata.license_source",
+            "indicateur_x_source_metadata.lien_page",
+            "indicateur_x_source_metadata.date_derniere_extraction",
+        ]
+        query: dict[str, Any] = {
+            "dimensions": dimensions,
+            "limit": limit,
+        }
+        if indicator_id is not None:
+            query["filters"] = [
+                {
+                    "member": "indicateur_x_source_metadata.id_indicateur",
+                    "operator": "equals",
+                    "values": [str(indicator_id)],
+                }
+            ]
+        result = await self.load(query)
+        return result.get("data", [])
+    async def search_indicators_by_libelle(
+        self,
+        search_term: str,
+        limit: int = 50,
+    ) -> list[dict[str, Any]]:
+        """Search indicators by keyword in libelle using contains filter.
+        This uses Cube.js contains operator for server-side filtering.
+        Note: Limited to single term, for multi-term use client-side filtering.
+        Args:
+            search_term: Term to search for in indicator libelle.
+            limit: Maximum number of results.
+        Returns:
+            List of matching indicator metadata records.
+        """
+        query: dict[str, Any] = {
+            "dimensions": [
+                "indicateur_metadata.id",
+                "indicateur_metadata.libelle",
+                "indicateur_metadata.description",
+                "indicateur_metadata.unite",
+                "indicateur_metadata.mailles_disponibles",
+                "indicateur_metadata.thematique_fnv",
+            ],
+            "filters": [
+                {
+                    "member": "indicateur_metadata.libelle",
+                    "operator": "contains",
+                    "values": [search_term],
+                }
+            ],
+            "limit": limit,
+        }
+        result = await self.load(query)
+        return result.get("data", [])
+# Singleton instance for the application
+_client_instance: CubeJsClient | None = None
+def get_client() -> CubeJsClient:
+    """Get or create the singleton CubeJsClient instance.
+    Returns:
+        The shared CubeJsClient instance.
+    """
+    global _client_instance
+    if _client_instance is None:
+        _client_instance = CubeJsClient()
+    return _client_instance
+async def close_client() -> None:
+    """Close the singleton client instance."""
+    global _client_instance
+    if _client_instance is not None:
+        await _client_instance.close()
+        _client_instance = None

src/cache.py ADDED Viewed

	@@ -0,0 +1,299 @@

+"""Metadata cache for indicators and cube mappings."""
+import asyncio
+import os
+from datetime import datetime, timedelta
+from typing import Any
+from .api_client import CubeJsClient, get_client
+from .cube_resolver import CubeResolver, get_resolver
+from .models import IndicatorMetadata, IndicatorListItem
+class IndicatorCache:
+    """Cache for indicator metadata and cube resolution.
+    This cache stores indicator metadata loaded at startup and periodically
+    refreshes to pick up new indicators. It also initializes the CubeResolver
+    for mapping indicator IDs to data cubes.
+    """
+    def __init__(
+        self,
+        refresh_interval_seconds: int | None = None,
+    ):
+        """Initialize the cache.
+        Args:
+            refresh_interval_seconds: How often to refresh the cache.
+                Defaults to CACHE_REFRESH_SECONDS env var or 3600 (1 hour).
+        """
+        self.refresh_interval = timedelta(
+            seconds=refresh_interval_seconds
+            or int(os.getenv("CACHE_REFRESH_SECONDS", "3600"))
+        )
+        # Indicator metadata by ID
+        self._indicators: dict[int, IndicatorMetadata] = {}
+        # Reference to the cube resolver
+        self._resolver: CubeResolver = get_resolver()
+        # Last refresh timestamp
+        self._last_refresh: datetime | None = None
+        # Lock for thread-safe refresh
+        self._refresh_lock = asyncio.Lock()
+        # Flag to indicate if initial load is complete
+        self._initialized = False
+    @property
+    def is_initialized(self) -> bool:
+        """Check if the cache has been initialized."""
+        return self._initialized
+    @property
+    def needs_refresh(self) -> bool:
+        """Check if the cache needs to be refreshed."""
+        if not self._initialized or self._last_refresh is None:
+            return True
+        return datetime.now() - self._last_refresh > self.refresh_interval
+    @property
+    def indicators(self) -> dict[int, IndicatorMetadata]:
+        """Get all cached indicators."""
+        return self._indicators.copy()
+    @property
+    def resolver(self) -> CubeResolver:
+        """Get the cube resolver instance."""
+        return self._resolver
+    async def initialize(self, client: CubeJsClient | None = None) -> None:
+        """Initialize the cache with data from the API.
+        This should be called at application startup.
+        Args:
+            client: Optional CubeJsClient instance. If not provided,
+                uses the singleton instance.
+        """
+        if client is None:
+            client = get_client()
+        await self.refresh(client)
+    async def refresh(self, client: CubeJsClient | None = None) -> None:
+        """Refresh the cache from the API.
+        Args:
+            client: Optional CubeJsClient instance.
+        """
+        async with self._refresh_lock:
+            if client is None:
+                client = get_client()
+            # Load indicator metadata
+            await self._load_indicators(client)
+            # Load and parse /meta for cube resolution
+            await self._load_cube_metadata(client)
+            self._last_refresh = datetime.now()
+            self._initialized = True
+    async def _load_indicators(self, client: CubeJsClient) -> None:
+        """Load all indicator metadata from the API."""
+        # Note: Some dimensions listed in /meta may not exist in actual data
+        # Only include dimensions that have been validated to work
+        dimensions = [
+            "indicateur_metadata.id",
+            "indicateur_metadata.libelle",
+            "indicateur_metadata.unite",
+            "indicateur_metadata.description",
+            "indicateur_metadata.methode_calcul",
+            "indicateur_metadata.annees_disponibles",
+            "indicateur_metadata.mailles_disponibles",
+            "indicateur_metadata.maille_mini_disponible",
+            "indicateur_metadata.couverture_geographique",
+            "indicateur_metadata.completion_region",
+            "indicateur_metadata.completion_departement",
+            "indicateur_metadata.completion_epci",
+            "indicateur_metadata.completion_commune",
+            "indicateur_metadata.thematique_fnv",
+            # Note: secteur_fnv, enjeux_fnv, levier_fnv cause errors despite being in schema
+        ]
+        data = await client.load_indicators_metadata(
+            dimensions=dimensions,
+            limit=1000,  # Should be enough for all indicators
+        )
+        self._indicators.clear()
+        for row in data:
+            try:
+                indicator = IndicatorMetadata.from_api_response(row)
+                self._indicators[indicator.id] = indicator
+            except Exception as e:
+                # Log but don't fail on individual indicator parsing errors
+                print(f"Warning: Failed to parse indicator: {e}")
+    async def _load_cube_metadata(self, client: CubeJsClient) -> None:
+        """Load cube metadata from /meta and initialize the resolver."""
+        meta = await client.get_meta()
+        self._resolver.load_from_meta(meta)
+    def get_indicator(self, indicator_id: int) -> IndicatorMetadata | None:
+        """Get indicator metadata by ID.
+        Args:
+            indicator_id: The indicator ID.
+        Returns:
+            The indicator metadata, or None if not found.
+        """
+        return self._indicators.get(indicator_id)
+    def get_cube_name(self, indicator_id: int, maille: str) -> str | None:
+        """Get the data cube name for an indicator at a specific maille.
+        Args:
+            indicator_id: The indicator ID.
+            maille: The geographic level.
+        Returns:
+            The cube name, or None if not found.
+        """
+        return self._resolver.find_cube_for_indicator(indicator_id, maille)
+    def list_indicators(
+        self,
+        thematique: str | None = None,
+        maille: str | None = None,
+    ) -> list[IndicatorListItem]:
+        """List indicators with optional filtering.
+        Args:
+            thematique: Filter by thematique_fnv (case-insensitive partial match).
+            maille: Filter by available geographic level.
+        Returns:
+            List of matching indicators.
+        """
+        results = []
+        for indicator in self._indicators.values():
+            # Apply thematique filter
+            if thematique:
+                if not indicator.thematique_fnv:
+                    continue
+                if thematique.lower() not in indicator.thematique_fnv.lower():
+                    continue
+            # Apply maille filter
+            if maille:
+                if not indicator.has_geographic_level(maille):
+                    continue
+            results.append(
+                IndicatorListItem(
+                    id=indicator.id,
+                    libelle=indicator.libelle,
+                    unite=indicator.unite,
+                    mailles_disponibles=indicator.mailles_disponibles,
+                    thematique_fnv=indicator.thematique_fnv,
+                )
+            )
+        # Sort by ID for consistent ordering
+        results.sort(key=lambda x: x.id)
+        return results
+    def search_indicators(self, query: str) -> list[IndicatorListItem]:
+        """Search indicators by keyword.
+        Searches in libelle and description fields (case-insensitive).
+        Args:
+            query: Search query string.
+        Returns:
+            List of matching indicators.
+        """
+        if not query or not query.strip():
+            return self.list_indicators()
+        query_lower = query.lower().strip()
+        query_words = query_lower.split()
+        results = []
+        for indicator in self._indicators.values():
+            # Search in libelle and description
+            searchable = " ".join(
+                filter(None, [indicator.libelle, indicator.description])
+            ).lower()
+            # Check if all query words are present
+            if all(word in searchable for word in query_words):
+                results.append(
+                    IndicatorListItem(
+                        id=indicator.id,
+                        libelle=indicator.libelle,
+                        unite=indicator.unite,
+                        mailles_disponibles=indicator.mailles_disponibles,
+                        thematique_fnv=indicator.thematique_fnv,
+                    )
+                )
+        # Sort by relevance (exact match in libelle first, then by ID)
+        def sort_key(item: IndicatorListItem) -> tuple[int, int]:
+            exact_match = 0 if query_lower in item.libelle.lower() else 1
+            return (exact_match, item.id)
+        results.sort(key=sort_key)
+        return results
+# Singleton cache instance
+_cache_instance: IndicatorCache | None = None
+def get_cache() -> IndicatorCache:
+    """Get or create the singleton IndicatorCache instance.
+    Returns:
+        The shared IndicatorCache instance.
+    """
+    global _cache_instance
+    if _cache_instance is None:
+        _cache_instance = IndicatorCache()
+    return _cache_instance
+async def initialize_cache(client: CubeJsClient | None = None) -> IndicatorCache:
+    """Initialize the singleton cache.
+    This should be called at application startup.
+    Args:
+        client: Optional CubeJsClient instance.
+    Returns:
+        The initialized cache.
+    """
+    cache = get_cache()
+    if not cache.is_initialized:
+        await cache.initialize(client)
+    return cache
+async def refresh_cache_if_needed(client: CubeJsClient | None = None) -> None:
+    """Refresh the cache if it's stale.
+    Args:
+        client: Optional CubeJsClient instance.
+    """
+    cache = get_cache()
+    if cache.needs_refresh:
+        await cache.refresh(client)

src/cube_resolver.py ADDED Viewed

	@@ -0,0 +1,286 @@

+"""Cube resolution logic for mapping indicator IDs to data cubes.
+The API uses a specific naming convention:
+- Data cubes: {thematique}_{maille} (e.g., conso_enaf_com, surface_bio_dpt)
+- Measures: {cube_name}.id_{indicator_id} (e.g., conso_enaf_com.id_611)
+- Geographic dimensions: geocode_{maille}, libelle_{maille}
+This module provides logic to find the correct cube for a given indicator
+and geographic level by parsing the /meta endpoint.
+"""
+from typing import Any
+from .models import MAILLE_SUFFIX_MAP, GEO_DIMENSION_PATTERNS, CubeInfo
+class CubeResolver:
+    """Resolves indicator IDs to their corresponding data cubes.
+    The resolver caches the /meta response and provides efficient lookup
+    of cubes by indicator ID and geographic level.
+    """
+    def __init__(self):
+        """Initialize the resolver."""
+        # Cache of cube metadata from /meta
+        self._cubes_meta: list[dict[str, Any]] = []
+        # Mapping: indicator_id -> {maille -> cube_name}
+        self._indicator_cube_map: dict[int, dict[str, str]] = {}
+        # Mapping: cube_name -> CubeInfo
+        self._cube_info: dict[str, CubeInfo] = {}
+        # Set of all indicator IDs found in cubes
+        self._known_indicator_ids: set[int] = set()
+        self._initialized = False
+    @property
+    def is_initialized(self) -> bool:
+        """Check if the resolver has been initialized."""
+        return self._initialized
+    def load_from_meta(self, meta_response: dict[str, Any]) -> None:
+        """Load and parse cube metadata from /meta response.
+        Args:
+            meta_response: The response from /cubejs-api/v1/meta
+        """
+        self._cubes_meta = meta_response.get("cubes", [])
+        self._build_mappings()
+        self._initialized = True
+    def _build_mappings(self) -> None:
+        """Build the internal mappings from cube metadata."""
+        self._indicator_cube_map.clear()
+        self._cube_info.clear()
+        self._known_indicator_ids.clear()
+        for cube in self._cubes_meta:
+            cube_name = cube.get("name", "")
+            # Skip metadata cubes
+            if cube_name in ("indicateur_metadata", "indicateur_x_source_metadata"):
+                continue
+            # Determine ALL available mailles from cube dimensions
+            available_mailles = self._detect_all_mailles(cube)
+            if not available_mailles:
+                continue
+            # Extract indicator IDs from measures
+            indicator_ids = self._extract_indicator_ids(cube)
+            if indicator_ids:
+                # Store cube info (use finest maille as primary)
+                finest_maille = available_mailles[0]  # Already sorted finest-first
+                self._cube_info[cube_name] = CubeInfo(
+                    name=cube_name,
+                    maille=finest_maille,
+                    indicator_ids=indicator_ids,
+                )
+                # Build reverse mapping: indicator_id -> {maille -> cube_name}
+                # Register cube for ALL available mailles
+                for ind_id in indicator_ids:
+                    self._known_indicator_ids.add(ind_id)
+                    if ind_id not in self._indicator_cube_map:
+                        self._indicator_cube_map[ind_id] = {}
+                    for maille in available_mailles:
+                        # Only register if not already mapped (prefer finest cube)
+                        if maille not in self._indicator_cube_map[ind_id]:
+                            self._indicator_cube_map[ind_id][maille] = cube_name
+    def _detect_all_mailles(self, cube: dict[str, Any]) -> list[str]:
+        """Detect ALL available geographic levels (mailles) in a cube.
+        Cubes like conso_enaf_com contain dimensions for all levels
+        (commune, epci, departement, region) allowing queries at any level.
+        Args:
+            cube: Cube metadata from /meta
+        Returns:
+            List of available mailles, sorted from finest to coarsest
+            (commune, epci, departement, region)
+        """
+        dimensions = cube.get("dimensions", [])
+        dim_names = [d.get("name", "") for d in dimensions]
+        # Order of mailles from finest to coarsest
+        maille_order = ["commune", "epci", "departement", "region"]
+        available = []
+        for maille in maille_order:
+            patterns = GEO_DIMENSION_PATTERNS.get(maille, {})
+            geocode_dim = patterns.get("geocode", "")
+            # Dimension names are prefixed with cube name
+            if any(geocode_dim in dim_name for dim_name in dim_names):
+                available.append(maille)
+        return available
+    def _detect_maille(self, cube: dict[str, Any]) -> str | None:
+        """Detect the finest geographic level (maille) of a cube.
+        Args:
+            cube: Cube metadata from /meta
+        Returns:
+            The finest maille name or None
+        """
+        mailles = self._detect_all_mailles(cube)
+        return mailles[0] if mailles else None
+    def _extract_indicator_ids(self, cube: dict[str, Any]) -> list[int]:
+        """Extract indicator IDs from cube measures.
+        Measures follow the pattern: {cube_name}.id_{indicator_id}
+        Args:
+            cube: Cube metadata from /meta
+        Returns:
+            List of indicator IDs found in the cube's measures
+        """
+        measures = cube.get("measures", [])
+        indicator_ids = []
+        for measure in measures:
+            measure_name = measure.get("name", "")
+            # Look for .id_{number} pattern
+            if ".id_" in measure_name:
+                try:
+                    # Extract the ID after "id_"
+                    id_part = measure_name.split(".id_")[-1]
+                    # Handle potential additional suffixes
+                    id_str = id_part.split("_")[0].split(".")[0]
+                    indicator_id = int(id_str)
+                    indicator_ids.append(indicator_id)
+                except (ValueError, IndexError):
+                    continue
+        return indicator_ids
+    def find_cube_for_indicator(
+        self,
+        indicator_id: int,
+        maille: str,
+    ) -> str | None:
+        """Find the data cube for a given indicator and geographic level.
+        Args:
+            indicator_id: The indicator ID to look up
+            maille: The geographic level ('commune', 'epci', 'departement', 'region')
+        Returns:
+            The cube name if found, None otherwise
+        """
+        if not self._initialized:
+            return None
+        maille_lower = maille.lower()
+        # Check direct mapping
+        if indicator_id in self._indicator_cube_map:
+            cube_map = self._indicator_cube_map[indicator_id]
+            if maille_lower in cube_map:
+                return cube_map[maille_lower]
+        return None
+    def get_measure_name(self, cube_name: str, indicator_id: int) -> str:
+        """Get the full measure name for an indicator in a cube.
+        Args:
+            cube_name: The cube name
+            indicator_id: The indicator ID
+        Returns:
+            The full measure name (e.g., 'conso_enaf_com.id_611')
+        """
+        return f"{cube_name}.id_{indicator_id}"
+    def get_dimension_name(self, cube_name: str, dimension: str) -> str:
+        """Get the full dimension name for a cube.
+        Args:
+            cube_name: The cube name
+            dimension: The dimension name (e.g., 'geocode_region')
+        Returns:
+            The full dimension name (e.g., 'conso_enaf_com.geocode_region')
+        """
+        return f"{cube_name}.{dimension}"
+    def get_available_mailles(self, indicator_id: int) -> list[str]:
+        """Get the available geographic levels for an indicator.
+        Args:
+            indicator_id: The indicator ID
+        Returns:
+            List of available mailles
+        """
+        if indicator_id not in self._indicator_cube_map:
+            return []
+        return list(self._indicator_cube_map[indicator_id].keys())
+    def get_cube_info(self, cube_name: str) -> CubeInfo | None:
+        """Get information about a cube.
+        Args:
+            cube_name: The cube name
+        Returns:
+            CubeInfo if found, None otherwise
+        """
+        return self._cube_info.get(cube_name)
+    def is_indicator_known(self, indicator_id: int) -> bool:
+        """Check if an indicator ID exists in any cube.
+        Args:
+            indicator_id: The indicator ID to check
+        Returns:
+            True if the indicator exists in at least one cube
+        """
+        return indicator_id in self._known_indicator_ids
+    def list_all_cubes(self) -> list[CubeInfo]:
+        """List all data cubes with their metadata.
+        Returns:
+            List of CubeInfo objects
+        """
+        return list(self._cube_info.values())
+    def get_cubes_for_indicator(self, indicator_id: int) -> dict[str, str]:
+        """Get all cubes containing a given indicator.
+        Args:
+            indicator_id: The indicator ID
+        Returns:
+            Dict mapping maille to cube_name
+        """
+        return self._indicator_cube_map.get(indicator_id, {}).copy()
+# Singleton instance
+_resolver_instance: CubeResolver | None = None
+def get_resolver() -> CubeResolver:
+    """Get or create the singleton CubeResolver instance.
+    Returns:
+        The shared CubeResolver instance
+    """
+    global _resolver_instance
+    if _resolver_instance is None:
+        _resolver_instance = CubeResolver()
+    return _resolver_instance

src/models.py ADDED Viewed

	@@ -0,0 +1,239 @@

+"""Pydantic models for the Indicateurs Territoriaux API responses."""
+from typing import Any
+from pydantic import BaseModel, Field
+class IndicatorMetadata(BaseModel):
+    """Metadata for a territorial indicator."""
+    id: int = Field(..., description="Unique identifier of the indicator")
+    libelle: str = Field(..., description="Human-readable name of the indicator")
+    unite: str | None = Field(None, description="Unit of measurement")
+    description: str | None = Field(None, description="Detailed description")
+    methode_calcul: str | None = Field(None, description="Calculation method")
+    fonction_calcul: str | None = Field(None, description="Calculation function")
+    date_debut: int | None = Field(None, description="First available year")
+    date_fin: int | None = Field(None, description="Last available year")
+    annees_disponibles: str | None = Field(
+        None, description="Available years (comma-separated)"
+    )
+    annees_manquantes: str | None = Field(
+        None, description="Missing years (comma-separated)"
+    )
+    mailles_disponibles: str | None = Field(
+        None, description="Available geographic levels (e.g., 'region,departement,epci')"
+    )
+    maille_mini_disponible: str | None = Field(
+        None, description="Finest available geographic level"
+    )
+    couverture_geographique: str | None = Field(
+        None, description="Geographic coverage (France métro, DOM, etc.)"
+    )
+    liste_drom: str | None = Field(None, description="Covered DROM territories")
+    completion_region: float | None = Field(
+        None, description="Completion percentage at region level"
+    )
+    completion_departement: float | None = Field(
+        None, description="Completion percentage at department level"
+    )
+    completion_epci: float | None = Field(
+        None, description="Completion percentage at EPCI level"
+    )
+    completion_commune: float | None = Field(
+        None, description="Completion percentage at commune level"
+    )
+    compte_region: int | None = Field(
+        None, description="Number of regions with data"
+    )
+    compte_departement: int | None = Field(
+        None, description="Number of departments with data"
+    )
+    compte_epci: int | None = Field(None, description="Number of EPCIs with data")
+    compte_commune: int | None = Field(
+        None, description="Number of communes with data"
+    )
+    thematique_fnv: str | None = Field(
+        None, description="France Nation Verte thematic"
+    )
+    secteur_fnv: str | None = Field(None, description="FNV sector")
+    enjeux_fnv: str | None = Field(None, description="FNV challenges")
+    levier_fnv: str | None = Field(None, description="FNV lever")
+    projets_associes: str | None = Field(None, description="Associated projects")
+    valeur_axes: str | None = Field(
+        None, description="Breakdown axes (JSON stringified)"
+    )
+    @classmethod
+    def from_api_response(cls, data: dict[str, Any]) -> "IndicatorMetadata":
+        """Create an IndicatorMetadata from a Cube.js API response row.
+        The API returns dimension names prefixed with the cube name.
+        This method strips the prefix.
+        """
+        # Strip the cube name prefix from keys
+        prefix = "indicateur_metadata."
+        cleaned = {}
+        for key, value in data.items():
+            clean_key = key.replace(prefix, "")
+            cleaned[clean_key] = value
+        return cls(**cleaned)
+    def has_geographic_level(self, level: str) -> bool:
+        """Check if the indicator has data at the specified geographic level."""
+        if not self.mailles_disponibles:
+            return False
+        return level.lower() in self.mailles_disponibles.lower()
+    def get_completion_for_level(self, level: str) -> float | None:
+        """Get the completion percentage for a geographic level."""
+        level_map = {
+            "region": self.completion_region,
+            "departement": self.completion_departement,
+            "epci": self.completion_epci,
+            "commune": self.completion_commune,
+        }
+        return level_map.get(level.lower())
+class SourceMetadata(BaseModel):
+    """Metadata for a data source associated with an indicator."""
+    id_indicateur: int = Field(..., description="ID of the related indicator")
+    nom_source: str | None = Field(None, description="Source identifier")
+    libelle: str | None = Field(None, description="Human-readable source name")
+    description: str | None = Field(None, description="Source description")
+    producteur_source: str | None = Field(None, description="Data producer")
+    distributeur_source: str | None = Field(None, description="Data distributor")
+    license_source: str | None = Field(None, description="Data license")
+    lien_page: str | None = Field(None, description="Source URL")
+    annees_disponibles_source: str | None = Field(
+        None, description="Available years from this source"
+    )
+    annees_manquantes_source: str | None = Field(
+        None, description="Missing years from this source"
+    )
+    maille_mini_disponible: str | None = Field(
+        None, description="Finest geographic level"
+    )
+    couverture_geographique: str | None = Field(
+        None, description="Geographic coverage"
+    )
+    date_derniere_extraction: str | None = Field(
+        None, description="Last extraction date"
+    )
+    @classmethod
+    def from_api_response(cls, data: dict[str, Any]) -> "SourceMetadata":
+        """Create a SourceMetadata from a Cube.js API response row."""
+        prefix = "indicateur_x_source_metadata."
+        cleaned = {}
+        for key, value in data.items():
+            clean_key = key.replace(prefix, "")
+            cleaned[clean_key] = value
+        return cls(**cleaned)
+class IndicatorListItem(BaseModel):
+    """Simplified indicator info for list responses."""
+    id: int
+    libelle: str
+    unite: str | None = None
+    mailles_disponibles: str | None = None
+    thematique_fnv: str | None = None
+class IndicatorDetails(BaseModel):
+    """Complete indicator details with sources."""
+    metadata: IndicatorMetadata
+    sources: list[SourceMetadata] = Field(default_factory=list)
+class GeographicDataPoint(BaseModel):
+    """A single data point with geographic information."""
+    geocode: str = Field(..., description="INSEE code of the territory")
+    libelle: str | None = Field(None, description="Name of the territory")
+    valeur: float | str | None = Field(None, description="Indicator value")
+    annee: str | None = Field(None, description="Year of the data")
+    unite: str | None = Field(None, description="Unit of measurement")
+class QueryResult(BaseModel):
+    """Result of a data query."""
+    indicator_id: int
+    indicator_name: str
+    geographic_level: str
+    data: list[GeographicDataPoint]
+    total_count: int = 0
+    query_info: dict[str, Any] = Field(default_factory=dict)
+class SearchResult(BaseModel):
+    """Result of an indicator search."""
+    indicators: list[IndicatorListItem]
+    query: str
+    total_count: int
+class CubeInfo(BaseModel):
+    """Information about a data cube."""
+    name: str = Field(..., description="Cube name (e.g., 'conso_enaf_com')")
+    maille: str = Field(..., description="Geographic level (commune, epci, departement, region)")
+    indicator_ids: list[int] = Field(default_factory=list, description="Indicator IDs in this cube")
+# Geographic level constants
+GEOGRAPHIC_LEVELS = ["region", "departement", "epci", "commune"]
+# Maille suffix mapping for cube names
+MAILLE_SUFFIX_MAP = {
+    "commune": "_com",
+    "epci": "_epci",
+    "departement": "_dpt",
+    "region": "_reg",
+}
+# Dimension patterns for each geographic level (validated by API tests)
+# Format: geocode_{maille} and libelle_{maille}
+GEO_DIMENSION_PATTERNS = {
+    "region": {
+        "geocode": "geocode_region",
+        "libelle": "libelle_region",
+    },
+    "departement": {
+        "geocode": "geocode_departement",
+        "libelle": "libelle_departement",
+    },
+    "epci": {
+        "geocode": "geocode_epci",
+        "libelle": "libelle_epci",
+    },
+    "commune": {
+        "geocode": "geocode_commune",
+        "libelle": "libelle_commune",
+    },
+}
+# Region code reference
+REGION_CODES = {
+    "11": "Île-de-France",
+    "24": "Centre-Val de Loire",
+    "27": "Bourgogne-Franche-Comté",
+    "28": "Normandie",
+    "32": "Hauts-de-France",
+    "44": "Grand Est",
+    "52": "Pays de la Loire",
+    "53": "Bretagne",
+    "75": "Nouvelle-Aquitaine",
+    "76": "Occitanie",
+    "84": "Auvergne-Rhône-Alpes",
+    "93": "Provence-Alpes-Côte d'Azur",
+    "94": "Corse",
+}

src/tools.py ADDED Viewed

	@@ -0,0 +1,354 @@

+"""MCP tools for querying territorial ecological indicators."""
+import json
+from typing import Any
+from .api_client import get_client, CubeJsClient, CubeJsClientError
+from .cache import get_cache, initialize_cache, refresh_cache_if_needed
+from .cube_resolver import get_resolver
+from .models import (
+    IndicatorMetadata,
+    SourceMetadata,
+    IndicatorListItem,
+    GEOGRAPHIC_LEVELS,
+    GEO_DIMENSION_PATTERNS,
+)
+async def _ensure_cache_initialized() -> None:
+    """Ensure the cache is initialized before tool execution."""
+    cache = get_cache()
+    if not cache.is_initialized:
+        await initialize_cache()
+    else:
+        await refresh_cache_if_needed()
+async def list_indicators(
+    thematique: str = "",
+    maille: str = "",
+) -> str:
+    """List all available territorial ecological indicators.
+    Returns a list of indicators with their main characteristics. You can filter
+    by thematic (France Nation Verte themes like "mieux se déplacer", "mieux se loger")
+    or by geographic level (region, departement, epci, commune).
+    Args:
+        thematique: Optional filter by FNV thematic. Use partial match, e.g., "déplacer"
+            for mobility indicators, "loger" for housing, "produire" for production.
+        maille: Optional filter by available geographic level. Valid values:
+            "region", "departement", "epci", "commune".
+    Returns:
+        JSON string containing a list of indicators with id, libelle, unite,
+        mailles_disponibles, and thematique_fnv.
+    Example:
+        To find mobility indicators available at department level:
+        list_indicators(thematique="déplacer", maille="departement")
+    """
+    await _ensure_cache_initialized()
+    cache = get_cache()
+    # Normalize empty strings to None
+    theme_filter = thematique.strip() if thematique else None
+    maille_filter = maille.strip().lower() if maille else None
+    # Validate maille if provided
+    if maille_filter and maille_filter not in GEOGRAPHIC_LEVELS:
+        return json.dumps({
+            "error": f"Invalid geographic level: {maille}",
+            "valid_levels": GEOGRAPHIC_LEVELS,
+        }, ensure_ascii=False)
+    indicators = cache.list_indicators(
+        thematique=theme_filter,
+        maille=maille_filter,
+    )
+    return json.dumps({
+        "indicators": [ind.model_dump() for ind in indicators],
+        "count": len(indicators),
+        "filters_applied": {
+            "thematique": theme_filter,
+            "maille": maille_filter,
+        },
+    }, ensure_ascii=False, indent=2)
+async def get_indicator_details(indicator_id: str) -> str:
+    """Get detailed information about a specific indicator.
+    Returns comprehensive metadata including description, calculation method,
+    data coverage, and data sources for a given indicator ID.
+    Args:
+        indicator_id: The numeric ID of the indicator (e.g., "42", "94", "611").
+    Returns:
+        JSON string containing:
+        - metadata: Full indicator metadata (description, methode_calcul,
+          annees_disponibles, completion rates by geographic level, etc.)
+        - sources: List of data sources with producer, license, and links.
+        - available_cubes: Dict mapping maille to cube name for data queries.
+    Example:
+        get_indicator_details("611") returns details about indicator 611
+        (Consommation d'espaces naturels, agricoles et forestiers).
+    """
+    await _ensure_cache_initialized()
+    # Parse indicator ID
+    try:
+        ind_id = int(indicator_id)
+    except ValueError:
+        return json.dumps({
+            "error": f"Invalid indicator ID: {indicator_id}. Must be a number.",
+        }, ensure_ascii=False)
+    cache = get_cache()
+    indicator = cache.get_indicator(ind_id)
+    if indicator is None:
+        return json.dumps({
+            "error": f"Indicator {ind_id} not found in metadata.",
+            "hint": "Use list_indicators() to see available indicators.",
+        }, ensure_ascii=False)
+    # Get available cubes from resolver
+    resolver = get_resolver()
+    available_cubes = resolver.get_cubes_for_indicator(ind_id)
+    # Fetch sources from API
+    client = get_client()
+    try:
+        sources_data = await client.load_sources_metadata(indicator_id=ind_id)
+        sources = [
+            SourceMetadata.from_api_response(row).model_dump()
+            for row in sources_data
+        ]
+    except CubeJsClientError as e:
+        sources = []
+        sources_error = str(e)
+    else:
+        sources_error = None
+    result = {
+        "metadata": indicator.model_dump(),
+        "sources": sources,
+        "available_cubes": available_cubes,
+    }
+    if sources_error:
+        result["sources_warning"] = f"Could not fetch sources: {sources_error}"
+    return json.dumps(result, ensure_ascii=False, indent=2)
+async def query_indicator_data(
+    indicator_id: str,
+    geographic_level: str,
+    geographic_code: str = "",
+    year: str = "",
+) -> str:
+    """Query data values for a specific indicator and territory.
+    Retrieves actual data values for an indicator at the specified geographic level.
+    You can filter by a specific territory code and/or year.
+    Args:
+        indicator_id: The numeric ID of the indicator (e.g., "611").
+        geographic_level: The geographic level to query. Valid values:
+            "region", "departement", "epci", "commune".
+        geographic_code: Optional INSEE code to filter by territory:
+            - Region: 2 digits (e.g., "93" for PACA, "11" for Île-de-France)
+            - Departement: 2-3 characters (e.g., "13", "2A", "974")
+            - EPCI: 9 digits (SIREN code)
+            - Commune: 5 digits (e.g., "75056" for Paris)
+        year: Optional year to filter data (e.g., "2020").
+    Returns:
+        JSON string containing:
+        - indicator_id: The queried indicator ID
+        - indicator_name: Human-readable name
+        - geographic_level: The queried level
+        - data: List of data points with geocode, libelle, valeur, annee
+        - total_count: Number of results
+    Example:
+        Query indicator 611 (ENAF consumption) for PACA region:
+        query_indicator_data("611", "region", "93")
+        Query all departments for 2020:
+        query_indicator_data("611", "departement", year="2020")
+    """
+    await _ensure_cache_initialized()
+    # Parse indicator ID
+    try:
+        ind_id = int(indicator_id)
+    except ValueError:
+        return json.dumps({
+            "error": f"Invalid indicator ID: {indicator_id}. Must be a number.",
+        }, ensure_ascii=False)
+    # Validate geographic level
+    geo_level = geographic_level.strip().lower()
+    if geo_level not in GEOGRAPHIC_LEVELS:
+        return json.dumps({
+            "error": f"Invalid geographic level: {geographic_level}",
+            "valid_levels": GEOGRAPHIC_LEVELS,
+        }, ensure_ascii=False)
+    cache = get_cache()
+    resolver = get_resolver()
+    indicator = cache.get_indicator(ind_id)
+    indicator_name = indicator.libelle if indicator else f"Indicator {ind_id}"
+    indicator_unite = indicator.unite if indicator else None
+    # Find the cube for this indicator and maille
+    cube_name = resolver.find_cube_for_indicator(ind_id, geo_level)
+    if cube_name is None:
+        # Check if indicator exists at all
+        if not resolver.is_indicator_known(ind_id):
+            return json.dumps({
+                "error": f"Indicator {ind_id} not found in any data cube.",
+                "hint": "Use get_indicator_details() to check available mailles.",
+            }, ensure_ascii=False)
+        # Indicator exists but not at this maille
+        available = resolver.get_available_mailles(ind_id)
+        return json.dumps({
+            "error": f"Indicator {ind_id} is not available at {geo_level} level.",
+            "available_levels": available,
+            "hint": f"Try one of: {', '.join(available)}",
+        }, ensure_ascii=False)
+    # Build the query
+    geo_patterns = GEO_DIMENSION_PATTERNS[geo_level]
+    # Measure and dimensions with full cube prefix
+    measure = resolver.get_measure_name(cube_name, ind_id)
+    geocode_dim = resolver.get_dimension_name(cube_name, geo_patterns["geocode"])
+    libelle_dim = resolver.get_dimension_name(cube_name, geo_patterns["libelle"])
+    annee_dim = resolver.get_dimension_name(cube_name, "annee")
+    query: dict[str, Any] = {
+        "measures": [measure],
+        "dimensions": [libelle_dim, annee_dim],
+        "limit": 500,
+    }
+    # Add filters
+    filters = []
+    geo_code = geographic_code.strip() if geographic_code else None
+    if geo_code:
+        filters.append({
+            "member": geocode_dim,
+            "operator": "equals",
+            "values": [geo_code],
+        })
+    year_filter = year.strip() if year else None
+    if year_filter:
+        filters.append({
+            "member": annee_dim,
+            "operator": "equals",
+            "values": [year_filter],
+        })
+    if filters:
+        query["filters"] = filters
+    # Execute query
+    client = get_client()
+    try:
+        result = await client.load(query)
+        data_rows = result.get("data", [])
+    except CubeJsClientError as e:
+        return json.dumps({
+            "error": f"Query failed: {str(e)}",
+            "cube": cube_name,
+            "query": query,
+        }, ensure_ascii=False, indent=2)
+    # Parse results
+    data_points = []
+    for row in data_rows:
+        data_points.append({
+            "libelle": row.get(libelle_dim),
+            "annee": row.get(annee_dim),
+            "valeur": row.get(measure),
+            "unite": indicator_unite,
+        })
+    # Sort by year, then by libelle
+    data_points.sort(key=lambda x: (x.get("annee") or "", x.get("libelle") or ""))
+    return json.dumps({
+        "indicator_id": ind_id,
+        "indicator_name": indicator_name,
+        "geographic_level": geo_level,
+        "data": data_points,
+        "total_count": len(data_points),
+        "query_info": {
+            "cube": cube_name,
+            "measure": measure,
+            "geographic_code_filter": geo_code,
+            "year_filter": year_filter,
+        },
+    }, ensure_ascii=False, indent=2)
+async def search_indicators(query: str) -> str:
+    """Search indicators by keywords in their name or description.
+    Performs a full-text search across indicator names (libelle) and descriptions.
+    All search terms must be present for an indicator to match (AND logic).
+    Args:
+        query: Search terms separated by spaces. Examples:
+            - "consommation espace" finds indicators about land consumption
+            - "émissions CO2" finds indicators about CO2 emissions
+            - "surface bio" finds organic surface indicators
+    Returns:
+        JSON string containing:
+        - indicators: List of matching indicators with id, libelle, unite,
+          mailles_disponibles, thematique_fnv
+        - query: The original search query
+        - total_count: Number of results
+    Example:
+        search_indicators("consommation espace") returns indicators mentioning
+        both "consommation" and "espace" in their name or description.
+    """
+    await _ensure_cache_initialized()
+    cache = get_cache()
+    search_query = query.strip() if query else ""
+    if not search_query:
+        # Return all indicators if no query
+        indicators = cache.list_indicators()
+    else:
+        indicators = cache.search_indicators(search_query)
+    return json.dumps({
+        "indicators": [ind.model_dump() for ind in indicators],
+        "query": search_query,
+        "total_count": len(indicators),
+    }, ensure_ascii=False, indent=2)
+# Export all tools
+__all__ = [
+    "list_indicators",
+    "get_indicator_details",
+    "query_indicator_data",
+    "search_indicators",
+]