dice / TODO.md
Alejo760's picture
Upload 13 files
534218d verified

A newer version of the Streamlit SDK is available: 1.54.0

Upgrade

πŸ“‹ TODO List: Pneumonia Consolidation Segmentation Project

βœ… Completed

  • Analyze project structure and patient data format
  • Create preprocessing script for consolidation enhancement
  • Build Streamlit app for Dice score calculation
  • Implement SAM integration for automatic segmentation
  • Create requirements.txt and documentation
  • Setup folder structure for annotations and results

πŸš€ Next Steps (In Order)

Phase 1: Setup & Data Preparation (Week 1)

  1. Install Dependencies

    • Run pip install -r requirements.txt
    • Test Streamlit app: streamlit run dice_calculator_app.py
    • (Optional) Download SAM checkpoint for automatic segmentation
  2. Preprocess Patient Images

    • Enhance all chest X-rays in data/Pacientes/ folder
    • Save enhanced images to dice/enhanced_images/
    • Review enhanced images for quality
    • Document any images with poor quality
  3. Setup Annotation Tool

    • Install CVAT (recommended) or Label Studio
    • Import enhanced images into annotation tool
    • Create annotation classes: "consolidation", "ground_glass", "air_bronchogram"
    • Setup annotation guidelines document for team

Phase 2: Annotation (Weeks 2-4)

  1. Create Ground Truth Annotations

    • Have 2-3 radiologists independently annotate same 20 images (pilot)
    • Calculate inter-rater agreement using Dice scores
    • Resolve disagreements through consensus meeting
    • Annotate remaining images (aim for 100+ cases)
    • Save masks to dice/annotations/ground_truth/
  2. Quality Control

    • Use Dice calculator app to validate annotation consistency
    • Flag cases with unclear consolidation boundaries
    • Re-annotate cases with Dice < 0.70 between annotators
    • Document difficult cases and edge cases

Phase 3: SAM Integration (Week 5)

  1. Test SAM for Automatic Segmentation

    • Download SAM checkpoint (ViT-H recommended)
    • Test SAM on 10 sample images
    • Compare SAM predictions vs ground truth
    • Adjust SAM parameters for best results
    • Document SAM performance metrics
  2. Generate Initial Predictions

    • Use SAM to generate masks for all images
    • Save to dice/annotations/predictions/
    • Calculate Dice scores against ground truth
    • Identify patterns in SAM failures

Phase 4: Analysis & Validation (Week 6)

  1. Calculate Comprehensive Metrics

    • Run batch Dice calculation on all mask pairs
    • Generate statistical reports (mean, std, distribution)
    • Create visualizations (overlays, comparison grids)
    • Save results to dice/results/
  2. Quality Assessment

    • Categorize segmentations: Excellent (>0.85), Good (0.70-0.85), Needs Review (<0.70)
    • Calculate additional metrics: IoU, Precision, Recall, Hausdorff distance
    • Generate quality control report
    • Document failure modes and edge cases

Phase 5: ML Model Development (Weeks 7-10)

  1. Train Segmentation Model

    • Split data: 70% train, 15% validation, 15% test
    • Choose architecture: U-Net, Attention U-Net, or nnU-Net
    • Implement data augmentation pipeline
    • Train model on ground truth annotations
    • Monitor validation Dice during training
  2. Model Evaluation

    • Test on held-out test set
    • Calculate Dice, IoU, and clinical metrics
    • Compare to SAM baseline
    • Generate prediction visualizations
    • Save model checkpoints

Phase 6: Clinical Validation (Weeks 11-12)

  1. Expert Review

    • Have radiologists review model predictions
    • Collect feedback on clinically acceptable performance
    • Test on external validation set (if available)
    • Document cases where model fails
  2. Final Report

    • Compile all metrics and visualizations
    • Write methods section describing workflow
    • Create supplemental figures
    • Prepare manuscript or technical report

πŸ”§ Technical Debt & Improvements

High Priority

  • Add DICOM file support (many medical images are DICOM)
  • Implement multi-class segmentation (consolidation types)
  • Add data versioning (DVC or similar)
  • Create automated testing suite

Medium Priority

  • Add boundary-based metrics (Surface Dice, Normalized Surface Distance)
  • Implement active learning workflow
  • Add export to COCO format for model training
  • Create Docker container for reproducibility

Low Priority

  • Add 3D visualization support
  • Implement web-based annotation tool
  • Add integration with PACS systems
  • Create mobile app for review

πŸ“Š Success Metrics

Annotation Phase

  • Target: 100+ annotated cases
  • Quality: Mean inter-rater Dice > 0.80
  • Efficiency: < 5 minutes per case

ML Model Phase

  • Performance: Mean Dice > 0.75 on test set
  • Comparison: Better than SAM baseline
  • Clinical: 90% of predictions acceptable to radiologists

Publication

  • Timeline: Submit manuscript within 6 months
  • Target: Radiology, European Radiology, or similar
  • Impact: Tool shared publicly for research use

πŸ› Known Issues

  • Large images (>2048x2048) may cause memory issues in Streamlit app
  • SAM requires significant GPU memory (12GB+ recommended)
  • Batch processing doesn't support progress resumption
  • Hausdorff distance calculation is slow for large masks

πŸ“š Learning Resources Needed

  • CVAT tutorial videos for team
  • Radiologic signs of pneumonia refresher
  • SAM usage best practices
  • Medical image segmentation literature review
  • Dice coefficient vs IoU interpretation

🀝 Team Assignments

  • Radiologist 1: Lead annotator, quality control
  • Radiologist 2: Second annotator, validation
  • ML Engineer: Preprocessing, model development
  • Data Manager: File organization, data versioning
  • Project Lead: Coordination, reporting

πŸ“… Timeline Summary

  • Week 1: Setup and preprocessing
  • Weeks 2-4: Ground truth annotation
  • Week 5: SAM integration and testing
  • Week 6: Metrics and analysis
  • Weeks 7-10: ML model development
  • Weeks 11-12: Clinical validation
  • Month 4-6: Manuscript preparation

Last Updated: February 6, 2026 Project Status: Phase 1 - Setup Complete Next Action: Install dependencies and test Streamlit app