sohiyiy commited on
Commit
15906fa
ยท
verified ยท
1 Parent(s): eba7350

Upload folder using huggingface_hub

Browse files
Files changed (2) hide show
  1. README.md +21 -44
  2. requirements.txt +6 -7
README.md CHANGED
@@ -4,10 +4,11 @@ emoji: ๐Ÿฆ
4
  colorFrom: green
5
  colorTo: green
6
  sdk: gradio
7
- sdk_version: 4.44.0
8
  app_file: app.py
9
  pinned: true
10
  license: mit
 
11
  tags:
12
  - bird-identification
13
  - bioacoustics
@@ -23,15 +24,15 @@ tags:
23
 
24
  **Multi-Modal Bird Identification for India**
25
 
26
- Competitive with BirdNET | 30 Indian Species | 100% Open Source
27
 
28
  ## ๐ŸŽฏ Features
29
 
30
- ### ๐ŸŽค Audio Identification
31
- - **BirdNET-style analysis** with comprehensive spectral features
32
- - Mel spectrogram processing
33
- - Syllable detection and pattern analysis
34
- - 48kHz processing for high-fidelity analysis
35
 
36
  ### ๐Ÿ“ท Image Identification
37
  - Color-based bird matching
@@ -41,50 +42,26 @@ Competitive with BirdNET | 30 Indian Species | 100% Open Source
41
  ### ๐Ÿ“ Description-Based
42
  - Natural language bird identification
43
  - Describe colors, calls, behavior
44
- - Semantic matching to species database
45
-
46
- ## ๐Ÿ“Š Species Database
47
-
48
- 30 Indian bird species with comprehensive data:
49
-
50
- | Species | Frequency Range | Call Pattern |
51
- |---------|-----------------|--------------|
52
- | Asian Koel | 1200-3500 Hz | Ascending whistle |
53
- | Indian Cuckoo | 600-1500 Hz | Four-note call |
54
- | Jungle Babbler | 800-4000 Hz | Chattering chorus |
55
- | Coppersmith Barbet | 1500-3000 Hz | Monotonous tapping |
56
- | Oriental Magpie-Robin | 2000-7000 Hz | Melodious song |
57
- | ... and 25 more | | |
58
 
59
  ## ๐Ÿ”ฌ Technical Details
60
 
61
- ### Audio Processing Pipeline
62
- 1. **Preprocessing**: 48kHz resampling, bandpass filter (150-15000 Hz)
63
- 2. **Feature Extraction**:
64
- - Spectral: Peak frequency, centroid, bandwidth, flatness, rolloff
65
- - Temporal: Syllable detection, onset strength
66
- - Pattern: Melodic/repetitive analysis
67
- 3. **Matching**: Multi-feature scoring with quality weighting
 
68
 
69
- ### Identification Accuracy
70
- - Based on BirdNET methodology
71
- - Frequency matching (40% weight)
72
- - Syllable rate matching (20% weight)
73
- - Pattern matching (20% weight)
74
- - Duration/amplitude (20% weight)
75
 
76
  ## ๐Ÿ‡ฎ๐Ÿ‡ณ CSCR Initiative
77
 
78
- This project is part of the **Citizen Science for Conservation Research** initiative to develop open-source tools for bird researchers in India.
79
-
80
- ## ๐Ÿ”— Links
81
-
82
- - [GitHub Repository](https://github.com/sohamzycus/eagv2/tree/master/birdsense)
83
- - [Documentation](https://github.com/sohamzycus/eagv2/blob/master/birdsense/README.md)
84
-
85
- ## ๐Ÿ“ License
86
-
87
- MIT License - Free for research and educational use.
88
 
89
  ---
90
 
 
4
  colorFrom: green
5
  colorTo: green
6
  sdk: gradio
7
+ sdk_version: 5.9.1
8
  app_file: app.py
9
  pinned: true
10
  license: mit
11
+ python_version: "3.11"
12
  tags:
13
  - bird-identification
14
  - bioacoustics
 
24
 
25
  **Multi-Modal Bird Identification for India**
26
 
27
+ META SAM-Audio Preprocessing | Ollama LLM | 10,000+ Species | CSCR Initiative
28
 
29
  ## ๐ŸŽฏ Features
30
 
31
+ ### ๐ŸŽค Audio Identification (with META SAM-Audio)
32
+ - **SAM-Audio style preprocessing** isolates bird calls from noise
33
+ - Text prompts: "bird call", "bird song" for source separation
34
+ - Frequency isolation: 500-10000 Hz bird vocalization range
35
+ - Multi-bird detection via frequency band analysis
36
 
37
  ### ๐Ÿ“ท Image Identification
38
  - Color-based bird matching
 
42
  ### ๐Ÿ“ Description-Based
43
  - Natural language bird identification
44
  - Describe colors, calls, behavior
45
+ - Semantic matching via LLM
 
 
 
 
 
 
 
 
 
 
 
 
 
46
 
47
  ## ๐Ÿ”ฌ Technical Details
48
 
49
+ ### META SAM-Audio Integration
50
+ ```
51
+ Raw Audio โ†’ SAM-Audio Preprocessing โ†’ Feature Extraction โ†’ LLM Identification
52
+ โ†“
53
+ Text Prompt: "bird call, bird song"
54
+ Frequency: 500-10000 Hz
55
+ Noise Reduction: Spectral gating
56
+ ```
57
 
58
+ ### LLM Backend
59
+ - **Local**: Ollama with qwen2.5:3b
60
+ - **Cloud**: HuggingFace Inference API (fallback)
 
 
 
61
 
62
  ## ๐Ÿ‡ฎ๐Ÿ‡ณ CSCR Initiative
63
 
64
+ Part of the **Citizen Science for Conservation Research** initiative for open-source bird identification tools.
 
 
 
 
 
 
 
 
 
65
 
66
  ---
67
 
requirements.txt CHANGED
@@ -1,8 +1,7 @@
1
- # BirdSense Pro - HuggingFace Space
2
- # Use latest stable Gradio
3
 
4
- gradio>=4.0
5
- numpy
6
- scipy
7
- requests
8
- Pillow
 
1
+ # BirdSense Pro Requirements
2
+ # Using Gradio 5.x for stable deployment
3
 
4
+ numpy>=1.21.0
5
+ scipy>=1.7.0
6
+ requests>=2.28.0
7
+ Pillow>=9.0.0