| # EAA Fusion Head for Gemma (LoRA) + w2v-bert-2.0 + emotion2vec | |
| This repo hosts the **fusion head** weights and code for the Emotion-Aware Audio LLM. | |
| - LoRA adapter lives at: **marccgrau/eaa-gemma3-270m-adapter** | |
| - Upstream encoders: `facebook/w2v-bert-2.0` (semantic) and `iic/emotion2vec_base` (acoustic via FunASR) | |
| - LLM: `google/gemma-3-270m` | |
| ## Files | |
| - `fusion_head.pt` β PyTorch state_dict of the fusion/regression head | |
| - `eaa_config.json` β minimal config (IDs, dims, hyperparams) | |
| - `modeling_eaa.py` β the fusion architecture (Dual X-Attn + pooling + [REG] head) | |
| ## Quickload (Python) | |
| ```python | |
| import torch, json | |
| from huggingface_hub import hf_hub_download | |
| from modeling_eaa import EAAEmotionRegressor | |
| # Download artifacts | |
| cfg_path = hf_hub_download(repo_id="marccgrau/eaa-gemma3-270m-w2vbert-emotion2vec", filename="eaa_config.json") | |
| with open(cfg_path) as f: | |
| cfg = json.load(f) | |
| # Recreate Gemma + load LoRA adapter | |
| from transformers import AutoModelForCausalLM, AutoTokenizer | |
| from peft import PeftModel | |
| tok = AutoTokenizer.from_pretrained(cfg["gemma_id"], trust_remote_code=True) | |
| llm_base = AutoModelForCausalLM.from_pretrained(cfg["gemma_id"], trust_remote_code=True, torch_dtype=torch.float16).cuda() | |
| llm = PeftModel.from_pretrained(llm_base, cfg["adapter_repo"]).eval() | |
| # Build fusion head and load weights | |
| head = EAAEmotionRegressor( | |
| d_sem=cfg["d_sem"], d_ac=cfg["d_ac"], llm_hidden=cfg["llm_hidden"], | |
| fusion_dim=cfg["fusion_dim"], num_audio_tokens=cfg["num_audio_tokens"] | |
| ).cuda().eval() | |
| sd_path = hf_hub_download(repo_id="marccgrau/eaa-gemma3-270m-w2vbert-emotion2vec", filename="fusion_head.pt") | |
| head.load_state_dict(torch.load(sd_path, map_location="cpu")) | |
| # Now pass (sem_feats, ac_feats) and (input_ids) to head.forward(..., llm=llm) | |
| ``` |