Long Grounded Thoughts: Distilling Compositional Visual Reasoning Chains at Scale Paper • 2511.05705 • Published 8 days ago • 6
OmniVinci: Enhancing Architecture and Data for Omni-Modal Understanding LLM Paper • 2510.15870 • Published 29 days ago • 86
NeKo: Toward Post Recognition Generative Correction Large Language Models with Task-Oriented Experts Paper • 2411.05945 • Published Nov 8, 2024 • 4
HyPoradise: An Open Baseline for Generative Speech Recognition with Large Language Models Paper • 2309.15701 • Published Sep 27, 2023 • 2
Whispering LLaMA: A Cross-Modal Generative Error Correction Framework for Speech Recognition Paper • 2310.06434 • Published Oct 10, 2023 • 4
Low-rank Adaptation of Large Language Model Rescoring for Parameter-Efficient Speech Recognition Paper • 2309.15223 • Published Sep 26, 2023 • 22