Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up

Vendor ABC

company
Activity Feed Request to join this org

AI & ML interests

None defined yet.

Rajiv Shah's profile picture Taylor Linton's profile picture

rajistics 
posted an update 10 months ago
view post
Post
3655
Having some fun with long context benchmarks (watch the video!!)

NoLiMA: NoLiMa: Long-Context Evaluation Beyond Literal Matching (2502.05167)
Fiction LiveBench: https://fiction.live/stories/Fiction-liveBench-Mar-25-2025/oQdzQvKHw8JyXbN87
Michalenglo: https://deepmind.google/research/publications/117639/
LongGenBench: Spinning the Golden Thread: Benchmarking Long-Form Generation in Language Models (2409.02076)
NeedleBench: NeedleBench: Can LLMs Do Retrieval and Reasoning in 1 Million Context Window? (2407.11963)
RULER: RULER: What's the Real Context Size of Your Long-Context Language Models? (2404.06654)

For more: https://www.reddit.com/r/rajistics/comments/1jxwk29/long_context_llm_benchmarks_video/

let me know if you like these posts
rajistics 
updated 3 models over 3 years ago

vendorabc/tabular-playground

Tabular Classification • Updated Aug 30, 2022

vendorabc/modeltest

Tabular Classification • Updated Aug 30, 2022

vendorabc/modelhubexample

Tabular Classification • Updated Aug 30, 2022
Company
TOS Privacy About Careers
Website
Models Datasets Spaces Pricing Docs