Peter Szemraj's picture

Peter Szemraj PRO

pszemraj

·

https://pszemraj.carrd.co/

AI & ML interests

metallic intuition

Recent Activity

liked a model 3 days ago

Alibaba-NLP/GVE-3B

upvoted a collection 3 days ago

upvoted a paper 4 days ago

Trove: A Flexible Toolkit for Dense Retrieval

View all activity

Organizations

upvoted a collection 3 days ago

GVE

Towards General Video Embeddings: Models and Benchmarks • 4 items • Updated 6 days ago • 16

upvoted a paper 4 days ago

Trove: A Flexible Toolkit for Dense Retrieval

Paper • 2511.01857 • Published 5 days ago • 10

upvoted 2 papers 7 days ago

The Principles of Diffusion Models

Paper • 2510.21890 • Published 16 days ago • 51

Reasoning Language Model Inference Serving Unveiled: An Empirical Study

Paper • 2510.18672 • Published 18 days ago • 7

upvoted a collection 7 days ago

LightOnOCR

The Case for End-to-End and Efficient Domain-Specific Vision-Language Models for OCR • 6 items • Updated 13 days ago • 13

upvoted a paper 8 days ago

Kimi Linear: An Expressive, Efficient Attention Architecture

Paper • 2510.26692 • Published 9 days ago • 99

upvoted 2 papers 12 days ago

WorldGrow: Generating Infinite 3D World

Paper • 2510.21682 • Published 15 days ago • 40

Reasoning with Sampling: Your Base Model is Smarter Than You Think

Paper • 2510.14901 • Published 23 days ago • 45

upvoted 2 papers 15 days ago

Attention Sinks in Diffusion Language Models

Paper • 2510.15731 • Published 22 days ago • 47

olmOCR 2: Unit Test Rewards for Document OCR

Paper • 2510.19817 • Published 17 days ago • 13

upvoted 2 papers 17 days ago

AION-1: Omnimodal Foundation Model for Astronomical Sciences

Paper • 2510.17960 • Published 19 days ago • 27

Chem-R: Learning to Reason as a Chemist

Paper • 2510.16880 • Published 20 days ago • 52

upvoted a paper 18 days ago

When to Ensemble: Identifying Token-Level Points for Stable and Fast LLM Ensembling

Paper • 2510.15346 • Published 23 days ago • 32

upvoted 2 papers 19 days ago

Robust Layerwise Scaling Rules by Proper Weight Decay Tuning

Paper • 2510.15262 • Published 23 days ago • 5

Language Models Model Language

Paper • 2510.12766 • Published 25 days ago • 23

upvoted a paper 22 days ago

Large Language Models Do NOT Really Know What They Don't Know

Paper • 2510.09033 • Published 30 days ago • 16

upvoted a paper 24 days ago

Deconstructing Attention: Investigating Design Principles for Effective Language Modeling

Paper • 2510.11602 • Published 26 days ago • 14

upvoted a paper 25 days ago

A Survey of Vibe Coding with Large Language Models

Paper • 2510.12399 • Published 26 days ago • 47

upvoted a collection 25 days ago

Qwen3-VL

37 items • Updated 7 days ago • 375

upvoted a collection 26 days ago

Nanonets-OCR2

2 items • Updated 26 days ago • 24