Victor Jotham Ashioya's picture

Victor Jotham Ashioya

ashioyajotham

·

https://ashioyajotham.github.io/

AI & ML interests

Hallucination in LLMs, AI Safety: alignment, red-teaming

Organizations

None yet

upvoted 2 papers 2 months ago

Lost in Embeddings: Information Loss in Vision-Language Models

Paper • 2509.11986 • Published Sep 15 • 27

A Survey of Reinforcement Learning for Large Reasoning Models

Paper • 2509.08827 • Published Sep 10 • 188

upvoted 4 papers 3 months ago

Neither Valid nor Reliable? Investigating the Use of LLMs as Judges

Paper • 2508.18076 • Published Aug 25 • 6

AetherCode: Evaluating LLMs' Ability to Win In Premier Programming Competitions

Paper • 2508.16402 • Published Aug 22 • 14

Training Long-Context, Multi-Turn Software Engineering Agents with Reinforcement Learning

Paper • 2508.03501 • Published Aug 5 • 57

Tool-integrated Reinforcement Learning for Repo Deep Search

Paper • 2508.03012 • Published Aug 5 • 20

upvoted a paper 4 months ago

UloRL:An Ultra-Long Output Reinforcement Learning Approach for Advancing Large Language Models' Reasoning Abilities

Paper • 2507.19766 • Published Jul 26 • 14

upvoted 2 papers 7 months ago

FormalMATH: Benchmarking Formal Mathematical Reasoning of Large Language Models

Paper • 2505.02735 • Published May 5 • 34

LongLoRA: Efficient Fine-tuning of Long-Context Large Language Models

Paper • 2309.12307 • Published Sep 21, 2023 • 89

upvoted 2 papers 10 months ago

SFT Memorizes, RL Generalizes: A Comparative Study of Foundation Model Post-training

Paper • 2501.17161 • Published Jan 28 • 123

Towards General-Purpose Model-Free Reinforcement Learning

Paper • 2501.16142 • Published Jan 27 • 30

upvoted a paper about 1 year ago

Sapiens: Foundation for Human Vision Models

Paper • 2408.12569 • Published Aug 22, 2024 • 94

upvoted 8 papers over 1 year ago

Automatic Data Curation for Self-Supervised Learning: A Clustering-Based Approach

Paper • 2405.15613 • Published May 24, 2024 • 17

An Introduction to Vision-Language Modeling

Paper • 2405.17247 • Published May 27, 2024 • 90

Beyond Scaling Laws: Understanding Transformer Performance with Associative Memory

Paper • 2405.08707 • Published May 14, 2024 • 34

LLM Agent Operating System

Paper • 2403.16971 • Published Mar 25, 2024 • 72

MM1: Methods, Analysis & Insights from Multimodal LLM Pre-training

Paper • 2403.09611 • Published Mar 14, 2024 • 129

Algorithmic progress in language models

Paper • 2403.05812 • Published Mar 9, 2024 • 20

Stealing Part of a Production Language Model

Paper • 2403.06634 • Published Mar 11, 2024 • 91

Common 7B Language Models Already Possess Strong Math Capabilities

Paper • 2403.04706 • Published Mar 7, 2024 • 20