-
LionGuard: Building a Contextualized Moderation Classifier to Tackle Localized Unsafe Content
Paper • 2407.10995 • Published • 2 -
A Flexible Large Language Models Guardrail Development Methodology Applied to Off-Topic Prompt Detection
Paper • 2411.12946 • Published • 22 -
Safe at the Margins: A General Approach to Safety Alignment in Low-Resource English Languages -- A Singlish Case Study
Paper • 2502.12485 • Published • 2 -
MinorBench: A hand-built benchmark for content-based risks for children
Paper • 2503.10242 • Published • 5
Collections
Discover the best community collections!
Collections including paper arxiv:2411.12946
-
Self-Rewarding Language Models
Paper • 2401.10020 • Published • 151 -
Orion-14B: Open-source Multilingual Large Language Models
Paper • 2401.12246 • Published • 14 -
MambaByte: Token-free Selective State Space Model
Paper • 2401.13660 • Published • 60 -
MM-LLMs: Recent Advances in MultiModal Large Language Models
Paper • 2401.13601 • Published • 48
-
Video Creation by Demonstration
Paper • 2412.09551 • Published • 9 -
DiffSensei: Bridging Multi-Modal LLMs and Diffusion Models for Customized Manga Generation
Paper • 2412.07589 • Published • 48 -
Unraveling the Complexity of Memory in RL Agents: an Approach for Classification and Evaluation
Paper • 2412.06531 • Published • 72 -
APOLLO: SGD-like Memory, AdamW-level Performance
Paper • 2412.05270 • Published • 38
-
Human-like Episodic Memory for Infinite Context LLMs
Paper • 2407.09450 • Published • 62 -
MUSCLE: A Model Update Strategy for Compatible LLM Evolution
Paper • 2407.09435 • Published • 23 -
Refuse Whenever You Feel Unsafe: Improving Safety in LLMs via Decoupled Refusal Training
Paper • 2407.09121 • Published • 6 -
ChatQA 2: Bridging the Gap to Proprietary LLMs in Long Context and RAG Capabilities
Paper • 2407.14482 • Published • 26
-
LionGuard: Building a Contextualized Moderation Classifier to Tackle Localized Unsafe Content
Paper • 2407.10995 • Published • 2 -
A Flexible Large Language Models Guardrail Development Methodology Applied to Off-Topic Prompt Detection
Paper • 2411.12946 • Published • 22 -
Safe at the Margins: A General Approach to Safety Alignment in Low-Resource English Languages -- A Singlish Case Study
Paper • 2502.12485 • Published • 2 -
MinorBench: A hand-built benchmark for content-based risks for children
Paper • 2503.10242 • Published • 5
-
Video Creation by Demonstration
Paper • 2412.09551 • Published • 9 -
DiffSensei: Bridging Multi-Modal LLMs and Diffusion Models for Customized Manga Generation
Paper • 2412.07589 • Published • 48 -
Unraveling the Complexity of Memory in RL Agents: an Approach for Classification and Evaluation
Paper • 2412.06531 • Published • 72 -
APOLLO: SGD-like Memory, AdamW-level Performance
Paper • 2412.05270 • Published • 38
-
Human-like Episodic Memory for Infinite Context LLMs
Paper • 2407.09450 • Published • 62 -
MUSCLE: A Model Update Strategy for Compatible LLM Evolution
Paper • 2407.09435 • Published • 23 -
Refuse Whenever You Feel Unsafe: Improving Safety in LLMs via Decoupled Refusal Training
Paper • 2407.09121 • Published • 6 -
ChatQA 2: Bridging the Gap to Proprietary LLMs in Long Context and RAG Capabilities
Paper • 2407.14482 • Published • 26
-
Self-Rewarding Language Models
Paper • 2401.10020 • Published • 151 -
Orion-14B: Open-source Multilingual Large Language Models
Paper • 2401.12246 • Published • 14 -
MambaByte: Token-free Selective State Space Model
Paper • 2401.13660 • Published • 60 -
MM-LLMs: Recent Advances in MultiModal Large Language Models
Paper • 2401.13601 • Published • 48