System 2 Attention (is something you might need too) Paper • 2311.11829 • Published Nov 20, 2023 • 44
Rethinking Attention: Exploring Shallow Feed-Forward Neural Networks as an Alternative to Attention Layers in Transformers Paper • 2311.10642 • Published Nov 17, 2023 • 26
Orca 2: Teaching Small Language Models How to Reason Paper • 2311.11045 • Published Nov 18, 2023 • 77