codelion commited on
Commit
47ee41f
·
verified ·
1 Parent(s): 7d1c75a

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -1
README.md CHANGED
@@ -120,7 +120,7 @@ Dhara-70M is a novel diffusion language model that achieves:
120
  | **FF Dimension** | 1024 |
121
  | **Attention Heads** | 8 |
122
  | **KV Heads** | 4 (GQA) |
123
- | **Context Length** | 2048 tokens |
124
  | **Position Encoding** | RoPE |
125
  | **Normalization** | RMSNorm |
126
  | **Special Layers** | Canon (depthwise causal convolutions) |
 
120
  | **FF Dimension** | 1024 |
121
  | **Attention Heads** | 8 |
122
  | **KV Heads** | 4 (GQA) |
123
+ | **Context Length** | 1024 tokens |
124
  | **Position Encoding** | RoPE |
125
  | **Normalization** | RMSNorm |
126
  | **Special Layers** | Canon (depthwise causal convolutions) |