Fix modeling_nemotron_h.py
#28 opened about 1 month ago
by
rrs1616
NVIDIA-Nemotron-Nano-9B-v2 with Docker
π€
1
#27 opened about 2 months ago
by
MOHASOFT
Add Streaming Tool Calling support
#26 opened about 2 months ago
by
crisafullifr
What inference setting for coding?
1
#25 opened 2 months ago
by
akierum
Can we have more detailed instructions on installing dependencies?
β
1
1
#24 opened 2 months ago
by
steveheh
Update README.md
#23 opened 3 months ago
by
sudoping01
Any plans to release the training recipe?
π
π
5
2
#21 opened 3 months ago
by
nskwal
Request: DOI
#19 opened 3 months ago
by
itsAmmar
feat: Add CPU support
#18 opened 3 months ago
by
gabegoodhart
I think yall can afford to benchmark Qwen 3 8B
π
1
1
#17 opened 3 months ago
by
owenqwenllmwine
Slower than Qwen3-8B despite claimed 3x inference speedup
9
#16 opened 3 months ago
by
coszeros
sad! no tool calls in streaming mode.
#15 opened 3 months ago
by
j4ys0n
HybridMambaAttentionDynamicCache is not valid?
β
2
2
#14 opened 3 months ago
by
GentleLiu
Any plans for MLX support?
1
#12 opened 3 months ago
by
Alealejandrooo
some problem when I asked the model: δ½ ζ―θ°οΌ
π€―
2
3
#8 opened 3 months ago
by
wenzel94
OOM with vllm==0.10.1 on GPU L40S
2
#7 opened 3 months ago
by
qingfu
GGUF support
β€οΈ
4
18
#4 opened 3 months ago
by
RedEyed
This just trades general performance for domain specific gains.
π₯
π
16
11
#3 opened 3 months ago
by
phil111