Update README.md

c8988f1 verified about 2 months ago

5.6 kB

	---
	language:
	- en
	- ko
	license: other
	license_name: solar-apache-2.0
	tags:
	- upstage
	- solar
	- moe
	- 100b
	- llm
	---

	<p align="center">
	<img src="./Solar-Open-100B.png" alt="Solar Open Model" width="100%">
	</p>

	# Solar Open

	Solar Open is Upstage's flagship 102B-parameter large language model, trained entirely from scratch and released under the Solar-Apache License 2.0 (see [LICENSE](#license) for details). As a Mixture-of-Experts (MoE) architecture, it delivers enterprise-grade performance in reasoning, instruction-following, and agentic capabilities—all while prioritizing transparency and customization for the open-source community.

	## Highlights

	* MoE Architecture (102B / 12B): Built on a Mixture-of-Experts architecture with 102B total / 12B active parameters. This design delivers the knowledge depth of a massive model with the inference speed and cost-efficiency of a much smaller model.
	* Massive Training Scale: Pre-trained on 19.7 trillion tokens, ensuring broad knowledge coverage and robust reasoning capabilities across various domains.

	## Model Overview

	* Model Name: Solar Open 100B
	* Hugging Face ID: Upstage/Solar-Open-100B
	* Architecture: Mixture-of-Experts (MoE)
	* Total Parameters: 102.6B
	* Active Parameters: 12B (per token)
	* Experts: 129 Experts (top 8 among 128 Routed + 1 Shared)
	* Pre-training Tokens: 19.7 Trillion
	* Context Length: 128k
	* Training Hardware: NVIDIA B200 GPUs
	* License: Solar-Apache License 2.0 (See [LICENSE](./LICENSE))
	* Hardware Requirements:
	* Minimum: 4x NVIDIA A100 (80GB)

	## License
	This repository contains both model weights and code,
	which are licensed under different terms:

	1. MODEL WEIGHTS (*.safetensors)
	Licensed under Solar-Apache License 2.0
	See: https://huggingface.co/upstage/Solar-Open-100B/blob/main/LICENSE

	2. CODE (.py, .json, *.jinja files)
	Licensed under Apache License 2.0
	See: https://www.apache.org/licenses/LICENSE-2.0

	## Performance

	TBA

	## Inference Quickstart

	We recommend using the following generation parameters:

	```
	temperature=0.8
	top_p=0.95
	top_k=50
	```

	### Transformers

	Install the required dependencies:

	```bash
	pip install -U transformers kernels torch accelerate
	```

	Run inference with the following code:

	```python
	import torch
	from transformers import AutoModelForCausalLM, AutoTokenizer

	MODEL_ID = "upstage/Solar-Open-100B"

	# Load model and tokenizer
	tokenizer = AutoTokenizer.from_pretrained(MODEL_ID)

	model = AutoModelForCausalLM.from_pretrained(
	pretrained_model_name_or_path=MODEL_ID,
	torch_dtype=torch.bfloat16,
	device_map="auto",
	trust_remote_code=True,
	)

	# Prepare input
	messages = [{"role": "user", "content": "who are you?"}]
	inputs = tokenizer.apply_chat_template(
	messages,
	tokenize=True,
	add_generation_prompt=True,
	return_dict=True,
	return_tensors="pt",
	)
	inputs = inputs.to(model.device)

	# Generate response
	generated_ids = model.generate(
	**inputs,
	max_new_tokens=4096,
	temperature=0.8,
	top_p=0.95,
	top_k=50,
	do_sample=True,
	)
	generated_text = tokenizer.decode(generated_ids[0][inputs.input_ids.shape[1] :])
	print(generated_text)
	```

	### vLLM

	#### Option 1: Using Docker (Highly Recommended)
	Docker is the recommended deployment method for running `Solar-Open-100B`.

	```bash
	# For 8 GPUs
	docker run --gpus all \
	--ipc=host \
	-p 8000:8000 \
	upstage/vllm-solar-open:latest \
	upstage/Solar-Open-100B \
	--trust-remote-code \
	--enable-auto-tool-choice \
	--tool-call-parser solar_open \
	--reasoning-parser solar_open \
	--logits-processors vllm.model_executor.models.parallel_tool_call_logits_processor:ParallelToolCallLogitsProcessor \
	--logits-processors vllm.model_executor.models.solar_open_logits_processor:SolarOpenTemplateLogitsProcessor \
	--tensor-parallel-size 8
	```

	#### Option 2: Installing from Source
	For development, debugging, custom modifications or offline inference, Solar Open can also be run
	using a source installation of vLLM. We recommend using [uv](https://docs.astral.sh/uv/) for environment
	management and dependency resolution.

	Create and activate a Python virtual environment
	```bash
	uv venv --python 3.12 --seed
	source .venv/bin/activate
	```

	Install Solar Open's optimized vLLM
	```bash
	VLLM_PRECOMPILED_WHEEL_LOCATION="https://github.com/vllm-project/vllm/releases/download/v0.12.0/vllm-0.12.0-cp38-abi3-manylinux_2_31_x86_64.whl" \
	VLLM_USE_PRECOMPILED=1 \
	uv pip install git+https://github.com/UpstageAI/vllm.git@v0.12.0-solar-open
	```

	Start the vLLM server (For 8 GPUs)
	```bash
	vllm serve upstage/Solar-Open-100B \
	--trust-remote-code \
	--enable-auto-tool-choice \
	--tool-call-parser solar_open \
	--reasoning-parser solar_open \
	--logits-processors vllm.model_executor.models.parallel_tool_call_logits_processor:ParallelToolCallLogitsProcessor \
	--logits-processors vllm.model_executor.models.solar_open_logits_processor:SolarOpenTemplateLogitsProcessor \
	--tensor-parallel-size 8
	```

	## Public API Access

	The official API service for Solar Open is scheduled to launch publicly on January.

	* Access: Upstage Console (TBA)
	* Documentation: Upstage Console (TBA)

	## Citation

	If you use Solar Open in your research, please cite:

	```bibtex
	@misc{solar-open-2025,
	title={Solar Open: Scaling Upstage's LLM Capabilities with MoE},
	author={Upstage AI},
	year={2025},
	url={https://huggingface.co/Upstage/Solar-Open-100B}
	}
	```