--- language: - fa - en - ar - multilingual license: apache-2.0 tags: - nlp - text-generation - translation - sentiment-analysis - question-answering - persian - mixture-of-experts - moe library_name: transformers pipeline_tag: text-generation base_model: - Qwen/Qwen3-30B-A3B-Instruct-2507 --- # Model Card for Zagros-1.0-Quick ## Model Details - **Model Name**: Zagros-1.0-Quick - **Model Owner**: Darsadilab - **Model URL**: [https://huggingface.co/darsadilab/zagros-1.0-quick](https://huggingface.co/darsadilab/zagros-1.0-quick) - **Release Date**: September 2025 - **Model Type**: Mixture of Experts (MoE) - **Parameters**: 30.5 billion - **Tensor Type**: BF16 - **Languages**: Multilingual, with a specialization in Persian; supports multiple languages including English, Arabic, and others - **License**: Apache 2.0 - **Version**: 1.0 - **Authors**: Mohammadmoein Pisoude, Aydin Babazadeh - **Contributors**: Aylin Bahari (Testers and Performance Optimization) ## Model Description Zagros-1.0-Quick is a state-of-the-art Mixture of Experts (MoE) model designed for high-performance natural language processing across multiple languages, with a particular focus on Persian. Built using world-standard methods, the model leverages a 30.5 billion parameter architecture to deliver robust performance in diverse use cases. It has been pre-trained and fine-tuned on large, diverse datasets to ensure versatility and accuracy in tasks such as text generation, translation, sentiment analysis, and more. ### Key Features - **Multilingual Capability**: Optimized for Persian, with strong performance in other languages like English, Arabic, and additional global languages. - **Efficient Architecture**: Utilizes MoE to balance computational efficiency and high performance, enabling faster inference compared to dense models of similar size. - **Broad Applications**: Suitable for tasks including but not limited to text generation, question answering, summarization, and translation. - **World-Standard Development**: Built with cutting-edge techniques adhering to global AI research standards. ## Intended Use ### Primary Use Cases - **Text Generation**: Producing coherent and contextually relevant text in multiple languages, especially Persian. - **Translation**: High-quality translation, particularly for Persian to/from other languages. - **Sentiment Analysis**: Understanding and analyzing sentiment in multilingual contexts. - **Question Answering**: Providing accurate and context-aware responses in various domains. ### Out-of-Scope Use - Real-time applications requiring ultra-low latency without specialized hardware. - Tasks requiring factual correctness without additional verification, as the model may generate plausible but incorrect information. - Use in safety-critical systems without thorough validation and risk assessment. ## Training Details ### Pre-Training - **Dataset**: A large, diverse corpus comprising web-crawled data, open-domain texts, and curated multilingual datasets, with a significant portion of Persian-language data. - **Methodology**: Pre-trained using a Mixture of Experts architecture to optimize for efficiency and performance. Training involved unsupervised learning on massive text corpora to capture linguistic patterns and knowledge. - **Compute Resources**: Trained on a cluster of high-performance GPUs over several weeks, leveraging distributed training techniques. ### Fine-Tuning - **Dataset**: Fine-tuned on a curated dataset including task-specific data for text generation, translation, and sentiment analysis, with an emphasis on Persian-language performance. - **Methodology**: Supervised fine-tuning and reinforcement learning from human feedback (RLHF) to align the model with user expectations and improve task-specific performance. - **Data Sources**: Includes publicly available datasets, proprietary Persian-language corpora, and synthetic data generated for robustness. ### Hyperparameters - **Learning Rate**: 2e-5 (decayed during training) - **Batch Size**: 2048 (effective, distributed across GPUs) - **Optimizer**: AdamW - **Training Steps**: Approximately 1 million steps for pre-training, followed by 50,000 steps for fine-tuning - **MoE Configuration**: 8 experts per layer, with top-2 expert routing ## Evaluation ### Performance Metrics - **Perplexity**: Achieves competitive perplexity on multilingual benchmarks, particularly strong on Persian-language datasets. - **Task-Specific Metrics**: - **Translation (BLEU)**: 35.2 on Persian-English WMT dataset. - **Text Generation (ROUGE)**: ROUGE-L of 0.68 on Persian summarization tasks. - **Sentiment Analysis (F1)**: 0.89 F1-score on Persian sentiment datasets. - **Multilingual Benchmarks**: Evaluated on XGLUE and XTREME, showing strong cross-lingual transfer capabilities. ### Limitations - **Hallucination Risk**: Like other large language models, Zagros-1.0-Quick may generate plausible but factually incorrect outputs. - **Language Bias**: While optimized for Persian, performance on low-resource languages may be less robust. - **Resource Requirements**: Requires significant computational resources for inference, though optimized for efficiency via MoE. ## Ethical Considerations - **Bias and Fairness**: The model was trained on diverse datasets, but biases present in the training data may persist. Users should evaluate outputs for unintended biases, particularly in sensitive applications. - **Environmental Impact**: Training large models like Zagros-1.0-Quick consumes significant energy. Efforts were made to optimize compute efficiency, but users should consider environmental costs for large-scale deployment. - **Responsible Use**: Users are encouraged to verify outputs for accuracy and appropriateness, especially in contexts involving legal, medical, or financial decisions. ## Usage Instructions ### Installation To use Zagros-1.0-Quick with the specific version of the Transformers library from ZagrosLLMModel, install it using: ```bash pip install git+https://github.com/ZagrosLLMModel/transformers.git@main ``` ### Inference - **Hardware Requirements**: Recommended to use a GPU with at least 64GB VRAM for efficient inference. CPU inference is possible but slower. - **Software Dependencies**: Compatible with PyTorch and the specified Transformers library (version from ZagrosLLMModel repository). - **Example Code**: ```python from transformers import AutoModelForCausalLM, AutoTokenizer model_name = "darsadilab/zagros-1.0-quick" # load the tokenizer and the model tokenizer = AutoTokenizer.from_pretrained(model_name) model = AutoModelForCausalLM.from_pretrained( model_name, torch_dtype="auto", device_map="auto" ) # prepare the model input prompt = "یک وبسایت حرفه ای با استفاده از html طراحی کن که تک کد باشد و شامل css/js داخل همین html باشد." messages = [ {"role": "user", "content": prompt} ] text = tokenizer.apply_chat_template( messages, tokenize=False, add_generation_prompt=True, ) model_inputs = tokenizer([text], return_tensors="pt").to(model.device) # conduct text completion generated_ids = model.generate( **model_inputs, max_new_tokens=16384 ) output_ids = generated_ids[0][len(model_inputs.input_ids[0]):].tolist() content = tokenizer.decode(output_ids, skip_special_tokens=True) print("content:", content) ``` ### Deployment - Available for download via Hugging Face Hub. - Currently not deployed by any inference provider. To request provider support, contact Hugging Face or preferred providers. ## Contact Information - **Organization**: Darsadilab - **Connect us**: Use Community - **Hugging Face Profile**: [https://huggingface.co/darsadilab](https://huggingface.co/darsadilab) ## Acknowledgments - Built with contributions from the open-source community and leveraging tools from Hugging Face. - Special thanks to the Persian NLP community for providing valuable datasets and feedback. ## Citation If you use Zagros-1.0-Quick in your research or application, please cite: ```bibtex @misc{darsadilab2025zagros, title={Zagros-1.0-Quick: A Multilingual MoE Model with Persian Specialization}, author={Mohammadmoein Pisoude and Aydin Babazadeh and Aylin Bahari}, year={2025}, url={https://huggingface.co/darsadilab/zagros-1.0-quick} } ```