--- library_name: transformers license: mit base_model: deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B tags: - generated_from_trainer - conversational - instruction-tuned - smoltalk datasets: - HuggingFaceTB/smoltalk metrics: - MMLU language: - en model-index: - name: DeepSeek-R1-Distill-Qwen-1.5B-finetuned-smoltalk-everyday-conversations results: - task: name: Text Generation type: text-generation dataset: name: HuggingFaceTB/smoltalk type: HuggingFaceTB/smoltalk metrics: - name: MMLU-PEM (0-shot) type: MMLU-PEM (0-shot) value: 0.2749 --- # Model Card for DeepSeek-R1-SmolTalk This model is a fine-tuned version of [`deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B`](https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B) on the [SmolTalk dataset](https://huggingface.co/datasets/HuggingFaceTB/smoltalk). It is optimized for small-scale, friendly, and engaging instruction-following dialogue. ## Model Details ### Model Description This model builds on DeepSeek's distilled Qwen-1.5B architecture and is trained for conversational tasks using the SmolTalk dataset. The goal is to create a lightweight, instruction-following model suitable for use in chatbots or lightweight assistants with limited hardware resources. - **Model type:** Instruction-tuned causal decoder (chat) - **Language(s):** English - **License:** MIT - **Finetuned from model:** deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B ## Uses ### Direct Use This model can be used as a lightweight assistant or chatbot in applications such as: - Embedded conversational interfaces - Educational or toy assistants - Small devices or local applications ### Downstream Use The model can be further fine-tuned or integrated into larger conversational systems, especially where resource efficiency is crucial. ### Out-of-Scope Use - Not suitable for tasks requiring deep factual accuracy or reasoning - Should not be used for sensitive or high-stakes decision making - Not designed for multilingual use ## Bias, Risks, and Limitations Due to the small model size and dataset limitations: - May produce generic or incorrect outputs - Can reflect biases present in the training dataset - Not guaranteed to be safe for all user demographics or use cases ### Recommendations - Use in controlled or sandboxed environments - Consider integrating content moderation or rule-based filtering - Do not deploy in contexts requiring factual correctness or ethical judgment ## How to Get Started with the Model ```Python from transformers import AutoModelForCausalLM, AutoTokenizer model = AutoModelForCausalLM.from_pretrained("avanishd/DeepSeek-R1-Distill-Qwen-1.5B-finetuned-smoltalk-everyday-conversations") tokenizer = AutoTokenizer.from_pretrained("avanishd/DeepSeek-R1-Distill-Qwen-1.5B-finetuned-smoltalk-everyday-conversations") input_text = "Hi there! What can you do?" inputs = tokenizer(input_text, return_tensors="pt") outputs = model.generate(**inputs, max_new_tokens=100) print(tokenizer.decode(outputs[0], skip_special_tokens=True)) ``` ## Training Details ### Training Data Used [SmolTalk dataset](https://huggingface.co/datasets/HuggingFaceTB/smoltalk), a dataset of lightweight, instruction-style conversations. The dataset is designed to help models learn concise, friendly, and helpful interactions. ### Training Procedure #### Preprocessing [optional] Used the DeepSeek tokenizer #### LoRA Configuration - rank: 6 - alpha: 12 - dropout: 0.05 - bias: none - target: linear #### Training Hyperparameters The following hyperparameters were used during training: - learning_rate: 2e-04 - train_batch_size: 2 - eval_batch_size: 2 - seed: 42 - gradient_accumulation_steps: 2 - gradient_clipping: 0.3 - total_train_batch_size: 128 - optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED - lr_scheduler_type: constant - lr_scheduler_warmup_ratio: 0.03 - num_epochs: 1 - mixed_precision_training: bf16 #### Speeds, Sizes, Times [optional] [More Information Needed] ## Evaluation ### Testing Data, Factors & Metrics #### Testing Data [More Information Needed] #### Factors [More Information Needed] #### Metrics [More Information Needed] ### Results [More Information Needed] #### Summary ## Model Examination [optional] [More Information Needed] ## Environmental Impact Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). - **Hardware Type:** [More Information Needed] - **Hours used:** [More Information Needed] - **Cloud Provider:** [More Information Needed] - **Compute Region:** [More Information Needed] - **Carbon Emitted:** [More Information Needed] ## Technical Specifications [optional] ### Model Architecture and Objective [More Information Needed] ### Compute Infrastructure [More Information Needed] #### Hardware [More Information Needed] #### Software [More Information Needed] ## Citation [optional] **BibTeX:** [More Information Needed] **APA:** [More Information Needed] ## Glossary [optional] [More Information Needed] ## More Information [optional] [More Information Needed] ## Model Card Authors [optional] [More Information Needed] ## Model Card Contact [More Information Needed] fill this model card