---
license: apache-2.0
language:
- id
- en
tags:
- text-generation
- pytorch
- causal-lm
- transformer
- untrained
- gqa
- rope
- swiglu
- rmsnorm
- flash-attention
- indonesian
library_name: transformers
pipeline_tag: text-generation
widget:
- text: "Jakarta adalah ibu kota"
example_title: "🇮🇩 Text Completion (ID)"
- text: |
Pertanyaan: Apa itu kecerdasan buatan?
Jawaban:
example_title: "🇮🇩 Question Answering (ID)"
- text: |
Tulis cerita pendek tentang robot yang belajar mencintai.
example_title: "🇮🇩 Creative Writing (ID)"
- text: "The capital of Indonesia is"
example_title: "🇬🇧 Text Completion (EN)"
- text: |
Question: What is artificial intelligence?
Answer:
example_title: "🇬🇧 Question Answering (EN)"
- text: |
def fibonacci(n):
"""Hitung bilangan fibonacci ke-n"""
example_title: "💻 Code Completion"
- text: |
def reverse_string(s):
example_title: "💻 Code Generation"
- text: |
User: Halo! Siapa kamu?
Assistant:
example_title: "💬 Chat Format (ID)"
- text: |
User: Jelaskan tentang machine learning dalam 2 kalimat.
Assistant:
example_title: "💬 Conversational (ID)"
inference:
parameters:
max_new_tokens: 100
temperature: 0.7
top_p: 0.9
top_k: 50
do_sample: true
repetition_penalty: 1.1
num_beams: 1
datasets: []
metrics:
- perplexity
model-index:
- name: caca-100M
results: []
---

# 🚀 CACA-100M
### Model Transformer Modern dengan Arsitektur Canggih
[](https://opensource.org/licenses/Apache-2.0)
[](https://www.python.org/downloads/)
[](https://pytorch.org/)
[](https://github.com/huggingface/transformers)
**222,201,088** parameters • **222.2M** • **20 layers**
[📖 Dokumentasi](#dokumentasi) • [🚀 Quick Start](#quick-start) • [💡 Fitur](#fitur-utama) • [🔧 Training](#training-guide) • [📊 Spesifikasi](#spesifikasi-teknis)
---
## ⚠️ PENTING: Model Belum Dilatih (Untrained)
> **PERHATIAN**: Ini adalah model yang **belum melalui proses training**. Bobot model masih dalam kondisi random initialization. Output yang dihasilkan akan **tidak bermakna dan acak**.
**Status Model:**
- 🔴 **Belum dilatih** - Bobot masih random
- 🟡 **Hanya untuk riset** - Eksperimen arsitektur & training
- 🟢 **Siap dilatih** - Arsitektur sudah teruji
Widget di atas hanya menunjukkan **format input yang diharapkan**. Setelah model dilatih dengan dataset yang tepat, format yang sama akan menghasilkan output berkualitas.
---
## 📋 Deskripsi
**Caca** adalah arsitektur Large Language Model (LLM) generasi terbaru yang menggabungkan berbagai teknik state-of-the-art dalam deep learning. Model ini dirancang dengan fokus pada **efisiensi**, **skalabilitas**, dan **performa tinggi**.
### 🎯 Keunggulan Utama
- **🇮🇩 Bilingual Support**: Optimized untuk Bahasa Indonesia & English
- **⚡ Ultra Fast**: Flash Attention 2 untuk inferensi 3x lebih cepat
- **💾 Memory Efficient**: Grouped Query Attention menghemat 75% KV cache
- **🎯 Long Context**: Support hingga 4,096 token
- **🔧 Modular**: Arsitektur fleksibel dengan berbagai opsi konfigurasi
---
## ✨ Fitur Utama
### 🎯 Core Features
- ✅ **Grouped Query Attention (GQA)** - Efisiensi memori dan komputasi superior
- Query heads: 12
- KV heads: 4
- Ratio: 3:1 (hemat 75% KV cache)
- ✅ **Rotary Position Embeddings (RoPE)** - Generalisasi konteks panjang lebih baik
- Theta: 10000
- Support extrapolation untuk konteks > training length
- ✅ **RMSNorm** - Normalisasi lebih stabil dan 50% lebih cepat dari LayerNorm
- Epsilon: 1e-06
- ✅ **SwiGLU Activation** - Performa 10-15% lebih baik dari ReLU/GELU
- Intermediate size: 3,072
- ✅ **Flash Attention 2** - Akselerasi hingga 3x dengan memory efficiency
- Otomatis aktif jika tersedia CUDA
### 🔥 Advanced Features
### 🎯 Attention Mechanisms
- ⚡ **Flash Attention v2** - 3x faster with IO-aware algorithm
- 🔑 **Grouped Query Attention (GQA)** - 12Q : 4KV heads
- 🚀 **xFormers Support** - Memory efficient attention fallback
- 🎯 **PyTorch SDPA** - Native scaled dot product attention
### 📍 Position Encodings
- 🔄 **RoPE** - Rotary embeddings (θ=10000)
### 🪟 Long Context Features
### 🎓 Training Optimizations
- 💾 **Gradient Checkpointing** - Memory efficient training
- 🎯 **Mixed Precision** - BF16 & FP16 support
### 📦 Quantization Support
- 4️⃣ **4-bit Quantization** - NF4, FP4 via bitsandbytes
- 8️⃣ **8-bit Quantization** - LLM.int8() support
- 🔄 **Double Quantization** - Further compression
### 🛠️ Optimization Features
- 💾 **KV Cache** - Generasi autoregressive 5-10x lebih cepat
- 🔧 **Gradient Checkpointing** - Training model besar dengan memory terbatas
- 📦 **Quantization Ready** - Support 4-bit & 8-bit quantization
- 🎯 **Mixed Precision Training** - BF16 & FP16 support
---
## 📊 Spesifikasi Teknis
| Spesifikasi | Detail |
|-------------|--------|
| **💎 Total Parameters** | **222,201,088** (222.2M) |
| **📐 Hidden Size** | 768 |
| **🔢 Intermediate Size** | 3,072 |
| **🏗️ Num Layers** | 20 |
| **🎯 Attention Heads** | 12 |
| **🔑 KV Heads** | 4 (GQA) |
| **📏 Head Dimension** | 64 |
| **📚 Vocab Size** | 32,000 tokens |
| **📖 Max Context** | 4,096 tokens |
| **🏛️ Architecture** | Decoder-only Transformer |
| **🎨 Model Type** | Causal Language Model |
### 📐 Arsitektur Detail
🔍 Klik untuk lihat struktur lengkap
```
CacaForCausalLM (222.2M)
│
├─ Embedding Layer
│ └─ Token Embeddings: 32,000 × 768
│ └─ Parameters: 24,576,000
│
├─ Transformer Layers (20x)
│ │
│ ├─ Layer {i} (repeated 20 times)
│ │ │
│ │ ├─ Input LayerNorm (RMSNorm)
│ │ │ └─ Params: 768
│ │ │
│ │ ├─ Self-Attention (Grouped Query Attention)
│ │ │ ├─ Q Projection: 768 → 768
│ │ │ ├─ K Projection: 768 → 256
│ │ │ ├─ V Projection: 768 → 256
│ │ │ ├─ O Projection: 768 → 768
│ │ │ ├─ RoPE Embeddings: θ=10000
│ │ │ └─ Flash Attention 2 (if available)
│ │ │
│ │ ├─ Post-Attention LayerNorm (RMSNorm)
│ │ │ └─ Params: 768
│ │ │
│ │ ├─ MLP (SwiGLU)
│ │ │ ├─ Gate: 768 → 3,072
│ │ │ ├─ Up: 768 → 3,072
│ │ │ ├─ Activation: SiLU (Swish)
│ │ │ └─ Down: 3,072 → 768
│ │ │
│ │ └─ Residual Connections (2x per layer)
│ │
│ └─ Total Layer Params: ~7M per layer
│
├─ Final LayerNorm (RMSNorm)
│ └─ Params: 768
│
└─ LM Head (Output Projection)
└─ Linear: 768 → 32,000
└─ Parameters: 24,576,000
```
**Perhitungan Parameter:**
- Embeddings: `32,000 × 768 = 24,576,000`
- Layers: `20 layers × ~7M = ~153M`
- **Total: 222,201,088 parameters**
---
## 🚀 Quick Start
### 📦 Instalasi
```bash
# Dependencies dasar
pip install torch>=2.0.0 transformers>=4.35.0 accelerate safetensors
# Optional: Untuk performa maksimal
pip install flash-attn --no-build-isolation # Flash Attention 2
pip install xformers # Memory efficient attention
pip install bitsandbytes # Quantization support
```
### 💻 Penggunaan Dasar
#### 1️⃣ Load Model
```python
from transformers import AutoModelForCausalLM, AutoConfig
import torch
# Load configuration
config = AutoConfig.from_pretrained(
"Lyon28/caca-100M-untrained",
trust_remote_code=True
)
print(f"Model: {config.model_type}")
print(f"Parameters: 222,201,088")
print(f"Hidden size: {config.hidden_size}")
print(f"Layers: {config.num_hidden_layers}")
# Load model
model = AutoModelForCausalLM.from_pretrained(
"Lyon28/caca-100M-untrained",
config=config,
torch_dtype=torch.bfloat16, # Gunakan BF16 untuk efisiensi
device_map="auto", # Otomatis distribusi ke GPU
trust_remote_code=True
)
print(f"Model loaded! Device: {model.device}")
```
#### 2️⃣ Verifikasi Model
```python
# Hitung total parameter
total_params = sum(p.numel() for p in model.parameters())
trainable_params = sum(p.numel() for p in model.parameters() if p.requires_grad)
print(f"Total parameters: {total_params:,}")
print(f"Trainable parameters: {trainable_params:,}")
print(f"Model size: {total_params * 2 / 1e9:.2f} GB (BF16)")
# Test forward pass
batch_size, seq_len = 2, 10
input_ids = torch.randint(0, config.vocab_size, (batch_size, seq_len))
input_ids = input_ids.to(model.device)
with torch.no_grad():
outputs = model(input_ids)
print(f"Output shape: {outputs.logits.shape}")
print("✅ Model berfungsi dengan baik!")
```
#### 3️⃣ Generate Text (Setelah Training)
```python
from transformers import AutoTokenizer
# Load tokenizer (gunakan tokenizer yang sesuai)
tokenizer = AutoTokenizer.from_pretrained("your-tokenizer-here")
# Prepare input
text = "Jelaskan tentang kecerdasan buatan"
inputs = tokenizer(text, return_tensors="pt").to(model.device)
# Generate
outputs = model.generate(
**inputs,
max_new_tokens=100,
temperature=0.7,
top_p=0.9,
top_k=50,
do_sample=True,
repetition_penalty=1.1,
pad_token_id=tokenizer.eos_token_id
)
# Decode
generated_text = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(generated_text)
```
---
## 🔧 Training Guide
### 📚 Persiapan Dataset
```python
from datasets import load_dataset
# Load dataset (contoh)
dataset = load_dataset("indonesian-nlp/id-wikipedia")
# Atau load dari file lokal
from datasets import Dataset
import pandas as pd
df = pd.read_csv("your_data.csv")
dataset = Dataset.from_pandas(df)
print(f"Dataset size: {len(dataset)}")
```
### 🎯 Training Configuration
```python
from transformers import Trainer, TrainingArguments
from transformers import DataCollatorForLanguageModeling
# Training arguments
training_args = TrainingArguments(
# Output
output_dir="./caca-caca-100M-trained",
run_name="caca-caca-100M-v1",
# Training
num_train_epochs=3,
per_device_train_batch_size=4,
gradient_accumulation_steps=8, # Effective batch size = 32
learning_rate=2e-4,
weight_decay=0.1,
warmup_steps=2000,
# Optimization
bf16=True, # Mixed precision training
gradient_checkpointing=True, # Hemat memory
optim="adamw_torch_fused", # Optimizer tercepat
max_grad_norm=1.0,
# Logging & Evaluation
logging_steps=10,
logging_first_step=True,
eval_strategy="steps",
eval_steps=500,
save_steps=1000,
save_total_limit=3,
# Hub integration
push_to_hub=True,
hub_model_id="your-username/caca-caca-100M-trained",
hub_strategy="every_save",
# Distributed training
ddp_find_unused_parameters=False,
dataloader_num_workers=4,
)
# Data collator
data_collator = DataCollatorForLanguageModeling(
tokenizer=tokenizer,
mlm=False # Causal LM, bukan Masked LM
)
# Trainer
trainer = Trainer(
model=model,
args=training_args,
train_dataset=dataset["train"],
eval_dataset=dataset["validation"],
data_collator=data_collator,
)
# Train!
print("🚀 Starting training...")
trainer.train()
# Save final model
print("💾 Saving model...")
trainer.save_model("./caca-caca-100M-final")
trainer.push_to_hub()
print("✅ Training complete!")
```
### 📊 Estimasi Resource
💰 Klik untuk melihat estimasi biaya & waktu training
**Hardware Requirements:**
| GPU | Memory | Batch Size | Speed | Est. Time (100B tokens) |
|-----|--------|------------|-------|-------------------------|
| RTX 3090 (24GB) | 24GB | 1-2 | ~1K tok/s | ~30 hari |
| A100 (40GB) | 40GB | 4-8 | ~5K tok/s | ~6 hari |
| A100 (80GB) | 80GB | 8-16 | ~8K tok/s | ~4 hari |
| 8×A100 (80GB) | 640GB | 64+ | ~50K tok/s | ~14 jam |
**Cloud Costs (approximate):**
- AWS p4d.24xlarge (8×A100): ~$32/hour × 24 hours = **~$768/day**
- GCP a2-ultragpu-8g: ~$30/hour × 24 hours = **~$720/day**
- Lambda Labs (8×A100): ~$15/hour × 24 hours = **~$360/day**
**Tips menghemat biaya:**
- Gunakan spot instances (60-70% lebih murah)
- Gradient accumulation untuk batch size lebih besar
- Mixed precision (BF16) untuk 2x speedup
- Gradient checkpointing untuk hemat memory
---
## 💬 Format Chat
Model ini mendukung format chat standar:
```python
# Single-turn
messages = [
{"role": "user", "content": "Halo! Siapa kamu?"},
]
# Multi-turn conversation
messages = [
{"role": "system", "content": "Kamu adalah asisten AI yang membantu."},
{"role": "user", "content": "Jelaskan tentang fotosintesis"},
{"role": "assistant", "content": "Fotosintesis adalah proses..."},
{"role": "user", "content": "Apa manfaatnya bagi manusia?"},
]
# Apply chat template
formatted = tokenizer.apply_chat_template(
messages,
tokenize=False,
add_generation_prompt=True
)
print(formatted)
# Output:
# System: Kamu adalah asisten AI yang membantu.
#
# User: Jelaskan tentang fotosintesis
# Assistant: Fotosintesis adalah proses...
# User: Apa manfaatnya bagi manusia?
# Assistant:
```
---
## 🎯 Use Cases
### ✅ Cocok Untuk:
- 🔬 **Penelitian**: Eksperimen arsitektur LLM modern
- 📚 **Edukasi**: Belajar tentang transformer & training
- 🎓 **Akademis**: Paper, thesis, project
- 🚀 **Base Model**: Fine-tuning untuk task spesifik
- 💡 **Proof of Concept**: Test ide sebelum scale up
### ❌ Tidak Cocok Untuk:
- 🚫 **Production**: Model belum dilatih
- 🚫 **Real-world apps**: Output masih random
- 🚫 **Safety-critical**: Belum ada safety alignment
- 🚫 **Direct deployment**: Perlu training dulu
---
## 📖 Dokumentasi
### 🔗 Links Penting
- 📚 **Hugging Face Docs**: [transformers.github.io](https://huggingface.co/docs/transformers)
- 💻 **GitHub**: [Lyon-28/caca-transformers](https://github.com/Lyon-28/caca-transformers)
- 💬 **Discussions**: [Model discussions](https://huggingface.co/Lyon28/caca-100M-untrained/discussions)
- 🐛 **Issues**: [Report bugs](https://huggingface.co/Lyon28/caca-100M-untrained/discussions)
### 📝 Related Models
| Model Size | Parameters | Link |
|------------|------------|------|
| 🐣 Tiny | 1M - 50M | [caca-1M](../caca-1M-untrained) to [caca-50M](../caca-50M-untrained) |
| 🐥 Small | 75M - 500M | [caca-75M](../caca-75M-untrained) to [caca-500M](../caca-500M-untrained) |
| 🦅 Medium | 600M - 1B | [caca-600M](../caca-600M-untrained) to [caca-1B](../caca-1B-untrained) |
| 🦁 Large | 1.5B - 5B | [caca-1.5B](../caca-1.5B-untrained) to [caca-5B](../caca-5B-untrained) |
| 🐉 XL | 6B - 10B | [caca-6B](../caca-6B-untrained) to [caca-10B](../caca-10B-untrained) |
| 🦖 XXL | 12B+ | [caca-12B](../caca-12B-untrained) to [caca-70B](../caca-70B-untrained) |
---
## 🤝 Contributing
Kami sangat terbuka untuk kontribusi! Beberapa cara untuk berkontribusi:
- 🐛 **Report bugs**: Temukan bug? [Buka issue](https://huggingface.co/Lyon28/caca-100M-untrained/discussions)
- 💡 **Suggest features**: Punya ide? Share di discussions
- 📝 **Improve docs**: PR welcome untuk dokumentasi
- 🎓 **Share results**: Training hasil? Share di model card
- ⭐ **Star & Share**: Bantu project ini berkembang
---
## 📜 License & Citation
### 📄 License
Model ini dirilis di bawah **Apache License 2.0**:
- ✅ Gratis untuk penggunaan komersial
- ✅ Gratis untuk penggunaan riset
- ✅ Boleh modifikasi & distribusi
- ✅ Tidak ada garansi
### 📚 Citation
Jika Anda menggunakan model ini dalam penelitian atau project, mohon cite:
```bibtex
@misc{cacacaca-100M2025,
author = {Lyon},
title = {Caca-caca-100M: Modern Transformer Architecture with GQA and Advanced Features},
year = {2025},
publisher = {Hugging Face},
journal = {Hugging Face Model Hub},
howpublished = {\url{https://huggingface.co/Lyon28/caca-100M-untrained}},
}
```
### 🙏 Acknowledgments
Model ini terinspirasi dan mengimplementasikan berbagai penelitian terkini:
#### 🏗️ **Core Architecture**
- **LLaMA** (Meta AI, 2023) - Base decoder-only architecture, RMSNorm, SwiGLU
- Paper: [LLaMA: Open and Efficient Foundation Language Models](https://arxiv.org/abs/2302.13971)
- **GPT-3** (OpenAI, 2020) - Transformer language modeling paradigm
- **PaLM** (Google, 2022) - SwiGLU activation function
#### 🎯 **Attention Mechanisms**
- **Flash Attention v2** (Tri Dao et al., 2023) - Efficient attention with IO-awareness
- Paper: [FlashAttention-2: Faster Attention with Better Parallelism](https://arxiv.org/abs/2307.08691)
- **Grouped Query Attention (GQA)** (Ainslie et al., Google, 2023) - Memory-efficient attention
- Paper: [GQA: Training Generalized Multi-Query Transformer](https://arxiv.org/abs/2305.13245)
- **Multi-Query Attention (MQA)** (Shazeer, Google, 2019) - Fast decoding
- **xFormers** (Meta AI, 2022) - Memory efficient attention implementations
- **PyTorch SDPA** (PyTorch Team, 2023) - Built-in scaled dot product attention
#### 📍 **Position Encodings**
- **RoPE** (Su et al., EleutherAI, 2021) - Rotary Position Embeddings
- Paper: [RoFormer: Enhanced Transformer with Rotary Position Embedding](https://arxiv.org/abs/2104.09864)
- **ALiBI** (Press et al., 2022) - Attention with Linear Biases for extrapolation
- Paper: [Train Short, Test Long: Attention with Linear Biases](https://arxiv.org/abs/2108.12409)
- **YaRN** (Peng et al., 2023) - Yet another RoPE extensioN for long context
- Paper: [YaRN: Efficient Context Window Extension](https://arxiv.org/abs/2309.00071)
#### 🪟 **Long Context & Efficiency**
- **Sliding Window Attention** (Mistral AI, 2023) - Local attention patterns
- Paper: [Mistral 7B](https://arxiv.org/abs/2310.06825)
- **StreamingLLM / Attention Sink** (Xiao et al., MIT, 2023) - Infinite sequence lengths
- Paper: [Efficient Streaming Language Models with Attention Sinks](https://arxiv.org/abs/2309.17453)
- **Logit Softcapping** (Google Gemma, 2024) - Prevent attention overflow
- Paper: [Gemma: Open Models Based on Gemini](https://arxiv.org/abs/2403.08295)
#### 🧠 **Mixture of Experts (MoE)**
- **Mixtral 8x7B** (Mistral AI, 2024) - Sparse MoE architecture
- Paper: [Mixtral of Experts](https://arxiv.org/abs/2401.04088)
- **Switch Transformers** (Fedus et al., Google, 2021) - Scaling with expert choice
- Paper: [Switch Transformers: Scaling to Trillion Parameter Models](https://arxiv.org/abs/2101.03961)
- **GLaM** (Du et al., Google, 2021) - Generalist Language Model with MoE
- **Expert Choice Routing** (Zhou et al., Google, 2022) - Improved load balancing
#### 🎓 **Training Optimizations**
- **Layer Scale** (Touvron et al., Meta, 2021) - Training stability for deep networks
- Paper: [Going Deeper with Image Transformers (CaiT)](https://arxiv.org/abs/2103.17239)
- **Stochastic Depth** (Huang et al., 2016) - Regularization via random layer dropping
- Paper: [Deep Networks with Stochastic Depth](https://arxiv.org/abs/1603.09382)
- **Mixture of Depths (MoD)** (Raposo et al., Google DeepMind, 2024) - Dynamic compute allocation
- Paper: [Mixture-of-Depths: Dynamically allocating compute in transformer-based models](https://arxiv.org/abs/2404.02258)
- **Gradient Checkpointing** (Chen et al., 2016) - Memory-efficient training
#### 📦 **Quantization**
- **LLM.int8()** (Dettmers et al., 2022) - 8-bit matrix multiplication
- Paper: [LLM.int8(): 8-bit Matrix Multiplication for Transformers](https://arxiv.org/abs/2208.07339)
- **QLoRA** (Dettmers et al., 2023) - 4-bit quantized LoRA fine-tuning
- Paper: [QLoRA: Efficient Finetuning of Quantized LLMs](https://arxiv.org/abs/2305.14314)
- **GPTQ** (Frantar et al., 2022) - Post-training quantization
- **bitsandbytes** (Dettmers) - Efficient quantization library
#### 🎨 **Multimodal Components**
- **Vision Transformer (ViT)** (Dosovitskiy et al., Google, 2020) - Image encoding
- Paper: [An Image is Worth 16x16 Words](https://arxiv.org/abs/2010.11929)
- **Perceiver Resampler** (Alayrac et al., DeepMind, 2022) - Multimodal fusion
- Paper: [Flamingo: a Visual Language Model](https://arxiv.org/abs/2204.14198)
- **Q-Former** (Li et al., Salesforce, 2023) - Query-based multimodal alignment
- Paper: [BLIP-2: Bootstrapping Language-Image Pre-training](https://arxiv.org/abs/2301.12597)
- **Whisper** (Radford et al., OpenAI, 2022) - Audio encoding inspiration
#### 🛠️ **Normalization & Activations**
- **RMSNorm** (Zhang & Sennrich, 2019) - Root Mean Square Layer Normalization
- Paper: [Root Mean Square Layer Normalization](https://arxiv.org/abs/1910.07467)
- **SwiGLU** (Shazeer, Google, 2020) - GLU activation variant
- Paper: [GLU Variants Improve Transformer](https://arxiv.org/abs/2002.05202)
#### 🔧 **Implementation & Tools**
- **Hugging Face Transformers** - Model implementation framework
- **PyTorch** - Deep learning framework
- **Safetensors** - Secure tensor serialization format
- **Accelerate** - Distributed training utilities
---
**Special Thanks to:**
- 🇮🇩 Indonesian NLP Community
- 🤗 Hugging Face Team
- 🔬 Open source AI research community
## ⚠️ Limitations & Bias
### Keterbatasan
- 🔴 **Untrained**: Model belum dilatih, output random
- 🟡 **No Tokenizer**: Perlu prepare tokenizer sendiri
- 🟡 **No Safety**: Belum ada content filtering/alignment
- 🟠 **Memory Intensive**: Training butuh GPU besar
### Potential Biases
Model ini akan mewarisi bias dari data training yang digunakan. Mohon perhatikan:
- **Bahasa**: Bias terhadap bahasa mayoritas di dataset
- **Kultur**: Bias terhadap perspektif kultur tertentu
- **Gender & Demografis**: Potential stereotypes
- **Faktual**: Bisa generate informasi tidak akurat
**Rekomendasi**: Lakukan evaluation & filtering sebelum deployment.
---
## 📞 Support & Contact
### 💬 Community
- **Discussions**: [HF Discussions](https://huggingface.co/Lyon28/caca-100M-untrained/discussions)
### 📧 Contact
Untuk pertanyaan atau kolaborasi:
- Email: cacatransformers@gmail.com
- HF Profile: [@Lyon28](https://huggingface.co/Lyon28)
---
## 🌟 Star History
[](https://star-history.com/#Lyon-28/caca-transformers&Date)
---
### 💝 Dibuat dengan ❤️ untuk komunitas AI Indonesia
**Terima kasih telah menggunakan Caca!**
Jika project ini bermanfaat, consider untuk:
- ⭐ Star repository ini
- 🔗 Share ke teman-teman
- 💬 Join discussions
- 🤝 Contribute ke project
---
### Quote Dari caca