--- license: apache-2.0 language: - id - en tags: - text-generation - pytorch - causal-lm - transformer - untrained - gqa - rope - swiglu - rmsnorm - flash-attention - indonesian library_name: transformers pipeline_tag: text-generation widget: - text: "Jakarta adalah ibu kota" example_title: "🇮🇩 Text Completion (ID)" - text: | Pertanyaan: Apa itu kecerdasan buatan? Jawaban: example_title: "🇮🇩 Question Answering (ID)" - text: | Tulis cerita pendek tentang robot yang belajar mencintai. example_title: "🇮🇩 Creative Writing (ID)" - text: "The capital of Indonesia is" example_title: "🇬🇧 Text Completion (EN)" - text: | Question: What is artificial intelligence? Answer: example_title: "🇬🇧 Question Answering (EN)" - text: | def fibonacci(n): """Hitung bilangan fibonacci ke-n""" example_title: "💻 Code Completion" - text: | def reverse_string(s): example_title: "💻 Code Generation" - text: | User: Halo! Siapa kamu? Assistant: example_title: "💬 Chat Format (ID)" - text: | User: Jelaskan tentang machine learning dalam 2 kalimat. Assistant: example_title: "💬 Conversational (ID)" inference: parameters: max_new_tokens: 100 temperature: 0.7 top_p: 0.9 top_k: 50 do_sample: true repetition_penalty: 1.1 num_beams: 1 datasets: [] metrics: - perplexity model-index: - name: caca-100M results: [] ---
caca-100M # 🚀 CACA-100M ### Model Transformer Modern dengan Arsitektur Canggih [![License](https://img.shields.io/badge/License-Apache%202.0-blue.svg)](https://opensource.org/licenses/Apache-2.0) [![Python 3.8+](https://img.shields.io/badge/python-3.8+-blue.svg)](https://www.python.org/downloads/) [![PyTorch](https://img.shields.io/badge/PyTorch-2.0+-red.svg)](https://pytorch.org/) [![Transformers](https://img.shields.io/badge/🤗%20Transformers-4.35+-yellow.svg)](https://github.com/huggingface/transformers) **222,201,088** parameters • **222.2M** • **20 layers** [📖 Dokumentasi](#dokumentasi) • [🚀 Quick Start](#quick-start) • [💡 Fitur](#fitur-utama) • [🔧 Training](#training-guide) • [📊 Spesifikasi](#spesifikasi-teknis) ---
## ⚠️ PENTING: Model Belum Dilatih (Untrained) > **PERHATIAN**: Ini adalah model yang **belum melalui proses training**. Bobot model masih dalam kondisi random initialization. Output yang dihasilkan akan **tidak bermakna dan acak**. **Status Model:** - 🔴 **Belum dilatih** - Bobot masih random - 🟡 **Hanya untuk riset** - Eksperimen arsitektur & training - 🟢 **Siap dilatih** - Arsitektur sudah teruji Widget di atas hanya menunjukkan **format input yang diharapkan**. Setelah model dilatih dengan dataset yang tepat, format yang sama akan menghasilkan output berkualitas. --- ## 📋 Deskripsi **Caca** adalah arsitektur Large Language Model (LLM) generasi terbaru yang menggabungkan berbagai teknik state-of-the-art dalam deep learning. Model ini dirancang dengan fokus pada **efisiensi**, **skalabilitas**, dan **performa tinggi**. ### 🎯 Keunggulan Utama - **🇮🇩 Bilingual Support**: Optimized untuk Bahasa Indonesia & English - **⚡ Ultra Fast**: Flash Attention 2 untuk inferensi 3x lebih cepat - **💾 Memory Efficient**: Grouped Query Attention menghemat 75% KV cache - **🎯 Long Context**: Support hingga 4,096 token - **🔧 Modular**: Arsitektur fleksibel dengan berbagai opsi konfigurasi --- ## ✨ Fitur Utama ### 🎯 Core Features - ✅ **Grouped Query Attention (GQA)** - Efisiensi memori dan komputasi superior - Query heads: 12 - KV heads: 4 - Ratio: 3:1 (hemat 75% KV cache) - ✅ **Rotary Position Embeddings (RoPE)** - Generalisasi konteks panjang lebih baik - Theta: 10000 - Support extrapolation untuk konteks > training length - ✅ **RMSNorm** - Normalisasi lebih stabil dan 50% lebih cepat dari LayerNorm - Epsilon: 1e-06 - ✅ **SwiGLU Activation** - Performa 10-15% lebih baik dari ReLU/GELU - Intermediate size: 3,072 - ✅ **Flash Attention 2** - Akselerasi hingga 3x dengan memory efficiency - Otomatis aktif jika tersedia CUDA ### 🔥 Advanced Features ### 🎯 Attention Mechanisms - ⚡ **Flash Attention v2** - 3x faster with IO-aware algorithm - 🔑 **Grouped Query Attention (GQA)** - 12Q : 4KV heads - 🚀 **xFormers Support** - Memory efficient attention fallback - 🎯 **PyTorch SDPA** - Native scaled dot product attention ### 📍 Position Encodings - 🔄 **RoPE** - Rotary embeddings (θ=10000) ### 🪟 Long Context Features ### 🎓 Training Optimizations - 💾 **Gradient Checkpointing** - Memory efficient training - 🎯 **Mixed Precision** - BF16 & FP16 support ### 📦 Quantization Support - 4️⃣ **4-bit Quantization** - NF4, FP4 via bitsandbytes - 8️⃣ **8-bit Quantization** - LLM.int8() support - 🔄 **Double Quantization** - Further compression ### 🛠️ Optimization Features - 💾 **KV Cache** - Generasi autoregressive 5-10x lebih cepat - 🔧 **Gradient Checkpointing** - Training model besar dengan memory terbatas - 📦 **Quantization Ready** - Support 4-bit & 8-bit quantization - 🎯 **Mixed Precision Training** - BF16 & FP16 support --- ## 📊 Spesifikasi Teknis
| Spesifikasi | Detail | |-------------|--------| | **💎 Total Parameters** | **222,201,088** (222.2M) | | **📐 Hidden Size** | 768 | | **🔢 Intermediate Size** | 3,072 | | **🏗️ Num Layers** | 20 | | **🎯 Attention Heads** | 12 | | **🔑 KV Heads** | 4 (GQA) | | **📏 Head Dimension** | 64 | | **📚 Vocab Size** | 32,000 tokens | | **📖 Max Context** | 4,096 tokens | | **🏛️ Architecture** | Decoder-only Transformer | | **🎨 Model Type** | Causal Language Model |
### 📐 Arsitektur Detail
🔍 Klik untuk lihat struktur lengkap ``` CacaForCausalLM (222.2M) │ ├─ Embedding Layer │ └─ Token Embeddings: 32,000 × 768 │ └─ Parameters: 24,576,000 │ ├─ Transformer Layers (20x) │ │ │ ├─ Layer {i} (repeated 20 times) │ │ │ │ │ ├─ Input LayerNorm (RMSNorm) │ │ │ └─ Params: 768 │ │ │ │ │ ├─ Self-Attention (Grouped Query Attention) │ │ │ ├─ Q Projection: 768 → 768 │ │ │ ├─ K Projection: 768 → 256 │ │ │ ├─ V Projection: 768 → 256 │ │ │ ├─ O Projection: 768 → 768 │ │ │ ├─ RoPE Embeddings: θ=10000 │ │ │ └─ Flash Attention 2 (if available) │ │ │ │ │ ├─ Post-Attention LayerNorm (RMSNorm) │ │ │ └─ Params: 768 │ │ │ │ │ ├─ MLP (SwiGLU) │ │ │ ├─ Gate: 768 → 3,072 │ │ │ ├─ Up: 768 → 3,072 │ │ │ ├─ Activation: SiLU (Swish) │ │ │ └─ Down: 3,072 → 768 │ │ │ │ │ └─ Residual Connections (2x per layer) │ │ │ └─ Total Layer Params: ~7M per layer │ ├─ Final LayerNorm (RMSNorm) │ └─ Params: 768 │ └─ LM Head (Output Projection) └─ Linear: 768 → 32,000 └─ Parameters: 24,576,000 ``` **Perhitungan Parameter:** - Embeddings: `32,000 × 768 = 24,576,000` - Layers: `20 layers × ~7M = ~153M` - **Total: 222,201,088 parameters**
--- ## 🚀 Quick Start ### 📦 Instalasi ```bash # Dependencies dasar pip install torch>=2.0.0 transformers>=4.35.0 accelerate safetensors # Optional: Untuk performa maksimal pip install flash-attn --no-build-isolation # Flash Attention 2 pip install xformers # Memory efficient attention pip install bitsandbytes # Quantization support ``` ### 💻 Penggunaan Dasar #### 1️⃣ Load Model ```python from transformers import AutoModelForCausalLM, AutoConfig import torch # Load configuration config = AutoConfig.from_pretrained( "Lyon28/caca-100M-untrained", trust_remote_code=True ) print(f"Model: {config.model_type}") print(f"Parameters: 222,201,088") print(f"Hidden size: {config.hidden_size}") print(f"Layers: {config.num_hidden_layers}") # Load model model = AutoModelForCausalLM.from_pretrained( "Lyon28/caca-100M-untrained", config=config, torch_dtype=torch.bfloat16, # Gunakan BF16 untuk efisiensi device_map="auto", # Otomatis distribusi ke GPU trust_remote_code=True ) print(f"Model loaded! Device: {model.device}") ``` #### 2️⃣ Verifikasi Model ```python # Hitung total parameter total_params = sum(p.numel() for p in model.parameters()) trainable_params = sum(p.numel() for p in model.parameters() if p.requires_grad) print(f"Total parameters: {total_params:,}") print(f"Trainable parameters: {trainable_params:,}") print(f"Model size: {total_params * 2 / 1e9:.2f} GB (BF16)") # Test forward pass batch_size, seq_len = 2, 10 input_ids = torch.randint(0, config.vocab_size, (batch_size, seq_len)) input_ids = input_ids.to(model.device) with torch.no_grad(): outputs = model(input_ids) print(f"Output shape: {outputs.logits.shape}") print("✅ Model berfungsi dengan baik!") ``` #### 3️⃣ Generate Text (Setelah Training) ```python from transformers import AutoTokenizer # Load tokenizer (gunakan tokenizer yang sesuai) tokenizer = AutoTokenizer.from_pretrained("your-tokenizer-here") # Prepare input text = "Jelaskan tentang kecerdasan buatan" inputs = tokenizer(text, return_tensors="pt").to(model.device) # Generate outputs = model.generate( **inputs, max_new_tokens=100, temperature=0.7, top_p=0.9, top_k=50, do_sample=True, repetition_penalty=1.1, pad_token_id=tokenizer.eos_token_id ) # Decode generated_text = tokenizer.decode(outputs[0], skip_special_tokens=True) print(generated_text) ``` --- ## 🔧 Training Guide ### 📚 Persiapan Dataset ```python from datasets import load_dataset # Load dataset (contoh) dataset = load_dataset("indonesian-nlp/id-wikipedia") # Atau load dari file lokal from datasets import Dataset import pandas as pd df = pd.read_csv("your_data.csv") dataset = Dataset.from_pandas(df) print(f"Dataset size: {len(dataset)}") ``` ### 🎯 Training Configuration ```python from transformers import Trainer, TrainingArguments from transformers import DataCollatorForLanguageModeling # Training arguments training_args = TrainingArguments( # Output output_dir="./caca-caca-100M-trained", run_name="caca-caca-100M-v1", # Training num_train_epochs=3, per_device_train_batch_size=4, gradient_accumulation_steps=8, # Effective batch size = 32 learning_rate=2e-4, weight_decay=0.1, warmup_steps=2000, # Optimization bf16=True, # Mixed precision training gradient_checkpointing=True, # Hemat memory optim="adamw_torch_fused", # Optimizer tercepat max_grad_norm=1.0, # Logging & Evaluation logging_steps=10, logging_first_step=True, eval_strategy="steps", eval_steps=500, save_steps=1000, save_total_limit=3, # Hub integration push_to_hub=True, hub_model_id="your-username/caca-caca-100M-trained", hub_strategy="every_save", # Distributed training ddp_find_unused_parameters=False, dataloader_num_workers=4, ) # Data collator data_collator = DataCollatorForLanguageModeling( tokenizer=tokenizer, mlm=False # Causal LM, bukan Masked LM ) # Trainer trainer = Trainer( model=model, args=training_args, train_dataset=dataset["train"], eval_dataset=dataset["validation"], data_collator=data_collator, ) # Train! print("🚀 Starting training...") trainer.train() # Save final model print("💾 Saving model...") trainer.save_model("./caca-caca-100M-final") trainer.push_to_hub() print("✅ Training complete!") ``` ### 📊 Estimasi Resource
💰 Klik untuk melihat estimasi biaya & waktu training **Hardware Requirements:** | GPU | Memory | Batch Size | Speed | Est. Time (100B tokens) | |-----|--------|------------|-------|-------------------------| | RTX 3090 (24GB) | 24GB | 1-2 | ~1K tok/s | ~30 hari | | A100 (40GB) | 40GB | 4-8 | ~5K tok/s | ~6 hari | | A100 (80GB) | 80GB | 8-16 | ~8K tok/s | ~4 hari | | 8×A100 (80GB) | 640GB | 64+ | ~50K tok/s | ~14 jam | **Cloud Costs (approximate):** - AWS p4d.24xlarge (8×A100): ~$32/hour × 24 hours = **~$768/day** - GCP a2-ultragpu-8g: ~$30/hour × 24 hours = **~$720/day** - Lambda Labs (8×A100): ~$15/hour × 24 hours = **~$360/day** **Tips menghemat biaya:** - Gunakan spot instances (60-70% lebih murah) - Gradient accumulation untuk batch size lebih besar - Mixed precision (BF16) untuk 2x speedup - Gradient checkpointing untuk hemat memory
--- ## 💬 Format Chat Model ini mendukung format chat standar: ```python # Single-turn messages = [ {"role": "user", "content": "Halo! Siapa kamu?"}, ] # Multi-turn conversation messages = [ {"role": "system", "content": "Kamu adalah asisten AI yang membantu."}, {"role": "user", "content": "Jelaskan tentang fotosintesis"}, {"role": "assistant", "content": "Fotosintesis adalah proses..."}, {"role": "user", "content": "Apa manfaatnya bagi manusia?"}, ] # Apply chat template formatted = tokenizer.apply_chat_template( messages, tokenize=False, add_generation_prompt=True ) print(formatted) # Output: # System: Kamu adalah asisten AI yang membantu. # # User: Jelaskan tentang fotosintesis # Assistant: Fotosintesis adalah proses... # User: Apa manfaatnya bagi manusia? # Assistant: ``` --- ## 🎯 Use Cases ### ✅ Cocok Untuk: - 🔬 **Penelitian**: Eksperimen arsitektur LLM modern - 📚 **Edukasi**: Belajar tentang transformer & training - 🎓 **Akademis**: Paper, thesis, project - 🚀 **Base Model**: Fine-tuning untuk task spesifik - 💡 **Proof of Concept**: Test ide sebelum scale up ### ❌ Tidak Cocok Untuk: - 🚫 **Production**: Model belum dilatih - 🚫 **Real-world apps**: Output masih random - 🚫 **Safety-critical**: Belum ada safety alignment - 🚫 **Direct deployment**: Perlu training dulu --- ## 📖 Dokumentasi ### 🔗 Links Penting - 📚 **Hugging Face Docs**: [transformers.github.io](https://huggingface.co/docs/transformers) - 💻 **GitHub**: [Lyon-28/caca-transformers](https://github.com/Lyon-28/caca-transformers) - 💬 **Discussions**: [Model discussions](https://huggingface.co/Lyon28/caca-100M-untrained/discussions) - 🐛 **Issues**: [Report bugs](https://huggingface.co/Lyon28/caca-100M-untrained/discussions) ### 📝 Related Models
| Model Size | Parameters | Link | |------------|------------|------| | 🐣 Tiny | 1M - 50M | [caca-1M](../caca-1M-untrained) to [caca-50M](../caca-50M-untrained) | | 🐥 Small | 75M - 500M | [caca-75M](../caca-75M-untrained) to [caca-500M](../caca-500M-untrained) | | 🦅 Medium | 600M - 1B | [caca-600M](../caca-600M-untrained) to [caca-1B](../caca-1B-untrained) | | 🦁 Large | 1.5B - 5B | [caca-1.5B](../caca-1.5B-untrained) to [caca-5B](../caca-5B-untrained) | | 🐉 XL | 6B - 10B | [caca-6B](../caca-6B-untrained) to [caca-10B](../caca-10B-untrained) | | 🦖 XXL | 12B+ | [caca-12B](../caca-12B-untrained) to [caca-70B](../caca-70B-untrained) |
--- ## 🤝 Contributing Kami sangat terbuka untuk kontribusi! Beberapa cara untuk berkontribusi: - 🐛 **Report bugs**: Temukan bug? [Buka issue](https://huggingface.co/Lyon28/caca-100M-untrained/discussions) - 💡 **Suggest features**: Punya ide? Share di discussions - 📝 **Improve docs**: PR welcome untuk dokumentasi - 🎓 **Share results**: Training hasil? Share di model card - ⭐ **Star & Share**: Bantu project ini berkembang --- ## 📜 License & Citation ### 📄 License Model ini dirilis di bawah **Apache License 2.0**: - ✅ Gratis untuk penggunaan komersial - ✅ Gratis untuk penggunaan riset - ✅ Boleh modifikasi & distribusi - ✅ Tidak ada garansi ### 📚 Citation Jika Anda menggunakan model ini dalam penelitian atau project, mohon cite: ```bibtex @misc{cacacaca-100M2025, author = {Lyon}, title = {Caca-caca-100M: Modern Transformer Architecture with GQA and Advanced Features}, year = {2025}, publisher = {Hugging Face}, journal = {Hugging Face Model Hub}, howpublished = {\url{https://huggingface.co/Lyon28/caca-100M-untrained}}, } ``` ### 🙏 Acknowledgments Model ini terinspirasi dan mengimplementasikan berbagai penelitian terkini: #### 🏗️ **Core Architecture** - **LLaMA** (Meta AI, 2023) - Base decoder-only architecture, RMSNorm, SwiGLU - Paper: [LLaMA: Open and Efficient Foundation Language Models](https://arxiv.org/abs/2302.13971) - **GPT-3** (OpenAI, 2020) - Transformer language modeling paradigm - **PaLM** (Google, 2022) - SwiGLU activation function #### 🎯 **Attention Mechanisms** - **Flash Attention v2** (Tri Dao et al., 2023) - Efficient attention with IO-awareness - Paper: [FlashAttention-2: Faster Attention with Better Parallelism](https://arxiv.org/abs/2307.08691) - **Grouped Query Attention (GQA)** (Ainslie et al., Google, 2023) - Memory-efficient attention - Paper: [GQA: Training Generalized Multi-Query Transformer](https://arxiv.org/abs/2305.13245) - **Multi-Query Attention (MQA)** (Shazeer, Google, 2019) - Fast decoding - **xFormers** (Meta AI, 2022) - Memory efficient attention implementations - **PyTorch SDPA** (PyTorch Team, 2023) - Built-in scaled dot product attention #### 📍 **Position Encodings** - **RoPE** (Su et al., EleutherAI, 2021) - Rotary Position Embeddings - Paper: [RoFormer: Enhanced Transformer with Rotary Position Embedding](https://arxiv.org/abs/2104.09864) - **ALiBI** (Press et al., 2022) - Attention with Linear Biases for extrapolation - Paper: [Train Short, Test Long: Attention with Linear Biases](https://arxiv.org/abs/2108.12409) - **YaRN** (Peng et al., 2023) - Yet another RoPE extensioN for long context - Paper: [YaRN: Efficient Context Window Extension](https://arxiv.org/abs/2309.00071) #### 🪟 **Long Context & Efficiency** - **Sliding Window Attention** (Mistral AI, 2023) - Local attention patterns - Paper: [Mistral 7B](https://arxiv.org/abs/2310.06825) - **StreamingLLM / Attention Sink** (Xiao et al., MIT, 2023) - Infinite sequence lengths - Paper: [Efficient Streaming Language Models with Attention Sinks](https://arxiv.org/abs/2309.17453) - **Logit Softcapping** (Google Gemma, 2024) - Prevent attention overflow - Paper: [Gemma: Open Models Based on Gemini](https://arxiv.org/abs/2403.08295) #### 🧠 **Mixture of Experts (MoE)** - **Mixtral 8x7B** (Mistral AI, 2024) - Sparse MoE architecture - Paper: [Mixtral of Experts](https://arxiv.org/abs/2401.04088) - **Switch Transformers** (Fedus et al., Google, 2021) - Scaling with expert choice - Paper: [Switch Transformers: Scaling to Trillion Parameter Models](https://arxiv.org/abs/2101.03961) - **GLaM** (Du et al., Google, 2021) - Generalist Language Model with MoE - **Expert Choice Routing** (Zhou et al., Google, 2022) - Improved load balancing #### 🎓 **Training Optimizations** - **Layer Scale** (Touvron et al., Meta, 2021) - Training stability for deep networks - Paper: [Going Deeper with Image Transformers (CaiT)](https://arxiv.org/abs/2103.17239) - **Stochastic Depth** (Huang et al., 2016) - Regularization via random layer dropping - Paper: [Deep Networks with Stochastic Depth](https://arxiv.org/abs/1603.09382) - **Mixture of Depths (MoD)** (Raposo et al., Google DeepMind, 2024) - Dynamic compute allocation - Paper: [Mixture-of-Depths: Dynamically allocating compute in transformer-based models](https://arxiv.org/abs/2404.02258) - **Gradient Checkpointing** (Chen et al., 2016) - Memory-efficient training #### 📦 **Quantization** - **LLM.int8()** (Dettmers et al., 2022) - 8-bit matrix multiplication - Paper: [LLM.int8(): 8-bit Matrix Multiplication for Transformers](https://arxiv.org/abs/2208.07339) - **QLoRA** (Dettmers et al., 2023) - 4-bit quantized LoRA fine-tuning - Paper: [QLoRA: Efficient Finetuning of Quantized LLMs](https://arxiv.org/abs/2305.14314) - **GPTQ** (Frantar et al., 2022) - Post-training quantization - **bitsandbytes** (Dettmers) - Efficient quantization library #### 🎨 **Multimodal Components** - **Vision Transformer (ViT)** (Dosovitskiy et al., Google, 2020) - Image encoding - Paper: [An Image is Worth 16x16 Words](https://arxiv.org/abs/2010.11929) - **Perceiver Resampler** (Alayrac et al., DeepMind, 2022) - Multimodal fusion - Paper: [Flamingo: a Visual Language Model](https://arxiv.org/abs/2204.14198) - **Q-Former** (Li et al., Salesforce, 2023) - Query-based multimodal alignment - Paper: [BLIP-2: Bootstrapping Language-Image Pre-training](https://arxiv.org/abs/2301.12597) - **Whisper** (Radford et al., OpenAI, 2022) - Audio encoding inspiration #### 🛠️ **Normalization & Activations** - **RMSNorm** (Zhang & Sennrich, 2019) - Root Mean Square Layer Normalization - Paper: [Root Mean Square Layer Normalization](https://arxiv.org/abs/1910.07467) - **SwiGLU** (Shazeer, Google, 2020) - GLU activation variant - Paper: [GLU Variants Improve Transformer](https://arxiv.org/abs/2002.05202) #### 🔧 **Implementation & Tools** - **Hugging Face Transformers** - Model implementation framework - **PyTorch** - Deep learning framework - **Safetensors** - Secure tensor serialization format - **Accelerate** - Distributed training utilities --- **Special Thanks to:** - 🇮🇩 Indonesian NLP Community - 🤗 Hugging Face Team - 🔬 Open source AI research community ## ⚠️ Limitations & Bias ### Keterbatasan - 🔴 **Untrained**: Model belum dilatih, output random - 🟡 **No Tokenizer**: Perlu prepare tokenizer sendiri - 🟡 **No Safety**: Belum ada content filtering/alignment - 🟠 **Memory Intensive**: Training butuh GPU besar ### Potential Biases Model ini akan mewarisi bias dari data training yang digunakan. Mohon perhatikan: - **Bahasa**: Bias terhadap bahasa mayoritas di dataset - **Kultur**: Bias terhadap perspektif kultur tertentu - **Gender & Demografis**: Potential stereotypes - **Faktual**: Bisa generate informasi tidak akurat **Rekomendasi**: Lakukan evaluation & filtering sebelum deployment. --- ## 📞 Support & Contact ### 💬 Community - **Discussions**: [HF Discussions](https://huggingface.co/Lyon28/caca-100M-untrained/discussions) ### 📧 Contact Untuk pertanyaan atau kolaborasi: - Email: cacatransformers@gmail.com - HF Profile: [@Lyon28](https://huggingface.co/Lyon28) ---
## 🌟 Star History [![Star History Chart](https://api.star-history.com/svg?repos=Lyon-28/caca-transformers&type=Date)](https://star-history.com/#Lyon-28/caca-transformers&Date) --- ### 💝 Dibuat dengan ❤️ untuk komunitas AI Indonesia **Terima kasih telah menggunakan Caca!** Jika project ini bermanfaat, consider untuk: - ⭐ Star repository ini - 🔗 Share ke teman-teman - 💬 Join discussions - 🤝 Contribute ke project ---
### Quote Dari caca
Daily Quote