--- base_model: - nbeerbower/Llama3-Asobi-70B - Mawdistical/Draconic-Tease-70B - allura-org/Bigger-Body-70b - sophosympatheia/Nova-Tempus-70B-v0.1 - Steelskull/L3.3-Electra-R1-70b - PKU-Alignment/alpaca-70b-reproduced-llama-3 - ReadyArt/The-Omega-Directive-L-70B-v1.0 - Undi95/Sushi-v1.4 - ReadyArt/Forgotten-Safeword-70B-v5.0 - KaraKaraWitch/Llama-3.3-Amakuro - flammenai/Mahou-1.5-llama3.1-70B - Black-Ink-Guild/Pernicious_Prophecy_70B - LatitudeGames/Wayfarer-Large-70B-Llama-3.3 library_name: transformers tags: - mergekit - merge --- # L3.3-oiiaioiiai-B ![image/png](https://cdn-uploads.huggingface.co/production/uploads/633e85093a17ab61de8d9073/h02_ylRi3IBi_8t9glFY1.png) ...I should explain what the 2 oiiaioiiai models are. And why I'm keeping this. The inital idea for oiiaioiiai is the giant "final"/"farewell" model for me on L3.3. Basically, take all the best and greatest as of this month, 04/25. and rock smashing into a huge model. The server I use to merge models only has around 200GB of ram. Initally I tried doing the whole suite of ~28 odd so models... But as you can already tell, this would have crashed the VM. So instead, I split it into 2 models. What currently A is actually the second iteration of the A part. Why? Well if anyone have seen [KaraKaraWarehouse/L3.3-Joubutsu2000](https://huggingface.co/KaraKaraWarehouse/L3.3-Joubutsu2000)... That model is a living example of why it's risky merging in so many models. Even at low temperatures, it still has no concept of a "sentence". Hence the title was renamed from "A" to "Joubutsu". (If you really want to learn what it is, I highly suggest looking up on [dic.nicovideo.jp](https://dic.nicovideo.jp/a/%E4%B8%87%E6%88%88%E3%82%A4%E3%83%A0-%E4%B8%80%E3%83%8E%E5%8D%81)) After testing "Joubutsu" (formerly oiiai-A), I decided to change things up a bit for B. `dare_ties` became `ties` only. Since I suspected the poor model became... like that... because of the merge method was different. Then I tested B, and it was... usable. At the time, I'm also was looking for a model to do MTL with. I'm kinda blown away by how much I like it when it comes to doing Japanese to English translations. Which I felt my previous models didn't do super well. (Perhaps... It could be due to Bigger body or Asobi or any other model, but your guess is good as mine...) More so, this model also is one of the first models I managed to merge in AISG's SEA Lion models. (#SupportLocal and I do geniunely mean that) So the theory should hold that it should be able to do `Burmese, Chinese, English, Filipino, Indonesia, Javanese, Khmer, Lao, Malay, Sundanese, Tamil, Thai, Vietnamese` better than it's counterparts. Anyway, A2 (The iteration that is on HF and labeled as A) faired a bit better but still write like a schizopheric after some quick testing. ## Model Vibes I'm writing vibes from my testing. Trying to be a bit more transparent on some of my models since I'm sure some that use on featherless would appreciate the added transparency. Plus, it's aspirations to new model mergers on what to look out for their own merges. For my usage **I actually do not RP.** I love to theory-craft and write scenarios and game events (possibly due to my background but I digress). So take it from that POV: 1. Tends to write wikipedia like content when asking for idea crafting. It doesn't like giving point forms. 2. Japanese -> English Translation (YMMV), is better than MagicalGirl series and is the reason why I'm keeping this model around. 3. Has a bit of a "Weeb" feel to it(?). It tends to stretch out groans and moans like you might see in a Visual Novel. 4. This model like to be more "direct" and pays more attention to the prompt. 5. It is able to do dialogue generation (Likely due to wayfarer playing a part in it). But I prefer CursedMagicalGirl or MagicalGirl-2 over this. 6. Model should be able to do furry elements well, but I haven't tested it. 7. Base model should make it uncensored. But I think it might be a tad too horny. ## Prompting ChatML seems to work which is what I typically use. My standard sampler settings are as follows (finally and yes, it's more or less standarized) ``` Temperature: 1.4/1.2 Min P: 0.03 ``` ## How 2 Model Merge I'm considering writing an article for this so I'm jotting my notes down in public. ### On Selection... In general and "Fun fact". I do not have a rhyme or reason for merging. The reason why I call model merging "Rock smashing" is because it's like one sometimes. You bang 2 different rocks and you get another different rock. You could do all the fancy science nonsense but I personally not a fan of those. Though I think there should be a few points to be made: 1. You should probably pre-test select models and read their model description. 2. Benchmarks can only get you that far before you realize that "there is no perfect benchmark in the world, and any and all benchmarks can be cheated." I'll add more when I can think of them. ### On Testing models Due to 2, there is also no standard way to test models. Of course you should be able to tell if the model is behaving like a schizo. 1. Test with what you have. *Your own chat logs is a useful way to tell what is good and what is bad* 2. Write down what you observe compared to other models. (See section on model vibes above). In a way, they help inform you what your next model can do better or you want out of a model. (i.e. I'm currently considering merging magicalgirl with this model and see if I can get the best out of both worlds) ## Merge Details ### Merge Method This model was merged using the [TIES](https://arxiv.org/abs/2306.01708) merge method using [ReadyArt/The-Omega-Directive-L-70B-v1.0](https://huggingface.co/ReadyArt/The-Omega-Directive-L-70B-v1.0) as a base. ### Models Merged The following models were included in the merge: * [nbeerbower/Llama3-Asobi-70B](https://huggingface.co/nbeerbower/Llama3-Asobi-70B) * [aisingapore/Llama-SEA-LION-v3-70B-IT](https://huggingface.co/aisingapore/Llama-SEA-LION-v3-70B-IT) * [Mawdistical/Draconic-Tease-70B](https://huggingface.co/Mawdistical/Draconic-Tease-70B) * [allura-org/Bigger-Body-70b](https://huggingface.co/allura-org/Bigger-Body-70b) * [sophosympatheia/Nova-Tempus-70B-v0.1](https://huggingface.co/sophosympatheia/Nova-Tempus-70B-v0.1) * [Steelskull/L3.3-Electra-R1-70b](https://huggingface.co/Steelskull/L3.3-Electra-R1-70b) * [PKU-Alignment/alpaca-70b-reproduced-llama-3](https://huggingface.co/PKU-Alignment/alpaca-70b-reproduced-llama-3) * OpenBioLLM * [Undi95/Sushi-v1.4](https://huggingface.co/Undi95/Sushi-v1.4) * [ReadyArt/Forgotten-Safeword-70B-v5.0](https://huggingface.co/ReadyArt/Forgotten-Safeword-70B-v5.0) * [KaraKaraWitch/Llama-3.3-Amakuro](https://huggingface.co/KaraKaraWitch/Llama-3.3-Amakuro) * [flammenai/Mahou-1.5-llama3.1-70B](https://huggingface.co/flammenai/Mahou-1.5-llama3.1-70B) * [Black-Ink-Guild/Pernicious_Prophecy_70B](https://huggingface.co/Black-Ink-Guild/Pernicious_Prophecy_70B) * [LatitudeGames/Wayfarer-Large-70B-Llama-3.3](https://huggingface.co/LatitudeGames/Wayfarer-Large-70B-Llama-3.3) ### Configuration The following YAML configuration was used to produce this model: ```yaml ############################################################################## # The benefit of L3 models is that all subversions are mergable in some way. # So we can create something **REALLY REALLY REALLY** Stupid like this. ############################################################################## # PLEASE DO NOT FOLLOW. # This will probably show up on the hf repo. Hi there! ############################################################################## # - KaraKaraWitch. # P.S. 3e7aWKeGHFE (15/04/25) ############################################################################## models: - model: Black-Ink-Guild/Pernicious_Prophecy_70B parameters: density: 0.8129 weight: 0.3378 # De-alignment - model: PKU-Alignment/alpaca-70b-reproduced-llama-3 parameters: density: 0.7909 weight: 0.672 # Text Adventure - model: LatitudeGames/Wayfarer-Large-70B-Llama-3.3 parameters: density: 0.5435 weight: 0.7619 - model: KaraKaraWitch/Llama-3.3-Amakuro parameters: density: 0.37 weight: 0.359 - model: ReadyArt/Forgotten-Safeword-70B-v5.0 parameters: density: 0.37 weight: 0.359 - model: Undi95/Sushi-v1.4 parameters: density: 0.623 weight: 0.789 - model: sophosympatheia/Nova-Tempus-70B-v0.1 parameters: density: 0.344 weight: 0.6382 - model: flammenai/Mahou-1.5-llama3.1-70B parameters: density: 0.56490 weight: 0.4597 # Changelog: [ADDED] furries. - model: Mawdistical/Draconic-Tease-70B parameters: density: 0.4706 weight: 0.3697 # R1 causes a lot of alignment. So we avoid it. - model: Steelskull/L3.3-Electra-R1-70b parameters: density: 0.1692 weight: 0.1692 # Blue hair, blue tie... Hiding in your wiifii # - model: sophosympatheia/Midnight-Miqu-70B-v1.0 # parameters: # density: 0.4706 # weight: 0.3697 # OpenBioLLM does not use safetensors in the repo. Custom safetensors version. - model: OpenBioLLM parameters: density: 0.267 weight: 0.1817 - model: allura-org/Bigger-Body-70b parameters: density: 0.6751 weight: 0.3722 - model: nbeerbower/Llama3-Asobi-70B parameters: density: 0.7113 weight: 0.4706 # ...Reminds that anytime that I try and merge in SEALION models # it ends up overpowering other models. So I'm setting it *really* low. - model: aisingapore/Llama-SEA-LION-v3-70B-IT parameters: density: 0.0527 weight: 0.1193 merge_method: ties base_model: ReadyArt/The-Omega-Directive-L-70B-v1.0 parameters: select_topk: 0.50 dtype: bfloat16 ```