It's a DELLA! 👼

by McG-221 - opened 13 days ago

Discussion

McG-221

13 days ago

First vibe test positive, the magic is there... ✨

Naphula

Owner 13 days ago

Glad to hear it , I did not test this one as much yet.

Cant wait to finetune 24B models eventually

McG-221

8 days ago

Townfolk was talking about some kind of repetition bug, prompting development of Goetia v1.4 -- what's that all about? 👀

Naphula

Owner 1 day ago

•

edited 1 day ago

Not sure but v1.4 is probably a ways off. I'm trying to save up for a 3090, working long hours. I have made some new datasets and found a way to improve Cthulhu/Raven/Morpheus with longer context Q&A Pairs (~2k tokens instead of 500). My initial test set for this long-form style performed quite well.

So, a proper Cthulhu/Goetia 24B dataset is under construction, but we'll likely see the smaller versions first.

Attempts to finetune 12B on a 8GB card have failed, but I can now merge MOE for mistral 7b and llama 8b. These seem to be promising. I have more ideas for MOE finetunes/merges coming up.

Also was working on a custom moe_karcher method ~~but lost the script in a BSOD~~

Update: The script has been recovered! Running a test of it now

McG-221

1 day ago

@Naphula Here's the thing: I have good hardware, but am not proficient in the software magic side of things. If you could provide me with an idiot-proof script that runs on macOS, I could certainly try to help you with merging, or whatever needed. Hit me up, if you think this could work…

Naphula

Owner 1 day ago

•

edited 1 day ago

What's your total VRAM? At least 16-24B is enough to help with finetuning.

Also you need high system ram (32-64gb, and a pagefile with SSD that has enough write cycles, I recommend at least 50-100GB)

I don't use mac but here is a supposed way to install it.

If you can get it to run a simple SLERP merge let me know, and I can send you more advanced scripts to test.

Save this as config.yaml and try to run it (recreate kekthulhu)

Some of these commands won't work on mac, so you may have to ask a 'smart' LLM for help mergekit-yaml C:\mergekit-main\config.yaml C:\mergekit-main\merged_model_output --copy-tokenizer --allow-crimes --out-shard-size 5B --trust-remote-code --lazy-unpickle --random-seed 420 --cuda

base_model: SicariusSicariiStuff\Assistant_Pepe_8B
architecture: MistralForCausalLM
merge_method: slerp
dtype: float32
out_dtype: bfloat16
slices:
  - sources:
      - model: Naphula\Cthulhu-8B-v1.4
        layer_range: [0, 32]
      - model: SicariusSicariiStuff\Assistant_Pepe_8B
        layer_range: [0, 32]
parameters:
  t: 0.5
tokenizer:
source: union
#chat_template: auto

To run Arcee AI's MergeKit on a Macintosh, first clone the repository using git clone https://github.com/arcee-ai/mergekit.git, then navigate to the directory with cd mergekit and install the package using pip install -e .. After installation, you can use the main script mergekit-yaml to perform merges by specifying a YAML configuration file and an output path.

Setting Up MergeKit on macOS

Prerequisites

Python: Ensure you have Python installed. You can download it from the official Python website.
Git: Install Git to clone the MergeKit repository. You can download it from the Git website.

Installation Steps

Clone the Repository:

Open your terminal and run the following command:

bash

git clone https://github.com/arcee-ai/mergekit.git

Navigate to the Directory:

Change to the MergeKit directory:

bash

cd mergekit

Install MergeKit:

Use pip to install MergeKit:

bash

pip install -e .

Running MergeKit

Prepare Your Configuration:

Create a YAML configuration file for your merge. You can use a template or create one from scratch.

Run the Merge Command:

Execute the merge command using the following syntax:

bash

mergekit-yaml path/to/your/config.yml ./output-model-directory [--cuda] [--lazy-unpickle] [--allow-crimes] [... other options]

Replace path/to/your/config.yml with the path to your configuration file and ./output-model-directory with your desired output directory.

McG-221

1 day ago

Thanks for sharing these steps, I will try them out when I’m back home. I should be able to use around 50 GB as VRAM, though I don’t know if that translates well with this use case. Will keep you posted, once I’m at it 👍

Naphula

Owner about 15 hours ago

•

edited about 15 hours ago

If you manage to get Mergekit running then you will want to replace the graph.py with this version

https://huggingface.co/spaces/Naphula/model_tools/blob/main/graph_v18.py

(you would want to edit some of the settings to recalibrate it for your vram ceiling)

Not sure if it works on Mac but if so then you should be able to breeze through big merges in minutes instead of hours with 50GB VRAM (assuming you have storage space on the SSD for them, it takes forever to transfer from HDD)

McG-221

about 13 hours ago

I’m still on sick leave, but beginning next week, I hope to finally be able to try it out! I'd love to be able to contribute in a meaningful way, even if I don’t really know what I’m doing 🤪

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment