Seikaijyu
/

RWKV-x060-World-3B-v2-nsfw-xuexue-v1.roleplay

Chinese

Not-For-All-Audiences

Model card Files Files and versions

xet

Community

纯小白如何使用该数据集进行微调和DPO？

by bdf3p4 - opened Oct 1

Discussion

bdf3p4

Oct 1

大佬，我找到一个可以可视化给大多数开源模型做微调的平台fireworks，它里面要求的数据集说要.jsonl结尾的，大概长这样：

{
  "messages": [
    {"role": "system", "content": "You are a helpful assistant."},
    {"role": "user", "content": "What is the capital of France?"}, 
    {"role": "assistant", "content": "Paris."}
  ]
}
{
  "messages": [
    {"role": "user", "content": "What is 1+1?"},
    {"role": "assistant", "content": "2", "weight": 0},
    {"role": "user", "content": "Now what is 2+2?"},
    {"role": "assistant", "content": "4"}
  ]
}

谷歌百度搜了下发现没有可以改为这个格式的，所以要怎么才能使用pth的数据集呀。

ps:本人纯小白，不懂任何模型训练的具体逻辑和代码实现

Seikaijyu

Owner Oct 13

RWKV目前没有DPO实现，我很早就和微调库的作者提过需求，不过没有被采纳

Seikaijyu

Owner Oct 13

也不对，有个日本人的，不过似乎不稳定

Seikaijyu

Owner Oct 13

•

edited Oct 13

如果小白的话跟着官方教程走吧
https://rwkv.cn/tutorials/advanced

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment