๐ŸงŸ Two-Sentence Horror Story Generator (Mistral 7B)

This project fine-tunes Mistral 7B Instruct v0.3 to generate short-form horror microfiction using 75,000 stories as well as the top 10,000 most upvoted stories from r/TwoSentenceHorror.

Training is performed using LoRA adapters and 4-bit NF4 quantization, enabling efficient fine-tuning on limited hardware.


1. Data Source

Most of dataset is derived from a full Pushshift subreddit dump, allowing access to historical posts beyond Reddit API limits. Then the model was additionally trained on the top 10,000 posts of all time on r/TwoSentenceHorror.

Input Dump

  • TwoSentenceHorror_submissions.zst

Extraction Script

  • extract_top_10k.py

Output Dataset

  • dataset_10k.txt

Dataset Format

Each line in the dataset contains:

  • The post title
  • The post body
  • Combined into a single, clean text sample

2. Extraction Process

The extraction script automatically:

  • Filters out deleted or removed posts
  • Sorts all submissions by upvote score
  • Selects the top 10,000 highest-rated stories
  • Normalizes punctuation and whitespace

3. Training Notebook Configuration

Training is performed in the llama-2sentencehorror-trainer.ipynb notebook on Kaggle.

this model originally used llama, but it kept crashing so we used mistral instead.

Step 3: Load Dataset

The dataset is loaded as a plain text file using Hugging Face Datasets.

โš ๏ธ Important: The path must point to the actual .txt file inside the Kaggle dataset directory.

data_file_path = "/kaggle/input/10k-most-upvoted-two-sentence-horror-2022/dataset_10k.txt"

print(f"Loading from: {data_file_path}")

raw_dataset = load_dataset(
    "text",
    data_files={"train": data_file_path},
    split="train"
)

Step 4: Instruction Formatting

Each story is wrapped using Mistralโ€™s instruction format so the model learns to associate the prompt with the desired output style.

4. Training Hyperparameters

Base Model

  • mistralai/Mistral-7B-Instruct-v0.3

Quantization

  • 4-bit NF4 (bitsandbytes)

LoRA Configuration

  • Rank: 8

  • Alpha: 16

Target Modules:

  • q_proj

  • k_proj

  • v_proj

  • o_proj

Optimization

  • Learning Rate: 2e-4

  • Batch Size: 2 per device

  • Gradient Accumulation: 2 โ†’ Effective Batch Size: 4

Epochs: 1

5. Inference Prompt

The model is trained to respond to the following fixed instruction:

<s>[INST] Write a creative and chilling two-sentence horror story. [/INST]

The model typically generates:

  • A grounded setup sentence

  • A disturbing or ironic twist sentence

*As a warning, the model is definitely not perfect. The issue I have with most LLM's is that they are unable to come up with creative ideas. My model can actually do this to some extent, but it struggles at doing that and continuing to use grammar correctly. *

Hardware Notes

Training: Kaggle (2ร— NVIDIA T4 GPUs)

Inference: Consumer GPUs supported via 4-bit quantization

Minimum VRAM: ~6โ€“8 GB

License

This project is released under the MIT License. Base model copyright remains with Mistral AI.

Downloads last month
1
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for denialguo/mistral-two-sentence-horror

Adapter
(598)
this model