QSP-TAT (Pangu-7B + Q4CE Prefix Injection)

This repository provides QSP-TAT training and evaluation scripts for triple classification using:

openPangu-Embedded-7B(LLaMA2 or Qwen2.5 as well) as the base LLM
Q4CE (KGE/Q4CE) embeddings injected as prefix tokens
LoRA for parameter-efficient tuning

Project Structure

QSP-TAT/
├─ ckpt/                 # Training outputs (LoRA + prefix adapter)
├─ data/                 # Datasets + Q4CE/KGE embedding pth
├─ logs/                 # nohup logs for training/inference
├─ qsp_tat/              # Core code (train/eval/inject/kge/prompt)
├─ templates/            # Prompt templates (alpaca, etc.)
├─ requirements.txt      # Python dependencies
└─ README.md

Setup

1) Create / activate environment

Example (conda):

conda create -n qsp python=3.10 -y
conda activate qsp

2) Install dependencies

pip install -r requirements.txt

3) Run from repo root

All commands below assume:

cd QSP-TAT

Data Format

instruction: string
input: string
output: string (e.g., "True" / "False")
embedding_ids: list of 3 integers [h, r, t] (required)

Example:

{
  "instruction": "Given a triple..., determine True/False.",
  "input": "head | relation | tail",
  "output": "True",
  "embedding_ids": [12, 3, 45]
}

Q4CE Embedding Checkpoint

--kge_path should point to a PyTorch .pth embedding checkpoint (Q4CE/KGE).
It is compatible with keys (e.g., ent_embeddings.weight, rel_embeddings.weight) or native keys.

Example:

data/UMLS/UMLS-Q4CE.pth

Training (Multistage, Sharded)

Training uses single-process sharded loading via Transformers device_map="auto".

✅ Run with python (single process)
❌ Do NOT use torchrun / DDP

Command (example: UMLS)

CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7 \
nohup python -u -m qsp_tat.train \
  --base_model "models/openPangu-Embedded-7B" \
  --data_path "data/UMLS/UMLS-train.json" \
  --out "ckpt/pangu-7B-Q4CE-UMLS" \
  --n_prefix 1 \
  --kge_path "data/UMLS/UMLS-Q4CE.pth" \
  --lora_r 64 \
  --lora_targets "q_proj,k_proj,v_proj,o_proj" \
  --s0_ep 1 \
  --s0_pick head \
  --s0_lr 5e-4 \
  --s1_ep 1 \
  --s1_lr 5e-4 \
  --s2_ep 2 \
  --s2_lr_lora 3e-4 \
  --s2_lr_adp 3e-5 \
  --s2_drop 0.1 \
  --bs 24 \
  --mbs 12 \
  --save no \
  > "logs/train/log_pangu-7B_Q4CE_UMLS.txt" 2>&1 &

Outputs

Training produces stage folders under --out:

ckpt/pangu-7b-Q4CE-UMLS/
├─ s0/   # optional entity-aware warmup
├─ s1/   # adapter-only stage
└─ s2/   # joint LoRA + adapter stage (recommended for eval)

Each stage directory contains:

LoRA weights (PEFT save_pretrained)
Prefix adapter: emb.pth

Evaluation / Inference

Use the stage-2 directory (s2) by default.

Command (example: UMLS valid set)

CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7 \
nohup python -u -m qsp_tat.eval \
  --base_model "models/openPangu-Embedded-7B" \
  --lora_dir "ckpt/pangu-7b-Q4CE-UMLS/s2" \
  --test_json "data/UMLS/UMLS-valid.json" \
  --max_new_tokens 128 \
  > "logs/infer/log_pangu-7B-Q4CE_infer_valid_UMLS.txt" 2>&1 &

Metrics

The eval script reports:

Accuracy
Precision
Recall
F1

Logs & Monitoring

Tail logs

tail -f "logs/train/log_pangu-7B_Q4CE_UMLS.txt"
tail -f "logs/infer/log_pangu-7B-Q4CE_infer_valid_UMLS.txt"

Check running jobs

ps -ef | grep qsp_tat

Notes

This pipeline injects Q4CE embedding prefixes before token embeddings, and fine-tunes the LLM with LoRA.
Multi-GPU sharding is handled by Transformers (device_map="auto"). Use single-process execution only.
Keep qsp_tat/ and templates/ at repo root when running via -m qsp_tat.*.

Downloads last month: -

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support