Model Card for vuteco-dsc-fnd

vuteco-dsc-fnd is a fine-tuned DeepSeek Coder 6.7B Instruct that classifies JUnit test methods into two classes:

Witnessing if it is testing for a vulnerability.
Unknown if it is unclear whether it is testing for a vulnerability.

Model Details

Model Description

VuTeCo is a framework for finding vulnerability-witnessing test cases in Java repositories (Finding) and match them with the right known vulnerability (Matching). More info in its GitHub repository.

This model (vuteco-dsc-fnd) is a fine-tuned DeepSeek Coder 6.7B Instruct with a simple classification prompt.

This model is used in VuTeCo for the "Finding" task, which can classify JUnit test methods into two classes:

Witnessing if it is testing for a vulnerability.
Unknown if it is unclear whether it is testing for a vulnerability.

The model input is the tokenized raw text of a JUnit test method, with no preprocessing.

Developed by: Hamburg University of Technology
Funded by: Sec4AI4Sec (Horizon EU)
Shared by:: Hugging Face
Model type: Text Classification
Language(s) (NLP): en
License: Apache-2.0
Finetuned from model: DeepSeek Coder 6.7B Instruct

Model Sources [optional]

Repository: VuTeCo's GitHub repository
Paper: MSR'26 paper

Uses

Direct Use

The model can be used right away to classify specific types of vulnerability-witnessing tests, e.g., distinguishing the exact vulnerability types that is tested.

Downstream Use [optional]

The model can be further fine-tuned to classify specific types of vulnerability-witnessing tests, e.g., distinguishing the exact vulnerability types that is tested.

It could also be fine-tuned for other testing frameworks (beyond JUnit) and programming languages (Python).

Out-of-Scope Use

N/A

Bias, Risks, and Limitations

The model predictions may be inaccurate (misclassified test methods). In particular, the reported performance show the model has limited recall, so it often says Unknown.

Recommendations

Manually validate the predictions made by the model.

How to Get Started with the Model

Please, refer to VuTeCo's GitHub repository for loading and using the model in the correct way.

Training Details

Training Data

This model was fine-tuned on Java repositories and vulnerabilities from Vul4J. Please refer to VuTeCo's GitHub repository for loading the dataset in the correct way.

Training Procedure

Please refer to VuTeCo's GitHub repository for customizing the model training.

Evaluation

Please refer to VuTeCo's GitHub repository for customizing the model evaluation.

Results

Please, refer to the MSR'26 paper for an overview of the main evaluation results. The complete raw results can be found in the paper's online appendix on Zenodo.

Model Examination [optional]

[More Information Needed]

Environmental Impact

N/A

Citation

If you use this model, please cite the MSR'26 paper (the publisher's reference will be available soon):

BibTeX:

@misc{iannone2026matchheavenaidrivenmatching,
    title={A Match Made in Heaven? AI-driven Matching of Vulnerabilities and Security Unit Tests}, 
    author={Emanuele Iannone and Quang-Cuong Bui and Riccardo Scandariato},
    year={2026},
    eprint={2502.03365},
    archivePrefix={arXiv},
    primaryClass={cs.SE},
    url={https://arxiv.org/abs/2502.03365}, 
}