Model Card for vuteco-dsc-fnd
vuteco-dsc-fnd is a fine-tuned DeepSeek Coder 6.7B Instruct that classifies JUnit test methods into two classes:
Witnessingif it is testing for a vulnerability.Unknownif it is unclear whether it is testing for a vulnerability.
Model Details
Model Description
VuTeCo is a framework for finding vulnerability-witnessing test cases in Java repositories (Finding) and match them with the right known vulnerability (Matching). More info in its GitHub repository.
This model (vuteco-dsc-fnd) is a fine-tuned DeepSeek Coder 6.7B Instruct with a simple classification prompt.
This model is used in VuTeCo for the "Finding" task, which can classify JUnit test methods into two classes:
Witnessingif it is testing for a vulnerability.Unknownif it is unclear whether it is testing for a vulnerability.
The model input is the tokenized raw text of a JUnit test method, with no preprocessing.
- Developed by: Hamburg University of Technology
- Funded by: Sec4AI4Sec (Horizon EU)
- Shared by:: Hugging Face
- Model type: Text Classification
- Language(s) (NLP): en
- License: Apache-2.0
- Finetuned from model: DeepSeek Coder 6.7B Instruct
Model Sources [optional]
- Repository: VuTeCo's GitHub repository
- Paper: MSR'26 paper
Uses
Direct Use
The model can be used right away to classify specific types of vulnerability-witnessing tests, e.g., distinguishing the exact vulnerability types that is tested.
Downstream Use [optional]
The model can be further fine-tuned to classify specific types of vulnerability-witnessing tests, e.g., distinguishing the exact vulnerability types that is tested.
It could also be fine-tuned for other testing frameworks (beyond JUnit) and programming languages (Python).
Out-of-Scope Use
N/A
Bias, Risks, and Limitations
The model predictions may be inaccurate (misclassified test methods).
In particular, the reported performance show the model has limited recall, so it often says Unknown.
Recommendations
Manually validate the predictions made by the model.
How to Get Started with the Model
Please, refer to VuTeCo's GitHub repository for loading and using the model in the correct way.
Training Details
Training Data
This model was fine-tuned on Java repositories and vulnerabilities from Vul4J. Please refer to VuTeCo's GitHub repository for loading the dataset in the correct way.
Training Procedure
Please refer to VuTeCo's GitHub repository for customizing the model training.
Evaluation
Please refer to VuTeCo's GitHub repository for customizing the model evaluation.
Results
Please, refer to the MSR'26 paper for an overview of the main evaluation results. The complete raw results can be found in the paper's online appendix on Zenodo.
Model Examination [optional]
[More Information Needed]
Environmental Impact
N/A
Citation
If you use this model, please cite the MSR'26 paper (the publisher's reference will be available soon):
BibTeX:
@misc{iannone2026matchheavenaidrivenmatching,
title={A Match Made in Heaven? AI-driven Matching of Vulnerabilities and Security Unit Tests},
author={Emanuele Iannone and Quang-Cuong Bui and Riccardo Scandariato},
year={2026},
eprint={2502.03365},
archivePrefix={arXiv},
primaryClass={cs.SE},
url={https://arxiv.org/abs/2502.03365},
}
Model Card Authors
- Downloads last month
- 2
Model tree for emaiannone/vuteco-dsc-fnd
Base model
deepseek-ai/deepseek-coder-6.7b-instruct