EvoLlama
EvoLlama is a multimodal framework that connects a structure-based protein encoder, a sequence-based protein encoder, and an LLM for protein understanding through a two-stage training process. For more details, please refer to our paper: EvoLlama: Enhancing LLMs' Understanding of Proteins via Multimodal Structure and Sequence Representations.
Quickstart
For more details, please refer to our GitHub repository.
Model Family
| Models | Stages | Datasets | PDB | Paths |
|---|---|---|---|---|
| EvoLlama (ProteinMPNN + ESM-2) | Projection Tuning | SwissProt | AlphaFold-2 | projection_tuning/protein_mpnn_esm2_650m |
| EvoLlama (ProteinMPNN + ESM-2) | Supervised Fine-tuning | PMol + PEER | ESMFold | supervised_fine_tuning/protein_mpnn_esm2_650m |
| EvoLlama (GearNet + ESM-2) | Projection Tuning | SwissProt | AlphaFold-2 | Coming soon ... |
| EvoLlama (GearNet + ESM-2) | Supervised Fine-tuning | PMol + PEER | ESMFold | Coming soon ... |
Model Architecture
EvoLlama is initialized with the weights of the following models:
| Models | Links |
|---|---|
| ProteinMPNN | Link |
| GearNet | Link |
| ESM-2 650M (facebook/esm2_t33_650M_UR50D) | Link |
| Llama-3 (meta-llama/Meta-Llama-3-8B-Instruct) | Link |
Citation
@misc{liu2024evollama,
title={EvoLlama: Enhancing LLMs' Understanding of Proteins via Multimodal Structure and Sequence Representations},
author={Nuowei Liu and Changzhi Sun and Tao Ji and Junfeng Tian and Jianxin Tang and Yuanbin Wu and Man Lan},
year={2024},
eprint={2412.11618},
archivePrefix={arXiv},
primaryClass={cs.LG},
url={https://arxiv.org/abs/2412.11618},
}
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
๐
Ask for provider support