EvoLlama

EvoLlama is a multimodal framework that connects a structure-based protein encoder, a sequence-based protein encoder, and an LLM for protein understanding through a two-stage training process. For more details, please refer to our paper: EvoLlama: Enhancing LLMs' Understanding of Proteins via Multimodal Structure and Sequence Representations.

Quickstart

For more details, please refer to our GitHub repository.

Model Family

Models Stages Datasets PDB Paths
EvoLlama (ProteinMPNN + ESM-2) Projection Tuning SwissProt AlphaFold-2 projection_tuning/protein_mpnn_esm2_650m
EvoLlama (ProteinMPNN + ESM-2) Supervised Fine-tuning PMol + PEER ESMFold supervised_fine_tuning/protein_mpnn_esm2_650m
EvoLlama (GearNet + ESM-2) Projection Tuning SwissProt AlphaFold-2 Coming soon ...
EvoLlama (GearNet + ESM-2) Supervised Fine-tuning PMol + PEER ESMFold Coming soon ...

Model Architecture

EvoLlama is initialized with the weights of the following models:

Models Links
ProteinMPNN Link
GearNet Link
ESM-2 650M (facebook/esm2_t33_650M_UR50D) Link
Llama-3 (meta-llama/Meta-Llama-3-8B-Instruct) Link

Citation

@misc{liu2024evollama,
    title={EvoLlama: Enhancing LLMs' Understanding of Proteins via Multimodal Structure and Sequence Representations}, 
    author={Nuowei Liu and Changzhi Sun and Tao Ji and Junfeng Tian and Jianxin Tang and Yuanbin Wu and Man Lan},
    year={2024},
    eprint={2412.11618},
    archivePrefix={arXiv},
    primaryClass={cs.LG},
    url={https://arxiv.org/abs/2412.11618}, 
}
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support