biology
genomics
DNA
evo2_20b / README.md
gbrixi's picture
Update README.md
616892e verified
metadata
license: apache-2.0
tags:
  - biology
  - genomics
  - DNA
datasets:
  - arcinstitute/opengenome2

Evo 2 20B, 1M context

Evo 2 is a state-of-the-art DNA language model trained autoregressively on trillions of DNA tokens.

For instructions, details, and examples, please refer to the GitHub and paper.

Model Details

  • Base Model: Evo 2 20B
  • Context Length: 1 million tokens
  • Parameters: 20B
  • Architecture: 50 layers

Main Evo 2 Checkpoints

Evo 2 40B 20B, and 7B checkpoints, trained up to 1 million sequence length, are available here:

Checkpoint name Num layers Num parameters
evo2_40b 50 40B
evo2_20b 50 20B
evo2_7b 32 7B

We also share 40B, 7B, and 1B base checkpoints trained on 8192 context length:

Checkpoint name Num layers Num parameters
evo2_40b_base 50 40B
evo2_7b_base 32 7B
evo2_1b_base 25 1B

Usage

Please refer to the Evo 2 GitHub repository for detailed usage instructions and examples.