ByteDance-Seed
/

ConfRover-interp-20M-v1.0

Other

confrover

Model card Files Files and versions

xet

Community

yuningshen commited on 17 days ago

Commit

0264329

verified ·

1 Parent(s): 7ca4811

Upload folder using huggingface_hub

Browse files

Files changed (2) hide show

README.md +169 -3
confrover_interp_20m_v1_0.pt +3 -0

README.md CHANGED Viewed

@@ -1,3 +1,169 @@
----
-license: apache-2.0
----

+---
+license: apache-2.0
+variant: interp
+size: 20M
+version: v1.0
+model_summary: ConfRover base model trained for conformation interpolation
+model_description: '
+  ConfRover is a deep generative model for protein 3D conformation and motion dynamics.
+  It leverages diffusion probability model to learn the distribution of protein 3D
+  conformations and captures the their temporal dependencies between frames through
+  temporal causal transformers.
+  Models are trained using molecular dynamics (MD) trajectories data and can generate
+  protein conformation ensembles and motion trajectories conditioned on the input
+  protein amino acid sequence.
+  This variant was continued trained from the base model with additional conformation
+  interpolation task.'
+recommend: For interpolation tasks
+model_id: ConfRover-interp-20M-v1.0
+name: ConfRover
+repo: https://github.com/ByteDance-Seed/ConfRover
+paper: https://arxiv.org/abs/2505.17478
+demo: https://ByteDance-Seed.github.io/ConfRover
+get_started_code: "\n```python\nfrom confrover import ConfRover\n\nmodel = ConfRover.from_pretrained(<model_name>)\n\
+  \nmodel.to(\"cuda\")\n\nmodel.generate(\n    case_id=<case_name>,\n    seqres=<amino_acid_sequence>,\n\
+  \    output_dir=</path/to/save/output>,\n    task_mode=<\"forward\"|\"iid\"|\"interp\"\
+  >,\n    n_replicates=<int>, # number of replicated trajectories (forward and interp)\
+  \ or total number of conformation samples (iid)\n    n_frames=<int>, # number of\
+  \ frames in the trajectory, including the conditioning frames.\n    stride_in_10ps=256,\
+  \ # time interval between frames in the unit of 10 ps.\n    conditions=..., # information\
+  \ for conditioning frames for forward simulation and interp. See `ConfRover.generate`\
+  \ for more details.\n)\n```\n"
+model_specs: '
+  ConfRover contains encoder, temporal module, and diffusion decoder.
+  - The encoder maps the input amino acid sequence (through a folding model) and coordinates
+  of context frames to a latent representation.
+  - The temporal module models the temporal dependencies between frames using an interleaving
+  of causal transformers (across the temporal dimension) and pairformers (to update
+  structures).
+  - The diffusion model learns the probability distribution of protein conformations
+  and generates samples conditioned on the input sequence and conditioning representation.
+  '
+bias_risks_limitations: '
+  ConfRover is trained on limited MD trajectories data and may not generalize well
+  to out-of-distribution data.
+  The quality of generated conformations is also limited by the quality of the input
+  data and the computational resources.
+  Currently, ConfRover only supports protein conformation generation and models the
+  coordinates of heavy atoms.
+  '
+citation_bibtex: "\n```text\n@article{confrover2025,\n  title={Simultaneous Modeling\
+  \ of Protein Conformation and Dynamics via Autoregression},\n  author={Shen, Yuning\
+  \ and Wang, Lihao and Yuan, Huizhuo and Wang, Yan and Yang, Bangji and Gu, Quanquan},\n\
+  \  journal={arXiv preprint arXiv:2505.17478},\n  year={2025}\n}\n```\n"
+---
+# Model Card for `ConfRover-interp-20M-v1.0`
+<!-- Provide a quick summary of what the model is/does. -->
+ConfRover base model trained for conformation interpolation
+## Model Details
+### Model Description
+<!-- Provide a longer summary of what this model is. -->
+ConfRover is a deep generative model for protein 3D conformation and motion dynamics.
+It leverages diffusion probability model to learn the distribution of protein 3D conformations and captures the their temporal dependencies between frames through temporal causal transformers.
+Models are trained using molecular dynamics (MD) trajectories data and can generate protein conformation ensembles and motion trajectories conditioned on the input protein amino acid sequence.
+This variant was continued trained from the base model with additional conformation interpolation task.
+**Basic info**
+| Model ID | ConfRover-interp-20M-v1.0 |
+|:--|:--|
+| **Variant** | interp |
+| **Size** | 20M |
+| **Version** | v1.0 |
+| **Recommend** | For interpolation tasks |
+| **License** | Apache-2.0 |
+### Model Sources
+<!-- Provide the basic links for the model. -->
+- **Repository:** https://github.com/ByteDance-Seed/ConfRover
+- **Paper:** https://arxiv.org/abs/2505.17478
+- **Website:** https://ByteDance-Seed.github.io/ConfRover
+## How to Get Started with the Model
+Use the code below to get started with the model.
+```python
+from confrover import ConfRover
+model = ConfRover.from_pretrained(<model_name>)
+model.to("cuda")
+model.generate(
+    case_id=<case_name>,
+    seqres=<amino_acid_sequence>,
+    output_dir=</path/to/save/output>,
+    task_mode=<"forward"|"iid"|"interp">,
+    n_replicates=<int>, # number of replicated trajectories (forward and interp) or total number of conformation samples (iid)
+    n_frames=<int>, # number of frames in the trajectory, including the conditioning frames.
+    stride_in_10ps=256, # time interval between frames in the unit of 10 ps.
+    conditions=..., # information for conditioning frames for forward simulation and interp. See `ConfRover.generate` for more details.
+)
+```
+## Technical Specifications
+ConfRover contains encoder, temporal module, and diffusion decoder.
+- The encoder maps the input amino acid sequence (through a folding model) and coordinates of context frames to a latent representation.
+- The temporal module models the temporal dependencies between frames using an interleaving of causal transformers (across the temporal dimension) and pairformers (to update structures).
+- The diffusion model learns the probability distribution of protein conformations and generates samples conditioned on the input sequence and conditioning representation.
+## Bias, Risks, and Limitations
+<!-- This section is meant to convey both technical and sociotechnical limitations. -->
+ConfRover is trained on limited MD trajectories data and may not generalize well to out-of-distribution data.
+The quality of generated conformations is also limited by the quality of the input data and the computational resources.
+Currently, ConfRover only supports protein conformation generation and models the coordinates of heavy atoms.
+## Citation
+<!-- If there is a paper or blog post introducing the model, the APA and Bibtex information for that should go in this section. -->
+```text
+@article{confrover2025,
+  title={Simultaneous Modeling of Protein Conformation and Dynamics via Autoregression},
+  author={Shen, Yuning and Wang, Lihao and Yuan, Huizhuo and Wang, Yan and Yang, Bangji and Gu, Quanquan},
+  journal={arXiv preprint arXiv:2505.17478},
+  year={2025}
+}
+```

confrover_interp_20m_v1_0.pt ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:19a016b291fbb9f8d6f6a1d1a9fbbc959e2655c8f86610dc34c6e6c2e81fe52e
+size 78548240