PhysicsNeMo Checkpoints: Atlas
Description:
Atlas is a medium-range weather generative AI model that autoregressively predicts ERA5 variables on a global 0.25 degree latitude-longitude grid.
The model can be sampled multiple times given a single input to produce multiple ensemble members, enabling rapid generation of skillful medium-range ensemble forecasts.
For inference see NVIDIA Earth2Studio.
This model is for research and development only.
License/Terms of Use:
Governing Terms: Use of this model is governed by the NVIDIA Open Model License.
Deployment Geography:
Global
Use Case:
Global medium-range ensemble weather forecasting
Release Date:
Hugging Face: 01/26/2026 via https://huggingface.co/nvidia/atlas-era5
Model Architecture
Architecture Type: Atlas uses a latent diffusion transformer (DiT) architecture, and generates samples using stochastic interpolants.
Network Architecture: Latent diffusion Transformer (DiT), 2.5B parameters, with a decoder DiT using 2D neighborhood attention, 1.8B parameters
Input:
Input Type(s):
- Tensor (75 state variables from ERA5)
- DateTime (NumPy Array)
Input Format(s): PyTorch Tensor / NumPy array
Input Parameters:
- Five Dimensional (5D) (batch, lead time, variable, latitude, longitude)
- Input DateTime (1D)
Other Properties Related to Input:
- Input grid (latitude/longitude) is a global 721x1440 equiangular grid.
- Input lead time is of size 2, including the current time step and the previous time step 6 hours in the past
- Input state ERA5 variables:
u10m,v10m,u100m,v100m,t2m,sp,msl,tcwv,u50,u100,u150,u200,u250,u300,u400,u500,u600,u700,u850u925,u1000,v50,v100,v150,v200,v250,v300,v400,v500,v600,v700,v850,v925,v1000,z50,z100,z150,z200,z250,z300,z400,z500,z600,z700,z850,z925,z1000,t50,t100,t150,t200,t250,t300,t400,t500,t600,t700,t850,t925,t1000,q50,q100,q150,q200,q250,q300,q400,q500,q600,q700,q850,q925,q1000,sst,tp
For variable naming information, review the Earth2Studio lexicon.
Output:
Output Type(s): Tensor (75 state variables from ERA5)
Output Format: Pytorch Tensors
Output Parameters: Five Dimensional (5D) (batch, lead time, variable, latitude, longitude)
Other Properties Related to Output:
- Output grid (latitude/longitude) is a global 721x1440 equiangular grid.
- Output lead time is of size 1, predicting 6 hours in the future.
- Output state ERA5 variables:
u10m,v10m,u100m,v100m,t2m,sp,msl,tcwv,u50,u100,u150,u200,u250,u300,u400,u500,u600,u700,u850u925,u1000,v50,v100,v150,v200,v250,v300,v400,v500,v600,v700,v850,v925,v1000,z50,z100,z150,z200,z250,z300,z400,z500,z600,z700,z850,z925,z1000,t50,t100,t150,t200,t250,t300,t400,t500,t600,t700,t850,t925,t1000,q50,q100,q150,q200,q250,q300,q400,q500,q600,q700,q850,q925,q1000,sst,tp
Our AI models are designed and/or optimized to run on NVIDIA GPU-accelerated systems. By leveraging NVIDIA’s hardware (e.g. GPU cores) and software frameworks (e.g., CUDA libraries), the model achieves faster training and inference times compared to CPU-only solutions.
Software Integration
Runtime Engine(s): Not Applicable
Supported Hardware Microarchitecture Compatibility:
- NVIDIA Ampere
- NVIDIA Blackwell
- NVIDIA Hopper
Supported Operating System(s):
- Linux
The integration of foundation and fine-tuned models into AI systems requires additional testing using use-case-specific data to ensure safe and effective deployment. Following the V-model methodology, iterative testing and validation at both unit and system levels are essential to mitigate risks, meet technical and functional requirements, and ensure compliance with safety and ethical standards before deployment.
Model Version(s):
Model Version: v1
Training, Testing, and Evaluation Datasets:
Note: The initial model development used the years 1980-2016 as a train set, with 2017-2019 as validation data. The final released model was then trained on data from 1980-2019 and evaluated on the year 2020.
Training Dataset:
Link: ERA5
Data Collection Method by dataset:
- Automatic/Sensors
Labeling Method by dataset:
- Automatic/Sensors
Data Modality:
- Gridded geophysical time series
Data Size:
- 16 TB subset used for model training
Properties:
ERA5 data for the period January 1980 - December 2019. ERA5 provides hourly estimates of various
atmospheric, land, and oceanic climate variables. The data covers the Earth on a 30km
grid and resolves the atmosphere at 137 levels.
Testing Dataset:
Link: ERA5
Data Collection Method by dataset:
- Automatic/Sensors
Labeling Method by dataset:
- Automatic/Sensors
Properties:
ERA5 data for the period January 2017 - December 2019. ERA5 provides hourly estimates of various
atmospheric, land, and oceanic climate variables. The data covers the Earth on a 30km
grid and resolves the atmosphere at 137 levels.
Evaluation Dataset:
Link: ERA5
Data Collection Method by dataset:
- Automatic/Sensors
Labeling Method by dataset:
- Automatic/Sensors
Properties:
ERA5 data for the period January 2020 - December 2020. ERA5 provides hourly estimates of various
atmospheric, land, and oceanic climate variables. The data covers the Earth on a 30km
grid and resolves the atmosphere at 137 levels.
Inference:
Engine: PyTorch
Test Hardware:
- A100
- H100
- L40S
Ethical Considerations:
NVIDIA believes Trustworthy AI is a shared responsibility and we have established policies and practices to enable development for a wide array of AI applications. When downloaded or used in accordance with our terms of service, developers should work with their internal model team to ensure this model meets requirements for the relevant industry and use case and addresses unforeseen product misuse.
Please report model quality, risk, security vulnerabilities or NVIDIA AI Concerns here.
- Downloads last month
- 22