Add support for transformers 4.44 through 5.0+

#16

by nvidia-oliver-holworthy - opened 28 days ago

base: refs/heads/main

←

from: refs/pr/16

Discussion Files changed

+214

-35

nvidia-oliver-holworthy

NVIDIA org 28 days ago

•

edited 28 days ago

Add support for broader set of transformers versions

This PR updates llama_bidirectional_model.py to support transformers versions 4.44 through 5.0+, replacing the previous requirement of exactly 4.47.1.

Why this change was needed

The previous implementation relied on overriding _update_causal_mask() to create bidirectional attention masks. This approach broke in several ways:

transformers 4.48: The attention refactor (#35235) activated our _attn_implementation = "eager" line, forcing eager attention instead of SDPA
transformers 4.53: The _update_causal_mask method was removed entirely, with masking logic moved to masking_utils

What changed

Unified forward() override instead of _update_causal_mask override
Introspection-based API detection using inspect.signature() rather than hardcoded version checks
Automatic fallback for mask creation: uses create_bidirectional_mask (5.0+) or _prepare_4d_attention_mask (older)
Handles API differences across versions:
- Decoder layer return type (tuple in <4.54, tensor in ≥4.54)
- Cache parameter name (past_key_value vs past_key_values)
- DynamicCache constructor signature
Removed _attn_implementation = "eager" - users should pass attention implementation via model_kwargs when loading

Testing

Tested with transformers versions: 4.44, 4.47.1, 4.48, 4.53, 4.54, 4.56, 4.57, 5.0.0

Embeddings verified consistent across versions (with expected minor floating-point differences ~1e-4 in 5.0+ due to different mask creation internals).

Add support for transformers 4.44 through 5.0+b0b8cc74

Remove note about 4.47.1 required version of transformersa7c51664

nvidia-oliver-holworthy changed pull request status to open 28 days ago

nvidia-oliver-holworthy changed pull request status to merged 27 days ago

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment