Integration¶
The bertblocks.integration package converts HuggingFace models into BertBlocks equivalents.
Loading Models¶
- bertblocks.integration.from_huggingface(
- pretrained_model_name_or_path: str,
- load_weights: bool = True,
- add_pooling_layer: bool = False,
- attn_implementation: Literal['flash_attention_2', 'sdpa', 'eager'] = 'sdpa',
Instantiate an equivalent BertBlocksModel from HuggingFace pretrained models.
Automatically detects the model type and routes to the appropriate conversion function. Supports BERT-like encoder models available on HuggingFace Hub.
- Parameters:
pretrained_model_name_or_path (str) – HuggingFace model identifier (e.g., “bert-base-uncased”, “modernbert-base”) or local path to a pretrained model directory.
load_weights (bool, optional) – Whether to transfer weights from the pretrained HuggingFace model. If True, copies all layer parameters. If False, only loads the configuration and initializes a fresh model with random weights. Defaults to True.
add_pooling_layer (bool, optional) – Whether to add a pooling layer that processes the [CLS] token. Useful for sequence-level classification tasks. Defaults to False.
attn_implementation (str, optional) – Attention backend. One of “flash_attention_2”, “sdpa”, or “eager”. Defaults to “sdpa”.
- Returns:
- A BertBlocks model with architecture matched to the source HuggingFace model,
optionally loaded with pretrained weights.
- Return type:
- Raises:
ValueError – If the model type is not supported or cannot be detected.
Model-Specific Loaders¶
- bertblocks.integration.load_modernbert.from_modernbert_model(
- orig_model: ModernBertModel,
- add_pooling_layer: bool = False,
- attn_implementation: Literal['flash_attention_2', 'sdpa', 'eager'] = 'sdpa',
Instantiate an equivalent BertBlocks model from a HuggingFace ModernBERT model instance.
- Parameters:
- Returns:
- A BertBlocks model with architecture matched to ModernBERT,
loaded with pretrained weights.
- Return type:
- bertblocks.integration.load_bert.from_bert_model(
- orig_model: BertModel,
- add_pooling_layer: bool = False,
- attn_implementation: Literal['flash_attention_2', 'sdpa', 'eager'] = 'sdpa',
Instantiate an equivalent BertBlocks model from a HuggingFace BERT model instance.
Converts a HuggingFace BERT model to BertBlocks architecture with weight transfer. The BertBlocks model uses post-normalization and standard MLP architecture to match BERT.
- Parameters:
orig_model (BertModel) – An instance of a HuggingFace BertModel that has been loaded with pretrained weights.
add_pooling_layer (bool, optional) – Whether to add a pooling layer that processes the [CLS] token. Defaults to False.
- Returns:
- A BertBlocks model with architecture matched to BERT,
loaded with pretrained weights.
- Return type:
References
“BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding” (https://arxiv.org/abs/1810.04805)