langml.plm.bert

Module Contents

Classes

BERT

Functions

load_bert(config_path: str, checkpoint_path: str, seq_len: Optional[int] = None, pretraining: bool = False, with_mlm: bool = True, with_nsp: bool = True, lazy_restore: bool = False, weight_prefix: Optional[str] = None, dropout_rate: float = 0.0, **kwargs) → Union[Tuple[langml.tensor_typing.Models, Callable], Tuple[langml.tensor_typing.Models, Callable, Callable]]

Load pretrained BERT/RoBERTa

class langml.plm.bert.BERT(vocab_size: int, position_size: int = 512, seq_len: int = 512, embedding_dim: int = 768, hidden_dim: Optional[int] = None, transformer_blocks: int = 12, attention_heads: int = 12, intermediate_size: int = 3072, dropout_rate: float = 0.1, attention_activation: langml.tensor_typing.Activation = None, feed_forward_activation: langml.tensor_typing.Activation = 'gelu', initializer_range: float = 0.02, pretraining: bool = False, trainable_prefixs: Optional[List] = None, share_weights: bool = False, weight_prefix: Optional[str] = None)[source]
get_weight_name(self, name: str) str[source]
build(self)[source]
get_inputs(self) List[langml.tensor_typing.Tensors][source]
get_embedding(self, inputs: List[langml.tensor_typing.Tensors]) List[langml.tensor_typing.Tensors][source]
is_trainable(self, layer: tensorflow.keras.layers.Layer) bool[source]
__call__(self, inputs: Optional[Union[Tuple, List]] = None, return_model: bool = True, with_mlm: bool = True, with_nsp: bool = True, custom_embedding_callback: Optional[Callable] = None) langml.tensor_typing.Models[source]
langml.plm.bert.load_bert(config_path: str, checkpoint_path: str, seq_len: Optional[int] = None, pretraining: bool = False, with_mlm: bool = True, with_nsp: bool = True, lazy_restore: bool = False, weight_prefix: Optional[str] = None, dropout_rate: float = 0.0, **kwargs) Union[Tuple[langml.tensor_typing.Models, Callable], Tuple[langml.tensor_typing.Models, Callable, Callable]][source]

Load pretrained BERT/RoBERTa :param - config_path: str, path of albert config :param - checkpoint_path: str, path of albert checkpoint :param - seq_len: Optional[int], specify fixed input sequence length, default None :param - pretraining: bool, pretraining mode, default False :param - with_mlm: bool, whether to use mlm task in pretraining, default True :param - with_nsp: bool, whether to use nsp task in pretraining, default True :param - lazy_restore: bool, whether to restore pretrained weights lazily, default False.

Set it as True for distributed training.

Parameters
  • weight_prefix (-) – Optional[str], prefix name of weights, default None. You can set a prefix name in unshared siamese networks.

  • dropout_rate (-) – float, dropout rate, default 0.

Returns

keras model - bert: bert instance - restore: conditionally, it will return when lazy_restore=True

Return type

  • model