langml.baselines.matching.sbert

Submodules

Package Contents

Classes

DataLoader

TFDataLoader

SentenceBert

class langml.baselines.matching.sbert.DataLoader(data: List, tokenizer: object, batch_size: int = 32)[source]

Bases: langml.baselines.BaseDataLoader

__len__(self) int
static load_data(fpath: str, build_vocab: bool = False, label2idx: Optional[Dict] = None) Union[List[Tuple[str, str, int]], Tuple[List[Tuple[str, str, int]], Dict]]
Parameters
  • fpath – str, path of data

  • build_vocab – bool, whether to build vocabulary

  • label2idx – Optional[Dict], label to index dict

make_iter(self, random: bool = False)
class langml.baselines.matching.sbert.TFDataLoader(data: List, tokenizer: object, batch_size: int = 32)[source]

Bases: DataLoader

make_iter(self, random: bool = False)
__call__(self, random: bool = False)
class langml.baselines.matching.sbert.SentenceBert(config_path: str, ckpt_path: str, params: langml.baselines.Parameters, backbone: str = 'roberta')[source]

Bases: langml.baselines.BaselineModel

get_pooling_output(self, model: langml.tensor_typing.Models, output_index: int, pooling_strategy: str = 'cls') langml.tensor_typing.Tensors

get pooling output :param model: keras.Model, BERT model :param output_index: int, specify output index of feedforward layer. :param pooling_strategy: str, specify pooling strategy from [‘cls’, ‘first-last-avg’, ‘last-avg’], default cls

build_model(self, task: str = 'regression', pooling_strategy: str = 'cls', lazy_restore: bool = False) langml.tensor_typing.Models