langml.baselines.contrastive.simcse.dataloder
Module Contents
Classes
- class langml.baselines.contrastive.simcse.dataloder.DataLoader(data: List, tokenizer: object, batch_size: int = 32)[source]
Bases:
langml.baselines.BaseDataLoader- static load_data(fpath: str, apply_aeda: bool = True, aeda_tokenize: Callable = whitespace_tokenize, aeda_language: str = 'EN') Tuple[List[Tuple[str, str]], List[Tuple[str, str, int]]][source]
- Parameters
fpath – str, path of data
apply_aeda – bool, whether to apply the AEDA technique to augment data, default True
aeda_tokenize – Callable, specify aeda tokenize function, it works when set apply_aeda=True
aeda_language – str, specifying the language, it works when set apply_aeda=True