MDLRetriever¶
- class openicl.icl_retriever.icl_mdl_retriever.MDLRetriever(dataset_reader: DatasetReader, ice_separator: str | None = '\n', ice_eos_token: str | None = '\n', prompt_eos_token: str | None = '', sentence_transformers_model_name: str | None = 'all-mpnet-base-v2', ice_num: int | None = 1, candidate_num: int | None = 1, index_split: str | None = 'train', test_split: str | None = 'test', tokenizer_name: str | None = 'gpt2-xl', ce_model_name: str | None = 'gpt2-xl', batch_size: int | None = 1, select_time: int | None = 5, accelerator: Accelerator | None = None, ice_template: PromptTemplate | None = None, prompt_template: PromptTemplate | None = None, labels: List | None = None, seed: int | None = 1)[source]¶
- MDL In-context Learning Retriever Class
Class of MDL Retriever.
- dataset_reader¶
An instance of the
DatasetReaderclass.- Type:
DatasetReader
- ice_separator¶
A string that separates each in-context example.
- Type:
str, optional
- ice_eos_token¶
A string that is added to the end of in-context examples.
- Type:
str, optional
- prompt_eos_token¶
A string that is added to the end of the prompt.
- Type:
str, optional
- ice_num¶
The number of data in the in-context examples.
- Type:
int, optional
- candidate_num¶
The number of data selected in TopK stage.
- Type:
int, optional
- index_split¶
A string for the index dataset name. The index dataset is used to select data for in-context examples. Defaults to
train.- Type:
str, optional
- test_split¶
A string for the generation dataset name. The test dataset is used to generate prompts for each data. Defaults to
test.- Type:
str, optional
- index_ds¶
The index dataset. Used to select data for in-context examples.
- Type:
Dataset
- test_ds¶
The test dataset. Used to generate prompts for each data.
- Type:
Dataset
- accelerator¶
An instance of the
Acceleratorclass, used for multiprocessing.- Type:
Accelerator, optional
- batch_size¶
Batch size for the
DataLoader.- Type:
int, optional
- model¶
An instance of
SentenceTransformerclass, used to calculate embeddings.- Type:
SentenceTransformer
- index¶
Index generated with FAISS.
- Type:
IndexIDMap
- select_time¶
Number of random selections in the MDL stage.
- Type:
int, optional
- labels¶
A list of labels for all classes used to generate prompts when calculating MDL.
- Type:
List, optional
- seed¶
Seed for the random number generator.
- Type:
int, optional