DPPRetriever¶
- class openicl.icl_retriever.icl_dpp_retriever.DPPRetriever(dataset_reader: DatasetReader, ice_separator: str | None = '\n', ice_eos_token: str | None = '\n', prompt_eos_token: str | None = '', sentence_transformers_model_name: str | None = 'all-mpnet-base-v2', ice_num: int | None = 1, candidate_num: int | None = 1, index_split: str | None = 'train', test_split: str | None = 'test', tokenizer_name: str | None = 'gpt2-xl', batch_size: int | None = 1, accelerator: Accelerator | None = None, seed: int | None = 1, scale_factor: float | None = 0.1)[source]¶
- DPP In-context Learning Retriever Class
Class of DPP Retriever. Two-stage DPP is used, where first stage is to get results of TopK to reduce candidate sets chechout https://arxiv.org/abs/2302.05698 for details.
- dataset_reader¶
An instance of the
DatasetReaderclass.- Type:
DatasetReader
- ice_separator¶
A string that separates each in-context example.
- Type:
str, optional
- ice_eos_token¶
A string that is added to the end of in-context examples.
- Type:
str, optional
- prompt_eos_token¶
A string that is added to the end of the prompt.
- Type:
str, optional
- ice_num¶
The number of data in the in-context examples.
- Type:
int, optional
- index_split¶
A string for the index dataset name. The index dataset is used to select data for in-context examples. Defaults to
train.- Type:
str, optional
- test_split¶
A string for the generation dataset name. The test dataset is used to generate prompts for each data. Defaults to
test.- Type:
str, optional
- index_ds¶
The index dataset. Used to select data for in-context examples.
- Type:
Dataset
- test_ds¶
The test dataset. Used to generate prompts for each data.
- Type:
Dataset
- accelerator¶
An instance of the
Acceleratorclass, used for multiprocessing.- Type:
Accelerator, optional
- batch_size¶
Batch size for the
DataLoader.- Type:
int, optional
- model¶
An instance of
SentenceTransformerclass, used to calculate embeddings.- Type:
SentenceTransformer
- index¶
Index generated with FAISS.
- Type:
IndexIDMap
- seed¶
Seed for the random number generator. (
random_stateinsample_exact_k_dppmethod)- Type:
int, optional
- scale_factor¶
A factor when gets the kernel.
- Type:
float, optional