metallic.data.datasets¶

Omniglot¶

class metallic.data.datasets.Omniglot(root: str, n_way: int, meta_split: str = 'train', use_vinyals_split: bool = True, k_shot_support: Optional[int] = None, k_shot_query: Optional[int] = None, shuffle: bool = True, transform: Optional[Callable] = None, target_transform: Optional[Callable] = None, augmentations: Optional[List[Callable]] = None, download: bool = False)[source]¶

The Omniglot introduced in [1]. It contains 1623 character classes from 50 different alphabets, each contains 20 samples. The original dataset is splited into background (train) and evaluation (test) sets.

We also provide a choice to use the splits from [2].

The dataset is downloaded from here, and the splits are taken from here.

Parameters

root (str) – Root directory of dataset
n_way (int) – Number of the classes per tasks
meta_split (str, optional, default='train') – Name of the split to be used: ‘train’ / ‘val’ / ‘test
use_vinyals_split (bool, optional, default=True) – If True, use the splits defined in [2], or use images_background for train split and images_evaluation for test split.
k_shot_support (int, optional) – Number of samples per class in support set
k_shot_query (int, optional) – Number of samples per class in query set
shuffle (bool, optional, default=True) – If True, samples in a class will be shuffled before been splited to support and query set
transform (Callable, optional) – A function/transform that takes in an PIL image and returns a transformed version
target_transform (Callable, optional) – A function/transform that takes in the target and transforms it
augmentations (List[Callable], optional) – A list of functions that augment the dataset with new classes
download (bool, optional, default=False) – If true, downloads the dataset zip files from the internet and puts it in root directory. If the zip files are already downloaded, they are not downloaded again.
NOTE – val split is not available when use_vinyals_split is set to False.
admonition: (.) – References: 1. “Human-level Concept Learning through Probabilistic Program Induction.” Brenden M. Lake, et al. Science 2015. 2. “Matching Networks for One Shot Learning.” Oriol Vinyals, et al. NIPS 2016.

class metallic.data.datasets.OmniglotClassDataset(root: str, meta_split: str = 'train', use_vinyals_split: bool = True, transform: Optional[Callable] = None, target_transform: Optional[Callable] = None, augmentations: Optional[List[Callable]] = None, download: bool = False)[source]¶

A dataset composed of classes from Omniglot.

Parameters

root (str) – Root directory of dataset
meta_split (str, optional, default='train') – Name of the split to be used: ‘train’ / ‘val’ / ‘test
use_vinyals_split (bool, optional, default=True) – If True, use the splits defined in [2], or use images_background for train split and images_evaluation for test split.
transform (callable, optional) – A function/transform that takes in an PIL image and returns a transformed version
target_transform (callable, optional) – A function/transform that takes in the target and transforms it
augmentations (list of callable, optional) – A list of functions that augment the dataset with new classes.
download (bool, optional, default=False) – If true, downloads the dataset zip files from the internet and puts it in root directory. If the zip files are already downloaded, they are not downloaded again.

create_cache() → None[source]¶: Iterates over the entire dataset and creates a map of target to samples and list of labels from scratch.

mini-ImageNet¶

class metallic.data.datasets.MiniImageNet(root: str, n_way: int, meta_split: str = 'train', k_shot_support: Optional[int] = None, k_shot_query: Optional[int] = None, shuffle: bool = True, transform: Optional[Callable] = None, target_transform: Optional[Callable] = None, augmentations: Optional[List[Callable]] = None, download: bool = False)[source]¶

The mini-ImageNet dataset introduced in [1]. It samples 100 classed from ImageNet (ILSVRC-2012), in which 64 for training, 16 for validation, and 20 for testing. Each of the class contains 600 samples.

The dataset is downloaded from here.

Note

[1] didn’t released their splits at first, so [2] created their own splits. Here we use the splits from [2].

Parameters

root (str) – Root directory of dataset
n_way (int) – Number of the classes per tasks
meta_split (str, optional, default='train') – Name of the split to be used: ‘train’ / ‘val’ / ‘test
k_shot_support (int, optional) – Number of samples per class in support set
k_shot_query (int, optional) – Number of samples per class in query set
shuffle (bool, optional, default=True) – If True, samples in a class will be shuffled before been splited to support and query set
transform (Callable, optional) – A function/transform that takes in an PIL image and returns a transformed version
target_transform (Callable, optional) – A function/transform that takes in the target and transforms it
augmentations (List[Callable], optional) – A list of functions that augment the dataset with new classes
download (bool, optional, default=False) – If true, downloads the dataset zip files from the internet and puts it in root directory. If the zip files are already downloaded, they are not downloaded again.

References

“Matching Networks for One Shot Learning.” Oriol Vinyals, et al. NIPS 2016.
“Optimization as a Model for Few-Shot Learning.” Sachin Ravi, et al. ICLR 2017.

class metallic.data.datasets.MiniImageNetClassDataset(root: str, meta_split: str = 'train', transform: Optional[Callable] = None, target_transform: Optional[Callable] = None, augmentations: Optional[List[Callable]] = None, download: bool = False)[source]¶

A dataset composed of classes from mini-ImageNet.

Parameters

root (str) – Root directory of dataset
meta_split (str, optional, default='train') – Name of the split to be used: ‘train’ / ‘val’ / ‘test
transform (Callable, optional) – A function/transform that takes in an PIL image and returns a transformed version
target_transform (Callable, optional) – A function/transform that takes in the target and transforms it
augmentations (List[Callable], optional) – A list of functions that augment the dataset with new classes.
download (bool, optional, default=False) – If true, downloads the dataset zip files from the internet and puts it in root directory. If the zip files are already downloaded, they are not downloaded again.

create_cache() → None[source]¶: Iterates over the entire dataset and creates a map of target to samples and list of labels from scratch.

download() → None[source]¶: Download file from Google drive.

Base¶

If you want to create your own dataset for meta-learning, maybe these classes will be helpful.

class metallic.data.datasets.Dataset(index: int, data: list, class_label: int, transform: Optional[Callable] = None, target_transform: Optional[Callable] = None)[source]¶

Bases: torch.utils.data.dataset.Dataset

A dataset containing all of the samples from a given class:

Dataset (a class)
├─────────┬─────────┐
│         │         │
sample1   sample2   ...

Parameters

index (str) – Index of the class
data (list) – A list of samples in the class
class_label (int) – Label of the class
transform (Callable, optional) – A function/transform that takes in an PIL image and returns a transformed version
target_transform (Callable, optional) – A function/transform that takes in the target and transforms it

class metallic.data.datasets.ClassDataset(root: str, meta_split: str, cache_path: str, transform: Optional[Callable] = None, target_transform: Optional[Callable] = None, augmentations: Optional[List[Callable]] = None)[source]¶

Bases: abc.ABC

Base class for a dataset composed of classes. Each item from a ClassDataset is a Dataset containing samples from the given class:

ClassDataset
├───────────────┬──────────────┐
│               │              │
class1          class2         ... (`Dataset`)
├─────────┬─────────┐
│         │         │
sample1   sample2   ...

Parameters

root (str) – Root directory of dataset
n_way (int) – Number of the classes per task
meta_split (str, optional, default='train') – Name of the split to be used: ‘train’ / ‘val’ / ‘test
cache_path (str) – Path to store the cache file
transform (Callable, optional) – A function/transform that takes in an PIL image and returns a transformed version
target_transform (Callable, optional) – A function/transform that takes in the target and transforms it
augmentations (List[Callable], optional) – A list of functions that augment the dataset with new classes.

abstract create_cache() → None[source]¶: Iterates over the entire dataset and creates a map of target to samples and list of labels from scratch.

load_cache() → None[source]¶: Load map of target to samples from cache.

preprocess() → None[source]¶

save_cache() → None[source]¶

class metallic.data.datasets.TaskDataset(datasets: List[metallic.data.datasets.base.Dataset], n_classes: int)[source]¶

Bases: torch.utils.data.dataset.ConcatDataset

A dataset for concatenating the given multiple classes, which means:

TaskDataset
├────────┬────────┬────────┬────────┐
│        │        │        │        │
c1_s1    c1_s2    ...      c2_s1    ...

Parameters

datasets (List[Dataset]) – A list of the Dataset to be concatenated
n_classes (int) – Number of the given classes

class metallic.data.datasets.MetaDataset(dataset: metallic.data.datasets.base.ClassDataset, n_way: int, k_shot_support: Optional[int] = None, k_shot_query: Optional[int] = None, shuffle: bool = True)[source]¶

Bases: torch.utils.data.dataset.Dataset

A dataset for fast indexing of samples within classes.

Parameters

dataset (ClassDataset) – An instance of ClassDataset class
n_way (int) – Number of the classes per tasks
k_shot_support (int, optional) – Number of samples per class in support set
k_shot_query (int, optional) – Number of samples per class in query set
shuffle (bool, optional, default=True) – If True, samples in a class will be shuffled before been splited to support and query set

split_task(task: metallic.data.datasets.base.TaskDataset) → collections.OrderedDict[source]¶: Split a TaskDataset into support / query set, each of ther set contains k_shot_suppor / k_shot_query samples per class.