metallic.data.datasets

Omniglot

class metallic.data.datasets.Omniglot(root: str, n_way: int, meta_split: str = 'train', use_vinyals_split: bool = True, k_shot_support: Optional[int] = None, k_shot_query: Optional[int] = None, shuffle: bool = True, transform: Optional[Callable] = None, target_transform: Optional[Callable] = None, augmentations: Optional[List[Callable]] = None, download: bool = False)[source]

The Omniglot introduced in [1]. It contains 1623 character classes from 50 different alphabets, each contains 20 samples. The original dataset is splited into background (train) and evaluation (test) sets.

We also provide a choice to use the splits from [2].

The dataset is downloaded from here, and the splits are taken from here.

Parameters
  • root (str) – Root directory of dataset

  • n_way (int) – Number of the classes per tasks

  • meta_split (str, optional, default='train') – Name of the split to be used: ‘train’ / ‘val’ / ‘test

  • use_vinyals_split (bool, optional, default=True) – If True, use the splits defined in [2], or use images_background for train split and images_evaluation for test split.

  • k_shot_support (int, optional) – Number of samples per class in support set

  • k_shot_query (int, optional) – Number of samples per class in query set

  • shuffle (bool, optional, default=True) – If True, samples in a class will be shuffled before been splited to support and query set

  • transform (Callable, optional) – A function/transform that takes in an PIL image and returns a transformed version

  • target_transform (Callable, optional) – A function/transform that takes in the target and transforms it

  • augmentations (List[Callable], optional) – A list of functions that augment the dataset with new classes

  • download (bool, optional, default=False) – If true, downloads the dataset zip files from the internet and puts it in root directory. If the zip files are already downloaded, they are not downloaded again.

  • NOTEval split is not available when use_vinyals_split is set to False.

  • admonition: (.) – References: 1. “Human-level Concept Learning through Probabilistic Program Induction.Brenden M. Lake, et al. Science 2015. 2. “Matching Networks for One Shot Learning.” Oriol Vinyals, et al. NIPS 2016.

class metallic.data.datasets.OmniglotClassDataset(root: str, meta_split: str = 'train', use_vinyals_split: bool = True, transform: Optional[Callable] = None, target_transform: Optional[Callable] = None, augmentations: Optional[List[Callable]] = None, download: bool = False)[source]

A dataset composed of classes from Omniglot.

Parameters
  • root (str) – Root directory of dataset

  • meta_split (str, optional, default='train') – Name of the split to be used: ‘train’ / ‘val’ / ‘test

  • use_vinyals_split (bool, optional, default=True) – If True, use the splits defined in [2], or use images_background for train split and images_evaluation for test split.

  • transform (callable, optional) – A function/transform that takes in an PIL image and returns a transformed version

  • target_transform (callable, optional) – A function/transform that takes in the target and transforms it

  • augmentations (list of callable, optional) – A list of functions that augment the dataset with new classes.

  • download (bool, optional, default=False) – If true, downloads the dataset zip files from the internet and puts it in root directory. If the zip files are already downloaded, they are not downloaded again.

create_cache()None[source]

Iterates over the entire dataset and creates a map of target to samples and list of labels from scratch.

mini-ImageNet

class metallic.data.datasets.MiniImageNet(root: str, n_way: int, meta_split: str = 'train', k_shot_support: Optional[int] = None, k_shot_query: Optional[int] = None, shuffle: bool = True, transform: Optional[Callable] = None, target_transform: Optional[Callable] = None, augmentations: Optional[List[Callable]] = None, download: bool = False)[source]

The mini-ImageNet dataset introduced in [1]. It samples 100 classed from ImageNet (ILSVRC-2012), in which 64 for training, 16 for validation, and 20 for testing. Each of the class contains 600 samples.

The dataset is downloaded from here.

Note

[1] didn’t released their splits at first, so [2] created their own splits. Here we use the splits from [2].

Parameters
  • root (str) – Root directory of dataset

  • n_way (int) – Number of the classes per tasks

  • meta_split (str, optional, default='train') – Name of the split to be used: ‘train’ / ‘val’ / ‘test

  • k_shot_support (int, optional) – Number of samples per class in support set

  • k_shot_query (int, optional) – Number of samples per class in query set

  • shuffle (bool, optional, default=True) – If True, samples in a class will be shuffled before been splited to support and query set

  • transform (Callable, optional) – A function/transform that takes in an PIL image and returns a transformed version

  • target_transform (Callable, optional) – A function/transform that takes in the target and transforms it

  • augmentations (List[Callable], optional) – A list of functions that augment the dataset with new classes

  • download (bool, optional, default=False) – If true, downloads the dataset zip files from the internet and puts it in root directory. If the zip files are already downloaded, they are not downloaded again.

References

  1. Matching Networks for One Shot Learning.” Oriol Vinyals, et al. NIPS 2016.

  2. Optimization as a Model for Few-Shot Learning.” Sachin Ravi, et al. ICLR 2017.

class metallic.data.datasets.MiniImageNetClassDataset(root: str, meta_split: str = 'train', transform: Optional[Callable] = None, target_transform: Optional[Callable] = None, augmentations: Optional[List[Callable]] = None, download: bool = False)[source]

A dataset composed of classes from mini-ImageNet.

Parameters
  • root (str) – Root directory of dataset

  • meta_split (str, optional, default='train') – Name of the split to be used: ‘train’ / ‘val’ / ‘test

  • transform (Callable, optional) – A function/transform that takes in an PIL image and returns a transformed version

  • target_transform (Callable, optional) – A function/transform that takes in the target and transforms it

  • augmentations (List[Callable], optional) – A list of functions that augment the dataset with new classes.

  • download (bool, optional, default=False) – If true, downloads the dataset zip files from the internet and puts it in root directory. If the zip files are already downloaded, they are not downloaded again.

create_cache()None[source]

Iterates over the entire dataset and creates a map of target to samples and list of labels from scratch.

download()None[source]

Download file from Google drive.

Base

If you want to create your own dataset for meta-learning, maybe these classes will be helpful.

class metallic.data.datasets.Dataset(index: int, data: list, class_label: int, transform: Optional[Callable] = None, target_transform: Optional[Callable] = None)[source]

Bases: torch.utils.data.dataset.Dataset

A dataset containing all of the samples from a given class:

Dataset (a class)
├─────────┬─────────┐
│         │         │
sample1   sample2   ...
Parameters
  • index (str) – Index of the class

  • data (list) – A list of samples in the class

  • class_label (int) – Label of the class

  • transform (Callable, optional) – A function/transform that takes in an PIL image and returns a transformed version

  • target_transform (Callable, optional) – A function/transform that takes in the target and transforms it

class metallic.data.datasets.ClassDataset(root: str, meta_split: str, cache_path: str, transform: Optional[Callable] = None, target_transform: Optional[Callable] = None, augmentations: Optional[List[Callable]] = None)[source]

Bases: abc.ABC

Base class for a dataset composed of classes. Each item from a ClassDataset is a Dataset containing samples from the given class:

ClassDataset
├───────────────┬──────────────┐
│               │              │
class1          class2         ... (`Dataset`)
├─────────┬─────────┐
│         │         │
sample1   sample2   ...
Parameters
  • root (str) – Root directory of dataset

  • n_way (int) – Number of the classes per task

  • meta_split (str, optional, default='train') – Name of the split to be used: ‘train’ / ‘val’ / ‘test

  • cache_path (str) – Path to store the cache file

  • transform (Callable, optional) – A function/transform that takes in an PIL image and returns a transformed version

  • target_transform (Callable, optional) – A function/transform that takes in the target and transforms it

  • augmentations (List[Callable], optional) – A list of functions that augment the dataset with new classes.

abstract create_cache()None[source]

Iterates over the entire dataset and creates a map of target to samples and list of labels from scratch.

load_cache()None[source]

Load map of target to samples from cache.

preprocess()None[source]
save_cache()None[source]
class metallic.data.datasets.TaskDataset(datasets: List[metallic.data.datasets.base.Dataset], n_classes: int)[source]

Bases: torch.utils.data.dataset.ConcatDataset

A dataset for concatenating the given multiple classes, which means:

TaskDataset
├────────┬────────┬────────┬────────┐
│        │        │        │        │
c1_s1    c1_s2    ...      c2_s1    ...
Parameters
  • datasets (List[Dataset]) – A list of the Dataset to be concatenated

  • n_classes (int) – Number of the given classes

class metallic.data.datasets.MetaDataset(dataset: metallic.data.datasets.base.ClassDataset, n_way: int, k_shot_support: Optional[int] = None, k_shot_query: Optional[int] = None, shuffle: bool = True)[source]

Bases: torch.utils.data.dataset.Dataset

A dataset for fast indexing of samples within classes.

Parameters
  • dataset (ClassDataset) – An instance of ClassDataset class

  • n_way (int) – Number of the classes per tasks

  • k_shot_support (int, optional) – Number of samples per class in support set

  • k_shot_query (int, optional) – Number of samples per class in query set

  • shuffle (bool, optional, default=True) – If True, samples in a class will be shuffled before been splited to support and query set

split_task(task: metallic.data.datasets.base.TaskDataset)collections.OrderedDict[source]

Split a TaskDataset into support / query set, each of ther set contains k_shot_suppor / k_shot_query samples per class.