torchvideo.datasets¶
Datasets¶
VideoDataset¶
-
class
torchvideo.datasets.VideoDataset(root_path, label_set=None, sampler=FullVideoSampler(), transform=None)[source]¶ Bases:
torch.utils.data.dataset.DatasetAbstract base class that all
VideoDatasetsinherit from. If you are implementing your ownVideoDataset, you should inherit from this class.- Parameters
-
__getitem__(index)[source]¶ Load an example by index
- Parameters
index (
int) – index of the example within the dataset.- Return type
- Returns
Example transformed by
transformif one was passed during instantiation, otherwise the example is converted to a tensor without any transformations applied to it. Additionally, if a label set is present, the method return a tuple:(video_tensor, label)
-
labels= None¶ The labels corresponding to the examples in the dataset. To get the label for example at index
iyou simple calldataset.labels[i], although this will be returned by__getitem__if this field is not None.
ImageFolderVideoDataset¶
-
class
torchvideo.datasets.ImageFolderVideoDataset(root_path, filename_template, filter=None, label_set=None, sampler=FullVideoSampler(), transform=None, frame_counter=None)[source]¶ Bases:
torchvideo.datasets.video_dataset.VideoDatasetDataset stored as a folder containing folders of images, where each folder represents a video.
The folder hierarchy should look something like this:
root/video1/frame_000001.jpg root/video1/frame_000002.jpg root/video1/frame_000003.jpg ... root/video2/frame_000001.jpg root/video2/frame_000002.jpg root/video2/frame_000003.jpg root/video2/frame_000004.jpg ...
- Parameters
root_path (
Union[str,Path]) – Path to dataset on disk. Contents of this folder should be example folders, each with frames named according to thefilename_templateargument.filename_template (
str) – Python 3 style formatting string describing frame filenames: e.g."frame_{:06d}.jpg"for the example dataset in the class docstring.filter (
Optional[Callable[[Path],bool]]) – Optional filter callable that decides whether a given example folder is to be included in the dataset or not.label_set (
Optional[LabelSet]) – Optional label set for labelling examples.sampler (
FrameSampler) – Optional sampler for drawing frames from each video.transform (
Optional[Callable[[Iterator[Image]],Tensor]]) – Optional transform performed over the loaded clip.frame_counter (
Optional[Callable[[Path],int]]) – Optional callable used to determine the number of frames each video contains. The callable will be passed the path to a video folder and should return a positive integer representing the number of frames. This tends to be useful if you’ve precomputed the number of frames in a dataset.
-
__getitem__(index)[source]¶ Load an example by index
- Parameters
index (
int) – index of the example within the dataset.- Return type
- Returns
Example transformed by
transformif one was passed during instantiation, otherwise the example is converted to a tensor without any transformations applied to it. Additionally, if a label set is present, the method return a tuple:(video_tensor, label)
VideoFolderDataset¶
-
class
torchvideo.datasets.VideoFolderDataset(root_path, filter=None, label_set=None, sampler=FullVideoSampler(), transform=None, frame_counter=None)[source]¶ Bases:
torchvideo.datasets.video_dataset.VideoDatasetDataset stored as a folder of videos, where each video is a single example in the dataset.
The folder hierarchy should look something like this:
root/video1.mp4 root/video2.mp4 ...
- Parameters
root_path (
Union[str,Path]) – Path to dataset folder on disk. The contents of this folder should be video files.filter (
Optional[Callable[[Path],bool]]) – Optional filter callable that decides whether a given example video is to be included in the dataset or not.label_set (
Optional[LabelSet]) – Optional label set for labelling examples.sampler (
FrameSampler) – Optional sampler for drawing frames from each video.transform (
Optional[Callable[[Iterator[Image]],Tensor]]) – Optional transform over the list of frames.frame_counter (
Optional[Callable[[Path],int]]) – Optional callable used to determine the number of frames each video contains. The callable will be passed the path to a video and should return a positive integer representing the number of frames. This tends to be useful if you’ve precomputed the number of frames in a dataset.
-
__getitem__(index)[source]¶ Load an example by index
- Parameters
index (
int) – index of the example within the dataset.- Return type
- Returns
Example transformed by
transformif one was passed during instantiation, otherwise the example is converted to a tensor without any transformations applied to it. Additionally, if a label set is present, the method return a tuple:(video_tensor, label)
GulpVideoDataset¶
-
class
torchvideo.datasets.GulpVideoDataset(root_path, *, gulp_directory=None, filter=None, label_field=None, label_set=None, sampler=FullVideoSampler(), transform=None)[source]¶ Bases:
torchvideo.datasets.video_dataset.VideoDatasetGulpIO Video dataset.
The folder hierarchy should look something like this:
root/data_0.gulp root/data_1.gulp ... root/meta_0.gulp root/meta_1.gulp ...
- Parameters
root_path (
Union[str,Path]) – Path to GulpIO dataset folder on disk. The.gulpand.gmetafiles are direct children of this directory.filter (
Optional[Callable[[str],bool]]) – Filter function that determines whether a video is included into the dataset. The filter is called on each video id, and should returnTrueto include the video, andFalseto exclude it.label_field (
Optional[str]) – Meta data field name that stores the label of an example, this is used to construct aGulpLabelSetthat performs the example labelling. Defaults to'label'.label_set (
Optional[LabelSet]) – Optional label set for labelling examples. This is mutually exclusive withlabel_field.sampler (
FrameSampler) – Optional sampler for drawing frames from each video.transform (
Optional[Callable[[ndarray],Tensor]]) – Optional transform over thendarraywith layoutTHWC. Note you’ll probably want to remap the channels toCTHWat the end of this transform.gulp_directory (
Optional[GulpDirectory]) – Optional gulp directory residing at root_path. Useful if you wish to create a custom label_set using the gulp_directory, which you can then pass in with the gulp_directory itself to avoid reading the gulp metadata twice.
-
__getitem__(index)[source]¶ Load an example by index
- Parameters
index – index of the example within the dataset.
- Return type
- Returns
Example transformed by
transformif one was passed during instantiation, otherwise the example is converted to a tensor without any transformations applied to it. Additionally, if a label set is present, the method return a tuple:(video_tensor, label)
Label Sets¶
Label sets are an abstraction over how your video data is labelled. This provides
flexibility in swapping out different storage methods and labelling methods. All
datasets optionally take a LabelSet that performs the mapping between
example and label.
LabelSet¶
DummyLabelSet¶
GulpLabelSet¶
-
class
torchvideo.datasets.GulpLabelSet(merged_meta_dict, label_field='label')[source]¶ Bases:
torchvideo.datasets.label_sets.label_set.LabelSetLabelSet for GulpIO datasets where the label is contained within the metadata of the gulp directory. Assuming you’ve written the label of each video to a field called
'label'in the metadata you can create a LabelSet like:GulpLabelSet(gulp_dir.merged_meta_dict, label_field='label')
CsvLabelSet¶
-
class
torchvideo.datasets.CsvLabelSet(df, col='label')[source]¶ Bases:
torchvideo.datasets.label_sets.label_set.LabelSetLabelSet for a pandas DataFrame or Series. The index of the DataFrame/Series is assumed to be the set of video names and the values in a series the label. For a dataframe the
fieldkwarg specifies which field to use as the labelExamples
>>> import pandas as pd >>> df = pd.DataFrame({'video': ['video1', 'video2'], ... 'label': [1, 2]}).set_index('video') >>> label_set = CsvLabelSet(df, col='label') >>> label_set['video1'] 1
- Parameters