FVD Object#

The major FVD functions are designed within fvd.cd_fvd, which initializes the model used to compute the fvd features, support functions to load videos in different formats, and conduct the calculation.

class cdfvd.fvd.cdfvd[source]#

This class loads a pretrained model (I3D or VideoMAE) and contains functions to compute the FVD score between real and fake videos.

Parameters:
  • model – Name of the model to use, either videomae or i3d.

  • n_real – Number of real videos to use for computing the FVD score, if 'full', all the videos in the dataset will be used.

  • n_fake – Number of fake videos to use for computing the FVD score.

  • ckpt_path – Path to save the model checkpoint.

  • seed – Random seed.

  • compute_feats – Whether to compute all features or just mean and covariance.

  • device – Device to use for computing the features.

  • half_precision – Whether to use half precision for the model.

compute_fvd_from_stats(fake_stats: FeatureStats | None = None, real_stats: FeatureStats | None = None) float[source]#

This function computes the FVD score between real and fake videos using precomputed features. If the stats are not provided, it uses the stats stored in the object.

Parameters:
  • fake_stats (FeatureStats | None) – FeatureStats object containing the features of the fake videos.

  • real_stats (FeatureStats | None) – FeatureStats object containing the features of the real videos.

Returns:

FVD score between the real and fake videos.

Return type:

float

compute_fvd(real_videos: ndarray[Any, dtype[uint8]], fake_videos: ndarray[Any, dtype[uint8]]) float[source]#

This function computes the FVD score between real and fake videos in the form of numpy arrays.

Parameters:
  • real_videos (ndarray[Any, dtype[uint8]]) – A numpy array of videos with shape (B, T, H, W, C), values in the range [0, 255]

  • fake_videos (ndarray[Any, dtype[uint8]]) – A numpy array of videos with shape (B, T, H, W, C), values in the range [0, 255]

Returns:

FVD score between the real and fake videos.

Return type:

float

compute_real_stats(loader: DataLoader | List | None = None) FeatureStats[source]#

This function computes the real features from a dataset.

Parameters:

loader (DataLoader | List | None) – real videos, either in the type of dataloader or list of numpy arrays.

Returns:

FeatureStats object containing the features of the real videos.

Return type:

FeatureStats

compute_fake_stats(loader: DataLoader | List | None = None) FeatureStats[source]#

This function computes the fake features from a dataset.

Parameters:

loader (DataLoader | List | None) – fake videos, either in the type of dataloader or list of numpy arrays.

Returns:

FeatureStats object containing the features of the fake videos.

Return type:

FeatureStats

add_real_stats(real_videos: ndarray[Any, dtype[uint8]])[source]#

This function adds features of real videos to the real_stats object.

Parameters:

real_videos (ndarray[Any, dtype[uint8]]) – A numpy array of videos with shape (B, T, H, W, C), values in the range [0, 255].

add_fake_stats(fake_videos: ndarray[Any, dtype[uint8]])[source]#

This function adds features of fake videos to the fake_stats object.

Parameters:

fake_videos (ndarray[Any, dtype[uint8]]) – A numpy array of videos with shape (B, T, H, W, C), values in the range [0, 255].

empty_real_stats()[source]#

This function empties the real_stats object.

empty_fake_stats()[source]#

This function empties the real_stats object.

save_real_stats(path: str)[source]#

This function saves the real_stats object to a file.

Parameters:

path (str) – Path to save the real_stats object.

load_real_stats(path: str)[source]#

This function loads the real_stats object from a file.

Parameters:

path (str) – Path to load the real_stats object.

load_videos(video_info: str, resolution: int = 256, sequence_length: int = 16, sample_every_n_frames: int = 1, data_type: str = 'video_numpy', num_workers: int = 4, batch_size: int = 16) DataLoader | List | None[source]#

This function loads videos from a way specified by data_type. video_numpy loads videos from a file containing a numpy array with the shape (B, T, H, W, C). video_folder loads videos from a folder containing video files. image_folder loads videos from a folder containing image files. stats_pkl indicates that video_info of a dataset name for pre-computed features. Currently supports ucf101, kinetics, sky, ffs, and taichi.

Parameters:
  • video_info (str) – Path to the video file or folder.

  • resolution (int) – Resolution of the video.

  • sequence_length (int) – Length of the video sequence.

  • sample_every_n_frames (int) – Number of frames to skip.

  • data_type (str) – Type of the video data, either video_numpy, video_folder, image_folder, or stats_pkl.

  • num_workers (int) – Number of workers for the dataloader.

  • batch_size (int) – Batch size for the dataloader.

Returns:

Dataloader or list of numpy arrays containing the videos.

Return type:

DataLoader | List | None

offload_model_to_cpu()[source]#

This function offloads the model to the CPU to release the memory.