cutcutcodec.core.analysis.video.metric

Image metrics.

Functions

compare(ref, dis, **kwargs)

Compare 2 video files with differents metrics.

lpips(ref, dis, *args, **kwargs)

Compute the Learned Perceptual Image Patch Similarity.

psnr(ref, dis, *args, **kwargs)

Compute the peak signal to noise ratio of 2 images.

ssim(ref, dis, *args[, stride])

Compute the Structural similarity index measure of 2 images.

uvq(dis[, _model])

Compute the Perceptual Video Quality.

Details

cutcutcodec.core.analysis.video.metric.compare(ref: Path | str | bytes, dis: Path | str | bytes, **kwargs) dict[str, list[float]][source]

Compare 2 video files with differents metrics.

Parameters

refpathlike

The reference video file.

dispathlike

The distorted video.

lpips_alexboolean, default=False

If True, compute the lpips with alex (medium).

lpips_vggboolean, default=False

If True, compute the lpips with vgg (slow).

psnrboolean, dafault=False

If True, compute the psnr (very fast).

ssimboolean, default=False

If True, compute the ssim (slow).

uvqboolean, default=False

If True, compute the uvq on the dis video (very slow). It returns only one value per second.

vmafboolean, default=False

If True, compute the vmaf (medium).

Returns

metricsdict[str, list[float]]

Each metric name is associated with the scalar value of each frame. All the numbers are rounded to 4 decimals number.

Notes

Frames are converted to yuv if not already converted, then the distorted video is converted to the color space of the reference video.

Examples

>>> import pprint
>>> from cutcutcodec.core.analysis.video.metric import compare
>>> res = compare(
...     "media/video/intro.webm", "media/video/intro.webm",
...     lpips_alex=True, psnr=True, ssim=True
... )
>>> pprint.pprint(res)  
{'lpips_alex': [0.0,
                0.0,
                ...,
                0.0,
                0.0],
 'psnr': [100.0,
          100.0,
          ...,
          100.0,
          100.0],
 'ssim': [1.0,
          1.0,
          ...,
          1.0,
          1.0]}
>>> compare(None, "media/video/intro.webm", uvq=True)
{'uvq': [2.9175, 2.8559, 2.7784, 3.1725, 3.5817, 3.7688, 3.0215, 2.944, 2.697, 3.4718]}
>>>
cutcutcodec.core.analysis.video.metric.lpips(ref: Tensor, dis: Tensor, *args, **kwargs) Tensor[source]

Compute the Learned Perceptual Image Patch Similarity.

It uses the module pip install lpips in backend, based on torch.

Parameters

ref, disarraylike

The 2 images to be compared, of shape ([*batch], height, width, channels). The frames are assumed to be in RGB in range [0, 1]. Gamut and EOTF must be standard rgb.

netstr, default=”alex”

The neuronal network used, “alex” or “vgg”.

threadsint, optional

Defines the number of threads. The value -1 means that the function uses as many calculation threads as there are cores. The default value (0) allows the same behavior as (-1) if the function is called in the main thread, otherwise (1) to avoid nested threads. Any other positive value corresponds to the number of threads used.

Returns

lpipsarraylike

The learned perceptual image patch similarity of each layers.

Examples

>>> import numpy as np
>>> from cutcutcodec.core.analysis.video.metric import lpips
>>> np.random.seed(0)
>>> ref = np.random.random((720, 1080, 3))  # It could also be a torch array list...
>>> dis = 0.8 * ref + 0.2 * np.random.random((720, 1080, 3))
>>> lpips(ref, dis).round(1)
np.float64(0.0)
>>>
cutcutcodec.core.analysis.video.metric.psnr(ref: Tensor, dis: Tensor, *args, **kwargs) Tensor[source]

Compute the peak signal to noise ratio of 2 images.

Parameters

ref, disarraylike

The 2 images to be compared, of shape ([*batch], height, width, channels). Supported types are float32 and float64.

weightsiterable[float], optional

The relative weight of each channel. By default, all channels have the same weight.

threadsint, optional

Defines the number of threads. The value -1 means that the function uses as many calculation threads as there are cores. The default value (0) allows the same behavior as (-1) if the function is called in the main thread, otherwise (1) to avoid nested threads. Any other positive value corresponds to the number of threads used.

Returns

psnrarraylike

The global peak signal to noise ratio, as a ponderation of the mean square error of each channel. It is batched and clamped in [0, 100] db.

Notes

  • It is optimized for C contiguous tensors.

  • If device is cpu and gradient is not required, a fast C code is used instead of torch code.

Examples

>>> import numpy as np
>>> from cutcutcodec.core.analysis.video.metric import psnr
>>> np.random.seed(0)
>>> ref = np.random.random((720, 1080, 3))  # It could also be a torch array list...
>>> dis = 0.8 * ref + 0.2 * np.random.random((720, 1080, 3))
>>> psnr(ref, dis).round(1)
np.float64(21.8)
>>>
cutcutcodec.core.analysis.video.metric.ssim(ref: Tensor, dis: Tensor, *args, stride: int = 1, **kwargs) Tensor[source]

Compute the Structural similarity index measure of 2 images.

Parameters

ref, disarraylike

The 2 images to be compared, of shape ([*batch], height, width, channels). Supported types are float32 and float64.

data_rangefloat, default=1.0

The data range of the input image (difference between maximum and minimum possible values).

weightsiterable[float], optional

The relative weight of each channel. By default, all channels have the same weight.

sigmafloat, default=1.5

The standard deviation of the gaussian. It has to be strictely positive.

strideint, default=1

The stride of the convolving kernel.

threadsint, optional

Defines the number of threads. The value -1 means that the function uses as many calculation threads as there are cores. The default value (0) allows the same behavior as (-1) if the function is called in the main thread, otherwise (1) to avoid nested threads. Any other positive value corresponds to the number of threads used.

Returns

ssimarraylike

The ponderated structural similarity index measure of each layers.

Notes

  • It is optimized for C contiguous tensors.

  • If device is cpu, gradient is not required and stride != 1, a fast C code is used.

Examples

>>> import numpy as np
>>> from cutcutcodec.core.analysis.video.metric import ssim
>>> np.random.seed(0)
>>> ref = np.random.random((720, 1080, 3))  # It could also be a torch array list...
>>> dis = 0.8 * ref + 0.2 * np.random.random((720, 1080, 3))
>>> ssim(ref, dis).round(2)
np.float64(0.95)
>>>
cutcutcodec.core.analysis.video.metric.uvq(dis: Tensor, _model=None) Tensor[source]

Compute the Perceptual Video Quality.

Parameters

disarraylike

The frames to be evaluated, of shape ([*batch], fps=5, height, width, channels=3). The framerate is assumed to be 5 Hz. The frames are assumed to be in RGB in range [0, 1]. Gamut and EOTF must be standard rgb.

Returns

uvqarraylike

The perceptual video quality measure for each group of 5 images.

Examples

>>> import numpy as np
>>> from cutcutcodec.core.analysis.video.metric import uvq
>>> np.random.seed(0)
>>> dis = np.random.random((5, 720, 1080, 3))  # It could also be a torch array list...
>>> uvq(dis).round(1)
np.float32(3.3)
>>>

Modules

lpips_torch

Compute a differenciable batched torch lpips.

metric

This module, implemented in C, offers functions for image metric calculation.

psnr_torch

Compute a differenciable batched torch psnr.

ssim_torch

Compute a differenciable batched torch ssim.

utils

Helper for metrics.

uvq_google

Universal Video Quality Model.

vmaf(ref, dis[, threads])

Call the Netflix vmaf metric on the frames.