cutcutcodec.core.analysis.video.quality

Video quality metrics.

Functions

lpips(dis, ref, *args, **kwargs)

Compute the Learned Perceptual Image Patch Similarity.

psnr(dis, ref, *args, **kwargs)

Compute the peak signal to noise ratio of 2 images.

ssim(dis, ref, *args[, stride])

Compute the structural similarity index measure of 2 images.

uvq(dis, *[, _model])

Compute the Perceptual Video Quality.

vif(dis, ref)

Compute the visual information fidelity of 2 images.

vmaf(dis, ref, *[, _model])

Compute the Video Multi-Method Assessment Fusion of 2 images.

Details

cutcutcodec.core.analysis.video.quality.lpips(dis: Tensor, ref: Tensor, *args, **kwargs) Tensor[source]

Compute the Learned Perceptual Image Patch Similarity.

It uses the module pip install lpips in backend, based on torch.

Parameters

dis, refarraylike

The 2 images to be compared, of shape ([*batch], height, width, channels=3). The frames are assumed to be in RGB (r’g’b’) in range [0, 1]. Gamut and EOTF must be standard rgb.

netstr, default=”alex”

The neuronal network used, “alex” or “vgg”.

threadsint, optional

Defines the number of threads. The value -1 means that the function uses as many calculation threads as there are cores. The default value (0) allows the same behavior as (-1) if the function is called in the main thread, otherwise (1) to avoid nested threads. Any other positive value corresponds to the number of threads used.

Returns

lpipsarraylike

The learned perceptual image patch similarity of each image.

Examples

>>> import numpy as np
>>> from cutcutcodec.core.analysis.video.quality import lpips
>>> np.random.seed(0)
>>> ref = np.random.random((720, 1080, 3))  # It could also be a torch array list...
>>> dis = 0.8 * ref + 0.2 * np.random.random((720, 1080, 3))
>>> lpips(dis, ref).round(1)
np.float64(0.0)
>>>
cutcutcodec.core.analysis.video.quality.psnr(dis: Tensor, ref: Tensor, *args, **kwargs) Tensor[source]

Compute the peak signal to noise ratio of 2 images.

Parameters

dis, refarraylike

The 2 images to be compared, of shape ([*batch], height, width, channels). Supported types are float32 and float64.

weightsiterable[float], optional

The relative weight of each channel. By default, all channels have the same weight.

threadsint, optional

Defines the number of threads. The value -1 means that the function uses as many calculation threads as there are cores. The default value (0) allows the same behavior as (-1) if the function is called in the main thread, otherwise (1) to avoid nested threads. Any other positive value corresponds to the number of threads used.

Returns

psnrarraylike

The global peak signal to noise ratio, as a ponderation of the mean square error of each channel. It is batched and clamped in [0, 100] db.

Notes

  • It is optimized for C contiguous tensors.

  • If device is cpu and gradient is not required, a fast C code is used instead of torch code.

Examples

>>> import numpy as np
>>> from cutcutcodec.core.analysis.video.quality import psnr
>>> np.random.seed(0)
>>> ref = np.random.random((720, 1080, 3))  # It could also be a torch array list...
>>> dis = 0.8 * ref + 0.2 * np.random.random((720, 1080, 3))
>>> psnr(dis, ref).round(1)
np.float64(21.8)
>>>
cutcutcodec.core.analysis.video.quality.ssim(dis: Tensor, ref: Tensor, *args, stride: int = 1, **kwargs) Tensor[source]

Compute the structural similarity index measure of 2 images.

Parameters

dis, refarraylike

The 2 images to be compared, of shape ([*batch], height, width, channels). Supported types are float32 and float64.

data_rangefloat, default=1.0

The data range of the input image (difference between maximum and minimum possible values).

weightsiterable[float], optional

The relative weight of each channel. By default, all channels have the same weight.

sigmafloat, default=1.5

The standard deviation of the gaussian. It has to be strictely positive.

strideint, default=1

The stride of the convolving kernel.

threadsint, optional

Defines the number of threads. The value -1 means that the function uses as many calculation threads as there are cores. The default value (0) allows the same behavior as (-1) if the function is called in the main thread, otherwise (1) to avoid nested threads. Any other positive value corresponds to the number of threads used.

Returns

ssimarraylike

The ponderated structural similarity index measure of each layers.

Notes

  • It is optimized for C contiguous tensors.

  • If device is cpu, gradient is not required and stride != 1, a fast C code is used.

Examples

>>> import numpy as np
>>> from cutcutcodec.core.analysis.video.quality import ssim
>>> np.random.seed(0)
>>> ref = np.random.random((720, 1080, 3))  # It could also be a torch array list...
>>> dis = 0.8 * ref + 0.2 * np.random.random((720, 1080, 3))
>>> ssim(dis, ref).round(2)
np.float64(0.95)
>>>
cutcutcodec.core.analysis.video.quality.uvq(dis: Tensor, *, _model=None) Tensor[source]

Compute the Perceptual Video Quality.

Parameters

disarraylike

The frames to be evaluated, of shape ([*batch], fps=5, height, width, channels=3). The framerate is assumed to be 5 Hz. The frames are assumed to be in RGB in range [0, 1]. Gamut and EOTF must be standard rgb.

Returns

uvqarraylike

The perceptual video quality measure for each group of 5 images.

Examples

>>> import numpy as np
>>> from cutcutcodec.core.analysis.video.quality import uvq
>>> np.random.seed(0)
>>> dis = np.random.random((5, 720, 1080, 3))  # It could also be a torch array list...
>>> uvq(dis).round(1)
np.float32(3.3)
>>>
cutcutcodec.core.analysis.video.quality.vif(dis: Tensor, ref: Tensor) Tensor[source]

Compute the visual information fidelity of 2 images.

Parameters

dis, refarraylike

The 2 images to be compared, of shape ([*batch], height, width, channels=[1, 3]). The frames are assumed to be in Y or YUV (y’pbpr) in range [0, 1]. Only the y’ component is used.

Returns

vifarraylike

The visual information fidelity of each image.

Notes

This metric isn’t symmetric, so make sure to place arguments in correct order.

cutcutcodec.core.analysis.video.quality.vmaf(dis: Tensor, ref: Tensor, *, _model=None, **kwargs) Tensor[source]

Compute the Video Multi-Method Assessment Fusion of 2 images.

Parameters

dis, refarraylike

The 2 images to be compared, of shape ([*batch], height, width, channels=3). The frames are assumed to be in YUV (y’pbpr) in range [0, 1]. Gamut and EOTF must be standard rgb.

threadsint, optional

Defines the number of threads. The value -1 means that the function uses as many calculation threads as there are cores. The default value (0) allows the same behavior as (-1) if the function is called in the main thread, otherwise (1) to avoid nested threads. Any other positive value corresponds to the number of threads used.

Returns

vmafarraylike

The learned perceptual image patch similarity of each image.

Notes

This static function does not require the installation of vmaf.

Examples

>>> import numpy as np
>>> from cutcutcodec.core.analysis.video.quality import vmaf
>>> np.random.seed(0)
>>> ref = np.random.random((720, 1080, 3))  # It could also be a torch array list...
>>> ref[..., 1:3] -= 0.5  # because pbpr in [-0.5, 0.5]
>>> dis = 0.8 * ref + 0.2 * np.random.randn((720, 1080, 3))
>>> vmaf(dis, ref).round(1)
>>>

Modules

lpips_torch

Compute a differenciable batched torch lpips.

metric

This module, implemented in C, offers functions for image metric calculation.

psnr_torch

Compute a differenciable batched torch psnr.

ssim_torch

Compute a differenciable batched torch ssim.

utils

Helper for metrics.

uvq_google

Universal Video Quality Model.

vif_torch

Compute a differential batched torch VIF (Visual Information Fidelity).

vmaf_official

Parse the ffmpeg vmaf metric.

vmaf_torch

Torch version of the Video Multi-Method Assessment Fusion.