cutcutcodec.core.analysis.video.quality¶

Video quality metrics.

Functions

`lpips`(dis, ref, args, *kwargs)	Compute the Learned Perceptual Image Patch Similarity.
`psnr`(dis, ref, args, *kwargs)	Compute the peak signal to noise ratio of 2 images.
`ssim`(dis, ref, *args[, stride])	Compute the structural similarity index measure of 2 images.
`uvq`(dis, *[, _model])	Compute the Perceptual Video Quality.
`vif`(dis, ref)	Compute the visual information fidelity of 2 images.
`vmaf`(dis, ref, *[, _model])	Compute the Video Multi-Method Assessment Fusion of 2 images.

Details

cutcutcodec.core.analysis.video.quality.lpips(dis: Tensor, ref: Tensor, *args, **kwargs) → Tensor[source]

Compute the Learned Perceptual Image Patch Similarity.

It uses the module pip install lpips in backend, based on torch.

Parameters¶

dis, refarraylike: The 2 images to be compared, of shape ([*batch], height, width, channels=3). The frames are assumed to be in RGB (r’g’b’) in range [0, 1]. Gamut and EOTF must be standard rgb.
netstr, default=”alex”: The neuronal network used, “alex” or “vgg”.
threadsint, optional: Defines the number of threads. The value -1 means that the function uses as many calculation threads as there are cores. The default value (0) allows the same behavior as (-1) if the function is called in the main thread, otherwise (1) to avoid nested threads. Any other positive value corresponds to the number of threads used.

Returns¶

lpipsarraylike: The learned perceptual image patch similarity of each image.

Examples¶

>>> import numpy as np
>>> from cutcutcodec.core.analysis.video.quality import lpips
>>> np.random.seed(0)
>>> ref = np.random.random((720, 1080, 3))  # It could also be a torch array list...
>>> dis = 0.8 * ref + 0.2 * np.random.random((720, 1080, 3))
>>> lpips(dis, ref).round(1)
np.float64(0.0)
>>>

cutcutcodec.core.analysis.video.quality.psnr(dis: Tensor, ref: Tensor, *args, **kwargs) → Tensor[source]

Compute the peak signal to noise ratio of 2 images.

Parameters¶

dis, refarraylike: The 2 images to be compared, of shape ([*batch], height, width, channels). Supported types are float32 and float64.
weightsiterable[float], optional: The relative weight of each channel. By default, all channels have the same weight.
threadsint, optional: Defines the number of threads. The value -1 means that the function uses as many calculation threads as there are cores. The default value (0) allows the same behavior as (-1) if the function is called in the main thread, otherwise (1) to avoid nested threads. Any other positive value corresponds to the number of threads used.

Returns¶

psnrarraylike: The global peak signal to noise ratio, as a ponderation of the mean square error of each channel. It is batched and clamped in [0, 100] db.

Notes¶

It is optimized for C contiguous tensors.
If device is cpu and gradient is not required, a fast C code is used instead of torch code.

Examples¶

>>> import numpy as np
>>> from cutcutcodec.core.analysis.video.quality import psnr
>>> np.random.seed(0)
>>> ref = np.random.random((720, 1080, 3))  # It could also be a torch array list...
>>> dis = 0.8 * ref + 0.2 * np.random.random((720, 1080, 3))
>>> psnr(dis, ref).round(1)
np.float64(21.8)
>>>

cutcutcodec.core.analysis.video.quality.ssim(dis: Tensor, ref: Tensor, *args, stride: int = 1, **kwargs) → Tensor[source]

Compute the structural similarity index measure of 2 images.

Parameters¶

dis, refarraylike: The 2 images to be compared, of shape ([*batch], height, width, channels). Supported types are float32 and float64.
data_rangefloat, default=1.0: The data range of the input image (difference between maximum and minimum possible values).
weightsiterable[float], optional: The relative weight of each channel. By default, all channels have the same weight.
sigmafloat, default=1.5: The standard deviation of the gaussian. It has to be strictely positive.
strideint, default=1: The stride of the convolving kernel.
threadsint, optional: Defines the number of threads. The value -1 means that the function uses as many calculation threads as there are cores. The default value (0) allows the same behavior as (-1) if the function is called in the main thread, otherwise (1) to avoid nested threads. Any other positive value corresponds to the number of threads used.

Returns¶

ssimarraylike: The ponderated structural similarity index measure of each layers.

Notes¶

It is optimized for C contiguous tensors.
If device is cpu, gradient is not required and stride != 1, a fast C code is used.

Examples¶

>>> import numpy as np
>>> from cutcutcodec.core.analysis.video.quality import ssim
>>> np.random.seed(0)
>>> ref = np.random.random((720, 1080, 3))  # It could also be a torch array list...
>>> dis = 0.8 * ref + 0.2 * np.random.random((720, 1080, 3))
>>> ssim(dis, ref).round(2)
np.float64(0.95)
>>>

cutcutcodec.core.analysis.video.quality.uvq(dis: Tensor, *, _model=None) → Tensor[source]

Compute the Perceptual Video Quality.

Parameters¶

disarraylike: The frames to be evaluated, of shape ([*batch], fps=5, height, width, channels=3). The framerate is assumed to be 5 Hz. The frames are assumed to be in RGB in range [0, 1]. Gamut and EOTF must be standard rgb.

Returns¶

uvqarraylike: The perceptual video quality measure for each group of 5 images.

Examples¶

>>> import numpy as np
>>> from cutcutcodec.core.analysis.video.quality import uvq
>>> np.random.seed(0)
>>> dis = np.random.random((5, 720, 1080, 3))  # It could also be a torch array list...
>>> uvq(dis).round(1)
np.float32(3.3)
>>>

cutcutcodec.core.analysis.video.quality.vif(dis: Tensor, ref: Tensor) → Tensor[source]

Compute the visual information fidelity of 2 images.

Parameters¶

dis, refarraylike: The 2 images to be compared, of shape ([*batch], height, width, channels=[1, 3]). The frames are assumed to be in Y or YUV (y’pbpr) in range [0, 1]. Only the y’ component is used.

Returns¶

vifarraylike: The visual information fidelity of each image.

Notes¶

This metric isn’t symmetric, so make sure to place arguments in correct order.

cutcutcodec.core.analysis.video.quality.vmaf(dis: Tensor, ref: Tensor, *, _model=None, **kwargs) → Tensor[source]

Compute the Video Multi-Method Assessment Fusion of 2 images.

Parameters¶

dis, refarraylike: The 2 images to be compared, of shape ([*batch], height, width, channels=3). The frames are assumed to be in YUV (y’pbpr) in range [0, 1]. Gamut and EOTF must be standard rgb.
threadsint, optional: Defines the number of threads. The value -1 means that the function uses as many calculation threads as there are cores. The default value (0) allows the same behavior as (-1) if the function is called in the main thread, otherwise (1) to avoid nested threads. Any other positive value corresponds to the number of threads used.

Returns¶

vmafarraylike: The learned perceptual image patch similarity of each image.

Notes¶

This static function does not require the installation of vmaf.

Examples¶

>>> import numpy as np
>>> from cutcutcodec.core.analysis.video.quality import vmaf
>>> np.random.seed(0)
>>> ref = np.random.random((720, 1080, 3))  # It could also be a torch array list...
>>> ref[..., 1:3] -= 0.5  # because pbpr in [-0.5, 0.5]
>>> dis = 0.8 * ref + 0.2 * np.random.randn(720, 1080, 3)
>>> vmaf(dis, ref).round(1)
np.float32(15.4)
>>>

Modules

`lpips_torch`	Compute a differenciable batched torch lpips.
`metric`	This module, implemented in C, offers functions for image metric calculation.
`psnr_torch`	Compute a differenciable batched torch psnr.
`ssim_torch`	Compute a differenciable batched torch ssim.
`utils`	Helper for metrics.
`uvq_google`	Universal Video Quality Model.
`vif_torch`	Compute a differential batched torch VIF (Visual Information Fidelity).
`vmaf_official`	Parse the ffmpeg vmaf metric.
`vmaf_torch`	Torch version of the Video Multi-Method Assessment Fusion.