cutcutcodec.core.analysis.video.complexity.dct.spatial_dct¶
- cutcutcodec.core.analysis.video.complexity.dct.spatial_dct(img: Tensor, threads: int = 0, patch: Integral = 32) Tensor[source]¶
Compute the spatial dct complexity for the image.
The dct spatial complexity \(C_{\text{dct}} \in \mathbb{R}^+\) is defined as follow:
\[\begin{split}\begin{cases} C_{\text{dct}} = \frac{1}{n_{\text{blocs}}} \sum\limits_{m=1}^{n_{\text{blocs}}} H_m \\ H_m = \frac{1}{s^2} \sum\limits_{i=1}^s \sum\limits_{j=1}^s e^{\left(\frac{ij}{s^2}\right)^2-1} \left|\mathscr{D}_m(i,j)\right| \\ \mathscr{D}_m(i,j) = \begin{cases} 0 & \text{if } i + j = 2 \\ \mathscr{F}_m(i,j) & \text{otherwise} \\ \end{cases} \\ \end{cases}\end{split}\]With \(\mathscr{F}_m(i,j)\) the DCT-II applied to the patch \(m\) of the image, calculated by the function
compute_dct(). The patches cover the full image and are not overlapping.Parameters¶
- imgarraylike
The Y[UV] images, of shape ([*batch], [1], height, width, [channels]). Only the Y component is used. It has to be in range [0, 1]. The image is sliced in non-overlapping squares of size \(s \times s\). If the height or width of the image is not a multiple of \(s\), edges will be cropped.
- threadsint, optional
Defines the number of threads. The value -1 means that the function uses as many calculation threads as there are cores. The default value (0) allows the same behavior as (-1) if the function is called in the main thread, otherwise (1) to avoid nested threads. Any other positive value corresponds to the number of threads used.
- patchint, default = 32
The patch size \(s\). It has to be >= 1. The default value of 32 is the one proposed in the VCA paper.
Returns¶
- spatial_dctarraylike
The \(C_{\text{dct}}\) scalar for each image (of shape batch).
Notes¶
It comes from the paper
A NEW ENERGY FUNCTION FOR SEGMENTATION AND COMPRESSION.The VCA tool offers an optimized version of this metric. The result is close to the
Ecolumn of the .csv file generated withffmpeg -i video.mp4 -f yuv4mpegpipe - | vca --y4m --input stdin --no-lowpass --complexity-csv result.csv.This function can be called by
cutcutcodec metric video.mp4 --spatial-dct -o result.json.
Examples¶
>>> import numpy as np >>> from cutcutcodec.core.analysis.video.complexity import spatial_dct >>> np.random.seed(0) >>> img = np.random.random((720, 1080, 3)) # It could also be a torch array list... >>> spatial_dct(img).round(2) array([1.59]) >>>