cutcutcodec.core.nn.model.compression.img_cgavaenn.VariationalEncoder

class cutcutcodec.core.nn.model.compression.img_cgavaenn.VariationalEncoder[source]

Projects images into a more compact space.

Each patch of 192x192 pixels with a stride of 32 pixels is projected into a space of dimension 256.

Initialize internal Module state, shared by both nn.Module and ScriptModule.

static add_quantization_noise(lat: Tensor) → Tensor[source]

Add a uniform noise in order to simulate the quantization into uint8.

Parameters

lattorch.Tensor: The float lattent space of shape (n, 256, a, b) with value in range ]0, 1[.

Returns

noised_lattorch.Tensor: The input tensor with a aditive uniform noise U(-.5/255, .5/255). The finals values are clamped to stay in the range [0, 1].

Examples

>>> import torch
>>> from cutcutcodec.core.nn.model.compression.img_cgavaenn import VariationalEncoder
>>> lat = torch.rand((10, 256, 1, 3))
>>> q_lat = VariationalEncoder.add_quantization_noise(lat)
>>> torch.all(abs(q_lat - lat) <= 0.5/255)
tensor(True)
>>> abs((q_lat - lat).mean().round(decimals=4))
tensor(0.)
>>>

forward(img: Tensor) → Tensor[source]

Apply the function on the images.

Parameters

imgtorch.Tensor: The float image batch of shape (n, 3, h, w). With h and w >= 192 + k*32, k positive integer.

Returns

lattorch.Tensor: The projection of the image in the latent space. New shape is (n, 256, (h-160)/32, (w-160)/32) with value in [0, 1].

Examples

>>> import torch
>>> from cutcutcodec.core.nn.model.compression.img_cgavaenn import VariationalEncoder
>>> encoder = VariationalEncoder()
>>> encoder(torch.rand((10, 3, 192, 192+2*32))).shape
torch.Size([10, 256, 1, 3])
>>>