cutcutcodec.core.io.read_ffmpeg

Decode the streams of a multimedia file based on ffmpeg.

Classes

ContainerInputFFMPEG(filename, **av_kwargs)

Allow to decode a multimedia file with ffmpeg.

Functions

frame_dates(frame)

Return the accurate time interval of the given frame.

Details

class cutcutcodec.core.io.read_ffmpeg.ContainerInputFFMPEG(filename: Path | str | bytes, **av_kwargs)[source]

Allow to decode a multimedia file with ffmpeg.

Attributes

av_kwargsdict[str]: The parameters passed to av.open.
filenamepathlib.Path: The path to the physical file that contains the extracted video stream (readonly).

Notes

In order to avoid the folowing error :: av.error.InvalidDataError: [Errno 1094995529] Invalid data found when processing input; last error log: [libdav1d] Error parsing OBU data

Which happens when reading a multi-stream file sparingly, The instances of av.container.InputContainer are new for each stream.

Examples

>>> import torch
>>> from cutcutcodec.core.io.read_ffmpeg import ContainerInputFFMPEG
>>> from cutcutcodec.utils import get_project_root
>>> with ContainerInputFFMPEG(get_project_root() / "examples" / "intro.webm") as container:
...     for stream in container.out_streams:
...         if stream.type == "video":
...             stream.snapshot(0, (stream.height, stream.width)).shape
...         elif stream.type == "audio":
...             torch.round(stream.snapshot(0, rate=2, samples=3), decimals=5)
...
(720, 1280, 3)
(360, 640, 3)
FrameAudio(0, 2, 'stereo', [[     nan,  0.1804 , -0.34765],
                            [     nan, -0.07236,  0.07893]])
FrameAudio(0, 2, 'mono', [[     nan,  0.06998, -0.24758]])
>>>

Initialise and create the class.

Parameters

filenamepathlike

Path to the file to be decoded.

**av_kwargsdict

Directly transmitted to av.open.

"format" (str): Specific format to use. Defaults to autodect.
"options" (dict): Options to pass to the container and all streams.
"container_options" (dict): Options to pass to the container.
"stream_options" (list): Options to pass to each stream.
"metadata_encoding" (str): Encoding to use when reading or writing file metadata.
Defaults to “utf-8”.
"metadata_errors" (str): Specifies how to handle encoding errors;
behaves like str.encode parameter. Defaults to “strict”.
"buffer_size" (int): Size of buffer for Python input/output operations in bytes.
Honored only when file is a file-like object. Defaults to 32768 (32k).
"timeout" (float or tuple): How many seconds to wait for data before giving up,
as a float, or a (open timeout, read timeout) tuple.

Raises

cutcutcodec.core.exceptions.DecodeError: If it fails to extract any multimedia stream from the provided file.

cutcutcodec.core.io.read_ffmpeg.frame_dates(frame: Frame) → tuple[Fraction, None | Fraction][source]

Return the accurate time interval of the given frame.

Parameters

frameav.frame.Frame: The audio or video frame witch we extract the timing information.

Returns

t_startFraction: The display time of the frame. for audio frame, it corressponds to the time of the first sample.
t_endFraction or None: For audio frame only, the time to switch off the last sample. Return None for video frame.

Examples

>>> import av
>>> from cutcutcodec.core.io.read_ffmpeg import frame_dates
>>> with av.open("cutcutcodec/examples/video.mp4") as av_container:
...     frame_dates(next(av_container.decode(av_container.streams.video[0])))
...     frame_dates(next(av_container.decode(av_container.streams.video[0])))
...
(Fraction(0, 1), None)
(Fraction(1, 25), None)
>>> with av.open("cutcutcodec/examples/audio_5.1_narration.oga") as av_container:
...     frame_dates(next(av_container.decode(av_container.streams.audio[0])))
...     frame_dates(next(av_container.decode(av_container.streams.audio[0])))
...
(Fraction(0, 1), Fraction(4, 125))
(Fraction(4, 125), Fraction(8, 125))
>>>

Notes

For audio frame, include the duration of the last sample. For video frame, the duration of the frame is unknown.