FFmpeg: convert RGB(A) to YUV

I recently had a need to convert a series of rendered images  generated in an application to a movie file. The images are rasters of raw 32-bit RGBA values. A typical solution to this problem would be to dump the images to disk, and then use a command-line program,  such as ffmpeg, to convert them to the desired movie format (in this case, MPEG-2).

In my particular usage scenario, this simple solution was not an option for various reasons (nearly unbounded disk usage, user interface issues, etc.). Another option is to use the FFmpeg API to encode each frame’s raw RGBA data and dump it to a movie file. Unfortunately, I was unable to find a codec that would directly convert from the raw data to the desired movie format.

A web search turned up a potential solution: http://stackoverflow.com/questions/16667687/how-to-convert-rgb-from-yuv420p-for-ffmpeg-encoder

It turns out you can convert RGB or RGBA data into YUV using FFmpeg itself (SwScale), which then is compatible with output to a file. The basics are just a few lines: first, create an SwsContext that specifies the image size, and the source and destination data formats:

AVCodec *codec = avcodec_find_encoder(AV_CODEC_ID_MPEG2VIDEO);
AVCodecContext *c = avcodec_alloc_context3(codec);
// ...set up c's params
AVFrame *frame = av_frame_alloc();
// ...set up frame's params and allocate image buffer
SwsContext * ctx = sws_getContext(c->width, c->height,
                                  AV_PIX_FMT_RGBA,
                                  c->width, c->height,
                                  AV_PIX_FMT_YUV420P,
                                  0, 0, 0, 0);

And then apply the conversion to each RGBA frame (the rgba32Data pointer) as it’s generated:

uint8_t *inData[1]     = { rgba32Data };
int      inLinesize[1] = { 4 * c->width };
sws_scale(ctx, inData, inLinesize, 0, c->height, 
          frame->data, frame->linesize);

One important point to note: if your input data has padding at the end of the rows, be sure to set the inLineSize  to the actual number of bytes per row, not simply 4 times the width of the image.

If you’re familiar with the FFmpeg API, this info should be sufficient to get you going. The FFmpeg API is quite extensive and a bit arcane, and even something as functionally simple as dumping animation frames to a movie file is not completely trivial. Fortunately, the FFmpeg folks have provided some nice example files, including one that demonstrates some basic audio and video encoding and decoding: https://www.ffmpeg.org/doxygen/2.1/decoding__encoding_8c.html

I took the source for the video encoding function and hacked it up to incorporate the required RGBA to YUV conversion. The code performs all the steps needed to set up and use the FFmpeg API, start to finish, to convert a sequence of raw RGBA data to a movie file.  As with the original version of the code, it synthesizes each frame’s data (an animated ramp image) and dumps it to a file. It should be easy to change the code to use real image data generated in your application. I’ve made this available on GitHub at:

https://github.com/codefromabove/FFmpegRGBAToYUV

For Mac programmers, I’ve included an Xcode 6 project that creates a single-button Cocoa app. The non-app code is separated out cleanly, so it should be easy for Linux or Windows users to make use of it.

Other input formats

The third argument to sws_getContext describes the format/packing of your data. There are a huge number of formats defined in FFmpeg (see pixfmt.h), so if your raw data is not RGBA you shouldn’t have to change how your image is generated. Be sure to compute the correct line width (inLinesize in the code snippets) when you change the input format specification. I don’t know which input formats are supported by sws_scale (all, most, just a few?), so it would be wise to do a little experimentation.

For example, if your data is packed 24-bit RGB, and not 32-bit RGBA, then the code would look like this:

SwsContext * ctx = sws_getContext(c->width, c->height,
                                  AV_PIX_FMT_RGB24,
                                  c->width, c->height,
                                  AV_PIX_FMT_YUV420P,
                                  0, 0, 0, 0);
uint8_t *inData[1]     = { rgb24Data };
int      inLinesize[1] = { 3 * c->width };
sws_scale(ctx, inData, inLinesize, 0, c->height, 
          frame->data, frame->linesize);

xclock -mode cat

cat mode

I thought it would be fitting for my first post on this blog to be about something I wrote many years ago, but which still lives to this day.

In grad school (University of Washington), I worked in a lab that acquired a DEC MicroVAX II. This machine was configured with either Ultrix (DEC’s version of Unix), or BSD Unix, and ran the X Window System (X10 R3 or R4). I used this machine quite a bit during my M.S. research. So I ended up with a degree in Computer Science, and some practical and then-quite-relevant skills in C, the X Window System, Unix, workstations, and computer graphics. As an historical note, the workstation’s main purpose in life was to act as a front end for an Adage/Ikonas frame buffer, but to get that to work, Jamie Painter and I had to rewrite part of the device driver for it, to accommodate the Q22-bus peculiarities. I made the mistake of putting that experience on my resume, and for years I was hounded by recruiters seeking people who actually liked working on device drivers, of which I was not one. Continue reading xclock -mode cat

The Blog of Philip and Dakota Schneider