AV Foundation: Saving a Sequence of Raw RGB Frames to a Movie

Share Button

An application may generate a sequence of images that are intended to be viewed as a movie, outside of that application. These images may be created by, say, a software 3D renderer , a procedural texture generator, etc. In a typical OS X application, these images may be in the form of a CGImage or NSImage. In such cases, there are a variety of approaches for dumping such objects to a movie. However, in some cases the image is stored simply as an array of RGB (or ARGB) values. This post discusses how to create a movie from a sequence of such “raw” (A)RGB data.

Credit is due to the very helpful respondents on this Apple Developer Forum thread. I’ve posted an Xcode project with the complete test app on github.

Creating a movie from raw RGB data in QuickTime (i.e., with QTKit) is relatively simple. However, QTKit has been deprecated, in favor of AV Foundation functionality.  AV Foundation is quite sophisticated, and any thorough treatment is far beyond the scope of this post. So, I’ll focus on the nuts-and-bolts of this particular problem,  assuming the reader is already familiar with the object classes we’ll be using, or will go elsewhere for more information.


To write data to a movie file, we’ll need three AV Foundation objects. We’ll assume our per-frame data is described by a data pointer, a width, a height, and a count of the bytes per row.

Setting Up

We’ll set up our code to create an MPEG-4 container with a JPEG codec, and get ready to start dumping each frame to the movie:

This is boilerplate asset-writing code: we first create a (minimal) dictionary for codec type and image dimensions. We then create the three AV Foundation objects necessary for writing: a writer input, an input adaptor, and the asset writer itself. The final three lines hook the writer input up to the writer, and gets it ready to start writing the (A)RGB data to the movie file.

Exporting Each Frame from Raw Data

In an application you might have an array of raw frame data already available, or instead,  the data might be retrieved or generated on the fly. In the former case, you might have a loop that runs over the raw data frames; in the latter case, you might have a function that’s called to dump each frame as it’s generated or otherwise made available. In this simple example, we’ll just do a loop, each time synthesizing some “fake” frame data. Note that we’re allocating our data on the heap, to avoid any possibility of stack allocation limitations:

In a real-world app, the data  variable would instead be retrieved from, say, an array, or fetched off a file on a disk, or be provided as a pointer to a function that saves out one frame. The guts of the solution here is to fill out the “write this frame out” functionality.

Before addressing that issue, we’ll complete our movie-file-writing by noting that once the last frame is written, we can simply call:

In any case, we’ll be writing the data from a CVPixelBuffer object, so we need to use a function that allows us to create such an object with our raw data raster. Fortunately, we can use CVPixelBufferCreateWithBytes to do this rather directly:

The CVPixelBufferCreateWithBytes function simply takes the pixel format, image dimension info, and image data, and makes us a CVPixelBuffer. We then hand it off to our input adaptor, and we’re done with this frame. Note that we’re using  k32ARGBPixelFormat rather than RGBA — the reason is that Core Video does not support RGBA order.

Handling Asynchrony

Ignoring our synthesizing of the “input data” as the pedagogical crutch it is, you might object that I’ve stuck in a seemingly “extra” step of first creating a CFData object, which now has a copy of the frame data. Why copy the input data? The reason is that -appendPixelBuffer  kicks off an asynchronous activity — it does not simply do its work in the current thread, and then return when it’s done.

We need to be sure that the data pointer remains valid until that activity is completed. Of course, we’ll need some way to free up that (copied) data once the activity’s done. Indeed, CVPixelBufferCreateWithBytes provides us with a mechanism for just that: the seventh parameter, which in this example is the callback function  ReleaseCVPixelBufferForCVPixelBufferCreateWithBytes . The parameter immediately following it allows us to specify what it is we need to delete (the CFData object, in this case).  The callback function is defined like this:

Now, if our per-frame data pointer is guaranteed to be valid throughout the entire duration of the movie-creation process, we could simply provide the raw data pointer to CVPixelBufferCreateWithBytes, and provide no deletion callback at all. This would be the case if, for example, we had all the frames in an array from the start.

Because writing a frame with   -appendPixelBuffer  is asynchronous, there’s a possibility that the input adaptor won’t be finished with its activity when we’re ready to send the next frame down. The input adaptor’s readiness is tracked by its “readyForMoreMediaData” property, so we could simply spin in the current thread until that property became true:

This certainly works, but it lacks a certain elegance. Instead, we can use key-value programming to track this, in conjunction with a semaphore. The idea is that the calling thread should wait on this semaphore until “readyForMoreMediaData” becomes true. We’ll need to define a semaphore and a helper variable:

and initialize at the beginning of our setup code, and add an observer for the “readyForMoreMediaData”:

So, instead of napping in a loop, we can wait on the input adaptor to signal us that it’s ready:

Our key-value observer function is this:

This simply checks to see if the primary thread is waiting for the input adaptor’s completion of its activity, and if so, signals the semaphore. The primary thread then can resume execution, hand its data off to the input adaptor, and iterate over the next frame (or return from the function and wait to be called again on the next frame).

The Final Code