Operating System

Handling Media Files in MatLab

You might be wondering: does anyone love anything as much as I love MatLab?  I get it, another MatLab article… Well, this one is pretty cool.  Handling media files in MatLab is, not only extremely useful, but is also rewarding.  To the programming enthusiast, it can be hard to learn about data structures and search algorithms and have only the facilities to apply this knowledge to text documents and large arrays of numbers.  Learning about how to handle media files allows to you see how computation effects pictures, and hear how it effects music.  Paired with some of the knowledge for my last two articles, one can begin to see how a variety of media-processing tools can be created using MatLab.



Audio is, perhaps, the simplest place to start.  MathWorks provides two built-in functions for handling audio: audioread() & audiowrite().  As the names may suggest, audioread can read-in an audio file from your machine and turn it into a matrix; audiowrite can take a matrix and write it to your computer as a new audio file.  Both functions can tolerate most conventional audio file formats (WAV, FLAC, M4A, etc…); however, there is an asymmetry between the two function in that, while audioread can read-in MP3 files, audiowrite cannot write MP3 files.   Still, there are a number of good, free MP3 encoders out there that can turn your WAV or FLAC file into an MP3 after you’ve created it.

So let’s get into some details… audioread has only one input argument (actually, it can be used with more than one, but for our purposed, you only have to use one), the filename.  Please note, filename here means the directory too (:C\TheDirectory\TheFile.wav).  If you want to select the file off your computer, you could use uigetfile for this.

The audioread function has two output arguments: the matrix of samples from the audio file & the sample rate.  I would encourage the reader to save both since the sample rate will prove to be important in basically every useful process you could perform on the audio.  Sample values in the audio matrix are represented by doubles and are normalized (the maximum value is 1).

Once you have the audio file read-in to MatLab, you can do a whole host of things to it.  MatLab has in-built filtering and other digital signal processing tools that you can use to modify the audio.  You can also make plots of the audio magnitude as well as it’s frequency contents using the fft() function.  The plot shown below is of the frequency content of All Star by Smashmouth.Once you’re finished processing the audio, you can write it back to a file on your computer.  This is done using the audiowrite() function.  The input arguments to audiowrite are the filename, audio matrix in Matlab, and sample rate.  Once again, the filename should also include the directory you want to save in.  This time, the filename should also include the file extension (.wav, .ogg, .flac, .m4a, .mp4).  With only this information, MatLab will produce a usable audio file that can then be played through any of your standard media players.

The audiowrite function also allows for some more parameters to be specified when creating your audiofile.  Name-argument pairs can be sent as arguments to the function (after the filename, matrix, and sample-rate) and can be used to set a number of different parameters.  For example, ‘BitsPerSample’ allows you to specify the bit-depth of the output file (the default is 16 bits, the standard for audio CDs).  ‘BitRate’ allows you to specify the amount of compression if you’re creating an .m4a or .mp4 file.  You can also use these arguments to put in song titles and artist names for use with software like iTunes.



Yes, MatLab can also do pictures.  There are two functions associated with handling images: imread() and imwrite().  I think you can surmise from the names of these two function which one reads-in images and which one writes them out.  With images, samples exist in space rather than in time so there is no sample-rate to worry about.  Images still do have a bit-depth and, in my own experience, it tends to differ a lot more from image-to-image than it does for audio files.

When you import an image into MatLab, the image is represented by a three-dimensional matrix.  For each color channel (red, green, and blue), there is a two-dimensional matrix with the same vertical and horizontal resolution as your photo.  When you display the image, the three channels are summed together to produce a full-color image.

By the way, if you want to display an image in MatLab, use the image() function.

MathWorks provides a good deal of image-processing features built-into MatLab so if you are interested in doing some crazy stuff to your pictures, you’re covered!