[MUSIC PLAYING] SPEAKER 1: Multimedia, odds are you see it every day, you hear it every day, but what is it? Well, let's start with audio, what you hear coming out of a computer. Turns out computers are really good at recording and playing back audio, and they're really good at generating audio as well. And they can do so using any number of file formats where a file format is just a way of storing zeros and ones on disk in a way that certain software knows how to interpret it. So let's start with a particularly common file format for musical instruments known as MIDI. It turns out that using the MIDI format, M-I-D-I can you store effectively the musical notes that compose some song. And you can do this for different instruments, and you can then play these instruments together by telling the computer to interpret those notes and then render them based on particular choices of instruments. For instance, this here is a program called GarageBand on Mac OS, and I've preloaded a MIDI file that I've downloaded online and I daresay you will soon recognized the tune. Let me go ahead and hit play. [MUSIC PLAYING] All right, well, that doesn't sound as good as you might remember it sounding in the movie, but why is that? Well, that's because my computer was synthesizing that music based only on those musical notes. So that wasn't an actual recording of an orchestra performing that song, but rather it was a computer synthesizing or generating the music based on an interpretation of those notes. So MIDI is especially common among musicians who want to share music with each other. It's especially common in the digital musical space where you do want the computer to synthesize the music for you. But, of course, we humans are generally in the habit of listening to songs as we know and love them on the radio or from CDs back in the day or streaming media services. And those are songs that have actually been performed typically by humans and recorded often in a concert or in a sound studio, so they sound really, really good and really really pristine. Well, you don't have to use MIDI for those kinds of experiences rather you can use any number of other file formats. For instance, one of the earliest formats for audio and still one of the most common for uncompressed audio is called the wave file format, which can store data in an uncompressed form so that you have a really, really high quality versions of some audio recording. But also popular and perhaps more popular among consumers is that known as MP3 or MPEG3, which is a file format for audio that uses compression to significantly reduce generally by a factor of more than 10 just how many bits are necessary to store some song on your hard drive or on your music device or on your phone or any other form of technology where you might store music. And it does so by really throwing away zeros and ones that we humans can't necessarily hear. Now, some people will disagree, and true audio files might disagree and insist that, actually, you can tell the difference among these file formats, but that may very well be the case because there's a trade off here. If you want to use fewer bits and really fewer megabytes to store your audio files, you might indeed have to sacrifice some of the quality. But the upside is that you might be able to store on your phone or your iPod or some other device 10 times as much music as a result of that compression. So audio compression is generally what's known as lossy, L-O-S-S-Y, whereby you're actually losing some of the quality or the fidelity of the music, but the gain is that you're using far less space to store that information. A similar file format in spirit is ACC, which is commonly used for audio files as well as inside video files for audio. And that's something that you might see when you download files from-- via iTunes, for instance, or the like. And then there are streaming services these days like Google Play and the Amazon store and Apple Music and Spotify, Pandora, and others that don't necessarily transfer files outright to your computer, but stream the bits to you so that they're actually being played in real time so long as your internet connection can keep up with the required bandwidth. So how do we think about the quality of these recordings, whether we're using any number of these file formats? Well, you can think of it in terms of at least two parameters. One is sampling frequency, the number of times per second that we actually take a digital snapshot, so to speak, of what it is the human would otherwise be hearing in person so as to then represented digitally using zeros and ones. And the second parameter would be the bit depth, just how many bits are you using for that snapshot in time, some number of times per second, in order to represent the pitch and the volume and what it is the human is seeing. And if you multiply those two values together, the bit depth and the sample rate, will you get just how many total bits are necessary to store for instance one second of music? And these file formats vary and allow you to vary exactly what these parameters are. So by using fewer bits, you might be able to save space but get a lower quality recording, or if you want a super high quality recording, you might use a higher bit rate all together. So now let's transition to graphics, what we see in the world of multimedia. Turns out here too there's multiple file formats for representing graphics. And what is a graphic? Well, graphic really if you think about it is just a whole bunch of dots otherwise known as pixels both horizontally and vertically. Indeed most images that you and I see on the web, on our phones, on our computers are rectangular in nature, though, you can make some of the images transparent, so they might appear to be other shapes. But at the end of the day, all file formats for images are rectangular in nature, and you can think of them as just a grid of pixels or dots. Now in the simplest form, each of those dots might just be represented by a single bit, a 1 or a 0. So for instance, here if you look far enough back, is what appears to be a very happy smiley face. But it's pretty simply implemented. If you think of, again, this rectangular region as just having a whole bunch of dots or pixels, I've pretty much colored in in black only those dots necessary to convey the idea of a happy face and left in white any of the dots that are otherwise part of our background. And you might then consider the white pixels to be represented with a one, and the black pixels to be represented with a zero or vice versa. It doesn't really matter, so long as we're consistent in our file format. And so if you take a step back, you can, kind of, sort of, but it's really hard to see the same image even among those zeros and ones, but that might be the simplest mapPNG from binary to an image. You simply have to decide that there's some number of bits horizontally, some number of bits vertically. And if it's a 1, it's a white pixel, and if it's a 0, it's a black pixel or equivalently vice versa. But, of course, we don't generally use black and white images alone, on the internet, on our phones, on our computers. Indeed, the world would be pretty boring if it only looked like that. And that's, indeed, how it looked way back in the day even before there was digital and before we had file formats like this when you just had black and white TV. But that would really be similar in spirit to what we're looking at here with some gray scales as well. But here let's focus on color and the introduction of color in a digital context, RGB, red, green, blue. If you've ever heard this acronym, and even if you haven't, this represents the three colors that can be mixed together really to give us any color that we want-- RGB meaning red, green, and blue. So using three different values, how much red do you want, how much green do you want, how much blue do you want, you can tell a computer to colorize each of those dots in a certain way. Now if you have none of these colors, you'll actually get a black dot. And if you have all of these colors mixed together in equal form, you'll get a white dot. But it's in the grades in between that you get all sorts of disparate colors. So let's consider this. Here is three bytes before you, and each is a byte, because each of these is 8 bits where, again, a bit is just a 0 or a 1. So I have eight bits here, eight bits here, and eight bits here. The first byte of bits, first eight bits, is, of course, all ones apparently. The second byte is all zeros, and the third byte is all zeros as well. So if you view each of these bytes, 1, 2, 3 as representing how much of a certain color red, green, blue, RGB, this appears to be a lot of red, because all of these bits are ones, no green and no blue. So are RGB, red, green, blue, lots of red, no green, no blue. And so indeed this is how a computer would typically using eight bits per color or 24 bits in total, 8 plus 8 plus 8, would represent the number we know as red. So that is to say if you think of this whole screen as just one dot-- it's not quite a square. It's a rectangle in this case-- but if you think of this whole screen as just one dot, if a computer wanted to make this dot red, it would store a pattern of 24 bits, the first eight of which are all ones, the second eight of which are all zeros, and the third of which are all zeros as well. And it will interpret the first of those eight bits as meaning give me a lot of red, give me no green, give me no blue, and thus you get a whole screen full of red or a whole pixel full of red. What if we change it up? What if we have a zero byte, a byte with all ones, and then another zero byte. Thereby, making the red zero, the green all ones, and the blue all zeros. Well, indeed, we'll get a screen filled with all green using that encoding of 24 bits. And you might guess in the end here, if we have zeros and zeros and then ones, RGB, this time we're going to get blue. That's how a computer using 24 bits would represent a dot that's entirely blue. Meanwhile, if you wanted represent black, you would use all zeros for each of the R, G and B values, and if you wanted to represent white, you would use all ones for each of the R,G, and B values. And you can get any number of colors in between these extremes in any number of variations of red, green, and blue by just mixing those colors together in different quantities. Now it turns out when we talk about graphical file formats, we don't typically talk in terms of or think in terms of binary. We rather use something called hexadecimal. Whereas binary just has two digits, zero and one and whereas recall decimal has 10 digits zero through nine, hexadecimal is a little different. It has 16 possible digits. And it's a little weird, but it's at least pretty straightforward. Those 16 digits are zero through nine, and then A through F. In other words, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, A, B, C, D, E, F. And so, of course, zero is the smallest number we can represent, and 15 is going to be the largest number we can represent, which is to say that F represents a 15. So, in fact, let's consider an example. Here is a pattern of eight bits, all of which are one. Let me go ahead and add a little bit of space to these eight bits just to separate them into two groups of four, because it turns out one of the nice features of hexadecimal mathematically is that each hexadecimal digit zero through F represents in total, four bits. Which is to say that we can take a number in binary like this, look at it as two halves, one half of a byte followed by another half of a byte, and use one hexadecimal digit instead of four binary digits to represent the first four bits. And then one other hexadecimal digit to represent the other four bits. So we can take something that takes eight symbols represent and widdle it down to just two, which is pretty convenient. And so, in fact, it turns out that in hexadecimal if we had all zeros, in hexadecimal that would just be 0. But if we have all ones, 1, 1, 1 and we convert that to hexadecimal, that's going to be if this is the one place and the twos places and the fours place and the eights place, that's going to be the number 15, otherwise known in hexadecimal as F. Which is to say if you have a byte of bits, 8 bits, all of which are ones, you can think of that same byte as being two hexadecimal digits FF, as opposed to thinking of it as 1, 1,1,1, 1, 1, 1, 1, it's just FF. So it's a more succinct way of representing the exact same information. And so accordingly, if you want to think about red a little more succinctly, you don't have to think about it in terms of eight ones and eights zeros and eight zeros, you can think of it in terms of FF, 0, 0, 0, 0 just because it's more succinct-- similarly for green, 0,0 F, F, 0, 0 and for blue 0,0, 0,0, F,F. It's just a more succinct way of explaining oneself, and indeed a lot of graphical editing programs like Photoshop being one of the most popular actually use this notation certainly instead of binary and also instead often of decimal just by convention. So now let's consider some specific file formats. If you're a PC user, you might not have seen this in a while, but odds are when you did it was for quite a few years, this beautiful rolling hill with a beautiful cloudy sky behind it. This was, of course, the wallpaper or the background image that came by default with Windows XP on operating system from Microsoft for PC computers. So the very first time you turn on your computer and, perhaps, logged in, you would see a screen like this and maybe some of your icons and your recycle bin and the like. Now as an aside and spoiler, this is what that same hill apparently looks like today. So it hasn't necessarily aged well, but for our purposes what's interesting here is what this image was stored as. It turns out that this image originally was a bitmap file, BMP, or bitmap, B-I-T-M-A-P, to pronounce it out loud. And that file format really is what that word implies. It's a map of bits. It's a grid of bits, which is perfectly consistent with our definition earlier of a very simple smiley face using just zeros and ones or black and white dots. This case, clearly, has many more colors than that, and indeed it's certainly the case in general. The graphical file formats on computers support dozens of colors, hundreds of colors, thousands, maybe even millions of colors, certainly, more than just black and white alone. But there's a finite amount of information here. And even though this looks like a beautifully crisp green grassy area and a beautifully blue-- a beautifully blue sky with some very smooth clouds, if we actually zoom in on those clouds, you'll see that indeed an image is really just a grid of dots. In fact, let me zoom in on those clouds, and I've not done any alterations. I simply used a graphical program to take that same sky and zoom in, zoom in, zoom in as much as I can. And as soon as you zoom in enough, you see that that cloud that previously looked especially smooth to the human eye, really isn't. It's just that my human eyes can't really see dots, especially clearly when they're really small, and there's a very high resolution so to speak-- a lot of pixels horizontally and a lot of pixels vertically in an image. But if I do zoom in on that, I actually do see the pixilation so to speak, whereby you actually see the dots. And you can see that those clouds are really just roughly represented as a green-- as a grid of dots or a map of pixels, a rectangular region of pixels. So that's all very interesting now, because it would seem that we don't have an endless ability to zoom and zoom and zoom in and see more and more detail unless that information's already there. And so, much like with audio, when you have the choice over just how many bits to use, so in the world of images do you have discretion over how many bits to use. How many bits do you use to represent each dots color. And that might indeed be just 8 bits for red, 8 bits for green, 8 bits for blue, AKA 24-bit color, but resolution also play-- comes into play. If you have an image that's only 100 pixels, for instance, by 100 pixels, horizontally by vertically, it might only be this big. Now that might not big enough to fill-- be big enough to fill your whole background wallpaper on your computer, and so you might try to scale it up or zoom in on it. But when you do that, you're taking only a limited amount of information, 100 pixels by 100 pixels, and you're essentially just duplicating those pixels making them bigger and blotchier just to fill your screen. Better would be to not start with an image with so few pixels, but rather get a much higher resolution image. And indeed, this is what you get with newer and better camera phones these days, newer and bigger, better digital cameras is among other things do you get higher and higher resolution. More and more dots, so that the dots ultimately that we humans see are so small on our screens, it looks ever more smooth than, say, an image like this. So generally speaking, higher resolution gives us higher fidelity and a cleaner image. The other factors in cameras certainly play into that as well. But there's something else I notice here. It seems a little silly that I'm using the same number of bits to represent the color of every one of the dots on the screen. Because even though I do see a few different shades of gray or white in there and light blue and dark blue, I see a lot of identical blue throughout this image. There's a lot of redundancy, and indeed if we rewind, there's a whole lot of blue in this image itself. There's a whole bunch of similar white it would seem in the middles of the clouds. There's a whole bunch of similar looking green. And yet we are using, it would seem by default, 24 pixels-- 24 bits for every pixel, which just seems wasteful even if one pixel is identical to the one next to it. So it turns out that graphical file formats can often be compressed, and this can be done in different ways. It can be done losslessly or lossely. So earlier you'll recall that I proposed shrinking audio files by throwing away information that maybe my human ears can't necessarily hear or my non-audio file might not even notice are missing. And that would be lossy compression, and then I'm just throwing information away assuming that the user's not going to notice. But that's not always necessary. Sometimes you can do lossless compression, whereby you can use fewer bits to store the same information. You just have to store it more intelligently. So consider this example here where you have an apple against a blue backdrop and that, much like our blue sky, seems pretty consistent throughout. And so it seems a little silly intuitively to record an image like this on disk as follows. If we think about me being a verbalisation of a file format, make this pixel blue, make this pixel blue, make this pixel blue, make this pixel blue, make this pixel blue, make this pixel blue. Literally saying the same sentence, or more technically using the same 24 bits for every pixel across that entire row even though my sentence might not be changing. And so instead what a clever file format might do is this. This is not what the user sees, but this is what the file format could store with respect to all of that redundant blue. Just remember, for instance, the leftmost pixels color as by saying this pixel is blue, and then for the rest of the row or scanline as it's called in an image just say, and so are the rest of the pixels in this row. So I can say much more concisely, essentially repeat this color throughout the entirety of the rest of the row, thereby saving myself any number of sentences let alone any number of 24 bits. And I can do that the same here, make this pixel blue and then repeat that image-- that color again and again and again. Now it gets a little less efficient as soon as we hit like the stem on the apple, because then that sentence has to change. Then we have to say something like make this pixel brown, make this pixel blue, and then repeat again. So we have to, kind of, stop and start if there's some obstruction in the way. And the same thing for the red apple itself. But just look based on the white at how much information we're potentially saving or how many bits we're potentially saving, and yet we're saving those bits in a way that the original information is recoverable. Just because we don't store 24 bits representing blue for every one of these dots on the screen, doesn't mean we can't display blue there just by interpreting this file format a little more cleverly. And so this is indeed how a file format might actually losslessly compress itself using fewer bits to store the same image, but in a way where you can recover the original image itself. Now let's take a look at another example this time of lossy compression. Here is a beautiful sunflower taken somewhere here on campus at Harvard University. This is a high quality JPEG photograph where JPEG is a popular file format for photographs especially. And this image here was somewhat compressed, but not very compressed. In fact, only if I put my face really awkwardly close to the screen do I see that it's a little bit blotchy way up close. But from just a foot or so or beyond, it looks perfectly pristine. But not if we compress this image further. Suppose that this image is just too big to fit on my Facebook profile page, or it's just too big to email to a friend via my phone. In other words, I need to use fewer bits or fewer megabytes even if it's a really big file to store this same image and convey the gist of the image to that friend. Now I see a little bit of blue and I do see a bunch of yellow, but it's not quite the same clean pattern that we saw with the apple or even the blissful blue sky above the green grassy hill. And so if I were instead using a file format that can still be compressed, but lossily where we're actually throwing information away, this might be the before image. And now wait for it. This might be the after image. So it's still clearly a sunflower, though it looks a little more sickly at this point. But it definitely looks blotchier. In fact, from a foot or more away, I can actually see that my sky has become very pixelated. It almost looks like Super Mario Bros. back in the old Nintendo systems where you could really see the big dots. And the greenery here is just a grid of pixels too, and even the flower has really just become a collection of dots that I ever so clearly see on the screen. And certainly this flower looks none so good anymore. So let's rewind. This was before, after, before, after. And so this is what it means to lossily compress an image. I cannot go from this pretty poor version back to the original, if I have achieved this compression by just throwing away some of those bits. So whereas before I was very cleverly just remembering repetition in the image, in this case using this file format, especially when you really turn the virtual knob and say compress this as much as you can. Essentially what my graphical software is going to do is start to use approximations. Well, does this leaf here really need to be 20 different shades of green? How about just two? And that's why I get this big green blotch here and this other green blotch here. Does this sky really needs to be 30 different shades of blue? How about two shades of blue and two shades of gray? And so that might be a way to use less information to still represent the same sky. I don't know in this file format just how clear the sky used to be, because those dots have essentially been thrown away and aggregated in this way. But it makes for a much smile-- much smaller file format. And so what are the formats that are disposable? Well, there's any number of options out there today, but perhaps the most common are these. There's the bitmap file format, which was commonly used originally in Windows and other contexts, not super common these days. Certainly, not on the web, but does indeed lay out all of your pixels in a grid essentially on disk of zeros and ones. Meanwhile, there's gif, which is commonly used for low quality images in multiple senses of the word. This is often used for icons on the screen or clip art that you might see, and it's also increasingly used for internet memes or the kinds of images that you might forward along to friends or see popping up on your screen in large part, because gifs can be animated. So they're, sort of, a very low end version of a video file where really it's like an image with-- it's like a video file with just a few images inside of it that often play on the repeat, so one after the other creating the illusion of some form of animation. But the resolution of gifs tends to be not very high, although they can be losslessly compressed, as we saw with the apple before, but they only support 8-bit color. And 8 bits can mean-- implies that we can only have a total of 256 colors in the image itself, which limits the range. And so they tend not to look great, especially when large for things like photographs of humans and in grassy knolls. JPEG, meanwhile, is the file format we saw just a moment ago of that beautiful sunflowers. This actually supports 24-bit color, but is lossily compress, so you might lose some information when shrinking those image files, but it allows you so many more colors that you can see images typically with much higher fidelity at much greater quality. Meanwhile, there's PNGs as well. PNGs are commonly used for high quality graphics that you might want to print or resize, supporting 24-bit color as well, and are generally used for images that you might indeed want to use in multiple contexts. Not neccess-- not so much photographs, but other artwork that's higher quality than gifs. And here's just a few examples. This is, perhaps, the most ridiculous animated gif that I could find. This here being a cat flying through the sky. And this is an animated gif in the sense that it's really just one image after another, after another, after another, and they're repeating again and again and again and again. So even though it looks like motion, really you're just seeing a bunch of images each of which has the cat in a slightly different position, and it's rainbow and the stars in a slightly different position. And if you loop these again and again, it looks like the cat's moving, but really you're just seeing a whole bunch of images every split second. Meanwhile, here is another JPEG in addition to the sunflower earlier. This is a beautiful shot of the ceiling here in Sanders Theater at Harvard University, and JPEG really lends itself to photography, because you have not only a huge range of colors, you also have the choice not really to compress the files very much. The fact that my sunflower got so ugly on the screen was because I deliberately said compress that sunflower as much as you can, but that doesn't need to be the case. If you can afford to spend the bytes on disk or you can afford to post a really big image on the internet, then you can certainly use minimal compression and capture a really beautiful image. As for a PNG, here might be a good opportunity for a PNG, a really high resolution version of say Harvard's crest that you might want to print small on some piece of paper or large on a banner or the like. And so this might lend itself especially to an application like that. Of course, we don't have an infinite amount of information at our disposal in graphics. Rather we only have the pixels and the dots and the colors that are there when that image was saved in some file format. And so it's quite all too common to see in popular television and film, sort of, abuses of what it means to be a multimedia format and a graphical file format at that. Such that there's entered the lexicon this notion of enhance where enhance essentially means apparently in the media make this image as clearly readable as possible no matter what format it was saved in. And we can see some examples of that with this popular TV show here. SPEAKER 2: We know. SPEAKER 3: That at 9:15 Ray Santoya was at the ATM. SPEAKER 2: The question is what was he doing at 9:16? SPEAKER 3: Shooting the 9 millimeter at something. Maybe he saw the sniper. SPEAKER 2: [INAUDIBLE] SPEAKER 3: Right. Go back one. SPEAKER 2: What do you see? SPEAKER 3: Bring his face up full screen. SPEAKER 2: His glasses. SPEAKER 3: There's a reflection. SPEAKER 2: [INAUDIBLE] baseball team. That's their logo. SPEAKER 3: And he's talking to whoever's wearing a jacket. SPEAKER 2: We may have a witness. SPEAKER 3: To both shootings. SPEAKER 1: All right, let's take a closer look at exactly what we just saw. So they're watching this video of some bad guy presumably, and they're trying to identify the suspect. So they're really just looking at what's called a frame in a video, which for all intensive purposes is just an image inside of a video. Because what's a video? Well, much like the animation we saw a moment ago, a video really is just a set of images being shown really fast to the human eye generally at a rate of 24 frames or images per second or as many as 30 frames or images per second, thereby creating the illusion of motion or really motion pictures. But really it's just a whole bunch of pictures being shown to us super quickly. So here's one such picture. And here apparently is the key to solving this mystery. Indeed, if we enhance that glint in this fellow's eye, we apparently see exactly this. And by the magical incantation of enhance do we apparently see this. And this is where reality breaks down. If this is the entirety of the information that has been stored in some file format and indeed you can see the pixels and the pixelation, the blotchiness because only so many bits and only so much resolution was used to store that image and we are looking at a tiny, tiny, tiny fraction of it in the reflection of that fellow sunglasses, this is all the information that we might have. Now, you might stare at this all day long and, kind of, sort of, think that you see who it is that had perpetrated this crime, but you're certainly not going to get from that anything close to the resolution of this, unless the original video and, therefore, the original frame or image was as high resolution as this output suggests. So the information, the bits, the pixels aren't just there. And even cartoons of today like Futurama know this. SPEAKER 4: Magnify that death spear. Why is it still blurry? SPEAKER 5: That's the resolution we have. Making it bigger doesn't make it clearer. SPEAKER 4: It does on CSI Miami. SPEAKER 1: All right, and what better segue then to video file formats themselves then these excerpts from some actual videos. Indeed, you can think of a video file format as very reminiscent of something from the real world. In fact, as a kid if you either made or played with these little flip books, you might have had the ability to actually see something animated really by just flipping through some physical pieces of paper really quickly. Well, that's all a video format is in the digital age. It is simply a file format that contains essentially a whole bunch of images inside of it, each of which is shown to you so fast that there appears to be the illusion just like this of motion. And you're seeing 24 images per second, 30 images per second, and it's not necessarily that they're all PNGs or JPEGS or gifs or actual images inside of it, there's actually more complicated and sophisticated ways of storing the information so you're not just storing each of the frames. You can actually use algorithms and mathematics to actually go from one frame to another. And indeed, there are some very clever opportunities when it comes to videos for compressing video formats themselves. We can certainly leverage within frames or Intraframe so to speak, the exact same techniques that we saw earlier with something like a gif and an apple where we can actually leverage the fact that there's redundancy in a given frame of a video, throw that information away, and just remember whole sky is blue or the whole rest of some line or row in a file is blue and, therefore, save on information and bits. But with videos you have another opportunity, because you don't just have an individual picture, you have a picture in every subsequent picture, which might look very similar as well. In fact, if I hold very still for multiple seconds, odds are almost everything in this video is staying the same except for my mouth, apparently my pointer finger and my lips and eyes as I blink, but everything else about me is pretty much the same. So why would you in your file format store all of the various colors that we see behind me and around me? You don't need to do that. You can also leverage something called interframe compression, whereby in simplest form you can take a look at the current frame of a video and look at the next frame and decide what has changed. And maybe look another frame after that, see what has changed, and another frame after that and see what has changed. And essentially store not every image from the starting point to the ending point, but really just the differences between those frames that are adjacent. So for instance, if we start off with this bee here on a trio of flowers and he moves and he moves and he moves, we could-- if not compressing this video and these four frames that compose the video-- we could just store each of those images essentially as is, even though flowers are not moving, the leaves are not moving. The only thing that's moving is the bee. Or we can be more clever about this just as we were with the blue sky behind the apple. We can recognize that between picture one and picture four, or the first four frames of this video, the only thing that's moving is indeed that bee. So maybe we should store just what we'll call keyframes or a snapshot in time of what the video looks like. And then on each subsequent frame, essentially, just remember what information has changed, in this case, the position of the bee and leave it to the computer playing the video to infer or interpolate these inner frames based on those so-called keyframes. Use a bit of clever math, use some algorithms to actually figure out that, oh, here's where the bee now is. Let me redraw the exact same flower and the exact same leaves behind that bee. But I now only have to store really as many bits as it takes to remember where the bee now is there, where the bee now is here, and then just for good measure to keep everything synchronized maybe every few frames we'll have another keyframe that, even though it's a little expensive, stores the entirety of the frame. Just in case something goes wrong, we can guarantee ourselves that we can reconstruct what the video actually is even if there's a little bit of a glitch otherwise. So what are the file formats that we have at our disposal? Well, in the video world the terminology gets a little more complicated in that there are a number of different solutions to the problem of storing video. And indeed these are what the world might call containers. And a container is just as the name implies, it's a digital container inside of which you can put multiple types of data. And the types of data you might put into a container would be a video track, like the actual footage that you see on the screen, an audio track. Which is the actual audio that you hear, maybe a secondary audio track. If a film has been dubbed from one language into another, you might have multiple audio tracks in the same container. And then the software on your computer or even on your TV for that matter that's playing back this video can actually choose between those multiple audio formats. You might have closed captions or some other track inside the container. So long story short, a container really is just that. It's this bucket inside of which is the video and the audio, but maybe multiple formats thereof so that you can play them back based on your own preferences. So AVI is a very popular format that's been commonly used in the Windows world for years, as has been DIVX. MP4 and Quicktime have been more common on the side of Macs, although MP4 is now pretty much universal across all browsers and operating systems and more. Otrosca is more of an open source container that's meant to be even more versatile than these others on this screen capable of storing any number of file formats inside. And as to those formats inside, they might indeed be video. They might indeed be audio. But within those worlds realize there are different ways of storing and encoding information, and those inner most rappers use what are called codecs where a codek is just a way of encoding information in a video or in an audio file format. And there's any number of these options as well, but perhaps some of the most common these days is something called H.264 for video, which is a way of storing video on disk inside of a container, or MPEG-4 part 2, a little bit more verbosely. A popular alternative there too. And then in the world of audio files, two terms we've seen before, and this is where the world gets a little confusing, sometimes the container formats are the same as the actual media formats. And in this case, AAC and MP3 can be standalone files that you download and listen to in iTunes or some other software, or they can be tracks inside of a container that actually provide a video with the audio that accompanies it. But there aren't just these two dimensional file formats, if you will. There are increasingly three dimensional or virtual formats as well that allow you to capture the entirety of spaces like this. In fact, this is a picture that is knowingly a little bit distorted, because if you look up and around in reality at this space, it doesn't look so wide and stretched out. And the stage definitely isn't curved like this, but essentially what you're looking at now is a 360 degree photograph of this exact stage. And that image, even though it's effectively a sphere that captures the entirety of this space, it's essentially like you've taken a sphere and cut it around the edges and then flattened it out, much like flattening a globe of the earth into a rectangular region, and what you get is something that's a little distorted. But if you kind of stare at this for just a moment and you imagine that the wooden stage here is really meant to be a straight line and all of these seats are supposed to be put together side by side, you can imagine re-forming a sphere out of this otherwise flat two-dimension image and putting yourself inside of it and being able to experience a space like this. So increasingly some of these same file formats that we've discussed, among them JPEG, for instance, for photographs, do you have the ability to inject what's called metadata, some additional often textual data that the human looking at an image doesn't see. But programs like Photoshop and browsers and applications can actually read and realize, oh, this image has not only a grid of pixels, compressed or otherwise, color or otherwise, that I can display to the user, there's also some additional metadata that tells me how to display this image in a way that's much more immersive, so that the image effectively wraps around the user. Now the user might look a little silly doing so, but if he or she has a headset quite like this one here, he or she can take a look at this image, pull it up on the digital screen that's before him, and thanks to two small lenses, left eye and right eye, start to look up and down and left and right and all around him or her and actually see a space like this and experience it in 360 degree virtual reality. So this is just a taste then of the file formats that currently exist, that are on the horizon today, and just who knows what more will exist. But at the end of the day, it all boils down to bits, to zeros and ones, how you arrange them on disk, and what features you provide to the users with which to capture their imagination.