[MUSIC PLAYING] ZAMYLA CHAN: Let's tackle recover. Recover is probably my favorite PSET, and mainly because I think it's really, really cool. Basically, you're given a memory card file in which pictures have been deleted. But what you're going to do is recover them all. OK. So it's really exciting, but maybe a little intimidating, because you're given an empty C file and you have to fill it in. OK, so let's break this into manageable parts. You'll want to open the memory card file. That seems simple enough. Then, find the beginning of a JPG image. All the files on this memory card are going to be JPGs. Then, once you find the beginning, you're going to open a new JPG, that is, like, create a JPG, and write 512 byte at a time until a new JPG is found, and ending the program, once you detect the end of the file. So first steps first is to open the memory card file. But you know this already, and there's a file I/O function that's going to prove very useful. OK. So what are JPGs? Because we need to the beginning it. Well, JPGs, just like bit maps, are just sequences of bytes. Luckily, every JPG starts with either 0xff, 0xd8, 0xff, 0xe0, one sequence of bytes, or another sequence of bytes. So those four bytes indicate the start of a JPG. None other than those two combinations of four bytes. And luckily for us, another fact that we can take advantage of is that every JPG is stored side-by-side on the memory card. I've represented the structure of a memory card schematically on this slide here. Here, every square, every rectangle, represents 512 bytes, and it starts with a gray in that we don't really have a JPG. But then we finally hit a block with a star. That means that the first four bytes out of those 512 are one of those two starting sequences of a JPG. And we go from there, and then once one JPG ends, the next one begins. We don't ever have any more gray space in-between. But how do we actually read this, and read the 512 bytes so that we can make the comparison the first place? Well, let's go back to fread, which takes in the struct that will contain the bytes that you're reading. So you're going to put those in there-- the size, the number, and then inpointer that you're reading from. Now, we want to read 512 at a time, and we want to store this in a buffer, I'm going to call it. Basically, we're going to hold onto those 512 bytes and do things with it, right? We're either going to compare the first four bytes, or we're going to read it in, OK? So then the data pointer will then serve as your buffer, and the inpointer, well, that's just going to be your memory card. Back to our memory card schematic. We're going to read 512 bytes at a time, storing every 512-byte block into a buffer, holding onto those buffer, those 512 bytes, until we know exactly what to do them. So the beginning isn't anything, so we'll read the buffer, compare it, and we won't need to do anything with it. And then, we finally hit a star block, meaning that we've found our first JPG. So the buffer now hold bytes from that JPG. The next time 512 bytes, because they're not a star block, are also part of that JPG. And JPGs are continuous from there on in, until we hit the next JPG. And then the buffer then holds 512 bytes for that JPG, and so on, and so forth. OK. So once you hit the first starred block, the first JPG, how do you actually, well, open it? Let's make a new JPG. The filenames for a JPG are going to be in the format, number, number, number.jpg, in that they're named in the order in which they are found, starting at 0. So the first JPG that you find will be 000.jpg. So, probably a good idea to keep track of how many JPGs you've found so far. So that's the file name. But how do you actually make that? Well, we're going to use a function called sprintf. A little bit similar to printf, where you can use placeholders for strings, except in this case, sprintf will print the file out into the current directory, not into the terminal. OK. So here we see that we have title, a char array that will store the resultant string, and we pass in the title of the actual string with a placeholder, just like we've learned to do with printf. But this code that I have here will give 2.jpg, not 002.jpg. So I'll leave to you to find out how to modify the placeholder to make the correct name. OK. So once you've sprintf'd then you can open that file, because it exists in your directory, with fopen, using the title, and then whatever mode you want to open that file in. So now that we've opened a new JPG file, now we can write 512 bytes at a time, until a new JPG is found. So let's take another look at the syntax of fwrite. I know that I'm showing this slide a lot, but I just want to make sure that you guys don't get too confused, because I know that it's very easy to mix up the first and the last argument, in particular. But remember that you're writing from your buffer into the out file images. Now that you know how the write 512 bytes into your JPG file that you've created, well, we want to stop that process once we've reached the end of our card, because there won't be any more images to be found. So let's go back to fread once more, I promise. fread returns how many items of size, size, were ready in successfully. Ideally, this is going to be whatever you pass in for number, right? Because you're trying to read number of elements of size, size. But if fread isn't able to read that number of elements, then it'll return whatever number it read successfully. Now, one important thing to note is that if you use another file I/O function like fgetc, it'll also return how many items it read successfully. What's useful about this function is that if you use functions inside of a condition, it'll execute itself while determining that condition, which is just really useful. So if you have this conditions, say, if fread buffer, sizeof DOG, 2, pointer, equals equals 1, that means that I'd like to read 2 dogs at the time. But if fread returns 1 instead of 2 as expected, that means that there are 2 dogs left in my file, but rather 1. But if it returns 2, then I still have those 2 dogs inside of my buffer. So now that gives you a sense of how to check for the end of the file, but let's go through now the logic. How do we actually piece all of these elements together? Once we hit our first JPG, since we know that JPGs are stored contiguously, we'll be writing until we reach the end of the card file. But we don't want to write anything until then. So it matters, not only that we're at the start of a new JPG, but whether we've already found a JPG or not. If It's the start of a new JPG, we'll want to close our current JPG file if we have one open, and open a new one to write into. If it's not the start of the new JPG, though, we'll keep the same JPG file open and write into that. We'll write our buffer into whichever JPG file we have open, provided that we have one open, of course. If we haven't found our first JPG yet, we don't write anything. And this process continues until you reach the end of the card file. And finally, you'll want to make sure that you fclose any files that you've fopened. Once you're comfortable with the concepts, take a look at some pseudocode, which I've included here. First, you want to open the card file, and then repeat the following process until you've reached the end of the card. You want to read 512 bytes into a buffer. Using that buffer, you'll want to check whether you're at the start of a new JPG or not. And the answer to that question will affect your file management-- which files you open, which ones do you close. Then, have you already found a JPG? How have you been keeping track of that? Then, depending on that, you'll either write into the current JPG that you have open, or not write it at all, because you haven't found a JPG yet. Finally, once you've reached the end of the file, you'll want to close any remaining files that you have open. We want to be tidy here. And with that, you've recovered all of the missing files from that memory card, which is a pretty amazing feat. So pat yourself on the back. But, there's one more element to the PSET, which is the contest. You'll find that all of the pictures that you've recovered are actually pictures of CS50's staff. So if you're on campus or somewhere near, then you can take pictures with the staff, and the section that has the most pictures with staff members from their recovered files will get an awesome prize. With that, then you've finished the recover PSET. My name is Zamyla, and this is CS50.