>> All right welcome back to CS50. This is the start of week 7. So it was actually around this time last year that I went into a bit of rant when I was in the subway station in Harvard Square and there was this nice lady who was clearly foreign, had never been to our subway system before and all she wanted to do was to get from a Harvard Square to Park Street, pretty reasonable goal, pretty straight forward route. And so she walked up to this device here. Blurry only in as much as I'm bad in photography, but this is--these machines if you've never used them before that they're very high-tech. You can touch the screens and interact with them and actually buy yourself a train ticket. Unfortunately, the sheer number of steps involved in buying something as simple as a one way ticket to Park Street involves some 7 plus steps. So you walk up as I did with this woman who had turned to me or really any one around for some help because she could not make sense of our American machines here. You walk up, you see the screen and you've got 3 huge buttons and the one you ultimately want is that one up there. To purchase new Charlie Ticket value tickets and passes press here. Now as an aside and from the perspective of user interface what is a Charlie Ticket? All right, we're already assuming that some outsider knows what we are talking about when we say Charlie Ticket but we can forgive that and move on. So now, I'm asked what I would like to purchase. This one is admittedly pretty straight forward. So you click on the left to actually take the subway and then you are asked for your fare here and so that one too is fairly straight forward and then you get to this point which is a curious one and that going downtown cost neither 5 dollars, 10 dollars, nor 20 dollars so you have to click other amount if you just want that one way pass which these days I think is like a dollar 50, 2 dollars, right. So clearly an up sell attempt here and I think if we do the math, even 5 dollars is not the cost of a round trip, I think it's probably 4 dollars or so. So we click other amounts, this woman and I and then you get to this and God forbid you wanna spend some triple digit amount of money just to get to Park Street. So now, you're on the screen and you input your number and finally you get to this screen which says you're buying an SV adult, whatever the heck that is. It turns out it's single value but who cares. And then amount to pay 4 dollars. Now I can credit card, debit card, and long story short. Dear God, it took us like 5 minutes just to buy a one way ticket to Park Street. Now, rewind to yesteryear just when I was here and we didn't have these fancy touchscreens. We instead had these devices where you put your dollars in and you get a T token, a metallic token drop it in a slot and then you're on your way. This is not an improvement. I would wager and yet this is representative, this kind of screen of a horrible user interface and sort of technology gone wrong. And unfortunately technology and by extension computer science too often get a bad rap for being beyond the scope of most mortals because we're so often presented with crap like this, right? Interfaces that are not optimized for the common case which I dare say is to get on subway and go somewhere not to go through the 7 some odd steps. So you'll find in problem set 5 I've formed this week to go online tomorrow as usual. You'll be asked to keep an eye out this week or to think about some piece of hardware or software whether in subway stations in your own life, on campus here at Harvard that could be better. And we're gonna ask you for your simply real world human instinct as to how this software or hardware device could be better in terms of its design. Now, recurring theme when we did this exercise for the last time last year is that there was a lot of Harvard centric things that were pointed out. I'm sure some of you can think of various web sites of which you are all too fond here within Harvard.edu that could use some tinkering. So do know that we will actually group some of your feedback by Harvard and non Harvard specific feedback so that we as a course will escalate your critiques, your informed judgments of various tools so that hopefully, collectively as a university we can rise up and present to each other interfaces that are not quite like this. Those of you who are not freshmen but lived through [inaudible] from college.harvard.edu in years past will know that that system is no more, replaced now with Gmail and I dare say that in particular was a recurring theme in last year's survey and look, that problem is gone. So perhaps, we as a class can similarly chip away this year. So with that rant aside, so it's a sad week over the past few days where we of course as a world lost Steve Jobst and to be honest, I would say in the context of this MBTA system in one of the most amazing things Apple really has done is take technologies and make them simpler, right. Mac OS itself is not fundamentally any simpler I would say than windows these days but these phones that we have, these iPads that we have like even technophiles have fallen in love with these things because they just make things easier and we would be remiss in not pointing at another gentleman who unfortunately passed away last week named Denis Ritchie. This is the author of the C language with which you have been all to acquainted this semester and he was also one of the co-contributors to the UNIX operating system which then gave rise to things like Linux and the like which we've been using in this course. So there's a really nice obituary in the New York Times. If you Google his name, Dennis Ritchie but know that it is to this man to whom we owe most everything we've been talking about on this class. So with that said, what's on the horizon ahead. So problem at 5 introduces not only the world of multimedia and imagery but also that of forensics and part of forensics will involve understanding how things like images work. Now it turns out that even though we look at images all day long on facebook or just your own computer in the form of JPEGs, and GIFs and PNGs, well, you can actually represent images as you will see in this week's pset fairly simply, right? If we have a trivial image that we wanna represent like a smiley face here, well you'll probably know before CS50 that an image is just a bunch of dots called pixels. It's a grid, horizontally and vertically of dots like this and so in the simplest form if all we wanted to do is implement a black and white smiley face, you know we could kinda do this using week one's material. We can just auto--arbitrarily say that the bit one will represent white, the bit zero will represent black. And so if we create a file using some kind of program or editor, whereby we say 11000011, we could get the top row aka scan line of the smiley's head there. Now of course images today are much more glamorous than this black and white images but all we do to represent things like JPEGs on facebook profiles is just more than a single bit per dot rather we typically use like 24 bits per dot to represent most any color that a human could possible see. And so you'll see that the problems that walks you through the progression from bitmaps BMPs which this thing here is to JPEGs which you then need to recover as part of the forensic challenge. But we also introduce some C in programming specific ideas. So we talked briefly in past weeks about structs and this table here is kind of overwhelming but what this table here represents is what you can find at the start of any bitmap file. BMP is a file format like JIF and JPEG. It's with the smiley face there it was but it turns out that it's not sufficient to just put zeros and ones in a file and say okay this is a BMP rather file formats like the bitmap file format needs what we generally call metadata. It needs some hints so that programs like photo shop and Ristretto photo viewer or whatever you're using to open up graphics know a little something about how to interpret those bits. Right, if you just hand a program zeros and ones, it's completely out of context. It's not clear if this is a word document, a text file, an image. So this metadata is what list the file extension the last 3 letters of a file name like BMP or JPG typically provide this information. Now not all of these is all that interesting but you'll see at the top that you'll--actually you'll see in the problem set that we do inquire about things like BI size and width and height. So information as to well, are all of these zeros and ones that you see, this wide, are they this tall and so forth. All of that information is hidden in here and as an aside, the reason for these new data types like word and DWORD and byte, these are actually typically used in the windows world when doing windows programming. Microsoft simply has some of its own data types that there is a mapping to Linux and UNIX data types like char and float and so forth. So you'll see in bmp.h one of the files we give you this week that we just create synonyms for these various things, for word and double word and long and bytes. So realize that we have all the expressiveness now and see to do this and those numbers if not clear on the left, just show you how far into the struct, at what byte position this particular field is. But without this information, you're kind of screwed right? Unless you understand something about the format of a file, it's pretty much impossible to read it unless you have these hints and similarly for the recovery part of these problems that we have to recover a whole bunch of JPEGs that I accidentally deleted just knowing a little something about this header information. This one again is BMPs. It's not JPEGs but we tell you a little about the JPEG format so that you can similarly recover like an investigator might a whole bunch of images. And now in a Hacker Edition for those who have been interested this week realize that this is most likely the last of our Hacker Editions 'cause we now kind of converging and all get on to the same page realize these one is actually a little more accessible than usual. What you'll do is much the same problem set but when challenged to resize images, BMPs to make them bigger, the Hacker Edition will also challenge you to make them smaller. And making an image smaller is actually non obvious whereas to make something twice as big, you can imagine conceptually just double all the pixels. Make everything twice as wide, twice as tall, and then you can just do that for any factor. But if you have to shrink an image and for instance you have a pixel, what would you do with that pixel? A pixel is a pixel. You can't shrink it. So instead you have to throw information away, if you wanna shrink an image. And so one of the challenges of the Hacker Edition is to decide which pixels should you throw away or if something is black and something is white, should you may be throw one of them away and just make one remaining pixel gray or something to that effect. So there are a lot of opportunities there for design. And there will also be opportunities as usual for office hours but per 2 weeks ago's announcement, we're gonna have to cross the river just this week to join the iLAB for one of its inaugural events, where we'll have the same iPod and then the same approach to office hours. But upon arrival, you'll see a nice sexy lobby like this. There will presumably be people by then behind the glass walls there and you'll pretty much be welcome to just spread out wherever. You can certainly bring friends from other classes. The goal is simply to hang out as a class, get the questions answered by the TFs and CAs and eat pizza and soda which will be supplied gratefully by the iLAB. All right, so quiz zero. Also what we will do. I'll be there tonight. If you have questions or wanna talk through your quiz zero, you can reach out to me directly. We will make one of the categories quiz zero questions tonight on q.cs50.net and also throughout the week. If you wanna walk through your quiz with one of the TFs or CAs realize that we will make ourselves available for that and lastly, if you're on that fence and you're here today at lecture but you are sort of planning to head to the registrar by 5 p.m. to withdraw from CS50, please do catch me after class for conversation if you're on that fence 'cause realized that we are now pretty much more than half way through the course's problem sets and pretty much half way through the course so we would hate to lose you at this point. All right, now for something technical. So one of the hardest things in [inaudible] is dealing with things like memory and pointers and malloc and free and increasingly will this be an opportunity to really screw up your programs and make programs that crash and seg fault and core dump. But thankfully, just as we have tools to write programs, we also have tools to fix or to diagnose or debug programs. We've used or we've said you should use gdb thus far, the new debugger but we also introduced this week a nice tool called Valgrind whose purpose in life is to analyze your code and figure out if you've screwed up with regard to memory management. Did you accidentally leak memory? Did you accidentally go beyond the boundaries of some array? Now the tool is imperfect. They can't necessarily perfectly test your code because of course any time you run programs, you give them input and Valgrind is not gonna be able to try your program on an infinite supply of inputs. So it's gonna do its best guest but you'll find that it can help us find some things quite nicely. Let me go ahead and open up a file called memory dotC. That's among our source code for today and this is a pretty stupid program that's deliberately buggy and that it does the following. So let's start from the bottom. Here is my main function. Takes no command line arguments, apparently calls a function F and then return 0. So that looks okay. But then let's roll upward to the F function. It returns void. It takes no arguments in but it does in this first line call malloc and just as a check, sanity check how many bytes get malloc in this highlighted line. So 40, right. If we assume in int is 4 bytes, then 4 times 10 is of course 40 and then what just to be technical are we storing in x? Then address. The address of that chunk of 40 bytes. So that's sort of from 2 weeks back and then on the next line unfortunately, I do something stupid. Why is this flawed? This second line of code. Extract a 10 get 0. Yeah? [ Inaudible Remark ] >> Perfect so the maximum index for this array should be 9 from 0 through 9 because I've only allocated 10 int. So of course this is an off by 1 error or I'm over reaching the boundaries of my array. In the worst case, this program might crash and seg fault and core dump but not always and that's one of the frustrating things. Often if you just go ever so slightly outside of your memory bounds--your memory's bounds, the program might not actually crash 'cause the operating system is being a bit generous and is allowing you if unofficially and in a dangerous way to go slightly beyond your array's bounds. So in other words, this is kind of hard to catch sometimes, certainly just by visual inspection. So let's instead try this. Let me open a terminal window. Let me go ahead then and compile my code with make memory. And now if I run this program, it actually seems to be fine, right? No core dump, no segmentation fault. So at first glance, all is okay but now instead let me run this new program Valgrind and let me go ahead and run it on memory. I still need the dot slash just to test this program here and then hit Enter. Unfortunately whoever first wrote Valgrind decided that let's make its output a crazy mess but if you read between the lines, you can actually see some juicy information. So let me scroll back up to the very start of this and you'll see here's where we started. This is where I ran the command and now we have some copyright information and stuff like that but here is where the helpful information is, invalid write of size 4 at such and such an address. Now this pointer, this address is on the left hand side are generally not all that useful unless we really get low level with our debugging but it does mean that you did something wrong at that address. Specifically at line 23 of what file? Right, so memory dotC, right? So the output is actually quite similar to gdb and so if I go back to line 23, which is this guy here sure enough, Valgrind has noticed that I wrote 4 bytes erroneously there. Now it's still up to me, the human, to determine where is my mistake and of course we already identified the mistake but the program found it. Even though when I ran this program, nothing appeared to be wrong. Now lastly, this here says--this actually being address is 0 bytes--actually, that's okay. Let me scroll down here and now if we go down here, leak summary definitely lost 40 bytes in 1 block. Now, this might be a much bigger program. Right now it's pretty simple. It's only a few lines so I can actually run Valgrind with more switches and notice it's telling me to rerun with hyphen, hyphen, leak, hyphen, check equals full to see details of leaked memory. And in fact if I go back to my slide here, what you should generally run even though it's a little tedious to type is exactly this. Valgrind-v--so I'm gonna try this, Valgrind-V. -V almost always means robust for a Linux program. Some of these switches take hyphen, hyphen not single hyphens. The convention just FYI is that if it's a single letter switch, it's usually a single hyphen. If it's a longer word like leak, then you would actually use hyphen, hyphen but that's just the human convention. So leak check equals full and then I have to say 8 dot out in general but in the concrete case, it's the program called memory. So let me hit Enter now and oh my God. So much more flew across the screen there but let's scroll up--scroll up and here we go. Now because I ran it with more robust output and I frankly knew where to look. You would kind to have to take a few seconds to sift through this but notice, 40 bytes in 1 block are definitely lost and lost record 1 of 1. I don't quite know what that means but it looks like the bug is in a function called malloc, right? >> All right, so probably not, right? So you've called malloc so let's at least chase this bug back a little further in the stack and what was the previous function called? Okay, it's probably me who screwed up so it's somewhere in memory dot C line 22, a function called F was called and somewhere in there 40 bytes were lost. Of course in a program, this small, it's pretty obvious to the reader what I did wrong which is what? How did I lose these 40 bytes? I didn't call free. I should have called free at some point and of course this would be kind of a silly exercise to just do it now because this is again kind of a bogus program but that would be the correct fix. Just hopefully in the real world, if I'm writing some real program, there would be some useful stuff going on in between those lines. But this one is just completely wrong. Going into bracket 10 is just incorrect. So in short, unglamorous program but a useful tool. So henceforth anytime you write any program whether for [inaudible] at five or onward that uses malloc, that uses arrays in any way do run your program through that tool. Take a few seconds to sift through the output and it might find a bug frankly before your TF does. Any questions? All right, so now we have the opportunity to start to implement with the same fundamentals from the first half of the semester some real world constructs so that we can begin to solve more interesting, more challenging problems with more sophisticated tools than just an array or variable. So this is a photograph from my favorite [inaudible] and this is of course a stack of trays. Well it turns out in computer science a stack is a data structure. It's a data structure that allows you to actually store data and potentially in more useful way then say a simple array does. The idea of a stack just like in the real world is that if you have a stack of trays, there are some up sides and some down sides. One upside of a stack is that if I put something down like a tray and then I put something else down like a tray, then I put something else down like a tray well it's very easy and very fast to get the most recent thing from that stack. You can just pop it off the top of the stack so to speak. Unfortunately, a stack of trays or a stack of paper here is not the most efficient system if you instead or for instance queuing up for something you really want, right. It would kind of suck if you got to the Apple store like 5 am before or even earlier to get your fancy new iPhone 3 or 4 or whatever and the staff in blue shirts starts popping people off the [inaudible] queue from the back, right? That is a bad data structure. A bad design when it comes to selling iPhones. You would really piss off the people at the front of the line. So instead in computer science, there's an alternative data structure that's similar in spirit. It's kind of a list of some sort but it's called technically a queue. And a queue is first in first out. So the acronym is fifo, FIFO. First in first out just means if you get there first in the morning, you're gonna be the first one allowed to buy that product. By contrast a stack is the opposite but we'll see utility of both of them. In fact, those of you familiar with HTML know that the prevalence in HTML is the idea of balancing tags, having an open tag, and a closed tag, and the notion of validating HTML involves checking. Do you close all of the tags that you opened or you can actually use it turns out a stack data structure to answer questions like that if you are the W3c's Validator. And for those for whom most of those words made no sense, not to fear just a couple of weeks, we'll be diving into the web programming portion of the course. But to begin to now use these data structures and solve more interesting problems, we need to be able to chain structures together. One of the biggest problems of an array is that as--even though it's super simple to use once you get to hang of it and that you can go to any elements in it. What can you not do easily with an array? [ Inaudible Remark ] >> Sorry? You can't extend it right. You have to decide in advance if you want 10 bytes or 100 bytes or whatever it is that you are allocating. You have to decide in advance. And what happens if you decide, I really need 1 additional element or twice as many elements what do you have to do if you're using arrays to store your data. What's that? [ Inaudible Remark ] >> Yeah. You pretty much have to make a new one. Copy the old data into the new one and then free the old one. Now, we haven't done that ourselves really but we've been using something that does. What function that we've been using since week 1 does exactly that, when it runs out of space, it allocates more, copies old into new and then frees the old. >> GetString. >> GetString, exactly. Remember GetString is a perfect example because we have no idea how long of a sentence or paragraph some students ever gonna type when we sit down at the start of the semester to implement GetString. So we have to be dynamic. We have to enable the function to grow the storage space it's using and we've seen as of 2 weeks ago that the technique for that involves using malloc and it turns out some other related functions like realloc and the like but it boils down to allocating and reallocating memory. So what if there were a data structure that's allowed you to have some fixed amount of memory first but then grow it super easily because the problem with using the array trick, is what? If your string is this big and you need 1 more byte or twice as many bytes, this is a lot of work, right? You have to ask the operating system for more memory. So temporarily, you're doubling your memory footprints then you have to copy as in a 4 or Y loop all of the old memory into the new memory, then you have to free that memory. That's a lot of steps. If the user has may be hit 1 extra character. So what's an alternative? You can avoid resizing arrays in this manner by just allocating maybe 2 gigabytes of memory. Any time the user might type something in, I'll just choose a huge upper bound, 2 gigabytes but of course what's the obvious problem here? [ Inaudible Remark ] >> Right, if you just allocate an excessive amount of memory, your program at some point is just gonna stop working or it's gonna slow the computer to a crawl because it appears to be using way more memory that it should and if you only have a couple gigabytes of RAM and your silly little program is using it when you're actually trying to do useful work with like a browser or word processor, you know everything will just slow to a crawl as we've discussed. So suppose there were a structure that looked a little something more like this. Enter into the picture something called a linked list which is pretty much what the name suggests. It is a linked list of what are generally called nodes. What's a node? For now, it's just a chunk of memory and we can implement these little chunks by way of that C construct known as a struct. So where have we used the struct before thus far? What comes to mind? [ Inaudible Remark ] >> Sudoku. So we had a struct in sudoku to represent things like the board and the y, x location. We had something for students a couple of weeks ago where we wrapped together students name and ID number and house. So any time you wanna kind of clump together some related data fields and call it something new like a student structure, you can do that. So these squares in just a moment are gonna represent what we'll generically call a node and each of these squares in this picture here stores a number, apparently an int, a 9, a 17, a 22, a 26, and a 34 and this is equivalent for the moment to an array of size 5. So I see now 5 numbers 9, 17, 22, 26, 34. so in effect, I could implement this very easily with week 1--with week 2 stuff with an array of size 5. But I propose that because I've now drawn the picture in this way using arrows and not by having these numbers back to back to back as is the case for arrays, it's gonna be super easy conceptually to insert more numbers into this, to append them to the end, to insert them into the beginning and there too is an advantage. What was really hard in an array? Remember when we had a whole bunch of people up here on the stage when we were doing sorting and we had to move someone who is over there, all the way over here. In the worst case, we had to ask everyone, "Can you move down? Can you move down?" So just to make 1 swap in the worst case, with an array we had to do like N-1 steps of work, but suppose the picture were like this and suppose that in between each number was literally a gap and there's some concept of an arrow saying the next number is over there--the next number is over there well maybe we could just have squeezed that students and their number into wherever we wanted in this list. So we just need a way of linking nodes together in this way. So odds are these arrows represent what? [ Inaudible Remark ] >> Yeah, so pointers right. This is how we've been generically drawing pointers on the blackboard and so really and underneath the hood, there are some memory addresses and hexadecimal notation and all of that. But conceptually, this is a linked list. It is something like an array but instead of having contiguous memory, your nodes, your integers can be stored all over you computer's memory which is great because then you don't need to find a huge contiguous block. It can come all fragmented like from the entirety of your RAM and it looks like we'll be able to more easily insert numbers into these lists because we can just temporarily move up 1 of those arrows, slot someone else and another number in there and then fix what the arrows are pointing at. So let's just take a look at what this might look like. I'm gonna go ahead and open a file called let's say list 1.h and I'll propose this as the container for 1 of these things called nodes. So we've seen the syntax before even though we haven't written it ourselves all that much. Type def does what? [ Inaudible Remark ] >> If you kind of wanna be a student, you just say well it defines a type. Well, what is that it makes a synonym. 1 type is a synonym for something else. So let's see what this means here. So we see the keyword struct and we see the word node. So G that is being friendly and highlighting all of the C specific words. So type def in struct are C key words. Node is just something I made up. It could be called foo but by convention we call these things nodes and then what composes a node? What is in one of those rectangles? Well, 2 things. One is an int and I can arbitrarily call it N but I could call it anything I want but then the curious feature is the second thing. The bottom half of this picture here. So for instance, here is N, the number 9, these fields here which I'm apparently gonna call next what is that? Well, if it's an arrow that suggests pictorially that it's a pointer. How do we write a pointer? Well with a star somehow but what is this a pointer to? Well, it's a pointer to the same kind of structure. So we say, struct node star and then we choose an arbitrary word for this arrow and I'll call it next just to imply that it's always pointing to the next object. And then down here, I have the word node again which feels a little redundant and it is in some sense. So at the end of this statement is the word node, that relates to the type def. So what this word and this word together mean is henceforth call this whole thing more succinctly node. But remember that C is not so generous. It doesn't look ahead in your code and so we kind of have a problem because if I want to create this pointer inside of 1 of these structures to another such structure, does the word node exists at this point in time? Not really, right. The word node is literally declared here. Well, wait a minute you also said node here but this is again just because. I need temporarily a name so that I can refer to myself inside of this structure so long story short, struct node creates this thing and then the combination of type def and node here rename it more succinctly from struct node to just node. Why? Why this complexity? Frankly it's just tedious typing struct node, struct node, struct node all over the place so now we can just call it node. Thanks to type def. But if that's a little unclear syntactically that's fine for now just take for granted that it makes one of these rectangles possible. So how do we now start to manipulate these things? Well, let's rewind just a moment to an example that we already looked at in the past. Recall this thing from a couple of weeks ago. This was a structure that we used to define a student and there's one syntactic difference. What is fundamentally different about this is there's no word here to the right struct. Why would I not need to say type def struct student using the logic from a moment ago? [ Simultaneous Talking ] >> Exactly and let me tweak the word. I don't mention student inside of the struct, so I don't need a temporary word there, but I could put this here but again just as a matter of style because I'm not using struct student anywhere. I might as well make it more succinct. I do need to say struct and it's in red because it is an important key word here but I don't need to say students up here unless I'm gonna refer to a student internally and this hopefully makes a little bit of intuitive sense right? If I'm a student object, there should fundamentally in the real world and in programming be no notion of next inside of a student's object. Right? I shouldn't as a human have some kind of data remembrance of someone to the left of me so again we don't need a pointer there but we do have two pointers for name and house. So how did we use this? Well recall students structs 1 dot C which we looked at briefly awhile ago and this was just a simple example where we created a bunch of structs of type students. We then got the user's ID and the user's name and their house and then we scrolled down and just yelled out so and so is in [inaudible] if they were in [inaudible]. So recall this was from a couple of weeks ago and there's only a couple of important take aways for now. For now, recall that this is how we declare a--in array of students and for those with prior programming experience class means like a class of students, not a class of objects. So student class gives me an array called class of student structures here. So what goes on here, this was the new syntax. If I wanna update the 8th student's ID number, I used the dot syntax. If I wanna update the 8th student's name, I use the dot syntax and if I wanna use the--update the 8th student's house, dot syntax. That was new. Then this was kind of old school stuff, strcmp for string comparison. What was I comparing, I was comparing their house against literally the word [inaudible] in double quote and then this was important here. I did need to, for good measure, call free on house and name, why? Because what, sorry? [ Inaudible Remark ] >> Because I stored it but how did I get those strings, name and house? So you getstring and even though this was kind of some garbage we used to kind of brush under the rug, this was kind of a flaw in our previous programs for several weeks which revealed a couple of weeks ago that get string all this time has of course been using malloc and so now anytime you use malloc whether directly or indirectly, you must free that memory. Otherwise, Valgrind, this tool I showed a moment ago that would actually catch. If you run Valgrind in fact on most of your prior programs that you getstring from early in the semester, almost all of them would be buggy in that they would all have some memory leak. Alright, but that's okay, that was deliberate and those were the so called training wheels that are now off. So let's just do one modification of this to kind of tie this together also to this week's problem set 5. So this is struct2.C. It's almost identical but I thought it would be kind of neat to stop writing programs that take an input and then produce output and then lose that output forever. It would kind of be nice if our program for instance could store permanently the names and IDs and names of all the students we typed in, right? If we want to approximate the idea of a database, a server that stores information long term will be kind of nice. If I could use these same ideas now and start saving my work to disk, just like we take for granted in Microsoft word and the like the ability to save, let's do that ourselves. So here is exactly the same code. Give me an array of students structs, populate them using get int and getstring and getstring but down here, go ahead and say, whoever is in [inaudible] just to be cute. But now let's save these students to disk which just means save a file. So how might we do this? So we could do this in any number of ways. Microsoft back in the day decided that dot doc format, d-o-c is going to store your essay and your bold facing and your table of contents in a certain way. Similarly, we as programmers, have to decide how are we gonna store something like a student ID and name and house perhaps for multiple people in a file. So you could actually do this in a bunch of ways but we decided totally arbitrarily that we're just gonna store these IDs, names, and house one per line in a text file just so we can see it. But this is of course a very unsophisticated approach but it will allow us to write files to disks. So let's see this. The very first line of interest is this thing here. So F open, you might guess is file open. We've not had to use this ourselves unless you already dove into problem set 5 but as the name suggests, it opens a file. What's the name of that file gonna be, arbitrarily I called it database. It could be called foo but again I chose database and then just take a guess, what does "W" means? So it means write. So it means write this file as opposed to reading it. So you can actually use the same F open function to read files in but for today, we'll just do it with W. As an aside, you might see in various books or tutorials, people saying WB for write binary or write ascii, realize that's on the appliance and on most modern systems, it suffices just to say R or W, you don't need to worry about the B, just FYI. So why do I check for this? If FP which is just generic notation for file pointer, if it equals null. When might something go wrong related to writing a file to disk? Why might F open this conjecture return null. [ Inaudible Remark ] >> sorry? [ Inaudible Remark ] >> So database is missing? Maybe that file doesn't exist. In this case, that's gonna be okay cause what's nice about F open is if you're writing a file and it doesn't exist, it will create it for you but a good thought, that would be the answer if we we're trying to read it back in certainly. Yeah? >> The disk is full. >> Yes, so if your disk is full, right. If your hard drive is just overflowing with other files and you just don't have enough space for any new file, F open could return null, other maybe more technical problem? If you've done something stupid or accidentally like try--change the permissions on some folder or you're in some folder on the system that doesn't belong to you, it's someone else's my document's folder for instance, well F open might return null in that case too. So in short, if you lack the ability to write a file, you're just gonna get back null but what is it you're gonna get back? Well FP is of type file and it is in all caps just because someone decided to make it that way and it turns out a file structure, F-I-L-E in all caps is some kind of structure not unlike conceptually a student's structure, inside of which is a bunch of stuff and that stuff is generally hidden from us but you can think of a file, kind of like a cassette tape if you even remember these things. A cassette tape of course has a whole bunch of tape inside and when you play a song in a cassette tape with these 2 things turn and the tape physically moves. A file on a computer system obviously has no tape but conceptually you read and write it in the same way. When you open a file with F open, it's like taking out a cassette tape and starting at the beginning of that tape and as you start to read from the cassette tape or read from the file, it's like advancing the reading head or the cursor so to speak or the file position indicator so to speak on that file and if you write to a file, similarly that happens. Now this is relevant because in problem set 5, one of the challenge is to enlarge a bit map file. So you have to take a pixel and double it or triple it and so forth. So realize that one of the challenges that arises in problem set 5 is, well if a file is being read from start to end just like a cassette tape, well what if you need to kind of rewind the tape and go back and reprint those same pixels, re-output those same pixels, well there's functions like F seek, file seek which can help you do this. There's F rewind although that's typically overkill but there's other ways altogether as Tommy discussed in the walk through involving arrays and just remembering some of those bits. But in short, for now, this is a pointer to a file structure and it's not clear to us just yet what's actually in that file. I accidentally clicked there so let me go back to structs2 dot C and we'll finish the story. So how do you write to this file? Well if I get as far as here inside of this if condition, the file opened okay and I'm good to go to write. So how do I write to a file. It's actually pretty familiar. So you don't use F--you don't print F, use F print F for file print F. The first argument is apparently a pointer to the thing that you wanna write to and what are the second and third arguments apparently. It's pretty much identical to print F, right. It's the format string. What do you wanna write and what you want to substitute in for any placeholders so apparently, I'm gonna write integer followed by a new line, a string new line, string new line and I'm gonna plug in each of these student's IDs, names and houses then I'm gonna close my file with F close as file close then I'm gonna go back and free the student's name and house. But one question, I don't free the student's ID number, why? I don't free ID. >> That's not a string. >> It's not a string, it's an int and we did not allocate that integer with malloc. It was just there inside of the struct recall. So let's see what the end result here. So this is structs2 dot C, let me go ahead and make structs2, okay. It seem to compile okay. Let me go ahead and run this now. So run structs2, enter, student's ID, let's Matt Kirkland [phonetic], student's ID, let's say Rob Kirkland [phonetic], student's ID 3 David Meather. So it still says David is in Meather but then nothing, but if I do LS now, there's a whole bunch of code from today as well as this file database. So let me go ahead and run G edit on database, enter and voila! We've just created this file. Again, so primitive this format, right? This is not what a word document looks like inside. But it is our own proprietary file format just as bit map and JPEG have their own file formats and ours too is in just ascii text which might not be sufficient if we wanna store something like an ID photo but at least we now have one way of expressing ourselves persistently by writing these things to disk. And it remains then to use these same fundamentals to start linking things together in an interesting way. Let's go ahead and take our 5-minute break now and then we'll return to that. [ Pause ] >> Alright, so we are back. Linked lists are much more exciting when we actually make a mess of them with actual humans. So if we could indulge, are 8 people willing to volunteer on center stage? Have to be comfy on camera. Okay, let me decide this randomly who is se gonna come up. Okay here we go, how about you two, three, four, five, six, like in the side of the room, seven, eight. Come on up. All right, you lost your chance. All right, come on up and we are going to hand you the same numbers we see here on the screen. So you will be number nine and if you wanna go on our usual spot over there welcome 17, welcome 22, welcome 26, we will fix the slide to make this consistent with reality. 34, you can be first. You can be pointer. >> Okay. >> You can be predecessor pointer, very dramatic. All right, so slights bug and this is an image too isn't it. May I see 20--who did I get wrong, 29 okay, debugging [laughter]. One moment please. Fixed, all right [laughter]. So that's 26. Okay, so here we have a linked list and let's ignore for the moment how we got to this point already. Let's just assume that we are-we've fast forwarded 10 minutes mentally and we already have the ability to create linked lists and here is the state of our world at this moment and actually let's pop out if you don't mind pointer and predecessor pointer if you could be here in the heap for just a moment. So first, this is good and now this is the point at which we need to really mimic this picture if you could somewhat awkwardly with your left hand point at the next person except for 34 where I would point down with your left hand, okay. So now we have implemented a linked list with humans. So what's going on here? So first of all each of these--you still have to point. All right, each of these structures has two things inside of the struct. We have an integer like the number 9 and then we also have a left arm known as next which is the pointer. So each of these humans represents a struct at this point with those two fields for a total of little check how many bytes does each person represent. How many bytes? >> Eight. >> Eight, right, so four bytes for the int and four bytes for the pointer. So each of these rectangles is 8 bytes on a system like the appliance. Now first is a little special, so first what's your name? >> Lisa. >> Lisa is especial in that she does not have an integer. She is only a pointer and so the reason for her being drawn as a smaller rectangle up there is that she is only 4 bytes but Lisa is super important because without Lisa we're gonna forget completely where this whole linked list is and without Lisa we would have a whole huge memory leak if we allocated all of these nodes but didn't remember where the first of those nodes is. So for now, each--Lisa is special. She is the pointer and Lisa really is our linked list. So we've had arrays before and we've had names of arrays which are really just the addresses as we've seen and so Lisa is sort of the equivalent of that. The only piece of data we have to have to have to keep around in memory to remember where the entirety of our linked list is, is the first person, a pointer to the start of the list and now all of these guys are fundamentally the same, they are incomplete structures except for 34 who apparently is pointing at the ground because that's the end of the list so that's probably gonna represent what, when pointing down. So null, we need a special value to say there is no one else to the left so when pointing down this represents storing the null pointer, all zeros in that pointer field. Okay, so the first challenge at hand and we will make use of--you don't technically have to keep standing there like that, we will use here in a moment. [ Laughter ] >> Okay so let's go ahead and try inserting at the end. So we have the--we need the number 55. Actually let me change it, let me steal a pointer, can you take on the rule 55. So what's your name? >> Lucas. >> So malloc Lucas, right. So now we have another 8 bytes inside of which we can point the number 55 and now let's figure out how to insert Lucas into this list and actually what's your name again? >> Joseph. >> Joseph oh okay and I will take that [laughter]. So Joseph if you wanna kind of take charge here and explain just in real human terms how do we insert 55 into our linked list. If you wanna go ahead and make the insertion, sure [laughter]. And let me give you this if this is on just to explain what it is you're doing here. >> Okay, so I guess you take him to the beginning of the list that you know this is where the list starts and--. >> Okay. >> You go down through each pointer to find the end of the list and then end up here. >> Okay, good. So we found the so-called tail of the list and then what did you do? >> Then you insert there. >> Okay. So let's push a little harder and what it means to insert into the list what you see, you put him there but-- >> So you take the pointer of the last element in the list and put. [ Laughter ]. >> Okay excellent and where should his left hand be pointing. >> To the ground. >> Okay good. And this is actually quite important just because you've malloced a node, a structure put in the number 55, you also have to change what the default next value is because just as with local variables if you just malloc a node and then put 55 there what's gonna be inside of the other 4 bytes called next. >> Garbage. >> It's just some garbage value and that's really bad in the context of pointers because if you follow a garbage value as a pointer you're gonna end up in some crazy place in memory and the odds are that you're going to seg fault or core dump. So that was really good. So now we have 55 at the right position, 34 is point at 55, let's now try to do a little something else with list. Supposed we instead now wanted to insert the number 5 so I need to malloc one more volunteer please. Okay come on up malloc number 5, what's your name. >> Emmy. >> So Emmy's been malloced here. We're gonna need you again with the mic to handle our insertion at the head of the list and consider now what is fundamentally different about insertion at the head of the list and the reason of course that we're doing head and tail is because we're trying to keep the list sorted which we've tried to do with arrays in the past, all right, go ahead. [ Inaudible Remark ] >> Oh, and talk as close as you can. >> Okay, so you have her at some place in memory. >> Okay. [ Laughter ] >> What the first person right to her and her to point to the next person. >> Okay good. And now even though as humans they did have to kind of shuffle left and right to make room for each other, in memory because each of these nodes is that some arbitrary location in memory as determined by malloc notice what did not have to happen was 19, 17, 22, 26, 34, and 55 did not have to move, right. If they did have to move, the running time of insert would not be constant time, it instead would have been what? They go of N right and the worse case if we had to move every one over by one spot in memory which we would had to do in an array its big O of N but instead insertion is big O of 1 constant time 'cause all we have to do is right to the start of the list and we know where Lisa is because she again is the special pointer that we always keep around and even though we had to do a couple things like point her at 5 and 5 at 9 well that's still two, three steps it's a finite number, it's constant so that was a really fast operation. By contrast, what was insertion at the tail of the list? >> The same thing. >> Well, big O of-- >> Big O of while you have to go-- >> Okay we had to go to every pointer so big O of N exactly. So big O of N. So let's see now what happens if we make this a little trickier and have one more node so I need a malloc one last volunteer number 20. Hansel, come on up. And so we've malloc another node and take it away. He belongs in the middle and is explain for the audience if you could how we find his location because it's not as simple as tail or the head this time. >> Okay. Well, in the middle is N. >> Well, so in sorted order. >> Sorted order. >> Yes and you can walk in front if its--if you need to see the actual numbers. >> Okay, so what's her name? >> You weren't expecting to be this involved we're you in this demo, okay [laughter]. >> But so you go to first pointer, follow it to the next value, compare the value. >> Good. >> Then if it's greater than the value in the first element. >> Okay. >> If you point to the second one and compare those. >> Good. Just please just watch where you are walking. >> Okay. >> Okay and good and so now how many pointers need to be updated? >> I will see [laughter]. So, however many--whatever element that is. >> Good. Okay well actually let's view it. Can we pop Hansel out for just a moment and 17 of you could keep pointing to 22 so just super deliberately now what needs to be updated in the way of left hand to insert Ansel in the right location. >> So he has to point to Hansel. >> Okay, that's good. >> Point to her. >> Good bug. So you just broke our linked list how and why and someone else can answer this, I don't want to put you too much on this problem. What was the bug in that? [ Inaudible Answer ] >> Right, so you have to point Hansel at 22 first because otherwise what happened? When we updated 17 to point to Hansel number 20 we now have lost a pointer namely where did 22 go, where did 26 go, we've both--we have a dangling reference we've orphaned frankly those four people there. So this is an example of perfectly memory leak, so well done so let's fix. >> Okay, so now you point. >> Okay Hansel points to 22 and-- >> Seventeen. >> Seventeen point to 20 and voila, oh. Okay that was a lot of work but incredibly important now the order of operation. So let's hold that pause for just a moment and let's see if we can round out just a couple more 'cause I promise you it's a lot more interesting to look at humans do this than actually C code. So supposed we wanted to remove the tail of the list, supposed that we said we needed to chop off number 55 or we wanna remove the largest element from the list, where are we go going with this? Well you can imagine now if we have the ability to represent a dynamic length of a list. That's more interesting--more dynamic than array we can start to implement that idea like a stack of trays or a queue in something like a line outside the Apple store. So suppose the goal now is to remove the highest numbered person 55 what needs to be updated? >> Okay so you can just point her to the ground. >> Good. >> Free him. >> Okay good! But what did--okay so that's good but technically we run the risk there of doing what again? What we technically might have lost 8 bites of memory because we pointed 34 down which means okay well where was 55 so we at least need and I can help out here. We at least need to either invoke me like a temporary variable and I point at 55 now 34 can point again at the ground and then what do we free? We free this pointer which is still pointing at the same node or what could you also do rather than-- [ Inaudible Remark ] >> Right you could free 55 first. There's nothing bad about freeing a node that's been pointed at by someone else so long as you immediately thereafter lower 34's hand. So either you can do it with a temporary variable or just do it in the right order. And lastly, if we wanna actually update say the head of the list such that it looks like this. We wanna remove number 5. Can I put pressure on you one last time to get this one right with no memory leaks? [ Inaudible Remark ] >> We wanna remove 5. Oh! What needs to happen? >> Involve the temporary pointer, pointer 5 >> Okay. [inaudible] mine. >> Okay. Point at 5. >> Good. >> Point her at whatever number that--and then you can free the point. >> Excellent! So if we could--a big round of applause for these guys. This is not easy. [ Applause ] >> Very nicely done. Guys you can head out to Rob there. Thank you so much. So now let us make linked list look much less exciting with actual code. Just to get a sense of what happens here and how to start thinking about this. You'll find that we've implemented all of that now for you and you'll find this to be particularly useful for problem set 6. One of the challenges in problems set 6 recall from a much earlier teaser is to implement the fastest spell checker possible. Now at first glance, this might seem a little mundane but it gets hard when the list of words, the list of English words you're gonna be handed is like 150 thousand words long and you're gonna need somehow read those in from a file much like you are doing already with problems set 5 this week. But then you're gonna have to somewhat somehow lay out all 150 thousand of those words in memory so that look ups spell checking is incredibly fast. Now the simplest implementation of a dictionary is actually pretty simple right. You allocate enough RAM, enough bytes to fit all 150 thousand words and you just load them up into a huge array. Write one word after another. Much like [inaudible] contains all the command line arguments. It's not hard to implement a spell checker or a dictionary if you just put everything in an array. But what would be the run time--what would be the big oh notation for spell checking a single word if that's your implementation? Big oh of? So if you have N words or 150 thousand words let's generalize it as N big O of what is the running time of spell check in the implementation that uses an array? Right it's N right? It might be N/2 on average but we always get rid of constant factors so big O of N is you know--actually pretty fast compared to some of the horrible sorting algorithms we played with like bubble sort selection sort those things really under performed compared to things like merge store but relatively speaking even linear time is something that you might wanna avoid. What are --what's an example of a running time that we saw that was actually better than linear? And what algorithm did we see that in? >> Binary search. >> Yeah so binary search. Right this divide and conquer approach where we counted students or we found numbers behind pieces of paper on the board so log in is even better. But now the Holy Grail, this week and next is going to completely went up ourselves and try to achieve constant time whereby even if you have a 150 thousand words the challenge ultimately in problem set six is going to be to get as close as possible to instantaneous look ups. No N no log N. Nothing other than constant time. Now we're almost going to be able to get there. But you'll see that it's gonna require some ingenuity among them tricks like linked list and something called hash tables or tries and trees and the like. So more on that to come but for now let's take a step toward it. So here is list 1.H and here again is just to be clear the structure that each of this guys except Lisa represented a moment ago. Each of them had an int and held up on paper. Each of them had a left tan called next and now those were our 8 bites that each of these guys represented. Now I'm gonna go ahead and open up list 1.C and we won't dwell on all of the minutia here 'cause it can actually frankly unless you kinda worked through it slowly it's very easy to get lost but let's take a look first at what this thing does. Let me go into a terminal window. Let me do make list 1 and let me go ahead now and run list 1 and you'll see that just so that this is so interesting I created sort of a poor man's menu here whereby it just waits for you at this command prompt to either type in delete, find and search or traverse. All of those should be self explanatory except traverse just means start at the beginning of you list and just prints out all of the numbers. That's all we mean there. So I just needed a little menu because I wanted to implement functions called delete, find, insert and traverse. So what do I wanna do here? Let me move this slightly over and do delete. Well there's no number to delete so the list is empty. So let's actually start with insert. I hit 3 for insert what number do I wanna insert? I'll insert the number 5 just like we started with there. So now just to see what's going on every time you execute a command I wrote the program in such a way that it just tells you. It reminds you what the current list is from left to right just like our humans. So let's do another one. Insert, I'm gonna insert let's do the exact same numbers. So I had a 5 in there before. Let's also put in the number 17. That was among our numbers. Let me do another insert this time for 9 so those are three of the numbers we had and notice now that the program is at the moment smart enough to actually insert at the right location. 55 is gonna go all the way at the end there. And I can continue this and I could insert all of the same numbers that we had up here on stage. So now let's peel back the implementation and see how this is working. So first, what do I need at the top of this program? So I've got a bunch of includes. We haven't seen all of this before but I'll wave my hands it's the new one there uni-standard.H. And just point out at 1 I'm including my own file and that has the type def we just saw. And I also have a global variable. I could avoid this because remember we've generally preached that global variables are bad but when you're implementing a program, as in this case, whose sole purpose in life is to do one thing, implement a linked list. It's pretty reasonable in this scenario to have a global variable. Just like in the game of 15 we had a big global variable there called G because the whole program was dedicated to the game. We might as well have one global rather than pass the same pointer around again and again. So here is essentially, I could to be clever, just rename this Lisa right? So Lisa represented this pointer here. So she's 4 bytes which is implied by the fact that there's a star there even though a node is 8 bites remember that every pointer is 4 bytes. So this variable now represents Lisa but I'll go back to the global variable here called first and that was the piece of paper that Lisa was holding up. So that's Lisa and she's initialized to null and this too is important if you don't initialize Lisa or the head of your linked list to null, you're gonna assume that whatever garbage value was in Lisa's left hand is an actual node and if you follow that pointer bad things are gonna happen if no one existed to the left of Lisa. So this is incredibly important now to always initialize your pointers to some known value for instance null. Now I've got some function prototypes and I promised that those things would exist. Let's scroll down to main. So main here's how I implemented the menu as we've done in the past. I have a do-while loop and that's how I keep prompting the user again and again. Here's just the menu it's kinda formatted in this way for just aesthetic purposes. As in the side if you've never known this--if you have a really long string that you wanna print with printer def and you don't want it to wrap and wrap and wrap around you can actually close the quotes as I did here. And then just hit enter and pull it multiple coded strings on each line with no other punctuation. No commas anything. And they'll be automatically contaminated together. They'll be attached to one another automatically. It's just a trick for keeping your code a little prettier. So here's where I say command I use a get ints font call to get the commands then I have a switch statements and this is nice and clean because if the user types in one I call delete, two fine and so forth. So all of that is kind of older constructs and then I have this trick here the user types in 0 that's how they would quit. Now this not the most friendly interface. It would be nice if I insert--if I supported Q for quit and the like but that's really not the goal of this program. Let's focus instead on the linked list. So notice here that at the very end of main I already have the foresight to free this list and so if you assume that the user might start inserting nodes that means that you'll gonna call malloc a whole bunch of times. So in my main function before I quit I wanna make sure to free any memory that's been allocated by this program. Now how does this work? We'll come back to this but here I have a temporary pointer that was me. I was this temporary variable this story. I'm pointing at the same thing Lisa's is at. So this is actually important. Lisa a moment ago, so Lisa took her paper but suppose this was Lisa here that's head first so if I say pointer gets first what am I pointing to? Here's first. Okay. So here's Lisa, here's me, what am I pointing to? Why not wedo this highlighted line of code pointer gets first. There are kind of two answers. Am I pointing to Lisa or am I pointing to whatever she is pointing at. So I'm actually pointing to whatever she's pointing because consider what's going on here. What is first? Well first is a pointer and it's declared up here. Well what does that represent even though we refer to this thing as Lisa, Lisa contains a value. Her left hand represented that value. So if I say pointer gets first, that means do exactly what Lisa is doing. Put inside of your 4 bytes exactly the address that Lisa was storing and that happened to be the address of like number 5 or number 9 whoever was there at that point in time. So I'm gonna actually be pointing at the exact same thing that Lisa is but not at Lisa herself. So what then happens? Well this is just a trick and this is very common in linked list and data structures where you need to free a bunch of stuff again and again. I have a wild loop and I'm saying wow my pointer is not null. So while I'm pointing at a legitimate person and you can see how the story ends. If I get to the end of the list and then I follow the pointer, eventually I, the temporary variable will be pointing at the ground also, null. So that's when that loop will stop and for now this is what's going on when you needed it to remember before freeing something how to keep track with fingers in code here. So I put here that's not this. What I mean here is this. This line of code obviously frees a pointer. It frees a struct. So what are these two doing? Well there's one new piece of syntax here. So when I am standing here and I am pointing at say the first node on the list, number 5. I am then going to go ahead and create a duplicate here pred pointer which is also--oops I have it here. These two are temporarily equivalent. So I'm pointing pointer is that this guy here say number 5. Pred pointer at the moment is also pointing to number 5. But what does this line of code then do? This is the only new one. Pointer who is currently pointing at 5 gets pointer arrow next. So that saying go to this guy follow his left hand and point at whoever was to his left which in this case was I think was the number 9. So in short, this piece of code here and it's nice that the syntax actually uses an arrow says update pointer to be equal to whatever it's already pointing at which is pointer but follow that next field. Follow the arrow. So this one line of code is just my way of again having a temporary variable pointing at this person and if I wanna point at the next one, I need to follow that arrow to 9 follow that arrow again and again and again and if I keep doing that remember that 55's left hand was pointing down. So as soon as I follow that arrow, my own PTR variable is also going to be null. No why did I keep pred pointer around? Well you could do this in different ways. But I needed to make sure that just because I advanced to the next pointer, I need to make sure that I don't orphan the previous person. I need to remember them 'cause I need to advancer this pointer and then delete or rather free this person. Suppose I did the order of operations differently. Suppose I did this free pointer which feels very reasonable and then do pointer gets pointer next. I mean frankly this is kind of intuitively I think much more straight forward. Free what you're pointing at and then update yourself to the next pointer and then repeat. But why is this bad? [ Inaudible Remark ] >> Because I already--exactly. If as soon as you call free on a pointer and thus a struct you are not allowed to touch that memory anymore because it's possible the OS even though it's just between two lines of code, it's possible the operating system might need that chunk of memory immediately allocated to someone else and voila you've just now followed a pointer that's no longer your own. In reality that's not quite how the story gonna go but the idea is the same. Once you have freed memory it is no longer yours to touch. It could in fact be reclaimed and used in some other forms. So that's why frankly we have kinda jump through these hoops and frankly make the code more complex and harder to understand but all we're doing really is implementing this idea of temporary fingers pointing at these nodes and only freeing them once we get the ordering right. But for now let me wave my hand at that and let's focus on say insert. So insert is actually a little long but we'll look at just a couple of cases. Al right. So this is where pointers gets interesting but realize we're providing this code for a reason so you don't need to implement at all from scratch. So how do you go about inserting a number like the number five or nine or 55 or whatever into the list? So first, I need a node. So I'm gonna call malloc and I need as much RAM as is the size of a node and then I store that addresses of memory and new pointer and then I just do a sanity check if new pointer is null I'm just gonna return because I'm obviously out of memory. Something went wrong, can't proceed. So that's what creates an empty human on stage with no number and no pointer just yet. Alright, now I initialize the node. Number to insert well I'm just using getint and here's how given a new pointer so given a human on stage, here is how I update his or her N field the integer and then I initialize the next field to null. So let me just point out one thing syntactically this is kind of new. This arrow notation and remember and before when we've used structs we used the dot notation. Why do I may not use the dot notation here for new pointer even though I claim new pointer is at the moment because I've called malloc this is what our picture looks like. This is our node. This is N. This is next. This is new pointer. And if we could try our camera trick so oops didn't see that. So this is what just happened when I call malloc I get this. When I store the return value of malloc in new pointer I get this picture. And so this is what's going on in memory right now. So when I called getints with my next line of code and suppose I type in the number 5 what happens? Well new pointer, arrow, N gets the number 5 and then this I'm initializing to null so I'll just write null in there. But the question I post was this if this is a struct why am I not using the dot notation as I did earlier today and two weeks ago? [ Inaudible Remark ] >> Yes, so it's an address. So new pointer is not a struct at this moment in time. If I had a struct it would be like this--node, a new node. That is a struct because recall we declared node as a data type so struct a new node. That is something that I could do this notation on. A new node.N gets one two three. And then a new node.next gets null. But I didn't do that because new pointers is what? Well it's an address. So just like this picture suggests it's a pointer pointing at a struct so I need to not use the dot because look there is no N or next field in this value. I have to follow the arrow to get here. But to reveal that the syntax is actually the same in previous lectures how did we start at a pointer and then go there? What was the syntax for saying here's a pointer go there? Yeah, so star. So you know what this arrow notation we're seeing now is actually equivalent to writing this. So this is exactly equivalent. So it's saying here's new pointer. Go there. Thanks to the star notation and once you are there, you can use the old school dot notation. So that begs the question why the arrow? Why not just write this code like this? This is more of this syntactic sugar right. It's the same effect but frankly I would claim that this is just a little more readable. In fact it kinda mirrors our idea of pictures much more effectively than star and parenthesis and dot. But these are equivalents. And then commentary here is just to reveal that they're doing nothing new. It's just a new piece of syntax. Alright so now we've initialized the null with both an N like 5 and null. Now I'm gonna do the following. So this case is easy. Suppose the list is empty. Suppose Lisa is just standing up here awkwardly. She's initialized to null so her hands are like these and there is no linked list like Lisa is null. Well this is actually the easiest case to handle because I malloc someone from the audience. I insert the number five into his or her N fields and then what do I do with their next field? Initialize it to null and then what do I do with Lisa. Just point Lisa's next field or rather point Lisa. She is a pointer at that node. So this scenario here check for empty list, we didn't mimic this on stage but this is by far the easiest scenario if Lisa is not pointing at anything already and we've just malloc a new node, ergo we have a brand new linked list just point Lisa at that node. Now it gets a little trickier if the list belongs elsewhere. We have to do a little thinking now. Suppose that first is not null. So Lisa is actually pointing at a node. But suppose that the new nodes N field is less than the first nodes N field. So suppose the list has the number the number 5 in it and I'm certain sorting number 1. Just conceptually to keep these things in sorted order I need to put someone between Lisa and that first node update Lisa to point to this new thing number 1 and update 1 to point to 5. So how do I do this? Well first just notice our syntax here. I'm using my arrow notation to check those values of N. So if the new node N is less than the first node's N what do I do? New pointer next. So here I can avoid temporary storage. If I am now the number 1 and I belong here I can just steal whatever Lisa's already pointing at. So I'm the new node of 1. I wanna point my left hand at whatever Lisa's already pointing at so at this point of the story both she and I are pointing at number 5. But now, who do I have to update to finish the story? I just say Lisa like point you r left hand not at five anymore point it at number 1 and so now we have a little insertion at the list's head. So if we scroll down further unfortunately it gets a little messier and we have the advantage before of using humans could've we just kind of walked back and forth over stage but suffice it to say that these lines of code here and that there is a loop involved here. There is inequality test. We're checking for duplicate values now and we're also checking whether something is less than or greater than. So here's a greater than side. Long story short, the remainder of this function is what implements all these humans working together where we had maybe a temporary variable. We walked down the list, looked for the right spot. Inserted the node into that location, kept track of the left hand to make sure we got the ordering right. And this is definitely more complex than those first scenarios but that's fine for now. The take away ultimately for today is that the basics have sunken. Any questions on linked list? They will recur. It's not the last time. Well that's it. Why don't we end on this slide and we'll see you on Wednesday.