1 00:00:00,000 --> 00:00:03,000 [Review] [Quiz 0] 2 00:00:03,000 --> 00:00:05,000 >> [Lexi Ross, Tommy MacWilliam, Lucas Freitas, Joseph Ong] [Harvard University] 3 00:00:05,000 --> 00:00:08,000 >> [This is CS50.] [CS50.TV] 4 00:00:08,000 --> 00:00:10,000 >> Hey, everyone. 5 00:00:10,000 --> 00:00:15,000 Welcome to the review session for Quiz 0, which is taking place this Wednesday. 6 00:00:15,000 --> 00:00:19,000 What we're going to do tonight, I'm with 3 other TFs, 7 00:00:19,000 --> 00:00:24,000 and together we're going to go through a review of what we've done in the course so far. 8 00:00:24,000 --> 00:00:27,000 It's not going to be 100% comprehensive, but it should give you a better idea 9 00:00:27,000 --> 00:00:31,000 of what you already have down and what you still need to study before Wednesday. 10 00:00:31,000 --> 00:00:34,000 And feel free to raise your hand with questions as we're going along, 11 00:00:34,000 --> 00:00:38,000 but keep in mind that we'll also have a little bit of time at the end— 12 00:00:38,000 --> 00:00:41,000 if we get through with a few minutes to spare—to do general questions, 13 00:00:41,000 --> 00:00:47,000 so keep that in mind, and so we're going to start at the beginning with Week 0. 14 00:00:47,000 --> 00:00:50,000 >> [Quiz 0 Review!] [Part 0] [Lexi Ross] But before we do that let's talk about 15 00:00:50,000 --> 00:00:53,000 the logistics of the quiz. 16 00:00:53,000 --> 00:00:55,000 >> [Logistics] [Quiz takes place on Wednesday 10/10 in lieu of lecture] 17 00:00:55,000 --> 00:00:57,000 >> [(See http://cdn.cs50.net/2012/fall/quizzes/0/about0.pdf for details)] It is on Wednesday, October 10th. 18 00:00:57,000 --> 00:01:00,000 >> That's this Wednesday, and if you go to this URL here, 19 00:01:00,000 --> 00:01:03,000 which is also accessible from CS50.net—there's a link to it— 20 00:01:03,000 --> 00:01:06,000 you can see information about where to go based on 21 00:01:06,000 --> 00:01:10,000 your last name or school affiliation as well as 22 00:01:10,000 --> 00:01:14,000 it tells about exactly what the quiz will cover and the types of questions that you're going to get. 23 00:01:14,000 --> 00:01:19,000 Keep in mind that you'll also have a chance to review for the quiz in section, 24 00:01:19,000 --> 00:01:21,000 so your TFs should be going over some practice problems, 25 00:01:21,000 --> 00:01:29,000 and that's another good chance to see where you still need to study up for the quiz. 26 00:01:29,000 --> 00:01:32,000 Let's start at the beginning with Bits 'n' Bytes. 27 00:01:32,000 --> 00:01:35,000 Remember a bit is just a 0 or a 1, 28 00:01:35,000 --> 00:01:38,000 and a byte is a collection of 8 of those bits. 29 00:01:38,000 --> 00:01:42,000 Let's look at this collection of bits right here. 30 00:01:42,000 --> 00:01:44,000 We should be able to figure out how many bits there are. 31 00:01:44,000 --> 00:01:48,000 Where we count there's just 8 of them, eight 0 or 1 units. 32 00:01:48,000 --> 00:01:51,000 And since there's 8 bits, that's 1 byte, 33 00:01:51,000 --> 00:01:53,000 and let's convert it to hexadecimal. 34 00:01:53,000 --> 00:01:58,000 Hexadecimal is base 16, and it's pretty easy to convert 35 00:01:58,000 --> 00:02:01,000 a number in binary, which is what that is, to a number in hexadecimal. 36 00:02:01,000 --> 00:02:04,000 All we do is we look at groups of 4, 37 00:02:04,000 --> 00:02:07,000 and we convert them to the appropriate hexadecimal digit. 38 00:02:07,000 --> 00:02:11,000 We start with the right-most group of 4, so 0011. 39 00:02:11,000 --> 00:02:16,000 That's going to be one 1 and one 2, so together that makes 3. 40 00:02:16,000 --> 00:02:19,000 And then let's look at the other block of 4. 41 00:02:19,000 --> 00:02:24,000 1101. That's going to be one 1, one 4, and one 8. 42 00:02:24,000 --> 00:02:28,000 Together that's going to be 13, which makes D. 43 00:02:28,000 --> 00:02:32,000 And we'll remember that in hexadecimal we don't just go 0 through 9. 44 00:02:32,000 --> 00:02:36,000 We go 0 through F, so after 9, 10 corresponds to A, 45 00:02:36,000 --> 00:02:40,000 11 to B, et cetera where F is 15. 46 00:02:40,000 --> 00:02:44,000 Here 13 is a D, 47 00:02:44,000 --> 00:02:49,000 so to convert it to decimal all we do is we actually 48 00:02:49,000 --> 00:02:52,000 treat each position as a power of 2. 49 00:02:52,000 --> 00:02:58,000 That's one 1, one 2, zero 4s, zero 8s, one 16, et cetera, 50 00:02:58,000 --> 00:03:03,000 and it's a little hard to compute in your head, but if we go to the next slide 51 00:03:03,000 --> 00:03:05,000 we can see the answer to that. 52 00:03:05,000 --> 00:03:09,000 >> Essentially we're going across from right back to left, 53 00:03:09,000 --> 00:03:14,000 and we're multiplying each digit by the corresponding power of 2. 54 00:03:14,000 --> 00:03:19,000 And remember, for hexadecimal we denote these numbers with 0x at the beginning 55 00:03:19,000 --> 00:03:23,000 so we don't confuse it with a decimal number. 56 00:03:23,000 --> 00:03:29,000 Continuing on, this is an ASCII Table, 57 00:03:29,000 --> 00:03:35,000 and what we use ASCII for is to map from characters to numerical values. 58 00:03:35,000 --> 00:03:39,000 Remember in the cryptography pset we made extensive use of the ASCII Table 59 00:03:39,000 --> 00:03:43,000 in order to use various methods of cryptography, 60 00:03:43,000 --> 00:03:47,000 the Caesar and the Vigenère cipher, to convert different letters 61 00:03:47,000 --> 00:03:52,000 in a string according to the key given by the user. 62 00:03:52,000 --> 00:03:56,000 Let's look at a little bit of ASCII math. 63 00:03:56,000 --> 00:04:02,000 Looking at 'P' + 1, in character form that would be Q, 64 00:04:02,000 --> 00:04:07,000 and remember that '5' ≠ 5. 65 00:04:07,000 --> 00:04:10,000 And how exactly would we convert between those 2 forms? 66 00:04:10,000 --> 00:04:13,000 It's not actually too hard. 67 00:04:13,000 --> 00:04:16,000 In order to get 5 we subtract '0' 68 00:04:16,000 --> 00:04:20,000 because there are 5 places between the '0' and the '5.' 69 00:04:20,000 --> 00:04:23,000 In order to go the other way we just add the 0, 70 00:04:23,000 --> 00:04:25,000 so it's sort of like regular arithmetic. 71 00:04:25,000 --> 00:04:29,000 Just remember that when something has quotes around it it's a character 72 00:04:29,000 --> 00:04:37,000 and thus corresponds to a value in the ASCII table. 73 00:04:37,000 --> 00:04:40,000 Moving into more general computer science topics. 74 00:04:40,000 --> 00:04:43,000 We learned what an algorithm is and how we use programming 75 00:04:43,000 --> 00:04:45,000 to implement algorithms. 76 00:04:45,000 --> 00:04:48,000 Some examples of algorithms are something really simple like 77 00:04:48,000 --> 00:04:51,000 checking whether a number is even or odd. 78 00:04:51,000 --> 00:04:54,000 For that remember we mod the number by 2 and check if the result is 0. 79 00:04:54,000 --> 00:04:57,000 If so, it's even. If not, it's odd. 80 00:04:57,000 --> 00:04:59,000 And that's an example of a really basic algorithm. 81 00:04:59,000 --> 00:05:02,000 >> A little bit of a more involved one is binary search, 82 00:05:02,000 --> 00:05:05,000 which we'll go over later in the review session. 83 00:05:05,000 --> 00:05:09,000 And programming is the term we use for taking an algorithm 84 00:05:09,000 --> 00:05:15,000 and converting it to code the computer can read. 85 00:05:15,000 --> 00:05:20,000 2 examples of programming is Scratch, 86 00:05:20,000 --> 00:05:22,000 which is what we did in Week 0. 87 00:05:22,000 --> 00:05:25,000 Even though we don't actually type out the code it's a way of implementing 88 00:05:25,000 --> 00:05:29,000 this algorithm, which is printing the numbers 1-10, 89 00:05:29,000 --> 00:05:32,000 and here we do the same in the C programming language. 90 00:05:32,000 --> 00:05:41,000 These are functionally equivalent, just written in different languages or syntax. 91 00:05:41,000 --> 00:05:44,000 We then learned about boolean expressions, 92 00:05:44,000 --> 00:05:48,000 and a boolean is a value that's either true or false, 93 00:05:48,000 --> 00:05:51,000 and here oftentimes boolean expressions 94 00:05:51,000 --> 00:05:55,000 go inside of conditions, so if (x ≤ 5), 95 00:05:55,000 --> 00:06:00,000 well, we already set x = 5, so that condition is going to evaluate to true. 96 00:06:00,000 --> 00:06:03,000 And if it's true, whatever code is beneath the condition 97 00:06:03,000 --> 00:06:08,000 is going to be evaluated by the computer, so that string is going to be printed 98 00:06:08,000 --> 00:06:12,000 to standard output, and the term condition 99 00:06:12,000 --> 00:06:16,000 refers to whatever is inside the parentheses of the if statement. 100 00:06:16,000 --> 00:06:20,000 Remember all the operators. 101 00:06:20,000 --> 00:06:26,000 Remember it's && and | | when we're trying to combine 2 or more conditions, 102 00:06:26,000 --> 00:06:30,000 == not = to check whether 2 things are equal. 103 00:06:30,000 --> 00:06:36,000 Remember that = is for assignment whereas == is a boolean operator. 104 00:06:36,000 --> 00:06:41,000 ≤, ≥ and then the final 2 are self-explanatory. 105 00:06:41,000 --> 00:06:45,000 A general review of boolean logic here. 106 00:06:45,000 --> 00:06:48,000 And boolean expressions are also important in loops, 107 00:06:48,000 --> 00:06:50,000 which we'll go over now. 108 00:06:50,000 --> 00:06:56,000 We learned about 3 types of loops so far in CS50, for, while, and do while. 109 00:06:56,000 --> 00:06:59,000 And it's important to know that while for most purposes 110 00:06:59,000 --> 00:07:02,000 we can actually use any type of loop generally 111 00:07:02,000 --> 00:07:06,000 there are certain types of purposes or common patterns 112 00:07:06,000 --> 00:07:09,000 in programming that specifically call for one of these loops 113 00:07:09,000 --> 00:07:13,000 that make it the most efficient or elegant to code it in that way. 114 00:07:13,000 --> 00:07:18,000 Let's go over what each of these loops tends to be used for most often. 115 00:07:18,000 --> 00:07:21,000 >> In a for loop we generally already know how many times we want to iterate. 116 00:07:21,000 --> 00:07:24,000 That's what we put in the condition. 117 00:07:24,000 --> 00:07:28,000 For, i = 0, i < 10, for example. 118 00:07:28,000 --> 00:07:31,000 We already know that we want to do something 10 times. 119 00:07:31,000 --> 00:07:34,000 Now, for a while loop, generally we don't necessarily 120 00:07:34,000 --> 00:07:36,000 know how many times we want the loop to run. 121 00:07:36,000 --> 00:07:39,000 But we do know some sort of condition that we want it to 122 00:07:39,000 --> 00:07:41,000 always be true or always be false. 123 00:07:41,000 --> 00:07:44,000 For example, while is set. 124 00:07:44,000 --> 00:07:46,000 Let's say that's a boolean variable. 125 00:07:46,000 --> 00:07:48,000 While that's true we want the code to evaluate, 126 00:07:48,000 --> 00:07:52,000 so a little bit more extensible, a little bit more general than a for loop, 127 00:07:52,000 --> 00:07:55,000 but any for loop can also be converted to a while loop. 128 00:07:55,000 --> 00:08:00,000 Finally, do while loops, which may be the trickiest to comprehend right away, 129 00:08:00,000 --> 00:08:04,000 are often used when we want to evaluate the code first 130 00:08:04,000 --> 00:08:06,000 before the first time we check the condition. 131 00:08:06,000 --> 00:08:09,000 A common use case for a do while loop 132 00:08:09,000 --> 00:08:12,000 is when you want to get user input, and you know you want to ask the user 133 00:08:12,000 --> 00:08:15,000 for input at least once, but if they don't give you good input right away 134 00:08:15,000 --> 00:08:18,000 you want to keep asking them until they give you the good input. 135 00:08:18,000 --> 00:08:21,000 That's the most common use of a do while loop, 136 00:08:21,000 --> 00:08:23,000 and let's look at the actual structure of these loops. 137 00:08:23,000 --> 00:08:27,000 They typically always tend to follow these patterns. 138 00:08:27,000 --> 00:08:30,000 >> On the for loop inside you have 3 components: 139 00:08:30,000 --> 00:08:35,000 initialization, typically something like int i = 0 where i is the counter, 140 00:08:35,000 --> 00:08:40,000 condition, where we want to say run this for loop as long as this condition still holds, 141 00:08:40,000 --> 00:08:44,000 like i < 10, and then finally, update, which is how we increment 142 00:08:44,000 --> 00:08:47,000 the counter variable at each point in the loop. 143 00:08:47,000 --> 00:08:50,000 A common thing to see there is just i++, 144 00:08:50,000 --> 00:08:52,000 which means increment i by 1 every time. 145 00:08:52,000 --> 00:08:55,000 You could also do something like i+ = 2, 146 00:08:55,000 --> 00:08:58,000 which means add 2 to i each time you go through the loop. 147 00:08:58,000 --> 00:09:03,000 And then the do this just refers to any code that actually runs as part of the loop. 148 00:09:03,000 --> 00:09:09,000 And for a while loop, this time we actually have the initialization outside of the loop, 149 00:09:09,000 --> 00:09:12,000 so for example, let's say we're trying to do the same type of loop as I just described. 150 00:09:12,000 --> 00:09:16,000 We would say int i = 0 before the loop begins. 151 00:09:16,000 --> 00:09:20,000 Then we could say while i < 10 do this, 152 00:09:20,000 --> 00:09:22,000 so the same block of code as before, 153 00:09:22,000 --> 00:09:26,000 and this time the update part of the code, for example, i++, 154 00:09:26,000 --> 00:09:29,000 actually goes inside of the loop. 155 00:09:29,000 --> 00:09:33,000 And finally, for a do while, it's similar to the while loop, 156 00:09:33,000 --> 00:09:36,000 but we have to remember that the code will evaluate once 157 00:09:36,000 --> 00:09:40,000 before the condition is checked, so it makes a lot more sense 158 00:09:40,000 --> 00:09:44,000 if you look at it in order of top to bottom. 159 00:09:44,000 --> 00:09:49,000 In a do while loop the code evaluates before you even look at the while condition, 160 00:09:49,000 --> 00:09:55,000 whereas a while loop, it checks first. 161 00:09:55,000 --> 00:09:59,000 Statements and variables. 162 00:09:59,000 --> 00:10:04,000 When we want to create a new variable we first want to initialize it. 163 00:10:04,000 --> 00:10:07,000 >> For example, int bar initializes the variable bar, 164 00:10:07,000 --> 00:10:10,000 but it doesn't give it a value, so what is bar's value now? 165 00:10:10,000 --> 00:10:12,000 We don't know. 166 00:10:12,000 --> 00:10:14,000 It could be some garbage value that was previously stored in memory there, 167 00:10:14,000 --> 00:10:16,000 and we don't want to use that variable 168 00:10:16,000 --> 00:10:19,000 until we actually give it a value, 169 00:10:19,000 --> 00:10:21,000 so we declare it here. 170 00:10:21,000 --> 00:10:24,000 Then we initialize it to be 42 below. 171 00:10:24,000 --> 00:10:28,000 Now, of course, we know this can be done on one line, int bar = 42. 172 00:10:28,000 --> 00:10:30,000 But just to be clear the multiple steps that are going on, 173 00:10:30,000 --> 00:10:34,000 the declaration and the initialization are happening separately here. 174 00:10:34,000 --> 00:10:38,000 It happens on one step, and the next one, int baz = bar + 1, 175 00:10:38,000 --> 00:10:44,000 this statement below, that increments baz, so at the end of this code block 176 00:10:44,000 --> 00:10:48,000 if we were to print the value of baz it would be 44 177 00:10:48,000 --> 00:10:52,000 because we declare and initialize it to be 1 > bar, 178 00:10:52,000 --> 00:10:58,000 and then we increment it once more with the ++. 179 00:10:58,000 --> 00:11:02,000 We went over this pretty briefly, but it's good to have a general 180 00:11:02,000 --> 00:11:04,000 understanding of what threads and events are. 181 00:11:04,000 --> 00:11:06,000 We mainly did this in Scratch, 182 00:11:06,000 --> 00:11:09,000 so you can think of threads as multiple sequences of code 183 00:11:09,000 --> 00:11:11,000 running at the same time. 184 00:11:11,000 --> 00:11:14,000 In actuality, it probably isn't running at the same time, 185 00:11:14,000 --> 00:11:17,000 but sort of abstractly we can think of it in that way. 186 00:11:17,000 --> 00:11:20,000 >> In Scratch, for example, we had the multiple sprites. 187 00:11:20,000 --> 00:11:22,000 It could be executing different code at the same time. 188 00:11:22,000 --> 00:11:26,000 One could be walking while the other is saying something 189 00:11:26,000 --> 00:11:29,000 in a different part of the screen. 190 00:11:29,000 --> 00:11:34,000 Events are another way of separating out the logic 191 00:11:34,000 --> 00:11:37,000 between different elements of your code, 192 00:11:37,000 --> 00:11:40,000 and in Scratch we were able to simulate events using the Broadcast, 193 00:11:40,000 --> 00:11:43,000 and that's actually When I Receive, not When I Hear, 194 00:11:43,000 --> 00:11:47,000 but essentially it's a way to transmit information 195 00:11:47,000 --> 00:11:49,000 from one sprite to another. 196 00:11:49,000 --> 00:11:52,000 For example, you may want to transmit game over, 197 00:11:52,000 --> 00:11:56,000 and when another sprite receives game over, 198 00:11:56,000 --> 00:11:58,000 it responds in a certain way. 199 00:11:58,000 --> 00:12:03,000 It's an important model to understand for programming. 200 00:12:03,000 --> 00:12:07,000 Just to go over the basic Week 0, what we've gone over so far, 201 00:12:07,000 --> 00:12:10,000 let's look at this simple C program. 202 00:12:10,000 --> 00:12:14,000 The text may be a little bit small from here, but I'll go over it really quick. 203 00:12:14,000 --> 00:12:20,000 We're including 2 header files at the top, cs50.h and stdio.h. 204 00:12:20,000 --> 00:12:23,000 We're then defining a constant called limit to be 100. 205 00:12:23,000 --> 00:12:26,000 We're then implementing our main function. 206 00:12:26,000 --> 00:12:29,000 Since we don't use command line arguments here we need to put void 207 00:12:29,000 --> 00:12:32,000 as the arguments for main. 208 00:12:32,000 --> 00:12:38,000 We see int above main. That's the return type, hence return 0 at the bottom. 209 00:12:38,000 --> 00:12:41,000 And we're using CS50 library function get int 210 00:12:41,000 --> 00:12:45,000 to ask the user for input, and we store it in this variable x, 211 00:12:45,000 --> 00:12:51,000 so we declare x above, and we initialize it with x = GetInt. 212 00:12:51,000 --> 00:12:53,000 >> We then check to see if the user gave us good input. 213 00:12:53,000 --> 00:12:59,000 If it's ≥ LIMIT we want to return an error code of 1 and print an error message. 214 00:12:59,000 --> 00:13:02,000 And finally, if the user has given us good input 215 00:13:02,000 --> 00:13:08,000 we're going to square the number and print out that result. 216 00:13:08,000 --> 00:13:11,000 Just to make sure that those all hit home 217 00:13:11,000 --> 00:13:17,000 you can see the labels of different parts of the code here. 218 00:13:17,000 --> 00:13:19,000 I mentioned constant, header files. 219 00:13:19,000 --> 00:13:21,000 Oh, int x. Make sure to remember that's a local variable. 220 00:13:21,000 --> 00:13:24,000 That contrasts it from a global variable, which we'll talk about 221 00:13:24,000 --> 00:13:27,000 a little bit later in the review session, 222 00:13:27,000 --> 00:13:30,000 and we are calling the library function printf, 223 00:13:30,000 --> 00:13:34,000 so if we hadn't included the stdio.h header file 224 00:13:34,000 --> 00:13:37,000 we would not be able to call printf. 225 00:13:37,000 --> 00:13:42,000 And I believe the arrow that got cut off here is pointing to the %d, 226 00:13:42,000 --> 00:13:45,000 which is a formatting string in printf. 227 00:13:45,000 --> 00:13:52,000 It says print out this variable as a number, %d. 228 00:13:52,000 --> 00:13:58,000 And that is it for Week 0. 229 00:13:58,000 --> 00:14:06,000 Now Lucas is going to continue. 230 00:14:06,000 --> 00:14:08,000 Hey, guys. My name is Lucas. 231 00:14:08,000 --> 00:14:10,000 I'm a sophomore in the best house on campus, Mather, 232 00:14:10,000 --> 00:14:14,000 and I'm going to talk a little bit about Week 1 and 2.1. 233 00:14:14,000 --> 00:14:16,000 [Week 1 and 2.1!] [Lucas Freitas] 234 00:14:16,000 --> 00:14:19,000 As Lexi was saying, when we started translating your code from Scratch to C 235 00:14:19,000 --> 00:14:23,000 one of the things that we noticed is that you cannot just 236 00:14:23,000 --> 00:14:26,000 write your code and run it using a green flag anymore. 237 00:14:26,000 --> 00:14:30,000 Actually, you have to use some steps to make your C program 238 00:14:30,000 --> 00:14:33,000 become an executable file. 239 00:14:33,000 --> 00:14:36,000 Basically what you do when you're writing a program is that 240 00:14:36,000 --> 00:14:40,000 you translate your idea into a language that a compiler can understand, 241 00:14:40,000 --> 00:14:44,000 so when you're writing a program in C 242 00:14:44,000 --> 00:14:47,000 what you're doing is actually writing something that your compiler is going to understand, 243 00:14:47,000 --> 00:14:50,000 and then the compiler is going to translate that code 244 00:14:50,000 --> 00:14:53,000 into something that your computer will understand. 245 00:14:53,000 --> 00:14:55,000 >> And the thing is, your computer is actually very dumb. 246 00:14:55,000 --> 00:14:57,000 Your computer can only understand 0s and 1s, 247 00:14:57,000 --> 00:15:01,000 so actually in the first computers people usually programmed 248 00:15:01,000 --> 00:15:04,000 using 0s and 1s, but not anymore, thank God. 249 00:15:04,000 --> 00:15:07,000 We don't have to memorize the sequences for 0s and 1s 250 00:15:07,000 --> 00:15:10,000 for a for loop or for a while loop and so on. 251 00:15:10,000 --> 00:15:13,000 That's why we have a compiler. 252 00:15:13,000 --> 00:15:17,000 What a compiler does is it basically translates the C code, 253 00:15:17,000 --> 00:15:21,000 in our case, to a language that your computer will understand, 254 00:15:21,000 --> 00:15:25,000 which is the object code, and the compiler that we're using 255 00:15:25,000 --> 00:15:30,000 is called clang, so this is actually the symbol for clang. 256 00:15:30,000 --> 00:15:33,000 When you have your program, you have to do 2 things. 257 00:15:33,000 --> 00:15:37,000 First, you have to compile your program, and then you're going to run your program. 258 00:15:37,000 --> 00:15:41,000 To compile your program you have a lot of options to do so. 259 00:15:41,000 --> 00:15:44,000 The first one is to do clang program.c 260 00:15:44,000 --> 00:15:47,000 in which program is the name of your program. 261 00:15:47,000 --> 00:15:51,000 In this case you can see they're just saying "Hey, compile my program." 262 00:15:51,000 --> 00:15:56,000 You're not saying "I want this name for my program" or anything. 263 00:15:56,000 --> 00:15:58,000 >> The second option is giving a name to your program. 264 00:15:58,000 --> 00:16:02,000 You can say clang -o and then the name that you want 265 00:16:02,000 --> 00:16:06,000 the executable file to be named as and then program.c. 266 00:16:06,000 --> 00:16:11,000 And you can also do make program, and see how in the first 2 cases 267 00:16:11,000 --> 00:16:15,000 I put .c, and in the third one I only have programs? 268 00:16:15,000 --> 00:16:18,000 Yeah, you actually should not put .c when you use make. 269 00:16:18,000 --> 00:16:22,000 Otherwise the compiler is actually going to yell at you. 270 00:16:22,000 --> 00:16:24,000 And also, I don't know if you guys remember, 271 00:16:24,000 --> 00:16:29,000 but a lot of times we also used -lcs50 or -lm. 272 00:16:29,000 --> 00:16:31,000 That is called linking. 273 00:16:31,000 --> 00:16:35,000 It just tells the compiler that you will use those libraries right there, 274 00:16:35,000 --> 00:16:39,000 so if you want to use cs50.h you actually have to type 275 00:16:39,000 --> 00:16:43,000 clang program.c -lcs50. 276 00:16:43,000 --> 00:16:45,000 If you don't do that, the compiler is not going to know 277 00:16:45,000 --> 00:16:50,000 that you're using those functions in cs50.h. 278 00:16:50,000 --> 00:16:52,000 And when you want to run your program you have 2 options. 279 00:16:52,000 --> 00:16:57,000 If you did clang program.c you didn't give a name to your program. 280 00:16:57,000 --> 00:17:01,000 You have to run it using ./a.out. 281 00:17:01,000 --> 00:17:06,000 A.out is a standard name that clang gives your program if you don't give it a name. 282 00:17:06,000 --> 00:17:11,000 Otherwise you're going to do ./program if you gave a name to your program, 283 00:17:11,000 --> 00:17:15,000 and also if you did make program the name that a program is going to get 284 00:17:15,000 --> 00:17:23,000 is already going to be programmed the same name as the c file. 285 00:17:23,000 --> 00:17:26,000 Then we talked about data types and data. 286 00:17:26,000 --> 00:17:31,000 >> Basically data types are the same thing as little boxes they use 287 00:17:31,000 --> 00:17:35,000 to store values, so data types are actually just like Pokémons. 288 00:17:35,000 --> 00:17:39,000 They come in all sizes and types. 289 00:17:39,000 --> 00:17:43,000 I don't know if that analogy makes sense. 290 00:17:43,000 --> 00:17:46,000 The data size actually depends on the machine architecture. 291 00:17:46,000 --> 00:17:49,000 All the data sizes that I'm going to show here 292 00:17:49,000 --> 00:17:53,000 are actually for a 32-bit machine, which is the case of our appliance, 293 00:17:53,000 --> 00:17:56,000 but if you are actually coding your Mac or in a Windows also 294 00:17:56,000 --> 00:17:59,000 probably you're going to have a 64-bit machine, 295 00:17:59,000 --> 00:18:03,000 so remember that the data sizes that I'm going to show here 296 00:18:03,000 --> 00:18:06,000 are for the 32-bit machine. 297 00:18:06,000 --> 00:18:08,000 The first one that we saw was an int, 298 00:18:08,000 --> 00:18:10,000 which is pretty straightforward. 299 00:18:10,000 --> 00:18:13,000 You use int to store an integer. 300 00:18:13,000 --> 00:18:16,000 We also saw the character, the char. 301 00:18:16,000 --> 00:18:20,000 If you want to use a letter or a little symbol you're probably going to use a char. 302 00:18:20,000 --> 00:18:26,000 A char has 1 byte, which means 8 bits, like Lexi said. 303 00:18:26,000 --> 00:18:31,000 Basically we have an ASCII Table that has 256 304 00:18:31,000 --> 00:18:34,000 possible combinations of 0s and 1s, 305 00:18:34,000 --> 00:18:37,000 and then when you type a char it's going to translate 306 00:18:37,000 --> 00:18:44,000 the character that inputs you a number that you have in the ASCII table, like Lexi said. 307 00:18:44,000 --> 00:18:48,000 We also have the float, which we use to store decimal numbers. 308 00:18:48,000 --> 00:18:53,000 If you want to choose 3.14, for example, you're going to use a float 309 00:18:53,000 --> 00:18:55,000 or a double that has more precision. 310 00:18:55,000 --> 00:18:57,000 A float has 4 bytes. 311 00:18:57,000 --> 00:19:01,000 A double has 8 bytes, so the only difference is the precision. 312 00:19:01,000 --> 00:19:04,000 We also have a long that is used for integers, 313 00:19:04,000 --> 00:19:09,000 and you can see for a 32-bit machine an int and a long have the same size, 314 00:19:09,000 --> 00:19:13,000 so it doesn't really make sense to use a long in a 32-bit machine. 315 00:19:13,000 --> 00:19:17,000 >> But if you're using a Mac and 64-bit machine, actually a long has size 8, 316 00:19:17,000 --> 00:19:19,000 so it really depends on the architecture. 317 00:19:19,000 --> 00:19:22,000 For the 32-bit machine it doesn't make sense to use a long really. 318 00:19:22,000 --> 00:19:25,000 And then a long long, on the other hand, has 8 bytes, 319 00:19:25,000 --> 00:19:30,000 so it is very good if you want to have a longer integer. 320 00:19:30,000 --> 00:19:34,000 And finally, we have string, which is actually a char *, 321 00:19:34,000 --> 00:19:37,000 which is a pointer to a char. 322 00:19:37,000 --> 00:19:40,000 It's very easy to think that the size of the string is going to be like 323 00:19:40,000 --> 00:19:42,000 the number of characters that you have there, 324 00:19:42,000 --> 00:19:45,000 but actually the char * itself 325 00:19:45,000 --> 00:19:49,000 has the size of a pointer to a char, which is 4 bytes. 326 00:19:49,000 --> 00:19:52,000 The size of a char * is 4 bytes. 327 00:19:52,000 --> 00:19:56,000 It doesn't matter if you have a small word or a letter or anything. 328 00:19:56,000 --> 00:19:58,000 It's going to be 4 bytes. 329 00:19:58,000 --> 00:20:01,000 We also learned a little bit about casting, 330 00:20:01,000 --> 00:20:04,000 so as you can see, if you have, for example, a program that says 331 00:20:04,000 --> 00:20:08,000 int x = 3 and then printf("%d", x/2) 332 00:20:08,000 --> 00:20:12,000 do you guys know what it's going to print on screen? 333 00:20:12,000 --> 00:20:14,000 >> Someone?>>[Students] 2. 334 00:20:14,000 --> 00:20:16,000 1.>>1, yeah. 335 00:20:16,000 --> 00:20:20,000 When you do 3/2 it's going to get 1.5, 336 00:20:20,000 --> 00:20:24,000 but since we're using an integer it's going to ignore the decimal part, 337 00:20:24,000 --> 00:20:26,000 and you're going to have 1. 338 00:20:26,000 --> 00:20:29,000 If you don't want that to happen what you can do, for example, 339 00:20:29,000 --> 00:20:33,000 is declare a float y = x. 340 00:20:33,000 --> 00:20:40,000 Then x that used to be 3 is now going to be 3.000 in y. 341 00:20:40,000 --> 00:20:44,000 And then you can print the y/2. 342 00:20:44,000 --> 00:20:50,000 Actually, I should have a 2. over there. 343 00:20:50,000 --> 00:20:55,000 It's going to do 3.00/2.00, 344 00:20:55,000 --> 00:20:58,000 and you're going to get 1.5. 345 00:20:58,000 --> 00:21:06,000 And we have this .2f just to ask for 2 decimal units in the decimal part. 346 00:21:06,000 --> 00:21:12,000 If you have .3f it's going to have actually 1.500. 347 00:21:12,000 --> 00:21:16,000 If it's 2 it's going to be 1.50. 348 00:21:16,000 --> 00:21:18,000 We also have this case here. 349 00:21:18,000 --> 00:21:22,000 If you do float x = 3.14 and then you printf x 350 00:21:22,000 --> 00:21:24,000 you're going to get 3.14. 351 00:21:24,000 --> 00:21:29,000 And if you do x = the int of x, 352 00:21:29,000 --> 00:21:34,000 which means treat x as an int and you print x now 353 00:21:34,000 --> 00:21:36,000 you're going to have 3.00. 354 00:21:36,000 --> 00:21:38,000 Does that make sense? 355 00:21:38,000 --> 00:21:41,000 Because you're first treating x as an integer, so you're ignoring the decimal part, 356 00:21:41,000 --> 00:21:45,000 and then you're printing x. 357 00:21:45,000 --> 00:21:47,000 And finally, you can also do this, 358 00:21:47,000 --> 00:21:52,000 int x = 65, and then you declare a char c = x, 359 00:21:52,000 --> 00:21:56,000 and then if you print the c you're actually going to get 360 00:21:56,000 --> 00:21:59,000 A, so basically what you're doing here 361 00:21:59,000 --> 00:22:02,000 is translating the integer into the character, 362 00:22:02,000 --> 00:22:05,000 just like the ASCII Table does. 363 00:22:05,000 --> 00:22:08,000 We also talked about math operators. 364 00:22:08,000 --> 00:22:14,000 Most of them are pretty straightforward, so +, -, *, /, 365 00:22:14,000 --> 00:22:20,000 and also we talked about mod, which is the remainder of a division of 2 numbers. 366 00:22:20,000 --> 00:22:23,000 If you have 10 % 3, for example, 367 00:22:23,000 --> 00:22:27,000 it means divide 10 by 3, and what is the remainder? 368 00:22:27,000 --> 00:22:30,000 It's going to be 1, so it's actually very useful for a lot of the programs. 369 00:22:30,000 --> 00:22:38,000 For Vigenère and Caesar I'm pretty sure that all of you guys used mod. 370 00:22:38,000 --> 00:22:43,000 About math operators, be very careful when combining * and /. 371 00:22:43,000 --> 00:22:48,000 >> For example, if you do (3/2) * 2 what are you going to get? 372 00:22:48,000 --> 00:22:50,000 [Students] 2. 373 00:22:50,000 --> 00:22:54,000 Yeah, 2, because 3/2 is going to be 1.5, 374 00:22:54,000 --> 00:22:57,000 but since you're doing operations between 2 integers 375 00:22:57,000 --> 00:22:59,000 you're actually just going to consider 1, 376 00:22:59,000 --> 00:23:03,000 and then 1 * 2 is going to be 2, so be very, very careful 377 00:23:03,000 --> 00:23:07,000 when doing arithmetic with integers because 378 00:23:07,000 --> 00:23:12,000 you might get that 2 = 3, in that case. 379 00:23:12,000 --> 00:23:14,000 And also be very careful about precedence. 380 00:23:14,000 --> 00:23:21,000 You should usually use parentheses to be sure that you know what you're doing. 381 00:23:21,000 --> 00:23:27,000 Some useful shortcuts, of course, one is i++ or i += 1 382 00:23:27,000 --> 00:23:30,000 or using +=. 383 00:23:30,000 --> 00:23:34,000 That is the same thing as doing i = i + 1. 384 00:23:34,000 --> 00:23:39,000 You can also do i-- or i -= 1, 385 00:23:39,000 --> 00:23:42,000 which is the same thing as i = i -1, 386 00:23:42,000 --> 00:23:46,000 something you guys use a lot in for loops, at least. 387 00:23:46,000 --> 00:23:52,000 Also, for *, if you use *= and if you do, for example, 388 00:23:52,000 --> 00:23:57,000 i *= 2 is the same thing as saying i = i * 2, 389 00:23:57,000 --> 00:23:59,000 and the same thing for division. 390 00:23:59,000 --> 00:24:08,000 If you do i /= 2 it's the same thing as i = i/2. 391 00:24:08,000 --> 00:24:10,000 >> Now about functions. 392 00:24:10,000 --> 00:24:13,000 You guys learned that functions are a very good strategy to save code 393 00:24:13,000 --> 00:24:16,000 while you're programming, so if you want to perform the same task 394 00:24:16,000 --> 00:24:20,000 in code again and again, probably you want to use a function 395 00:24:20,000 --> 00:24:25,000 just so you don't have to copy and paste the code over and over again. 396 00:24:25,000 --> 00:24:28,000 Actually, main is a function, and when I show you the format of a function 397 00:24:28,000 --> 00:24:32,000 you're going to see that that is pretty obvious. 398 00:24:32,000 --> 00:24:35,000 We also use functions from some libraries, 399 00:24:35,000 --> 00:24:39,000 for example, printf, GetIn, which is from the CS50 library, 400 00:24:39,000 --> 00:24:43,000 and other functions like toupper. 401 00:24:43,000 --> 00:24:46,000 All of those functions are actually implemented in other libraries, 402 00:24:46,000 --> 00:24:49,000 and when you put those tether files in the beginning of your program 403 00:24:49,000 --> 00:24:53,000 you're saying can you please give me the code for those functions 404 00:24:53,000 --> 00:24:57,000 so I don't have to implement them by myself? 405 00:24:57,000 --> 00:25:00,000 And you can also write your own functions, so when you start programming 406 00:25:00,000 --> 00:25:04,000 you realize that libraries don't have all the functions that you need. 407 00:25:04,000 --> 00:25:10,000 For the last pset, for example, we wrote draw, scramble, and lookup, 408 00:25:10,000 --> 00:25:13,000 and it's very, very important to be able to write functions 409 00:25:13,000 --> 00:25:17,000 because they are useful, and we use them all the time in programming, 410 00:25:17,000 --> 00:25:19,000 and it saves a lot of code. 411 00:25:19,000 --> 00:25:21,000 The format of a function is this one. 412 00:25:21,000 --> 00:25:24,000 We have return type in the beginning. What is the return type? 413 00:25:24,000 --> 00:25:27,000 It's just when your function is going to return. 414 00:25:27,000 --> 00:25:29,000 If you have a function, for example, factorial, 415 00:25:29,000 --> 00:25:31,000 that is going to calculate a factorial of an integer, 416 00:25:31,000 --> 00:25:34,000 probably it's going to return an integer also. 417 00:25:34,000 --> 00:25:37,000 Then the return type is going to be int. 418 00:25:37,000 --> 00:25:41,000 Printf actually has a return type void 419 00:25:41,000 --> 00:25:43,000 because you're not returning anything. 420 00:25:43,000 --> 00:25:45,000 You're just printing things to the screen 421 00:25:45,000 --> 00:25:48,000 and quitting the function afterwards. 422 00:25:48,000 --> 00:25:51,000 Then you have the name of the function that you can choose. 423 00:25:51,000 --> 00:25:55,000 You should be a little reasonable, like don't choose a name like xyz 424 00:25:55,000 --> 00:25:58,000 or like x2f. 425 00:25:58,000 --> 00:26:02,000 Try to make up a name that makes sense. 426 00:26:02,000 --> 00:26:04,000 >> For example, if it's factorial, say factorial. 427 00:26:04,000 --> 00:26:08,000 If it's a function that is going to draw something, name it draw. 428 00:26:08,000 --> 00:26:11,000 And then we have the parameters, which are also called arguments, 429 00:26:11,000 --> 00:26:14,000 which are like the resources that your function needs 430 00:26:14,000 --> 00:26:17,000 from your code to perform its task. 431 00:26:17,000 --> 00:26:20,000 If you want to calculate the factorial of a number 432 00:26:20,000 --> 00:26:23,000 probably you need to have a number to calculate a factorial. 433 00:26:23,000 --> 00:26:27,000 One of the arguments that you're going to have is the number itself. 434 00:26:27,000 --> 00:26:31,000 And then it's going to do something and return the value at the end 435 00:26:31,000 --> 00:26:35,000 unless it's a void function. 436 00:26:35,000 --> 00:26:37,000 Let's see an example. 437 00:26:37,000 --> 00:26:40,000 If I want to write a function that sums all the numbers in an array of integers, 438 00:26:40,000 --> 00:26:43,000 first of all, the return type is going to be int 439 00:26:43,000 --> 00:26:46,000 because I have an array of integers. 440 00:26:46,000 --> 00:26:51,000 And then I'm going to have the function name like sumArray, 441 00:26:51,000 --> 00:26:54,000 and then it's going to take the array itself, to int nums, 442 00:26:54,000 --> 00:26:58,000 and then the length of the array so I know how many numbers I have to sum. 443 00:26:58,000 --> 00:27:02,000 Then I have to initialize a variable called sum, for example, to 0, 444 00:27:02,000 --> 00:27:08,000 and every time I see an element in the array I should add it to sum, so I did a for loop. 445 00:27:08,000 --> 00:27:15,000 Just like Lexi said, you do int i = 0, i < length and i++. 446 00:27:15,000 --> 00:27:20,000 And for every element in the array I did sum += nums[i], 447 00:27:20,000 --> 00:27:24,000 and then I returned the sum, so it's very simple, and it saves a lot of code 448 00:27:24,000 --> 00:27:28,000 if you're using this function a lot of times. 449 00:27:28,000 --> 00:27:32,000 Then we took a look at conditions. 450 00:27:32,000 --> 00:27:38,000 We have if, else, and else if. 451 00:27:38,000 --> 00:27:42,000 Let's see what is the difference between those. 452 00:27:42,000 --> 00:27:45,000 Take a look at these 2 codes. What is the difference between them? 453 00:27:45,000 --> 00:27:49,000 The first one has—basically the codes want you to tell 454 00:27:49,000 --> 00:27:51,000 if a number is +, -, or 0. 455 00:27:51,000 --> 00:27:55,000 The first one says if it's > 0 then it's positive. 456 00:27:55,000 --> 00:28:00,000 If it's = to 0 then it's 0, and if it's < 0 then it's negative. 457 00:28:00,000 --> 00:28:04,000 >> And the other one is doing if, else if, else. 458 00:28:04,000 --> 00:28:07,000 The difference between the two is that this one is actually going to 459 00:28:07,000 --> 00:28:13,000 check if > 0, < 0 or = 0 three times, 460 00:28:13,000 --> 00:28:17,000 so if you have the number 2, for example, it's going to come here and say 461 00:28:17,000 --> 00:28:21,000 if (x > 0), and it's going to say yes, so I print positive. 462 00:28:21,000 --> 00:28:25,000 But even though I know that it's > 0 and it's not going to be 0 or < 0 463 00:28:25,000 --> 00:28:29,000 I'm still going to do is it 0, is it < 0, 464 00:28:29,000 --> 00:28:33,000 so I'm actually going inside of ifs that I didn't have to 465 00:28:33,000 --> 00:28:38,000 because I already know that it's not going to satisfy any of these conditions. 466 00:28:38,000 --> 00:28:41,000 I can use the if, else if, else statement. 467 00:28:41,000 --> 00:28:45,000 It basically says if x = 0 I print the positive. 468 00:28:45,000 --> 00:28:48,000 If it's not, I'm going to also test this. 469 00:28:48,000 --> 00:28:51,000 If it's 2 not I'm going to do this. 470 00:28:51,000 --> 00:28:54,000 Basically if I had x = 2 you would say 471 00:28:54,000 --> 00:28:57,000 if (x > 0), yes, so print this. 472 00:28:57,000 --> 00:29:00,000 Now that I know that it's > 0 and that it satisfied the first if 473 00:29:00,000 --> 00:29:02,000 I'm not even going to run this code. 474 00:29:02,000 --> 00:29:09,000 The code runs faster, actually, 3 times faster if you use this. 475 00:29:09,000 --> 00:29:11,000 We also learned about and and or. 476 00:29:11,000 --> 00:29:15,000 I'm not going to go through this because Lexi already talked about them. 477 00:29:15,000 --> 00:29:17,000 It's just the && and | | operator. 478 00:29:17,000 --> 00:29:21,000 >> The only thing I'll say is be careful when you have 3 conditions. 479 00:29:21,000 --> 00:29:24,000 Use parentheses because it's very confusing when you have a condition 480 00:29:24,000 --> 00:29:27,000 and another one or another one. 481 00:29:27,000 --> 00:29:30,000 Use parentheses just to be sure that your conditions make sense 482 00:29:30,000 --> 00:29:34,000 because in that case, for example, you can imagine that 483 00:29:34,000 --> 00:29:38,000 it could be the first condition and one or the other 484 00:29:38,000 --> 00:29:41,000 or the 2 conditions combined in an and 485 00:29:41,000 --> 00:29:45,000 or the third one, so just be careful. 486 00:29:45,000 --> 00:29:48,000 And finally, we talked about switches. 487 00:29:48,000 --> 00:29:53,000 A switch is very useful when you have a variable. 488 00:29:53,000 --> 00:29:55,000 Let's say that you have a variable like n 489 00:29:55,000 --> 00:29:59,000 that can be 0, 1, or 2, and for each of those cases 490 00:29:59,000 --> 00:30:01,000 you're going to perform a task. 491 00:30:01,000 --> 00:30:04,000 You can say switch the variable, and it indicates that 492 00:30:04,000 --> 00:30:08,000 the value then is like value1 I'm going to do this, 493 00:30:08,000 --> 00:30:12,000 and then I break, which means I'm not going to look at any of the other cases 494 00:30:12,000 --> 00:30:15,000 because we already satisfied that case 495 00:30:15,000 --> 00:30:20,000 and then value2 and so on, and I also can have a default switch. 496 00:30:20,000 --> 00:30:24,000 That means if it doesn't satisfy any of the cases that I had 497 00:30:24,000 --> 00:30:29,000 that I'm going to do something else, but that's optional. 498 00:30:29,000 --> 00:30:36,000 That's all for me. Now let's have Tommy. 499 00:30:36,000 --> 00:30:41,000 All right, this is going to be Week 3-ish. 500 00:30:41,000 --> 00:30:45,000 These are some of the topics we'll be covering, crypto, scope, arrays, et cetera. 501 00:30:45,000 --> 00:30:49,000 Just a quick word on crypto. We're not going to hammer this home. 502 00:30:49,000 --> 00:30:52,000 >> We did this in pset 2, but for the quiz make sure you know the difference 503 00:30:52,000 --> 00:30:54,000 between the Caesar cipher and the Vigenère cipher, 504 00:30:54,000 --> 00:30:57,000 how both of those ciphers work and what it's like to encrypt 505 00:30:57,000 --> 00:30:59,000 and decrypt text using those 2 ciphers. 506 00:30:59,000 --> 00:31:03,000 Remember, the Caesar cipher simply rotates each character by the same amount, 507 00:31:03,000 --> 00:31:06,000 making sure you mod by the number of letters in the alphabet. 508 00:31:06,000 --> 00:31:09,000 And the Vigenère cipher, on the other hand, rotates each character 509 00:31:09,000 --> 00:31:12,000 by a different amount, so rather than saying 510 00:31:12,000 --> 00:31:15,000 every character rotated by 3 Vigenère will rotate each character 511 00:31:15,000 --> 00:31:17,000 by a different amount depending on some keyword 512 00:31:17,000 --> 00:31:20,000 where each letter in the keyword represents some different amount 513 00:31:20,000 --> 00:31:26,000 to rotate the clear text by. 514 00:31:26,000 --> 00:31:28,000 Let's first talk about variable scope. 515 00:31:28,000 --> 00:31:30,000 There are 2 different types of variables. 516 00:31:30,000 --> 00:31:33,000 We have local variables, and these are going to be defined 517 00:31:33,000 --> 00:31:36,000 outside of main or outside any function or block, 518 00:31:36,000 --> 00:31:39,000 and these will be accessible anywhere in your program. 519 00:31:39,000 --> 00:31:41,000 If you have a function and in that function is a while loop 520 00:31:41,000 --> 00:31:44,000 the big global variable is accessible everywhere. 521 00:31:44,000 --> 00:31:48,000 A local variable, on the other hand, is scoped to the place where it is defined. 522 00:31:48,000 --> 00:31:53,000 >> If you have a function here, for example, we have this function g, 523 00:31:53,000 --> 00:31:56,000 and inside of g there is a variable here called y, 524 00:31:56,000 --> 00:31:58,000 and that means that this is a local variable. 525 00:31:58,000 --> 00:32:00,000 Even though this variable is called y 526 00:32:00,000 --> 00:32:03,000 and this variable is called y these 2 functions 527 00:32:03,000 --> 00:32:06,000 have no idea what each other's local variables are. 528 00:32:06,000 --> 00:32:10,000 On the other hand, up here we say int x = 5, 529 00:32:10,000 --> 00:32:12,000 and this is outside the scope of any function. 530 00:32:12,000 --> 00:32:16,000 It's outside the scope of main, so this is a global variable. 531 00:32:16,000 --> 00:32:20,000 That means that inside of these 2 functions when I say x-- or x++ 532 00:32:20,000 --> 00:32:26,000 I'm accessing the same x whereby this y and this y are different variables. 533 00:32:26,000 --> 00:32:30,000 That's the difference between a global variable and a local variable. 534 00:32:30,000 --> 00:32:33,000 As far as design is concerned, sometimes it's probably a better idea 535 00:32:33,000 --> 00:32:37,000 to keep variables local whenever you possibly can 536 00:32:37,000 --> 00:32:39,000 since having a bunch of global variables can get really confusing. 537 00:32:39,000 --> 00:32:42,000 If you have a bunch of functions all modifying the same thing 538 00:32:42,000 --> 00:32:45,000 you might forget what if this function accidentally modifies this global, 539 00:32:45,000 --> 00:32:47,000 and this other function doesn't know about it, 540 00:32:47,000 --> 00:32:50,000 and it does get pretty confusing as you get more code. 541 00:32:50,000 --> 00:32:53,000 Keeping variables local whenever you possibly can 542 00:32:53,000 --> 00:32:56,000 is just good design. 543 00:32:56,000 --> 00:33:00,000 Arrays, remember, are simply lists of elements of the same type. 544 00:33:00,000 --> 00:33:04,000 Inside of C I can't have a list like 1, 2.0, hello. 545 00:33:04,000 --> 00:33:06,000 We just can't do that. 546 00:33:06,000 --> 00:33:11,000 >> When we declare an array in C all of the elements have to be of the same type. 547 00:33:11,000 --> 00:33:14,000 Here I have an array of 3 integers. 548 00:33:14,000 --> 00:33:18,000 Here I have the length of the array, but if I'm just declaring it in this syntax 549 00:33:18,000 --> 00:33:21,000 where I specify what all of the elements are I don't technically need this 3. 550 00:33:21,000 --> 00:33:25,000 The compiler is smart enough to figure out how big the array should be. 551 00:33:25,000 --> 00:33:28,000 Now when I want to get or set the value of an array 552 00:33:28,000 --> 00:33:30,000 this is the syntax to do that. 553 00:33:30,000 --> 00:33:33,000 This will actually modify the second element of the array because, remember, 554 00:33:33,000 --> 00:33:36,000 numbering starts at 0, not at 1. 555 00:33:36,000 --> 00:33:42,000 If I want to read that value I can say something like int x = array[1]. 556 00:33:42,000 --> 00:33:44,000 Or if I want to set that value, like I'm doing here, 557 00:33:44,000 --> 00:33:47,000 I can say array[1] = 4. 558 00:33:47,000 --> 00:33:50,000 That time accessing elements by their index 559 00:33:50,000 --> 00:33:52,000 or their position or where they are in the array, 560 00:33:52,000 --> 00:33:57,000 and that listing starts at 0. 561 00:33:57,000 --> 00:34:00,000 We can also have arrays of arrays, 562 00:34:00,000 --> 00:34:03,000 and this is called a multi-dimensional array. 563 00:34:03,000 --> 00:34:05,000 When we have a multi-dimensional array 564 00:34:05,000 --> 00:34:07,000 that means we can have something like rows and columns, 565 00:34:07,000 --> 00:34:11,000 and this is just one way of visualizing this or thinking about it. 566 00:34:11,000 --> 00:34:14,000 When I have a multi-dimensional array that means I'm going to start needing 567 00:34:14,000 --> 00:34:17,000 more than 1 index because if I have a grid 568 00:34:17,000 --> 00:34:19,000 just saying what row you're in doesn't give us a number. 569 00:34:19,000 --> 00:34:22,000 That's really just going to give us a list of numbers. 570 00:34:22,000 --> 00:34:25,000 Let's say I have this array here. 571 00:34:25,000 --> 00:34:30,000 I have an array called grid, and I'm saying it's 2 rows and 3 columns, 572 00:34:30,000 --> 00:34:32,000 and so this is one way of visualizing it. 573 00:34:32,000 --> 00:34:37,000 When I say I want to get the element at [1] [2] 574 00:34:37,000 --> 00:34:41,000 that means that because these are rows first and then columns 575 00:34:41,000 --> 00:34:44,000 I'm going to jump to row 1 since I said 1. 576 00:34:44,000 --> 00:34:49,000 >> Then I'm going to come over here to column 2, and I'm going to get the value 6. 577 00:34:49,000 --> 00:34:51,000 Make sense? 578 00:34:51,000 --> 00:34:55,000 Multi-dimensional arrays, remember, are technically just an array of arrays. 579 00:34:55,000 --> 00:34:57,000 We can have arrays of arrays of arrays. 580 00:34:57,000 --> 00:35:00,000 We can keep going, but really one way to think about 581 00:35:00,000 --> 00:35:03,000 how this is being laid out and what's going on is to visualize it 582 00:35:03,000 --> 00:35:09,000 in a grid like this. 583 00:35:09,000 --> 00:35:12,000 When we pass arrays to functions, they're going to behave 584 00:35:12,000 --> 00:35:16,000 a little bit differently than when we pass regular variables to functions 585 00:35:16,000 --> 00:35:18,000 like passing an int or a float. 586 00:35:18,000 --> 00:35:21,000 When we pass in an int or char or any of these other data types 587 00:35:21,000 --> 00:35:24,000 we just took a look at if the function modifies 588 00:35:24,000 --> 00:35:28,000 the value of that variable that change is not going to propagate up 589 00:35:28,000 --> 00:35:32,000 to the calling function. 590 00:35:32,000 --> 00:35:35,000 With an array, on the other hand, that will happen. 591 00:35:35,000 --> 00:35:39,000 If I pass in an array to some function and that function changes some of the elements, 592 00:35:39,000 --> 00:35:43,000 when I come back up to the function that called it 593 00:35:43,000 --> 00:35:47,000 my array is now going to be different, and the vocabulary for that 594 00:35:47,000 --> 00:35:50,000 is arrays are passed by reference, as we'll see later. 595 00:35:50,000 --> 00:35:53,000 This is related to how pointers work, where these basic data types, 596 00:35:53,000 --> 00:35:55,000 on the other hand, are passed by value. 597 00:35:55,000 --> 00:35:59,000 >> We can think of that as making a copy of some variable and then passing in the copy. 598 00:35:59,000 --> 00:36:01,000 It doesn't matter what we do with that variable. 599 00:36:01,000 --> 00:36:06,000 The calling function will not be aware that it was changed. 600 00:36:06,000 --> 00:36:10,000 Arrays are just a little bit different in that regard. 601 00:36:10,000 --> 00:36:13,000 For example, as we just saw, main is simply a function 602 00:36:13,000 --> 00:36:15,000 that can take in 2 arguments. 603 00:36:15,000 --> 00:36:20,000 The first argument to the main function is argc, or the number of arguments, 604 00:36:20,000 --> 00:36:23,000 and the second argument is called argv, 605 00:36:23,000 --> 00:36:27,000 and those are the actual values of those arguments. 606 00:36:27,000 --> 00:36:30,000 Let's say I have a program called this.c, 607 00:36:30,000 --> 00:36:34,000 and I say make this, and I'm going to run this at the command line. 608 00:36:34,000 --> 00:36:38,000 Now to pass in some arguments to my program called this, 609 00:36:38,000 --> 00:36:42,000 I could say something like ./this is cs 50. 610 00:36:42,000 --> 00:36:45,000 This is what we imagine David to do every day at the terminal. 611 00:36:45,000 --> 00:36:48,000 But now the main function inside of that program 612 00:36:48,000 --> 00:36:52,000 has these values, so argc is 4. 613 00:36:52,000 --> 00:36:56,000 It might be a little confusing because really we're only passing in is cs 50. 614 00:36:56,000 --> 00:36:58,000 That's only 3. 615 00:36:58,000 --> 00:37:02,000 But remember that the first element of argv or the first argument 616 00:37:02,000 --> 00:37:05,000 is the name of the function itself. 617 00:37:05,000 --> 00:37:07,190 So that means that we have 4 things here, 618 00:37:07,190 --> 00:37:10,530 and the first element is going to be ./this. 619 00:37:10,530 --> 00:37:12,970 And this will be represented as a string. 620 00:37:12,970 --> 00:37:18,590 Then the remaining elements are what we typed in after the name of the program. 621 00:37:18,590 --> 00:37:22,720 So just as an aside, as we probably saw in pset 2, 622 00:37:22,720 --> 00:37:28,780 remember that the string 50 is ≠ the integer 50. 623 00:37:28,780 --> 00:37:32,520 So we can't say something like, 'int x = argv 3.' 624 00:37:32,520 --> 00:37:36,470 >> That's just not going to make sense, because this is a string, and this is an integer. 625 00:37:36,470 --> 00:37:38,510 So if you want to convert between the 2, remember, we're going to 626 00:37:38,510 --> 00:37:40,810 have this magic function called atoi. 627 00:37:40,810 --> 00:37:46,270 That takes a string and returns the integer represented inside of that string. 628 00:37:46,270 --> 00:37:48,360 So that's an easy mistake to make on the quiz, 629 00:37:48,360 --> 00:37:51,590 just thinking that this will automatically be the correct type. 630 00:37:51,590 --> 00:37:53,860 But just know that these will always be strings 631 00:37:53,860 --> 00:38:00,920 even if the string only contains an integer or a character or a float. 632 00:38:00,920 --> 00:38:03,380 So now let's talk about running time. 633 00:38:03,380 --> 00:38:06,700 When we have all these algorithms that do all these crazy things, 634 00:38:06,700 --> 00:38:11,580 it becomes really useful to ask the question, "How long do they take?" 635 00:38:11,580 --> 00:38:15,500 We represent that with something called asymptotic notation. 636 00:38:15,500 --> 00:38:18,430 So this means that--well, let's say we give our algorithm 637 00:38:18,430 --> 00:38:20,840 some really, really, really big input. 638 00:38:20,840 --> 00:38:23,840 We want to ask the question, "How long is it going to take? 639 00:38:23,840 --> 00:38:26,370 How many steps will it take our algorithm to run 640 00:38:26,370 --> 00:38:29,980 as a function of the size of the input?" 641 00:38:29,980 --> 00:38:33,080 So the first way we can describe run time is with big O. 642 00:38:33,080 --> 00:38:35,380 And this is our worst-case running time. 643 00:38:35,380 --> 00:38:38,590 So if we want to sort an array, and we give our algorithm an array 644 00:38:38,590 --> 00:38:41,000 that's in descending order when it should be in ascending order, 645 00:38:41,000 --> 00:38:43,130 that's going to be the worst case. 646 00:38:43,130 --> 00:38:49,800 This is our upper bound in the maximum length of time our algorithm will take. 647 00:38:49,800 --> 00:38:54,740 On the other hand, this Ω is going to describe best-case running time. 648 00:38:54,740 --> 00:38:58,210 So if we give an already sorted array to a sorting algorithm, 649 00:38:58,210 --> 00:39:00,940 how long will it take to sort it? 650 00:39:00,940 --> 00:39:06,610 And this, then, describes a lower bound on running time. 651 00:39:06,610 --> 00:39:10,980 So here are just some words that describe some common running times. 652 00:39:10,980 --> 00:39:13,120 These are in ascending order. 653 00:39:13,120 --> 00:39:16,060 The fastest running time we have is called constant. 654 00:39:16,060 --> 00:39:19,800 >> That means no matter how many elements we give our algorithm, 655 00:39:19,800 --> 00:39:22,280 no matter how big our array is, sorting it 656 00:39:22,280 --> 00:39:26,510 or doing whatever we're doing to the array will always take the same amount of time. 657 00:39:26,510 --> 00:39:30,270 So we can represent that just with a 1, which is a constant. 658 00:39:30,270 --> 00:39:32,410 We also looked at logarithmic run time. 659 00:39:32,410 --> 00:39:34,800 So something like binary search is logarithmic, 660 00:39:34,800 --> 00:39:37,140 where we cut the problem in half every time 661 00:39:37,140 --> 00:39:40,970 and then things just get higher from there. 662 00:39:40,970 --> 00:39:43,580 And if you're ever writing an O of any factorial algorithm, 663 00:39:43,580 --> 00:39:47,850 you probably shouldn't consider this as your day job. 664 00:39:47,850 --> 00:39:53,910 When we compare running times it's important to keep in mind these things. 665 00:39:53,910 --> 00:39:57,760 So if I have an algorithm that's O(n), and someone else 666 00:39:57,760 --> 00:40:03,590 has an algorithm of O(2n) these are actually asymptotically equivalent. 667 00:40:03,590 --> 00:40:06,590 So if we imagine n to be a big number like eleventy billion: 668 00:40:06,590 --> 00:40:13,090 so when we're comparing eleventy billion to something like eleventy billion + 3, 669 00:40:13,090 --> 00:40:17,640 suddenly that +3 doesn't really make a big difference anymore. 670 00:40:17,640 --> 00:40:20,980 That's why we're going to start considering these things to be equivalent. 671 00:40:20,980 --> 00:40:24,220 So things like these constants here, there's 2 x this, or adding a 3, 672 00:40:24,220 --> 00:40:27,180 these are just constants, and these are going to drop up. 673 00:40:27,180 --> 00:40:32,480 So that's why all 3 of these run times are the same as saying they're O(n). 674 00:40:32,480 --> 00:40:37,490 Similarly, if we have 2 other run times, let's say O(n³ + 2n²), we can add 675 00:40:37,490 --> 00:40:42,070 + n, + 7, and then we have another run time that's just O(n³). 676 00:40:42,070 --> 00:40:46,290 again, these are the same thing because these--these are not the same. 677 00:40:46,290 --> 00:40:49,840 These are the same things, sorry. So these are the same because 678 00:40:49,840 --> 00:40:53,090 this n³ is going to dominate this 2n². 679 00:40:53,090 --> 00:40:59,130 >> What is not the same thing is if we have run times like O(n³) and O(n²) 680 00:40:59,130 --> 00:41:02,820 because this n³ is much larger than this n². 681 00:41:02,820 --> 00:41:05,470 So if we have exponents, suddenly this starts to matter, 682 00:41:05,470 --> 00:41:08,280 but when we're just dealing with factors as we are up here, 683 00:41:08,280 --> 00:41:12,810 then it's not going to matter because they are just going to drop out. 684 00:41:12,810 --> 00:41:16,760 Let's take a look at some of the algorithms we've seen so far 685 00:41:16,760 --> 00:41:19,260 and talk about their run time. 686 00:41:19,260 --> 00:41:23,850 The first way of looking for a number in a list, that we saw, was linear search. 687 00:41:23,850 --> 00:41:26,950 And the implementation of linear search is super straightforward. 688 00:41:26,950 --> 00:41:30,490 We just have a list, and we're going to look at every single element in the list 689 00:41:30,490 --> 00:41:34,260 until we find the number we're looking for. 690 00:41:34,260 --> 00:41:38,370 So that means that in the worst case, this O(n). 691 00:41:38,370 --> 00:41:40,860 And the worst case here could be if the element is 692 00:41:40,860 --> 00:41:45,710 the last element, then using linear search we have to look at every single element 693 00:41:45,710 --> 00:41:50,180 until we get to the last one in order to know that it was actually in the list. 694 00:41:50,180 --> 00:41:52,910 We can't just give up halfway and say, "It's probably not there." 695 00:41:52,910 --> 00:41:55,980 With linear search we have to look at the whole thing. 696 00:41:55,980 --> 00:41:59,090 The best-case running time, on the other hand, is constant 697 00:41:59,090 --> 00:42:04,200 because in the best case the element we're looking for is just the first one in the list. 698 00:42:04,200 --> 00:42:08,930 So it's going to take us exactly 1 step, no matter how big the list is 699 00:42:08,930 --> 00:42:12,140 if we're looking for the first element every time. 700 00:42:12,140 --> 00:42:15,390 >> So when you search, remember, it does not require that our list be sorted. 701 00:42:15,390 --> 00:42:19,430 Because we're simply going to look over every single element, and it doesn't really matter 702 00:42:19,430 --> 00:42:23,560 what order those elements are in. 703 00:42:23,560 --> 00:42:28,110 A more intelligent search algorithm is something like binary search. 704 00:42:28,110 --> 00:42:31,500 Remember, the implementation of binary search is when you're going to 705 00:42:31,500 --> 00:42:34,320 keep looking at the middle of the list. 706 00:42:34,320 --> 00:42:38,000 And because we're looking at the middle, we require that the list is sorted 707 00:42:38,000 --> 00:42:40,580 or else we don't know where the middle is, and we have to look over 708 00:42:40,580 --> 00:42:44,480 the whole list to find it, and then at that point we're just wasting time. 709 00:42:44,480 --> 00:42:48,480 So if we have a sorted list and we find the middle, we're going to compare the middle 710 00:42:48,480 --> 00:42:51,590 to the element we're looking for. 711 00:42:51,590 --> 00:42:54,640 If it's too high, then we can forget the right half 712 00:42:54,640 --> 00:42:57,810 because we know that if our element is already too high 713 00:42:57,810 --> 00:43:01,080 and everything to the right of this element is even higher, 714 00:43:01,080 --> 00:43:02,760 then we don't need to look there anymore. 715 00:43:02,760 --> 00:43:05,430 Where on the other hand, if our element is too low, 716 00:43:05,430 --> 00:43:08,700 we know everything to the left of that element is also too low, 717 00:43:08,700 --> 00:43:11,390 so it doesn't really make sense to look there, either. 718 00:43:11,390 --> 00:43:15,760 This way, with every step and every time we look at the midpoint of the list, 719 00:43:15,760 --> 00:43:19,060 we're going to cut our problem in half because suddenly we know 720 00:43:19,060 --> 00:43:23,040 a whole bunch of numbers that can't be the one we're looking for. 721 00:43:23,040 --> 00:43:26,950 >> In pseudocode this would look something like this, 722 00:43:26,950 --> 00:43:30,990 and because we're cutting the list in half every single time, 723 00:43:30,990 --> 00:43:34,920 our worst-case run time jumps from linear to logarithmic. 724 00:43:34,920 --> 00:43:39,260 So suddenly we have log-in steps in order to find an element in a list. 725 00:43:39,260 --> 00:43:42,460 The best-case running time, though, is still constant 726 00:43:42,460 --> 00:43:45,180 because now, let's just say that the element we're looking for is 727 00:43:45,180 --> 00:43:48,380 always the exact middle of the original list. 728 00:43:48,380 --> 00:43:52,080 So we can grow our list as big as we want, but if the element we're looking for is at the middle, 729 00:43:52,080 --> 00:43:54,910 then it's only going to take us 1 step. 730 00:43:54,910 --> 00:44:00,920 So that's why we're O(log n) and Ω(1) or constant. 731 00:44:00,920 --> 00:44:04,510 Let's actually run binary search on this list. 732 00:44:04,510 --> 00:44:08,020 So let's say that we're looking for the element 164. 733 00:44:08,020 --> 00:44:11,650 The first thing we are going to do is find the midpoint of this list. 734 00:44:11,650 --> 00:44:15,060 It just so happens that the midpoint is going to fall in between these 2 numbers, 735 00:44:15,060 --> 00:44:18,960 so let's just arbitrarily say, every time the midpoint falls between 2 numbers, 736 00:44:18,960 --> 00:44:21,150 let's just round up. 737 00:44:21,150 --> 00:44:24,330 We just need to make sure we do this every step of the way. 738 00:44:24,330 --> 00:44:29,040 So we're going to round up, and we're going to say that 161 is the middle of our list. 739 00:44:29,040 --> 00:44:34,640 So 161 < 164, and every element to the left of 161 740 00:44:34,640 --> 00:44:39,120 is also < 164, so we know that it's not going to help us at all 741 00:44:39,120 --> 00:44:42,690 to start looking over here because the element we're looking for can't be there. 742 00:44:42,690 --> 00:44:47,060 So what we can do is we can just forget about that whole left half of the list, 743 00:44:47,060 --> 00:44:51,700 and now only consider from the right of the 161 onward. 744 00:44:51,700 --> 00:44:54,050 >> So again, this is the midpoint; let's just round up. 745 00:44:54,050 --> 00:44:56,260 Now 175 is too big. 746 00:44:56,260 --> 00:44:59,180 So we know it's not going to help us looking here or here, 747 00:44:59,180 --> 00:45:06,610 so we can just throw that away, and eventually we'll hit the 164. 748 00:45:06,610 --> 00:45:10,560 Any questions on binary search? 749 00:45:10,560 --> 00:45:14,180 Let's move on from searching through an already-sorted list 750 00:45:14,180 --> 00:45:17,660 to actually taking a list of numbers in any order 751 00:45:17,660 --> 00:45:20,960 and making that list in ascending order. 752 00:45:20,960 --> 00:45:24,060 The first algorithm we looked at was called bubble sort. 753 00:45:24,060 --> 00:45:27,300 And this would be simpler of the algorithms we saw. 754 00:45:27,300 --> 00:45:32,970 Bubble sort says that when any 2 elements inside the list are out of place, 755 00:45:32,970 --> 00:45:36,500 meaning there is a higher number to the left of a lower number, 756 00:45:36,500 --> 00:45:40,190 then we're going to swap them, because that means that the list will be 757 00:45:40,190 --> 00:45:42,860 "more sorted" than it was before. 758 00:45:42,860 --> 00:45:45,180 And we're just going to continue this process again and again and again 759 00:45:45,180 --> 00:45:52,100 until eventually the elements kind of bubble to their correct location and we have a sorted list. 760 00:45:52,100 --> 00:45:57,230 >> The run time of this is going to be O(n²). Why? 761 00:45:57,230 --> 00:46:00,370 Well, because in the worst case, we're going to take every element, and 762 00:46:00,370 --> 00:46:04,570 we're going to end up comparing it to every other element in the list. 763 00:46:04,570 --> 00:46:08,030 But in the best case, we have an already sorted list, bubble sort's 764 00:46:08,030 --> 00:46:12,230 just going to go through once, say "Nope. I didn't make any swaps, so I'm done." 765 00:46:12,230 --> 00:46:17,410 So we have a best-case running time of Ω(n). 766 00:46:17,410 --> 00:46:20,680 Let's run bubble sort on a list. 767 00:46:20,680 --> 00:46:23,560 Or first, let's just look at some pseudocode really quickly. 768 00:46:23,560 --> 00:46:28,160 We want to say we want to keep track of, in every iteration of the loop, 769 00:46:28,160 --> 00:46:32,190 keep track of whether or not we changed any elements. 770 00:46:32,190 --> 00:46:37,610 So the reason for that is, we're going to stop when we have not swapped any elements. 771 00:46:37,610 --> 00:46:41,980 So at the start of our loop we haven't swapped anything, so we'll say that's false. 772 00:46:41,980 --> 00:46:47,170 Now, we're going to go through the list and compare element i to element i + 1 773 00:46:47,170 --> 00:46:50,310 and if it is the case that there is a bigger number to the left of a smaller number, 774 00:46:50,310 --> 00:46:52,310 then we're just going to swap them. 775 00:46:52,310 --> 00:46:54,490 >> And then we're going to remember that we swapped an element. 776 00:46:54,490 --> 00:46:58,900 That means that we need to go through the list at least 1 more time 777 00:46:58,900 --> 00:47:02,160 because the condition in which we stopped is when the whole list is already sorted, 778 00:47:02,160 --> 00:47:04,890 meaning we have not made any swaps. 779 00:47:04,890 --> 00:47:09,960 So that's why our condition down here is 'while some elements have been swapped.' 780 00:47:09,960 --> 00:47:13,720 So now let's just look at this running on a list. 781 00:47:13,720 --> 00:47:16,640 I have the list 5,0,1,6,4. 782 00:47:16,640 --> 00:47:19,850 Bubble sort is going to start all the way at the left, and it's going to compare 783 00:47:19,850 --> 00:47:24,700 the i elements, so 0 to i + 1, which is element 1. 784 00:47:24,700 --> 00:47:29,020 It's going to say, well 5 > 0, but right now 5 is to the left, 785 00:47:29,020 --> 00:47:32,500 so I need to swap the 5 and the 0. 786 00:47:32,500 --> 00:47:35,470 When I swap them, suddenly I get this different list. 787 00:47:35,470 --> 00:47:38,260 Now 5 > 1, so we're going to swap them. 788 00:47:38,260 --> 00:47:42,160 5 is not > 6, so we don't need to do anything here. 789 00:47:42,160 --> 00:47:46,690 But 6 > 4, so we need to swap. 790 00:47:46,690 --> 00:47:49,740 Again, we need to run through the whole list to eventually discover 791 00:47:49,740 --> 00:47:52,330 that these are out of order; we swap them, 792 00:47:52,330 --> 00:47:57,120 and at this point we need to run through the list 1 more time 793 00:47:57,120 --> 00:48:05,390 to make sure that everything's in its order, and at this point bubble sort has finished. 794 00:48:05,390 --> 00:48:10,720 A different algorithm for taking some elements and sorting them is selection sort. 795 00:48:10,720 --> 00:48:15,740 The idea behind selection sort is that we're going to build up a sorted portion of the list 796 00:48:15,740 --> 00:48:18,150 1 element at a time. 797 00:48:18,150 --> 00:48:23,170 >> And the way we're going to do that is by building up the left segment of the list. 798 00:48:23,170 --> 00:48:27,510 And basically, every--on each step, we're going to take the smallest element we have left 799 00:48:27,510 --> 00:48:32,310 that hasn't been sorted yet, and we're going to move it into that sorted segment. 800 00:48:32,310 --> 00:48:35,850 That means we need to continuously find the minimum unsorted element 801 00:48:35,850 --> 00:48:40,720 and then take that minimum element and swap it with whatever 802 00:48:40,720 --> 00:48:45,090 left-most element that is not sorted. 803 00:48:45,090 --> 00:48:50,890 The run time of this is going to be O(n²) because in the worst case 804 00:48:50,890 --> 00:48:55,070 we need to compare every single element to every other element. 805 00:48:55,070 --> 00:48:59,250 Because we're saying that if we start at the left half of the list, we need 806 00:48:59,250 --> 00:49:02,970 to go through the entire right segment to find the smallest element. 807 00:49:02,970 --> 00:49:05,430 And then, again, we need to go over the entire right segment and 808 00:49:05,430 --> 00:49:08,210 keep going over that over and over and over again. 809 00:49:08,210 --> 00:49:11,350 That's going to be n². We're going to need a for loop inside of another for loop 810 00:49:11,350 --> 00:49:13,350 which suggests n². 811 00:49:13,350 --> 00:49:16,530 In the best case thought, let's say we give it an already sorted list; 812 00:49:16,530 --> 00:49:19,270 we actually don't do any better than n². 813 00:49:19,270 --> 00:49:21,730 Because selection sort has no way of knowing that 814 00:49:21,730 --> 00:49:25,540 the minimum element is just the one I happen to be looking at. 815 00:49:25,540 --> 00:49:28,970 It still needs to make sure that this is actually the minimum. 816 00:49:28,970 --> 00:49:31,670 >> And the only way to make sure that it's the minimum, using this algorithm, 817 00:49:31,670 --> 00:49:34,640 is to look at every single element again. 818 00:49:34,640 --> 00:49:38,420 So really, if you give it--if you give selection sort an already sorted list, 819 00:49:38,420 --> 00:49:42,720 it's not going to do any better than giving it a list that isn't sorted yet. 820 00:49:42,720 --> 00:49:46,320 By the way, if it happens to be the case that something is O(something) 821 00:49:46,320 --> 00:49:50,640 and the omega of something, we can just say more succinctly that it's θ of something. 822 00:49:50,640 --> 00:49:52,760 So if you see that come up anywhere, that's what that just means. 823 00:49:52,760 --> 00:49:57,580 >> If something is theta of n², it is both big O(n²) and Ω(n²). 824 00:49:57,580 --> 00:49:59,790 So best case and worst case, it doesn't make a difference, 825 00:49:59,790 --> 00:50:04,400 the algorithm is going to do the same thing every time. 826 00:50:04,400 --> 00:50:06,610 So this is what pseudocode for selection sort could look like. 827 00:50:06,610 --> 00:50:10,630 We're basically going to say that I want to iterate over the list 828 00:50:10,630 --> 00:50:15,180 from left to right, and at each iteration of the loop, I'm going to move 829 00:50:15,180 --> 00:50:19,780 the minimum element into this sorted portion of the list. 830 00:50:19,780 --> 00:50:23,260 And once I move something there, I never need to look at that element again. 831 00:50:23,260 --> 00:50:28,600 Because as soon as I swap an element in to the left segment of the list, it's sorted 832 00:50:28,600 --> 00:50:32,600 because we're doing everything in ascending order by using minimums. 833 00:50:32,600 --> 00:50:38,740 So we said, okay, we're at position i, and we need to look at all of the elements 834 00:50:38,740 --> 00:50:42,260 to the right of i in order to find the minimum. 835 00:50:42,260 --> 00:50:46,150 So that means we want to look from i + 1 to the end of the list. 836 00:50:46,150 --> 00:50:51,610 And now, if the element that we're currently looking at is less than our minimum so far, 837 00:50:51,610 --> 00:50:54,190 which, remember, we're starting the minimum off to just be 838 00:50:54,190 --> 00:50:57,020 whatever element we're currently at; I'll assume that's the minimum. 839 00:50:57,020 --> 00:51:00,270 If I find an element that's smaller than that, then I'm going to say, okay, 840 00:51:00,270 --> 00:51:02,700 well, I have found a new minimum. 841 00:51:02,700 --> 00:51:06,080 I'm going to remember where that minimum was. 842 00:51:06,080 --> 00:51:09,560 >> So now, once I've gone through that right unsorted segment, 843 00:51:09,560 --> 00:51:16,690 I can say I'm going to swap the minimum element with the element that is in position i. 844 00:51:16,690 --> 00:51:21,100 That's going to build up my list, my sorted portion of the list from left to right, 845 00:51:21,100 --> 00:51:25,190 and we don't ever need to look at an element again once it's in that portion. 846 00:51:25,190 --> 00:51:27,930 Once we've swapped it. 847 00:51:27,930 --> 00:51:30,260 So let's run selection sort on this list. 848 00:51:30,260 --> 00:51:38,220 The blue element here is going to be the i, and the red element is going to be the minimum element. 849 00:51:38,220 --> 00:51:41,570 So i starts all the way at the left of the list, so at 5. 850 00:51:41,570 --> 00:51:44,610 Now we need to find the minimum unsorted element. 851 00:51:44,610 --> 00:51:49,480 So we say 0 < 5, so 0 is my new minimum. 852 00:51:49,480 --> 00:51:53,820 >> But I can't stop there, because even though we can recognize that 0 is the smallest, 853 00:51:53,820 --> 00:51:59,390 we need to run through every other element of the list to make sure. 854 00:51:59,390 --> 00:52:01,760 So 1 is bigger, 6 is bigger, 4 is bigger. 855 00:52:01,760 --> 00:52:05,850 That means that after looking at all of these elements, I've determined 0 is the smallest. 856 00:52:05,850 --> 00:52:09,800 So I'm going to swap the 5 and the 0. 857 00:52:09,800 --> 00:52:15,480 Once I swap that, I'm going to get a new list, and I know that I never need to look at that 0 again 858 00:52:15,480 --> 00:52:19,380 because once I've swapped it, I've sorted it and we're done. 859 00:52:19,380 --> 00:52:22,730 Now it just so happens that the blue element is again the 5, 860 00:52:22,730 --> 00:52:26,030 and we need to look at the 1, the 6 and the 4 to determine that 1 861 00:52:26,030 --> 00:52:31,520 is the smallest minimum element, so we'll swap the 1 and the 5. 862 00:52:31,520 --> 00:52:36,890 Again, we need to look at--compare the 5 to the 6 and the 4, 863 00:52:36,890 --> 00:52:39,830 and we're going to swap the 4 and the 5, and finally, compare 864 00:52:39,830 --> 00:52:45,740 those 2 numbers and swap them until we get our sorted list. 865 00:52:45,740 --> 00:52:49,730 Any questions on selection sort? 866 00:52:49,730 --> 00:52:56,420 Okay. Let's move to the last topic here, and that is recursion. 867 00:52:56,420 --> 00:52:59,810 >> Recursion, remember, is this really meta thing where a function 868 00:52:59,810 --> 00:53:02,740 repeatedly calls itself. 869 00:53:02,740 --> 00:53:05,620 So at some point, while our fuction is repeatedly calling itself, 870 00:53:05,620 --> 00:53:10,100 there needs to be some point at which we stop calling ourselves. 871 00:53:10,100 --> 00:53:13,670 Because if we don't do that, then we're just going to continue to do this forever, 872 00:53:13,670 --> 00:53:16,660 and our program is just not going to terminate. 873 00:53:16,660 --> 00:53:19,200 We call this condition the base case. 874 00:53:19,200 --> 00:53:22,570 And the base case says, rather than calling a function again, 875 00:53:22,570 --> 00:53:25,330 I'm just going to return some value. 876 00:53:25,330 --> 00:53:28,080 So once we've returned a value, we've stopped calling ourselves, 877 00:53:28,080 --> 00:53:32,550 and the rest of the calls we've made so far can also return. 878 00:53:32,550 --> 00:53:36,050 The opposite of the base case is the recursive case. 879 00:53:36,050 --> 00:53:39,050 And this is when we want to make another call to the function that we're currently in. 880 00:53:39,050 --> 00:53:44,690 And we probably, although not always, want to use different arguments. 881 00:53:44,690 --> 00:53:48,940 >> So if we have a function called f, and f just called take 1 argument, 882 00:53:48,940 --> 00:53:52,010 and we just keep calling f(1), f(1), f(1), and it just so happens that 883 00:53:52,010 --> 00:53:56,510 the argument 1 falls into recursive case, we're still never going to stop. 884 00:53:56,510 --> 00:54:01,620 Even if we have a base case, we need to make sure that eventually we're going to hit that base case. 885 00:54:01,620 --> 00:54:04,250 We don't just keep staying in this recursive case. 886 00:54:04,250 --> 00:54:09,870 Generally, when we call ourselves, we're probably going to have a different argument each time. 887 00:54:09,870 --> 00:54:12,700 Here is a really simple recursive function. 888 00:54:12,700 --> 00:54:15,090 So this will compute the factorial of a number. 889 00:54:15,090 --> 00:54:17,790 Up top here we have our base case. 890 00:54:17,790 --> 00:54:22,330 In the case that n ≤ 1, we're not going to call factorial again. 891 00:54:22,330 --> 00:54:26,490 We're going to stop; we're just going to return some value. 892 00:54:26,490 --> 00:54:30,170 If this is not true, then we're going to hit our recursive case. 893 00:54:30,170 --> 00:54:33,550 Notice here that we're not just calling factorial(n), because that wouldn't be very helpful. 894 00:54:33,550 --> 00:54:36,810 We're going to call factorial of something else. 895 00:54:36,810 --> 00:54:40,850 >> And so you can see, eventually if we pass a factorial (5) or something, 896 00:54:40,850 --> 00:54:45,900 we're going to call factorial (4) and so on, and eventually we're going to hit this base case. 897 00:54:45,900 --> 00:54:51,730 So this looks good. Let's see what happens when we actually run this. 898 00:54:51,730 --> 00:54:57,840 This is the stack, and let's say that main is going to call this function with an argument (4). 899 00:54:57,840 --> 00:55:02,200 So once factorial sees and = 4, factorial will call itself. 900 00:55:02,200 --> 00:55:05,010 Now, suddenly, we have factorial(3). 901 00:55:05,010 --> 00:55:10,780 So these functions are going to keep growing until eventually we hit our base case. 902 00:55:10,780 --> 00:55:17,830 At this point, the return value of this is the return(n x the return value of this), 903 00:55:17,830 --> 00:55:21,290 the return value of this is n x the return value of this. 904 00:55:21,290 --> 00:55:23,290 Eventually we need to hit some number. 905 00:55:23,290 --> 00:55:26,560 At the top here, we say return 1. 906 00:55:26,560 --> 00:55:30,650 That means that once we return that number, we can pop this off the stack. 907 00:55:30,650 --> 00:55:36,570 So this factorial(1) is done. 908 00:55:36,570 --> 00:55:41,190 When 1 returns, this factorial(1) returns, this return to 1. 909 00:55:41,190 --> 00:55:46,910 The return value of this, remember, was n x the return value of this. 910 00:55:46,910 --> 00:55:50,720 So suddenly, this guy knows that I want to return 2. 911 00:55:50,720 --> 00:55:55,910 >> So remember, return value of this is just n x the return value up here. 912 00:55:55,910 --> 00:56:01,160 So now we can say 3 x 2, and finally, here we can say 913 00:56:01,160 --> 00:56:04,010 this is just going to be 4 x 3 x 2. 914 00:56:04,010 --> 00:56:09,570 And once this returns, we get down to a single integer inside of main. 915 00:56:09,570 --> 00:56:15,460 Any questions on recursion? 916 00:56:15,460 --> 00:56:17,090 All right. So there's more time for questions at the end, 917 00:56:17,090 --> 00:56:23,360 but now Joseph will cover the remaining topics. 918 00:56:23,360 --> 00:56:25,590 >> [Joseph Ong] All right. So now that we've talked about recursions, 919 00:56:25,590 --> 00:56:27,840 let's talk a little bit about what merge sort is. 920 00:56:27,840 --> 00:56:31,740 Merge sort is basically another way of sorting a list of numbers. 921 00:56:31,740 --> 00:56:36,430 And how it works is, with merge sort you have a list, and what we do is 922 00:56:36,430 --> 00:56:39,120 we say, let's split this into 2 halves. 923 00:56:39,120 --> 00:56:42,750 We'll first run merge sort again on the left half, 924 00:56:42,750 --> 00:56:45,040 then we'll run merge sort on the right half, 925 00:56:45,040 --> 00:56:50,240 and that gives us now 2 halves that are sorted, and now we're going to combine those halves together. 926 00:56:50,240 --> 00:56:55,010 It's a bit hard to see without an example, so we'll go through the motions and see what happens. 927 00:56:55,010 --> 00:56:59,590 So you start with this list, we split it into 2 halves. 928 00:56:59,590 --> 00:57:02,300 We run merge sort on the left half first. 929 00:57:02,300 --> 00:57:06,660 So that's the left half, and now we run them through this list again 930 00:57:06,660 --> 00:57:09,800 which gets passed into merge sort, and then we look, again, 931 00:57:09,800 --> 00:57:13,270 at the left side of this list and we run merge sort on it. 932 00:57:13,270 --> 00:57:15,880 Now, we get down to a list of 2 numbers, 933 00:57:15,880 --> 00:57:19,010 and now the left half is only 1 element long, and we can't 934 00:57:19,010 --> 00:57:23,380 split a list that's only 1 element into half, so we just say, once we have 50, 935 00:57:23,380 --> 00:57:26,400 which is just 1 element, it's already sorted. 936 00:57:26,400 --> 00:57:29,860 >> Once we're done with that, we can see that we can 937 00:57:29,860 --> 00:57:32,230 move on to the right half of this list, 938 00:57:32,230 --> 00:57:36,480 and 3 is also sorted, and so now that both halves of this list are sorted 939 00:57:36,480 --> 00:57:39,080 we can join these numbers back together. 940 00:57:39,080 --> 00:57:45,320 So we look at 50 and 3; 3 is smaller than 50, so it goes in first and then 50 comes in. 941 00:57:45,320 --> 00:57:49,340 Now, that's done; we go back up to that list and sort it's right half. 942 00:57:49,340 --> 00:57:52,440 42 is it's own number, so it's already sorted. 943 00:57:52,440 --> 00:57:57,850 So now we compare these 2 and 3 is smaller than 42, so that gets put in first, 944 00:57:57,850 --> 00:58:02,340 now 42 gets put in, and 50 gets put in. 945 00:58:02,340 --> 00:58:07,220 Now, that's sorted, we go all the way back to the top, 1337 and 15. 946 00:58:07,220 --> 00:58:14,560 Well, we now look at the left half of this list; 1337 is by itself so it's sorted and same with 15. 947 00:58:14,560 --> 00:58:19,020 So now we combine these 2 numbers to sort that original list, 15 < 1337, 948 00:58:19,020 --> 00:58:23,060 so it goes in first, then 1337 goes in. 949 00:58:23,060 --> 00:58:26,640 And now we sorted both halves of the original list up top. 950 00:58:26,640 --> 00:58:30,440 And all we have to do is combine these. 951 00:58:30,440 --> 00:58:36,890 We look at the first 2 numbers of this list, 3 < 15, so it goes into the sort array first. 952 00:58:36,890 --> 00:58:44,460 15 < 42, so it goes in. Now, 42 < 1337, that goes in. 953 00:58:44,460 --> 00:58:51,010 50 < 1337, so it goes in. And notice that we just took 2 numbers off of this list. 954 00:58:51,010 --> 00:58:53,640 So we're not just alternating between the 2 lists. 955 00:58:53,640 --> 00:58:56,050 We're just looking at the beginning, and we're taking the element 956 00:58:56,050 --> 00:59:00,270 that's smaller and then putting it into our array. 957 00:59:00,270 --> 00:59:04,080 Now we've merged all the halves and we're done. 958 00:59:04,080 --> 00:59:07,780 >> Any questions about merge sort? Yes? 959 00:59:07,780 --> 00:59:14,190 [Student] If it's splitting into different groups, why don't they just split it once 960 00:59:14,190 --> 00:59:19,970 and you have 3 and 2 in a group? [Rest of question unintelligible] 961 00:59:19,970 --> 00:59:24,940 The reason--so the question is, why can't we just merge them at that first step after we have them? 962 00:59:24,940 --> 00:59:29,530 The reason we can do this, start at the left-most elements of both sides, 963 00:59:29,530 --> 00:59:33,040 and then take the smaller one and put it in, is that we know that these 964 00:59:33,040 --> 00:59:35,290 individual lists are in sorted orders. 965 00:59:35,290 --> 00:59:37,290 So if I'm looking at the left-most elements of both halves, 966 00:59:37,290 --> 00:59:40,490 I know they're going to be the smallest elements of those lists. 967 00:59:40,490 --> 00:59:43,930 So I can put them into the smallest element spots of this large list. 968 00:59:43,930 --> 00:59:47,810 On the other hand, if I look at those 2 lists in the second level over there, 969 00:59:47,810 --> 00:59:51,640 50, 3, 42, 1337 and 15, those aren't sorted. 970 00:59:51,640 --> 00:59:55,770 So if I look at 50 and 1337, I'm going to put 50 into my list first. 971 00:59:55,770 --> 01:00:00,130 But that doesn't really make sense, because 3 is the smallest element out of all of those. 972 01:00:00,130 --> 01:00:04,390 So the only reason we can do this combining step is because our lists are already sorted. 973 01:00:04,390 --> 01:00:07,010 Which is why we have to get down all the way to the bottom 974 01:00:07,010 --> 01:00:09,800 because when we have just a single number, you know that a single number 975 01:00:09,800 --> 01:00:14,120 in and of itself is already a sorted list. 976 01:00:14,120 --> 01:00:19,360 >> Any questions? No? 977 01:00:19,360 --> 01:00:24,260 Complexity? Well, you can see that at each step there's end numbers, 978 01:00:24,260 --> 01:00:27,590 and we can divide a list in half log n times, 979 01:00:27,590 --> 01:00:31,700 which is where we get this n x log n complexity. 980 01:00:31,700 --> 01:00:34,940 And you'll see the best case for merge sort is n log n, and it just so happens 981 01:00:34,940 --> 01:00:39,340 that the worst case, or the Ω over there, is also n log n. 982 01:00:39,340 --> 01:00:42,480 Something to keep in mind. 983 01:00:42,480 --> 01:00:45,750 Moving on, let's go on to some super basic file I/O. 984 01:00:45,750 --> 01:00:48,830 If you looked at Scramble, you'll notice we had some sort of system 985 01:00:48,830 --> 01:00:51,270 where you could write to a log file if you read through the code. 986 01:00:51,270 --> 01:00:53,730 Let's see how you might do that. 987 01:00:53,730 --> 01:00:57,450 Well, we have fprintf, which you can think of as just printf, 988 01:00:57,450 --> 01:01:01,720 but just printing to a file instead, and hence the f at the beginning. 989 01:01:01,720 --> 01:01:07,570 This sort of code up here, what it does is, as you might have seen in Scramble, 990 01:01:07,570 --> 01:01:12,310 it goes through your 2-dimensional array printing out row by row what the numbers are. 991 01:01:12,310 --> 01:01:17,850 In this case, printf prints out to your terminal or what we call the standard output of section. 992 01:01:17,850 --> 01:01:22,170 >> And now, in this case, all we have to do is replace printf with fprintf, 993 01:01:22,170 --> 01:01:26,770 tell it what file you want to print to, and in this case it just prints it out to that file 994 01:01:26,770 --> 01:01:32,230 instead of printing it out to your terminal. 995 01:01:32,230 --> 01:01:36,500 Well, then that begs the question: Where do we get this sort of file from, right? 996 01:01:36,500 --> 01:01:39,840 We passed log in to this fprintf fuction but we had no idea where it came from. 997 01:01:39,840 --> 01:01:43,980 Well, early in the code, what we had was this chunk of code over here, 998 01:01:43,980 --> 01:01:48,340 which basically says that open the file calls log.txt. 999 01:01:48,340 --> 01:01:53,220 What we do after that is we have to make sure that the file is actually opened successfully. 1000 01:01:53,220 --> 01:01:57,070 So it might fail for multiple reasons; you don't have enough space on your computer, for example. 1001 01:01:57,070 --> 01:01:59,790 So it's always important before you do any operations with the file 1002 01:01:59,790 --> 01:02:03,300 that we check whether that file was opened successfully. 1003 01:02:03,300 --> 01:02:09,330 So what that a, that's an argument to fopen, well, we can open a file in many ways. 1004 01:02:09,330 --> 01:02:13,510 What we can do is, we can pass it w, which means override the file if it exits already, 1005 01:02:13,510 --> 01:02:18,070 We can pass an a, which they append to the end of the file instead of overriding it, 1006 01:02:18,070 --> 01:02:22,730 or we can specify r, which means, let's open the file as read-only. 1007 01:02:22,730 --> 01:02:24,890 So if the program tries to make any changes to the file, 1008 01:02:24,890 --> 01:02:30,140 yell at them and don't let them do it. 1009 01:02:30,140 --> 01:02:33,320 Finally, once we're done with the file, done doing operations on it, 1010 01:02:33,320 --> 01:02:35,860 we need to make sure we close the file. 1011 01:02:35,860 --> 01:02:38,830 And so at the end of your program, you are going to pass them again 1012 01:02:38,830 --> 01:02:42,120 this file that you opened, and just close it. 1013 01:02:42,120 --> 01:02:44,650 So this is something important that you have to make sure you do. 1014 01:02:44,650 --> 01:02:47,180 So remember you can open a file, then you can write to the file, 1015 01:02:47,180 --> 01:02:51,270 do operations in the file, but then you have to close the file at the end. 1016 01:02:51,270 --> 01:02:53,270 >> Any questions on basic file I/O? Yes? 1017 01:02:53,270 --> 01:02:58,050 [Student question, unintelligible] 1018 01:02:58,050 --> 01:03:02,480 Right here. The question is, where does this log.txt file appear? 1019 01:03:02,480 --> 01:03:07,890 Well, if you just give it log.txt, it creates it in the same directory as the executable. 1020 01:03:07,890 --> 01:03:10,500 So if you're-- >>[Student question, unintelligible] 1021 01:03:10,500 --> 01:03:18,830 Yes. In the same folder, or in the same directory, as you call it. 1022 01:03:18,830 --> 01:03:21,400 Now memory, stack, and heap. 1023 01:03:21,400 --> 01:03:23,400 So how is memory laid out in the computer? 1024 01:03:23,400 --> 01:03:26,270 Well, you can imagine memory as sort of this block here. 1025 01:03:26,270 --> 01:03:30,260 And in memory we have what's called the heap stuck over there, and the stack that's down there. 1026 01:03:30,260 --> 01:03:34,480 And the heap grows downward and the stack grows upward. 1027 01:03:34,480 --> 01:03:38,620 So as Tommy mentioned--oh, well, and we have these other 4 segments which I'll get to in a second-- 1028 01:03:38,620 --> 01:03:42,890 As Tommy said earlier, you know how his functions call themselves and call each other? 1029 01:03:42,890 --> 01:03:44,930 They build up this sort of stack frame. 1030 01:03:44,930 --> 01:03:47,360 Well, if main calls foo, foo gets put on the stack. 1031 01:03:47,360 --> 01:03:52,430 Foo calls bar, bar get's put on the stack, and that gets put on the stack after. 1032 01:03:52,430 --> 01:03:57,040 And as they return, they each get taken off the stack. 1033 01:03:57,040 --> 01:04:00,140 What do each of these locations and memory hold? 1034 01:04:00,140 --> 01:04:03,110 Well, the top, which is the text segment, contains the program itself. 1035 01:04:03,110 --> 01:04:06,390 So the machine code, that's there, once you compile your program. 1036 01:04:06,390 --> 01:04:08,520 Next, any initialized global variables. 1037 01:04:08,520 --> 01:04:12,660 >> So you have global variables in your program, and you say like, a = 5, 1038 01:04:12,660 --> 01:04:15,260 that gets put in that segment, and right under that, 1039 01:04:15,260 --> 01:04:18,990 you have any uninitialized global data, which is just int a, 1040 01:04:18,990 --> 01:04:20,990 but you don't say it's equal to anything. 1041 01:04:20,990 --> 01:04:23,870 Realize these are global variables, so they're outside of main. 1042 01:04:23,870 --> 01:04:28,560 So this means any global variables that are declared but are not initialized. 1043 01:04:28,560 --> 01:04:32,310 So what's in the heap? Memory allocated using malloc, which we'll get to in a little bit. 1044 01:04:32,310 --> 01:04:35,990 And finally, with the stack you have any local variables 1045 01:04:35,990 --> 01:04:39,950 and any functions you might call in any of their parameters. 1046 01:04:39,950 --> 01:04:43,720 The last thing, you don't really have to know what the environment variables do, 1047 01:04:43,720 --> 01:04:46,700 but whenever you run program, there is something associated, like 1048 01:04:46,700 --> 01:04:49,550 this is the username of the person who ran the program. 1049 01:04:49,550 --> 01:04:51,550 And that's going to be sort of at the bottom. 1050 01:04:51,550 --> 01:04:54,540 In terms of memory addresses, which are hexadecimal values, 1051 01:04:54,540 --> 01:04:58,170 the values at the top start at 0, and they go all the way down to the bottom. 1052 01:04:58,170 --> 01:05:00,440 In this case, if you're on the 32-bit system, 1053 01:05:00,440 --> 01:05:05,390 the address at the bottom is going to be 0x, followed by af, because that's 32 bits, 1054 01:05:05,390 --> 01:05:10,890 which is 8 bytes, and in this case 8 bytes corresponds to 8 hexadecimal digits. 1055 01:05:10,890 --> 01:05:20,110 So down here you're going to have, like, 0xffffff, and up there you're going to have 0. 1056 01:05:20,110 --> 01:05:23,660 So what are pointers? Some of you may not have covered this in section before. 1057 01:05:23,660 --> 01:05:26,660 but we did go over it in lecture, so a pointer is just a data type 1058 01:05:26,660 --> 01:05:34,030 which stores, instead of some sort of value like 50, it stores the address of some location in memory. 1059 01:05:34,030 --> 01:05:36,020 Like that memory [unintelligible]. 1060 01:05:36,020 --> 01:05:41,120 So in this case, what we have is, we have a pointer to an integer or an int *, 1061 01:05:41,120 --> 01:05:46,210 and it contains this hexadecimal address of 0xDEADBEEF. 1062 01:05:46,210 --> 01:05:50,880 >> So what we have is, now, this pointer points at some location in memory, 1063 01:05:50,880 --> 01:05:56,020 and that's just a, the value 50 is at this memory location. 1064 01:05:56,020 --> 01:06:01,810 On some 32-bit systems, on all 32-bit systems, pointers take up 32 bits or 4 bytes. 1065 01:06:01,810 --> 01:06:06,020 But, for example, on a 64-bit system, pointers are 64 bits. 1066 01:06:06,020 --> 01:06:08,040 So that's something you'll want to keep in mind. 1067 01:06:08,040 --> 01:06:12,310 So on an end-bit system, a pointer is end bits long. 1068 01:06:12,310 --> 01:06:17,320 Pointers are sort of hard to digest without extra things, 1069 01:06:17,320 --> 01:06:20,300 so let's go through an example of dynamic memory allocation. 1070 01:06:20,300 --> 01:06:25,130 What dynamic memory allocation does for you, or what we call malloc, 1071 01:06:25,130 --> 01:06:29,280 it lets you allocate some sort of data outside of the set. 1072 01:06:29,280 --> 01:06:31,830 So this data is sort of more permanent for the duration of the program. 1073 01:06:31,830 --> 01:06:36,430 Because as you know, if you declare x inside of a function, and that function returns, 1074 01:06:36,430 --> 01:06:40,910 you no longer have access to the data that was stored in x. 1075 01:06:40,910 --> 01:06:44,420 What pointers let us do is they let us store memory or store values 1076 01:06:44,420 --> 01:06:46,840 in a different segment of memory, namely the heap. 1077 01:06:46,840 --> 01:06:49,340 Now once we return out of function, as long as we have a pointer 1078 01:06:49,340 --> 01:06:54,960 to that location in memory, then what we can do is we can just look at the values there. 1079 01:06:54,960 --> 01:06:58,020 Let's look at an example: This is our memory layout again. 1080 01:06:58,020 --> 01:07:00,050 And we have this function, main. 1081 01:07:00,050 --> 01:07:06,870 What it does is--okay, so simple, right?--int x = 5, that's just a variable on the stack in main. 1082 01:07:06,870 --> 01:07:12,450 >> On the other hand, now we declare a pointer which calls the function giveMeThreeInts. 1083 01:07:12,450 --> 01:07:16,800 And so now we go into this function and we create a new stack frame for it. 1084 01:07:16,800 --> 01:07:20,440 However, in this stack frame, we declare int * temp, 1085 01:07:20,440 --> 01:07:23,210 which in mallocs 3 integers for us. 1086 01:07:23,210 --> 01:07:25,880 So size of int will give us how many bytes this int is, 1087 01:07:25,880 --> 01:07:29,620 and malloc gives us that many bytes of space on the heap. 1088 01:07:29,620 --> 01:07:32,890 So in this case, we have created enough space for 3 integers, 1089 01:07:32,890 --> 01:07:36,830 and the heap is way up there, which is why I've drawn it higher up. 1090 01:07:36,830 --> 01:07:42,900 Once we're done, we come back up here, you only need 3 ints returned, 1091 01:07:42,900 --> 01:07:47,000 and it returns the address, in this case over where that memory is. 1092 01:07:47,000 --> 01:07:51,250 And we set pointer = switch, and up there we have just another pointer. 1093 01:07:51,250 --> 01:07:54,550 But what that function returns is stacked here and disappears. 1094 01:07:54,550 --> 01:07:59,250 So temp disappears, but we still maintain the address of where 1095 01:07:59,250 --> 01:08:01,850 those 3 integers are inside of mains. 1096 01:08:01,850 --> 01:08:06,180 So in this set, the pointers are scoped locally for the stacked frame, 1097 01:08:06,180 --> 01:08:09,860 but the memory to which they refer is in the heap. 1098 01:08:09,860 --> 01:08:12,190 >> Does that make sense? 1099 01:08:12,190 --> 01:08:14,960 [Student] Could you repeat that? >>[Joseph] Yes. 1100 01:08:14,960 --> 01:08:20,270 So if I go back just a little bit, you see that temp allocated 1101 01:08:20,270 --> 01:08:23,500 some memory on the heap up there. 1102 01:08:23,500 --> 01:08:28,680 So when this function, giveMeThreeInts returns, this stack here is going to disappear. 1103 01:08:28,680 --> 01:08:35,819 And with it any of the variables, in this case, this pointer that was allocated in stacked frame. 1104 01:08:35,819 --> 01:08:39,649 That is going to disappear, but since we returned temp 1105 01:08:39,649 --> 01:08:46,330 and we set pointer = temp, pointer's now going to point the same memory of location as temp was. 1106 01:08:46,330 --> 01:08:50,370 So now, even though we lose temp, that local pointer, 1107 01:08:50,370 --> 01:08:59,109 we still retain the memory address of what it was pointing to inside of that variable pointer. 1108 01:08:59,109 --> 01:09:03,740 Questions? That can be kind of a confusing topic if you haven't gone over it in section. 1109 01:09:03,740 --> 01:09:09,240 We can, your TF will definitely go over it and of course we can answer questions 1110 01:09:09,240 --> 01:09:11,500 at the end of the review session for this. 1111 01:09:11,500 --> 01:09:14,220 But this is sort of a complex topic, and I have more examples that are going to show up 1112 01:09:14,220 --> 01:09:18,790 that will help clarify what pointers actually are. 1113 01:09:18,790 --> 01:09:22,500 >> In this case, pointers are equivalent to arrays, 1114 01:09:22,500 --> 01:09:25,229 so I can just use this pointer as the same thing as an int array. 1115 01:09:25,229 --> 01:09:29,840 So I'm indexing into 0, and changing the first integer to 1, 1116 01:09:29,840 --> 01:09:39,689 changing the second integer to 2, and the 3rd integer to 3. 1117 01:09:39,689 --> 01:09:44,210 So more on pointers. Well, recall Binky. 1118 01:09:44,210 --> 01:09:48,319 In this case we've allocated a pointer, or we declared a pointer, 1119 01:09:48,319 --> 01:09:52,760 but initially, when I just declared a pointer, it's not pointing to anywhere in memory. 1120 01:09:52,760 --> 01:09:54,930 It's just garbage values inside of it. 1121 01:09:54,930 --> 01:09:56,470 So I have no idea where this pointer is pointing to. 1122 01:09:56,470 --> 01:10:01,630 It has an address which is just filled with 0's and 1's where it was initially declared. 1123 01:10:01,630 --> 01:10:04,810 I can't do anything with this until I call malloc on it 1124 01:10:04,810 --> 01:10:08,390 and then it gives me a little space on the heap where I can put values inside. 1125 01:10:08,390 --> 01:10:11,980 Then again, I don't know what's inside of this memory. 1126 01:10:11,980 --> 01:10:16,780 So the first thing I have to do is check whether the system had enough memory 1127 01:10:16,780 --> 01:10:20,850 to give me back 1 integer in the first place, which is why I'm doing this check. 1128 01:10:20,850 --> 01:10:25,020 If pointer is null, that means that it didn't have enough space or some other error occurred, 1129 01:10:25,020 --> 01:10:26,320 so I should exit out of my program. 1130 01:10:26,320 --> 01:10:29,400 But if it did succeed, now I can use that pointer 1131 01:10:29,400 --> 01:10:35,020 and what * pointer does is it follows where the address is 1132 01:10:35,020 --> 01:10:38,480 to where that value is, and it sets it equal to 1. 1133 01:10:38,480 --> 01:10:41,850 So over here, we're checking if that memory existed. 1134 01:10:41,850 --> 01:10:45,380 >> Once you know it exists, you can put into it 1135 01:10:45,380 --> 01:10:50,460 what value you want to put into it; in this case 1. 1136 01:10:50,460 --> 01:10:53,060 Once we're done with it, you need to free that pointer 1137 01:10:53,060 --> 01:10:57,160 because we need to get back to the system that memory that you asked for in the first place. 1138 01:10:57,160 --> 01:10:59,690 Because the computer doesn't know when we're done with it. 1139 01:10:59,690 --> 01:11:02,510 In this case we're explicitly telling it, okay, we're done with that memory. 1140 01:11:02,510 --> 01:11:10,780 If some other process needs it, some other program needs it, feel free to go ahead and take it. 1141 01:11:10,780 --> 01:11:15,110 What we can also do is we can just get the address of local variables on the set. 1142 01:11:15,110 --> 01:11:19,080 So int x is inside the stacked frame of main. 1143 01:11:19,080 --> 01:11:23,060 And when we use this ampersand, this and operator, what it does is 1144 01:11:23,060 --> 01:11:27,310 it takes x, and x is just some data in memory, but it has an address. 1145 01:11:27,310 --> 01:11:33,790 It's located somewhere. So by calling & x, what this does is it gives us the address of x. 1146 01:11:33,790 --> 01:11:38,430 By doing this, we're making pointer point to where x is in memory. 1147 01:11:38,430 --> 01:11:41,710 Now we just do something like *x, we're going to get 5 back. 1148 01:11:41,710 --> 01:11:43,820 The star is called dereferencing it. 1149 01:11:43,820 --> 01:11:46,640 You follow the address and you get the value of it stored there. 1150 01:11:51,000 --> 01:11:53,310 >> Any questions? Yes? 1151 01:11:53,310 --> 01:11:56,500 [Student] If you don't do the 3-pointed thing, does it still compile? 1152 01:11:56,500 --> 01:11:59,490 Yes. If you don't do the 3-pointer thing, it's still going to compile, 1153 01:11:59,490 --> 01:12:02,720 but I'll show you what happens in a second, and without doing that, 1154 01:12:02,720 --> 01:12:04,860 that's what we call a memory leak. You're not giving the system 1155 01:12:04,860 --> 01:12:07,850 back its memory, so after a while the program is going to accumulate 1156 01:12:07,850 --> 01:12:10,940 memory that it's not using, and nothing else can use it. 1157 01:12:10,940 --> 01:12:15,750 If you've ever seen Firefox with 1.5 million kilobytes on your computer, 1158 01:12:15,750 --> 01:12:17,840 in the task manager, that's what's going on. 1159 01:12:17,840 --> 01:12:20,760 You have a memory leak in the program that they're not handling. 1160 01:12:23,080 --> 01:12:26,240 So how does pointer arithmetic work? 1161 01:12:26,240 --> 01:12:29,480 Well, pointer arithmetic is sort of like indexing into an array. 1162 01:12:29,480 --> 01:12:36,370 In this case, I have a pointer, and what I do is I make pointer point to the first element 1163 01:12:36,370 --> 01:12:42,100 of this array of 3 integers that I've allocated. 1164 01:12:42,100 --> 01:12:46,670 So now what I do, star pointer just changes the first element in the list. 1165 01:12:46,670 --> 01:12:49,140 Star pointer +1 points over here. 1166 01:12:49,140 --> 01:12:53,140 So pointer is over here, pointer +1 is over here, pointer +2 is over here. 1167 01:12:53,140 --> 01:12:56,610 >> So just adding 1 is the same thing as moving along this array. 1168 01:12:56,610 --> 01:12:59,880 What we do is, when we do pointer +1 you get the address over here, 1169 01:12:59,880 --> 01:13:04,180 and in order to get the value in here, you put a star in from the entire expression 1170 01:13:04,180 --> 01:13:05,990 to dereference it. 1171 01:13:05,990 --> 01:13:09,940 So, in this case, I'm setting the first location in this array to 1, 1172 01:13:09,940 --> 01:13:13,970 second location to 2, and third location to 3. 1173 01:13:13,970 --> 01:13:18,180 Then what I'm doing over here is I'm printing our pointer +1, 1174 01:13:18,180 --> 01:13:19,970 which just gives me 2. 1175 01:13:19,970 --> 01:13:23,650 Now I'm incrementing pointer, so pointer equals pointer +1, 1176 01:13:23,650 --> 01:13:26,780 which moves it forward. 1177 01:13:26,780 --> 01:13:30,810 And so now if I print out pointer +1, pointer +1 is now 3, 1178 01:13:30,810 --> 01:13:33,990 which in this case prints out 3. 1179 01:13:33,990 --> 01:13:36,560 And in order to free something, the pointer that I give it 1180 01:13:36,560 --> 01:13:40,540 must be pointing at the beginning of the array which I got back from malloc. 1181 01:13:40,540 --> 01:13:43,430 So, in this case, if I were to call 3 right here, this wouldn't be right, 1182 01:13:43,430 --> 01:13:45,070 because it's in the middle of the array. 1183 01:13:45,070 --> 01:13:48,820 I have to subtract to get to the original location 1184 01:13:48,820 --> 01:13:50,420 the initial first spot before I can free it. 1185 01:13:56,300 --> 01:13:58,450 So, here's a more involved example. 1186 01:13:58,450 --> 01:14:03,360 In this case, we're allocating 7 characters in a character array. 1187 01:14:03,360 --> 01:14:06,480 >> And in this case what we're doing is we're looping over the first 6 of them, 1188 01:14:06,480 --> 01:14:09,900 and we're setting them to Z. 1189 01:14:09,900 --> 01:14:13,350 So, for int i = 0, i > 6, i ++, 1190 01:14:13,350 --> 01:14:16,220 So, pointer +i will just give us, in this case, 1191 01:14:16,220 --> 01:14:20,860 pointer, pointer +1, pointer +2, pointer +3, and so on and so forth in the loop. 1192 01:14:20,860 --> 01:14:24,040 What it's going to do is it gets that address, dereferences it to get the value, 1193 01:14:24,040 --> 01:14:27,440 and changes that value to a Z. 1194 01:14:27,440 --> 01:14:30,350 Then at the end remember this is a string, right? 1195 01:14:30,350 --> 01:14:33,560 All strings have to end with the null terminating character. 1196 01:14:33,560 --> 01:14:38,620 So, what I do is in pointer 6 I put the null terminator character in. 1197 01:14:38,620 --> 01:14:43,980 And now what I'm basically doing over here is implementing printf for a string, right? 1198 01:14:43,980 --> 01:14:46,190 >> So, when does printf now when it's reached the end of a string? 1199 01:14:46,190 --> 01:14:48,230 When it hits the null terminating character. 1200 01:14:48,230 --> 01:14:52,030 So, in this case, my original pointer points to the beginning of this array. 1201 01:14:52,030 --> 01:14:56,410 I print the first character out. I move it over one. 1202 01:14:56,410 --> 01:14:58,420 I print that character out. I move it over. 1203 01:14:58,420 --> 01:15:02,180 And I keep doing this until I reach the end. 1204 01:15:02,180 --> 01:15:07,750 And now the end *pointer will dereference this and get the null terminating character back. 1205 01:15:07,750 --> 01:15:11,780 And so my while loop runs only when that value is not the null terminating character. 1206 01:15:11,780 --> 01:15:13,770 So, now I exit out of this loop. 1207 01:15:18,780 --> 01:15:21,180 And so if I subtract 6 from this pointer, 1208 01:15:21,180 --> 01:15:22,860 I go back all the way to the beginning. 1209 01:15:22,860 --> 01:15:27,880 Remember, I'm doing this because I have to go to the beginning in order to free it. 1210 01:15:27,880 --> 01:15:30,270 >> So, I know that was a lot. Are there any questions? 1211 01:15:30,270 --> 01:15:31,870 Please, yes? 1212 01:15:31,870 --> 01:15:36,610 [Student question unintelligible] 1213 01:15:36,610 --> 01:15:38,190 Can you say that louder? Sorry. 1214 01:15:38,190 --> 01:15:44,140 [Student] On the last slide right before you freed the pointer, 1215 01:15:44,140 --> 01:15:47,300 where were you actually changing the value of the pointer? 1216 01:15:47,300 --> 01:15:50,370 [Joseph] So, right here. >>[Student] Oh, okay. 1217 01:15:50,370 --> 01:15:51,890 [Joseph] So, I have a pointer minus minus, right, 1218 01:15:51,890 --> 01:15:54,140 which moves the thing back one, and then I free it, 1219 01:15:54,140 --> 01:15:57,000 because this pointer has to be pointed to the beginning of the array. 1220 01:15:57,000 --> 01:16:00,420 [Student] But that wouldn't be needed had you stopped after that line. 1221 01:16:00,420 --> 01:16:03,130 [Joseph] So, if I had stopped after this, this would be considered a memory leak, 1222 01:16:03,130 --> 01:16:04,810 because I didn't run the free. 1223 01:16:04,810 --> 01:16:11,290 [Student] I [unintelligible] after the first three lines where you had pointer +1 [unintelligible]. 1224 01:16:11,290 --> 01:16:13,140 [Joseph] Uh-huh. So, what's the question there? 1225 01:16:13,140 --> 01:16:14,780 Sorry. No, no. Go, go, please. 1226 01:16:14,780 --> 01:16:16,870 [Student] So, you're not changing the value of pointers. 1227 01:16:16,870 --> 01:16:19,130 You wouldn't have had to do pointer minus minus. 1228 01:16:19,130 --> 01:16:19,730 [Joseph] Yes, exactly. 1229 01:16:19,730 --> 01:16:21,890 So, when I do pointer +1 and pointer +2, 1230 01:16:21,890 --> 01:16:24,410 I'm not doing pointer equals pointer +1. 1231 01:16:24,410 --> 01:16:27,260 So, the pointer just stays pointing at the beginning of the array. 1232 01:16:27,260 --> 01:16:31,460 It's only when I do plus plus that it sets the value back inside the pointer, 1233 01:16:31,460 --> 01:16:33,550 that it actually moves this along. 1234 01:16:36,860 --> 01:16:37,780 All right. 1235 01:16:40,550 --> 01:16:42,030 More questions? 1236 01:16:44,680 --> 01:16:47,790 >> Again, if this is sort of overwhelming, this will be covered in session. 1237 01:16:47,790 --> 01:16:50,710 Ask your teaching fellow about it, and we can answer questions at the end. 1238 01:16:53,510 --> 01:16:56,600 And usually we don't like to do this minus thing. 1239 01:16:56,600 --> 01:16:59,760 This has to require me keeping track of how much I've offset in the array. 1240 01:16:59,760 --> 01:17:04,520 So, in general, this is just to explain how pointer arithmetic works. 1241 01:17:04,520 --> 01:17:07,970 But what we usually like to do is we like to create a copy of the pointer, 1242 01:17:07,970 --> 01:17:11,640 and then we'll use that copy when we're moving around in the string. 1243 01:17:11,640 --> 01:17:14,660 So, in these case you use the copy to print the entire string, 1244 01:17:14,660 --> 01:17:19,040 but we don't have to do like pointer minus 6 or keep track of how much we moved in this, 1245 01:17:19,040 --> 01:17:22,700 just because we know that our original point is still pointed to the beginning of the list 1246 01:17:22,700 --> 01:17:25,340 and all that we altered was this copy. 1247 01:17:25,340 --> 01:17:28,250 So, in general, alter copies of your original pointer. 1248 01:17:28,250 --> 01:17:32,350 Don't try to sort of like--don't alter original copies. 1249 01:17:32,350 --> 01:17:35,290 Trying to alter only copies of your original. 1250 01:17:41,540 --> 01:17:44,870 So, you notice when we pass the string into printf 1251 01:17:44,870 --> 01:17:48,990 you don't have to put a star in front of it like we did with all the other dereferences, right? 1252 01:17:48,990 --> 01:17:54,180 So, if you print out the entire string %s expects is an address, 1253 01:17:54,180 --> 01:17:57,610 and in this case a pointer or in this case like an array of characters. 1254 01:17:57,610 --> 01:18:00,330 >> Characters, char*s, and arrays are the same thing. 1255 01:18:00,330 --> 01:18:03,690 Pointer is to characters, and character arrays are the same thing. 1256 01:18:03,690 --> 01:18:05,720 And so, all we have to do is pass in pointer. 1257 01:18:05,720 --> 01:18:08,150 We don't have to pass in like *pointer or anything like that. 1258 01:18:13,110 --> 01:18:14,930 So, arrays and pointers are the same thing. 1259 01:18:14,930 --> 01:18:19,160 When you're doing something like x [ y ] over here for an array, 1260 01:18:19,160 --> 01:18:21,960 what it's doing under the hood is it's saying, okay, it's a character array, 1261 01:18:21,960 --> 01:18:23,690 so it's a pointer. 1262 01:18:23,690 --> 01:18:26,510 And so x are the same thing, 1263 01:18:26,510 --> 01:18:28,650 and so what it does is it adds y to x, 1264 01:18:28,650 --> 01:18:31,820 which is the same thing as moving forward in memory that much. 1265 01:18:31,820 --> 01:18:34,930 And now x + y gives us some sort of address, 1266 01:18:34,930 --> 01:18:37,570 and we dereference the address or follow the arrow 1267 01:18:37,570 --> 01:18:41,640 to where that location in memory is and we get the value out of that location in memory. 1268 01:18:41,640 --> 01:18:43,720 So, so these two are exactly the same thing. 1269 01:18:43,720 --> 01:18:45,840 It's just a syntactic sugar. 1270 01:18:45,840 --> 01:18:48,090 They do the same thing. They're just different syntactics for each other. 1271 01:18:51,500 --> 01:18:57,590 >> So, what can go wrong with pointers? Like, a lot. Okay. So, bad things. 1272 01:18:57,590 --> 01:19:02,410 Some bad things you can do are not checking if your malloc call returns null, right? 1273 01:19:02,410 --> 01:19:06,560 In this case, I'm asking the system to give me--what is that number? 1274 01:19:06,560 --> 01:19:11,200 Like 2 billion times 4, because the size of an integer is 4 bytes. 1275 01:19:11,200 --> 01:19:13,810 I'm asking it for like 8 billion bytes. 1276 01:19:13,810 --> 01:19:17,270 Of course my computer is not going to be able to give me that much memory back. 1277 01:19:17,270 --> 01:19:20,960 And we didn't check if this is null, so when we try to dereference it over there-- 1278 01:19:20,960 --> 01:19:24,270 follow the arrow to where it's going to--we don't have that memory. 1279 01:19:24,270 --> 01:19:27,150 This is what we call dereferencing a null pointer. 1280 01:19:27,150 --> 01:19:29,710 And this essentially causes you to segfault. 1281 01:19:29,710 --> 01:19:31,790 This is one of the ways you can segfault. 1282 01:19:34,090 --> 01:19:38,090 Other bad things you can do--oh well. 1283 01:19:38,090 --> 01:19:40,650 That was dereferencing a null pointer. Okay. 1284 01:19:40,650 --> 01:19:45,160 Other bad things--well, to fix that you just put a check in there 1285 01:19:45,160 --> 01:19:46,980 that checks whether the pointer is null 1286 01:19:46,980 --> 01:19:51,000 and exit out of the program if it happens that malloc returns a null pointer. 1287 01:19:55,110 --> 01:19:59,850 That's the xkcd comic. People understand it now. Sort of. 1288 01:20:06,120 --> 01:20:09,350 >> So, memory. And I went over this. 1289 01:20:09,350 --> 01:20:12,000 We're calling malloc in a loop, but every time we call malloc 1290 01:20:12,000 --> 01:20:14,370 we're losing track of where this pointer is pointing to, 1291 01:20:14,370 --> 01:20:15,750 because we're clobbering it. 1292 01:20:15,750 --> 01:20:18,410 So, the initial call to malloc gives me memory over here. 1293 01:20:18,410 --> 01:20:19,990 My pointer pointers to this. 1294 01:20:19,990 --> 01:20:23,020 Now, I don't free it, so now I call malloc again. 1295 01:20:23,020 --> 01:20:26,070 Now it points over here. Now my memory is pointing over here. 1296 01:20:26,070 --> 01:20:27,640 Pointing over here. Pointing over here. 1297 01:20:27,640 --> 01:20:31,820 But I've lost track of the addresses of all the memory over here that I allocated. 1298 01:20:31,820 --> 01:20:35,100 And so now I don't have any reference to them anymore. 1299 01:20:35,100 --> 01:20:37,230 So, I can't free them outside of this loop. 1300 01:20:37,230 --> 01:20:39,390 And so in order to fix something like this, 1301 01:20:39,390 --> 01:20:42,250 if you forget to free memory and you get this memory leak, 1302 01:20:42,250 --> 01:20:45,810 You have to free the memory inside of this loop once you're done with it. 1303 01:20:45,810 --> 01:20:51,400 Well, this is what happens. I know lots of you hate this. 1304 01:20:51,400 --> 01:20:55,270 But now--yay! You get like 44,000 kilobytes. 1305 01:20:55,270 --> 01:20:57,110 So, you free it at the end of the loop, 1306 01:20:57,110 --> 01:20:59,770 and that's going to just free the memory every time. 1307 01:20:59,770 --> 01:21:03,620 Essentially, your program doesn't have a memory leak anymore. 1308 01:21:03,620 --> 01:21:08,150 >> And now something else you can do is free some memory that you've asked for twice. 1309 01:21:08,150 --> 01:21:11,060 In this case, you malloc something, you change its value. 1310 01:21:11,060 --> 01:21:13,140 You free it once because you said you were done with it. 1311 01:21:13,140 --> 01:21:14,940 But then we freed it again. 1312 01:21:14,940 --> 01:21:16,730 This is something that's pretty bad. 1313 01:21:16,730 --> 01:21:18,820 It's not going to initially segfault, 1314 01:21:18,820 --> 01:21:23,350 but after a while what this does is double freeing this corrupts your heap structure, 1315 01:21:23,350 --> 01:21:27,200 and you'll learn a little bit more about this if you choose to take a class like CS61. 1316 01:21:27,200 --> 01:21:30,000 But essentially after a while your computer is going to get confused 1317 01:21:30,000 --> 01:21:33,010 about what memory locations are where and where it's stored-- 1318 01:21:33,010 --> 01:21:34,800 where data is stored in memory. 1319 01:21:34,800 --> 01:21:38,080 And so freeing a pointer twice is a bad thing that you don't want to do. 1320 01:21:38,080 --> 01:21:41,600 >> Other things that can go wrong is not using sizeof. 1321 01:21:41,600 --> 01:21:44,460 So, in this case you malloc 8 bytes, 1322 01:21:44,460 --> 01:21:46,700 and that's the same thing as two integers, right? 1323 01:21:46,700 --> 01:21:49,580 So, that's perfectly safe, but is it? 1324 01:21:49,580 --> 01:21:52,160 Well, as Lucas talked about on different architectures, 1325 01:21:52,160 --> 01:21:54,220 integers are of different lengths. 1326 01:21:54,220 --> 01:21:57,970 So, on the appliance that you're using, integers are about 4 bytes, 1327 01:21:57,970 --> 01:22:02,370 but on some other system they might be 8 bytes or they might be 16 bytes. 1328 01:22:02,370 --> 01:22:05,680 So, if I just use this number over here, 1329 01:22:05,680 --> 01:22:07,310 this program might work on the appliance, 1330 01:22:07,310 --> 01:22:10,360 but it's not going to allocate enough memory on some other system. 1331 01:22:10,360 --> 01:22:14,020 In this case, this is what the sizeof operator is used for. 1332 01:22:14,020 --> 01:22:16,880 When we call sizeof(int), what this does is 1333 01:22:16,880 --> 01:22:21,910 it gives us the size of an integer on the system that the program is running. 1334 01:22:21,910 --> 01:22:25,490 So, in this case, sizeof(int) will return 4 on something like the appliance, 1335 01:22:25,490 --> 01:22:29,980 and now this will 4 * 2, which is 8, 1336 01:22:29,980 --> 01:22:32,330 which is just the amount of space necessary for two integers. 1337 01:22:32,330 --> 01:22:36,710 On a different system, if an int is like 16 bytes or 8 bytes, 1338 01:22:36,710 --> 01:22:39,380 it's just going to return enough bytes to store that amount. 1339 01:22:41,830 --> 01:22:45,310 >> And finally, structs. 1340 01:22:45,310 --> 01:22:48,340 So, if you wanted to store a sudoku board in memory, how might we do this? 1341 01:22:48,340 --> 01:22:51,570 You might think of like a variable for the first thing, 1342 01:22:51,570 --> 01:22:53,820 a variable for the second thing, a variable for the third thing, 1343 01:22:53,820 --> 01:22:56,420 a variable for the fourth thing--bad, right? 1344 01:22:56,420 --> 01:23:00,750 So, one improvement you can make on top of this is to make a 9 x 9 array. 1345 01:23:00,750 --> 01:23:04,480 That's fine, but what if you wanted to associate other things with the sudoku board 1346 01:23:04,480 --> 01:23:06,490 like what the difficulty of the board is, 1347 01:23:06,490 --> 01:23:11,740 or, for example, what your score is, or how much time it's taken you to solve this board? 1348 01:23:11,740 --> 01:23:14,970 Well, what you can do is you can create a struct. 1349 01:23:14,970 --> 01:23:18,910 What I'm basically saying is I'm defining this structure over here, 1350 01:23:18,910 --> 01:23:23,230 and I'm defining a sudoku board which consists of a board that is 9 x 9. 1351 01:23:23,230 --> 01:23:26,650 >> And what it has it has pointers to the name of the level. 1352 01:23:26,650 --> 01:23:30,730 It also has x and y, which are the coordinates of where I am right now. 1353 01:23:30,730 --> 01:23:35,980 It also has time spent [unintelligible], and it has the total number of moves I've inputted so far. 1354 01:23:35,980 --> 01:23:40,010 And so in this case, I can group a whole bunch of data into just one structure 1355 01:23:40,010 --> 01:23:42,790 instead of having it like flying around in like different variables 1356 01:23:42,790 --> 01:23:44,540 that I can't really keep track of. 1357 01:23:44,540 --> 01:23:49,720 And this lets us have just nice syntax for sort of referencing different things inside of this struct. 1358 01:23:49,720 --> 01:23:53,430 I can just do board.board, and I get the sudoku board back. 1359 01:23:53,430 --> 01:23:56,320 Board.level, I get how tough it is. 1360 01:23:56,320 --> 01:24:00,540 Board.x and board.y give me the coordinates of where I might be in the board. 1361 01:24:00,540 --> 01:24:04,730 And so I'm accessing what we call fields in the struct. 1362 01:24:04,730 --> 01:24:08,840 This defines sudokuBoard, which is a type that I have. 1363 01:24:08,840 --> 01:24:14,800 And now we're here. I have a variable called "board" of type sudokuBoard. 1364 01:24:14,800 --> 01:24:18,820 And so now I can access all the fields that make up this structure over here. 1365 01:24:20,830 --> 01:24:22,450 >> Any questions about structs? Yes? 1366 01:24:22,450 --> 01:24:25,890 [Student] For int x, y, you declared both on one line? >>[Joseph] Uh-huh. 1367 01:24:25,890 --> 01:24:27,400 [Student] So, could you just do that with all of them? 1368 01:24:27,400 --> 01:24:31,200 Like in x, y comma times that total? 1369 01:24:31,200 --> 01:24:34,460 [Joseph] Yes, you could definitely do that, but the reason I put x and y on the same line-- 1370 01:24:34,460 --> 01:24:36,330 and the question is why can we just do this on the same line? 1371 01:24:36,330 --> 01:24:38,600 Why don't we just put all of these on the same line is 1372 01:24:38,600 --> 01:24:42,090 x and y are related to each other, 1373 01:24:42,090 --> 01:24:44,780 and this is just stylistically more correct, in a sense, 1374 01:24:44,780 --> 01:24:46,600 because it's grouping two things on the same line 1375 01:24:46,600 --> 01:24:49,340 that like sort of relate to the same thing. 1376 01:24:49,340 --> 01:24:51,440 And I just split these apart. It's just a style thing. 1377 01:24:51,440 --> 01:24:53,720 It functionally makes no difference whatsoever. 1378 01:24:58,150 --> 01:24:59,270 Any other questions on structs? 1379 01:25:03,030 --> 01:25:06,620 You can define a Pokédex with a struct. 1380 01:25:06,620 --> 01:25:11,720 A Pokémon has a number and it has a letter, an owner, a type. 1381 01:25:11,720 --> 01:25:16,990 And then if you have an array of Pokémon , you can make up a Pokédex, right? 1382 01:25:16,990 --> 01:25:20,810 Okay, cool. So, questions on structs. Those are related to structs. 1383 01:25:20,810 --> 01:25:25,270 >> Finally, GDB. What does GDB let you do? It lets you debug your program. 1384 01:25:25,270 --> 01:25:27,650 And if you haven't used GDB, I would recommended watching the short 1385 01:25:27,650 --> 01:25:31,250 and just going over what GDB is, how you work with it, how you might use it, 1386 01:25:31,250 --> 01:25:32,900 and test it on a program. 1387 01:25:32,900 --> 01:25:37,400 And so what GDB lets you do is it lets pause the [unintelligible] up your program 1388 01:25:37,400 --> 01:25:38,920 and a practical line. 1389 01:25:38,920 --> 01:25:42,600 For example, I want to pause execution at like line 3 of my program, 1390 01:25:42,600 --> 01:25:46,010 and while I'm at line 3 I can print out all the values that are there. 1391 01:25:46,010 --> 01:25:49,710 And so what we call like pausing in a line 1392 01:25:49,710 --> 01:25:52,350 is we call this putting a breakpoint at that line 1393 01:25:52,350 --> 01:25:55,920 and then we can print out the variables at the state of the program at that time. 1394 01:25:55,920 --> 01:25:58,990 >> We can then from there step through the program line-by-line. 1395 01:25:58,990 --> 01:26:03,200 And then we can look at the state of the stack at the time. 1396 01:26:03,200 --> 01:26:08,600 And so in order to use GDB, what we do is we call clang on the C file, 1397 01:26:08,600 --> 01:26:11,290 but we have to pass it the -ggdb flag. 1398 01:26:11,290 --> 01:26:15,850 And once we're done with that we just run gdb on the resulting output file. 1399 01:26:15,850 --> 01:26:18,810 And so you get some like mass of text like this, 1400 01:26:18,810 --> 01:26:21,990 but really all you have to do is type in commands at the beginning. 1401 01:26:21,990 --> 01:26:24,250 Break main puts a breakpoint at main. 1402 01:26:24,250 --> 01:26:28,470 List 400 lists the lines of code around line 400. 1403 01:26:28,470 --> 01:26:31,410 And so in this case you can just look around and say, oh, 1404 01:26:31,410 --> 01:26:34,360 I want to set a breakpoint at line 397, which is this line, 1405 01:26:34,360 --> 01:26:37,170 and then your program runs into that step and it's going to break. 1406 01:26:37,170 --> 01:26:41,120 It's going to pause there, and you can print out, for example, value of low or high. 1407 01:26:41,120 --> 01:26:46,410 And so there are a bunch of commands you need to know, 1408 01:26:46,410 --> 01:26:48,660 and this slideshow will go up on the website, 1409 01:26:48,660 --> 01:26:54,000 so if you just want to reference these or like put them on your cheat sheets, feel free. 1410 01:26:54,000 --> 01:27:00,650 >> Cool. That was Quiz Review 0, and we'll stick around if you have any questions. 1411 01:27:00,650 --> 01:27:03,850 All right. 1412 01:27:03,850 --> 01:27:09,030 >> [applause] 1413 01:27:09,030 --> 01:27:13,000 >> [CS50.TV]