[MUSIC PLAYING] 

DOUG LLOYD: OK. We've worked with integers, we've worked with characters, we've worked floats, doubles, strings, and bools. We've exhausted pretty much all of the [INAUDIBLE] types that have been available to us all along. But now we want to do something more. How do we do that? How do we create different data types? We can do so by using structures. So structures allow us to unify variables of different types into a single, new variable type, which we can assign its own type name. This is a really strong thing to be able to do, because we can now group elements of different data types together that have a logical connection. We've been able to do this with arrays sort of, right? We can group variables of the same data type together in a large unit of memory, an array. 

But we haven't been able to mix up different data types together. We can't, say, pair an integer, and a character, and a double all in the same thing and call that a single unit. But with structures, or frequently referred to as structs, we actually can. So a structure is sort of like a super variable. It's a variable that contains other variables inside of it. So here's an example of a very simple structure. This is what the syntax would look like to create a structure for a car. Now, let's go through the syntax here. Struct, that's the keyword that indicates that I'm creating a new data type here. In particular, the data type's name is going to be struct car, as we'll see. But this is the sort of tip off to the compiler that this as a group of variables that is going to be considered part of the same type in a minute. 

Cars, just the name of the structure. Again, the data type here is going to be struct car, not just car. But if you have different-- if you create multiple structs in the same program, you need to distinguish between struct and struct. So struct car, I might also have struct student, for example, in the same program. Inside of the curly braces are all of the so-called fields, or members of the structure. So what are some of the things that are inherent in a car? Well, it usually has a year, has a model name, and a license plate, an odometer that usually has some number of miles on it, and maybe an engine size. And as you can see, I'm mixing up integers and characters and doubles. They're all going to be part of this new data type. 

Lastly, the final thing I need to do, don't forget this little semicolon at the end. After we finish defining the structure, we need to put a semicolon at the end. It's a very common syntactical mistake, because with a function, for example, you would just have open curly brace, close curly brace. You don't put a semicolon at the end of a function definition. This looks like a function definition, but it's not, and so the semicolon there is just a reminder that you need to put it there, because the compiler will otherwise not know what to do with it. It's a very common error to accidentally make when you're first defining structures. 

OK. So we usually define our structures at the very top of our programs because they're probably going to be used by multiple functions. We don't want to define a struct inside of a function, because then we can only-- the scope of the structure really only exists inside of that function. We'd probably want to define a structure so we can use it in multiple functions, or perhaps in multiple files that are tied together to create our single program. Sometimes also instead of defining the structure at the very top where you put your pound includes and your pound defines, for example, you might put them in separate dot h files, which you then pound include yourself. 

So we have structures, but now we need to get inside of them. How do we get inside of a structure to access those sub-variables, those variables that exist inside the structure? Well, we have something called the dot operator, which allows us to access the fields of the structure. So for example, let's say I've declared my structure data type somewhere at the top of my program, or perhaps in a dot h file that I've pound included. If I then want to create a new variable of that data type, I can say, struct car, my car, semicolon. Just like I could say int x, or string name semicolon. 

The data type here is struct car, the name of the variable is my car, and then I can use the dot operator to access the various fields of my car. So I can say my car dot year equals 2011. That's perfectly fine. Year, if you recall, was defined as an integer field inside of this struct car data type. So any variable of the struct car data type, such as my car, I can say my car dot year equals and then assign it some integer value, 2011. My car dot plate equals CS50. My card dot odometer equals 50505 semicolon. All of those are perfectly fine and that's how we access the fields of the structure. Structures, though, do not need to be created on the stack. Just like any other variable, we can dynamically allocate them. If we have a program that might be generating many structures, we don't know how many we're going to need, then we need to dynamically allocate those structures as our program is running. And so if we're going to access the fields of a structure in that context, recall that we first need to dereference the pointer to the structure, and then once we dereference the pointer, then we can access the fields. If we only have a pointer to the structure, we can't just say pointer dot field name and get what we're looking for. There's the extra step of dereferencing. So let's say that instead of the previous-- just like the previous example, instead of declaring it on the stack, struct car, my car, semicolon, I say struct car, star, a pointer to a struct car called my car, equals malloc size of struct car. Size of we'll figure out how many bytes your new data type takes up. You don't necessarily only need to use size of, width, int, or char, or any of the built-in data types. The compiler is smart enough to figure out how many bytes are required by your new structure. So I malloc myself a unit of memory big enough to hold a struct car, and I get a pointer back to that block of memory, and that pointer is assigned to my car. 

Now, if I want to access the fields of my car, I first dereference my car using the dereference operator, star that we've seen from the pointers videos, and then after I dereference, then I can use the dot operator to access the various fields of my car. Star my car dot year equals 2011. That would have the effect we want in this case, because we've dynamically allocated my car. 

That's kind of annoying, though, right? There's a 2-step process now. Now we have to dereference-- we have a star operator, and we have a dot operator. And as you might expect, because C programmers love shorter ways to do things, there is a shorter way to do this. There is another operator called arrow, which makes this process a lot easier. The way arrow works is it first dereferences the pointer on the left side of the operator, and then, after having dereferenced the pointer on the left, it accesses the field on the right. And so previously we had this sort of star my car dot all this stuff, like there was a lot going on there. But what we can instead do is this-- my car arrow year equals 2011. 

Again, what's happening here? First, I'm dereferencing my car. Which again, is a pointer here. Then, after having dereferenced my car, I can then access the fields year, plate, and odometer just as I could before having first used star to dereference my car, and dot to access the field. So you can have structures, you can have pointers to structures, and you have ways to access the fields of those structures, whether you have pointers to them or the variables themselves. Dot or arrow, depending on how the variable was declared. I'm Doug Lloyd, this is CS50.