WEBVTT X-TIMESTAMP-MAP=LOCAL:00:00:00.000,MPEGTS:900000 00:00:00.994 --> 00:00:04.970 [MUSIC PLAYING] 00:00:16.388 --> 00:00:19.430 DAVID MALAN: There's any number of languages in which we can communicate, 00:00:19.430 --> 00:00:21.020 English being just one of them. 00:00:21.020 --> 00:00:24.720 But computers, of course, only understand binary zeros and ones. 00:00:24.720 --> 00:00:28.430 And so somehow we have to communicate our thoughts-- ultimately in binary-- 00:00:28.430 --> 00:00:30.800 in order to solve some problem with a computer. 00:00:30.800 --> 00:00:34.880 But certainly, we don't want to program computers by writing zeros and ones 00:00:34.880 --> 00:00:37.850 and memorizing the patterns that they'll understand, 00:00:37.850 --> 00:00:40.520 so we do somehow need to adopt a process by which 00:00:40.520 --> 00:00:43.880 we can express our thoughts and the solutions to the problems 00:00:43.880 --> 00:00:48.140 that we have in mind but in such a way that the computer can understand them. 00:00:48.140 --> 00:00:50.270 Now it turns out it's quite meaningful when 00:00:50.270 --> 00:00:53.570 you see something like Intel inside or AMD 00:00:53.570 --> 00:00:55.700 or any number of other computer manufacturers 00:00:55.700 --> 00:00:59.140 that make what are called CPUs, Central Processing Units, which 00:00:59.140 --> 00:01:02.240 you can think of as the brains of sorts inside of your computer. 00:01:02.240 --> 00:01:05.090 Well, turns out that Intel and AMD and other companies 00:01:05.090 --> 00:01:08.330 have decided in advance what patterns of bits-- 00:01:08.330 --> 00:01:11.330 zeros and ones-- that their CPUs understand. 00:01:11.330 --> 00:01:14.400 A certain pattern of zeros and ones might represent addition. 00:01:14.400 --> 00:01:17.810 Another pattern of zeros and ones might represent subtraction or multiplication 00:01:17.810 --> 00:01:21.500 or division or the act of moving information around in memory 00:01:21.500 --> 00:01:24.290 or saving information from memory. 00:01:24.290 --> 00:01:28.430 And so those patterns are very much computer or CPU specific. 00:01:28.430 --> 00:01:31.520 And frankly, I'd like to be able to write software and write 00:01:31.520 --> 00:01:34.760 code that can run on your computer and my computer 00:01:34.760 --> 00:01:37.730 and some other manufacturer's computer without really having 00:01:37.730 --> 00:01:39.560 to know all about those zeros and ones. 00:01:39.560 --> 00:01:43.610 So there too it would be nice if there's a process, a workflow, a tool 00:01:43.610 --> 00:01:48.800 chain via which we can communicate our thoughts in a fairly accessible way 00:01:48.800 --> 00:01:52.370 but have them ultimately translated into those zeros and ones. 00:01:52.370 --> 00:01:54.950 So what is it that Intel inside really means? 00:01:54.950 --> 00:01:57.590 What is it that a CPU actually understands? 00:01:57.590 --> 00:02:01.940 Well, it's what's called machine code, zeros and ones that ultimately dictate 00:02:01.940 --> 00:02:05.330 what the computer should do-- add, subtract, multiply, or something else 00:02:05.330 --> 00:02:06.140 altogether. 00:02:06.140 --> 00:02:10.740 That machine code literally might look something like this. 00:02:10.740 --> 00:02:14.930 In fact, let me give you just a moment and from all of these zeros and ones, 00:02:14.930 --> 00:02:19.295 can you glean perhaps what this very program would do if run on a computer? 00:02:22.160 --> 00:02:22.660 No? 00:02:22.660 --> 00:02:26.080 Well, odds are you couldn't imagine what this would do because even 00:02:26.080 --> 00:02:27.970 I can't read these zeros and ones. 00:02:27.970 --> 00:02:31.750 But if it turns out you fed these very zeros and ones to a computer, 00:02:31.750 --> 00:02:33.970 it would print out on the screen quite simply 00:02:33.970 --> 00:02:37.150 "hello world," which is sort of the canonical phrase, one 00:02:37.150 --> 00:02:40.300 of the first phrases ever printed on a computer screen back in the day 00:02:40.300 --> 00:02:42.860 when computer languages were first being invented. 00:02:42.860 --> 00:02:45.610 Now you, of course, would never know that, should never know that, 00:02:45.610 --> 00:02:48.070 nor should even the most sophisticated of programmers 00:02:48.070 --> 00:02:52.000 because this is just too low of a level to communicate one's thoughts in. 00:02:52.000 --> 00:02:55.990 Far more compelling would be to operate at a level closer to English 00:02:55.990 --> 00:02:58.000 or whatever your spoken language might be. 00:02:58.000 --> 00:03:01.210 And so quickly, when humans invented computers decades ago, 00:03:01.210 --> 00:03:04.120 did we decide we need something different from machine code. 00:03:04.120 --> 00:03:06.940 We need something at a higher level of abstraction, 00:03:06.940 --> 00:03:09.433 if you will, something that's more familiar to us 00:03:09.433 --> 00:03:12.100 but that's close enough that the computer can somehow figure out 00:03:12.100 --> 00:03:12.850 what to do. 00:03:12.850 --> 00:03:15.220 And so thus was born assembly code. 00:03:15.220 --> 00:03:17.140 Assembly code is an example more generally 00:03:17.140 --> 00:03:19.690 of what's called source code, which is typically 00:03:19.690 --> 00:03:24.850 English-like syntax more familiar to us humans that can somehow be translated 00:03:24.850 --> 00:03:27.040 eventually down to machine code. 00:03:27.040 --> 00:03:29.110 And assembly code, which is one of the earliest 00:03:29.110 --> 00:03:33.400 incarnations of this general idea, looked a little something like this. 00:03:33.400 --> 00:03:35.560 Now it too looks pretty cryptic, though hopefully 00:03:35.560 --> 00:03:39.550 not quite as cryptic as just seemingly random patterns of zeros and ones 00:03:39.550 --> 00:03:42.850 because there is some organization to this text here. 00:03:42.850 --> 00:03:45.070 It's not quite English-like I would say, but there's 00:03:45.070 --> 00:03:47.910 some familiar sequences of characters that I can perhaps 00:03:47.910 --> 00:03:50.080 ascribe some meaning to. 00:03:50.080 --> 00:03:54.640 I see words that look a little familiar-- push queue or at least push, 00:03:54.640 --> 00:03:56.770 move queue, or perhaps move. 00:03:56.770 --> 00:04:01.210 Sub-- maybe that means subtract or call, as in to call a function 00:04:01.210 --> 00:04:05.500 or a procedure, xor and add and pop and others-- 00:04:05.500 --> 00:04:07.810 these seem to be reminiscent of English words. 00:04:07.810 --> 00:04:08.830 And so while cryptic-- 00:04:08.830 --> 00:04:11.920 and I would surely need a manual in order to figure out what these mean-- 00:04:11.920 --> 00:04:15.850 this is a standardization of how you might communicate instructions 00:04:15.850 --> 00:04:16.720 to a computer. 00:04:16.720 --> 00:04:20.079 Indeed, all of those keywords push queue, move queue, sub queue and so 00:04:20.079 --> 00:04:23.200 forth are literally called instructions, and those 00:04:23.200 --> 00:04:27.400 are the names given to the instructions, the commands that Intel and AMD have 00:04:27.400 --> 00:04:32.470 decided that their brains, their CPUs shall understand. 00:04:32.470 --> 00:04:35.800 To the right of these instructions is some cryptic looking syntax now-- 00:04:35.800 --> 00:04:38.830 dollar signs and percent signs, commas, and others. 00:04:38.830 --> 00:04:42.100 Well, those are used to communicate what are called registers. 00:04:42.100 --> 00:04:45.370 It turns out that the smallest unit of useful memory 00:04:45.370 --> 00:04:47.230 typically inside of a computer-- 00:04:47.230 --> 00:04:48.910 in particular inside of a CPU-- 00:04:48.910 --> 00:04:50.290 is what's called a register. 00:04:50.290 --> 00:04:53.590 A register might be eight bits back in the day, 32 bits 00:04:53.590 --> 00:04:55.810 more moderately, or even 64 bits. 00:04:55.810 --> 00:04:58.450 And that is really the smallest piece of information 00:04:58.450 --> 00:05:03.310 that you can do some operation on, the smallest unit of information 00:05:03.310 --> 00:05:07.150 that you can add to something else or subtract from something else and so 00:05:07.150 --> 00:05:07.790 forth. 00:05:07.790 --> 00:05:11.450 So when a CPU is doing its arithmetic-- addition, subtraction, multiplication, 00:05:11.450 --> 00:05:12.570 division, and so forth-- 00:05:12.570 --> 00:05:14.650 it's operating on pretty small values. 00:05:14.650 --> 00:05:19.300 They might be big numbers, but they only take up maybe 32 or 64 bits. 00:05:19.300 --> 00:05:22.580 And those registers, those chunks of memory have names. 00:05:22.580 --> 00:05:25.720 The names to be fair are cryptic, but they're expressed 00:05:25.720 --> 00:05:27.880 in the same language, assembly code. 00:05:27.880 --> 00:05:31.030 So that you're telling the computer in this language what should 00:05:31.030 --> 00:05:34.390 you move to where and what should you add to what. 00:05:34.390 --> 00:05:37.930 And so once you acquire a taste, if you will, for this language, 00:05:37.930 --> 00:05:39.910 does all of this begin to make more sense. 00:05:39.910 --> 00:05:43.810 And frankly, if I really scour it, aha, down here at the bottom, 00:05:43.810 --> 00:05:47.320 I do see explicit mention of that phrase hello world. 00:05:47.320 --> 00:05:52.210 And these other lines simply call in to action the printing of that phrase 00:05:52.210 --> 00:05:53.440 on the screen. 00:05:53.440 --> 00:05:55.870 But frankly, this doesn't look all that compelling still. 00:05:55.870 --> 00:05:57.620 It's certainly better than zeros and ones, 00:05:57.620 --> 00:05:59.590 but assembly code is generally considered 00:05:59.590 --> 00:06:03.340 to be fairly low level, not as low level as zeros and ones and not as low 00:06:03.340 --> 00:06:05.500 level as electricity from the wall. 00:06:05.500 --> 00:06:07.930 But it's still low level enough that it's not really 00:06:07.930 --> 00:06:09.940 that pleasant to program in. 00:06:09.940 --> 00:06:13.250 Now back in the day, decades ago, this was all you had at your disposal. 00:06:13.250 --> 00:06:15.530 And so surely, this was better than nothing else. 00:06:15.530 --> 00:06:18.610 And in fact, some of the earliest games and some of the earliest software 00:06:18.610 --> 00:06:20.380 were written in assembly language. 00:06:20.380 --> 00:06:24.670 So it truly was experts back in the day writing frequently 00:06:24.670 --> 00:06:26.920 in this low language, and you might still use it today 00:06:26.920 --> 00:06:28.840 for the smallest of details. 00:06:28.840 --> 00:06:33.550 But on top of assembly language have evolved more modern forms 00:06:33.550 --> 00:06:38.050 of source code-- newer languages with easier to understand syntax 00:06:38.050 --> 00:06:39.670 and more and more features. 00:06:39.670 --> 00:06:43.120 And one of the first successors to something like assembly code 00:06:43.120 --> 00:06:45.700 was a language called C, quite simply. 00:06:45.700 --> 00:06:47.450 C looks like this. 00:06:47.450 --> 00:06:49.990 Now I dare say this too remains fairly cryptic, 00:06:49.990 --> 00:06:52.810 but I feel like we're walking up a set of stairs 00:06:52.810 --> 00:06:55.815 here where things are finally starting to look a little more 00:06:55.815 --> 00:06:58.690 familiar and a little more comfortable, even though there might still 00:06:58.690 --> 00:07:00.850 be some distractions of syntax. 00:07:00.850 --> 00:07:03.100 These angled braces and these curly braces 00:07:03.100 --> 00:07:06.280 and quotes and parentheses and a semicolon-- 00:07:06.280 --> 00:07:10.480 all of that you might get to in an actual course on programming itself. 00:07:10.480 --> 00:07:15.610 But here we use this as demonstrative of a fairly more English-like syntax 00:07:15.610 --> 00:07:17.860 with which you can express the same program. 00:07:17.860 --> 00:07:19.840 Printing hello world in a language called 00:07:19.840 --> 00:07:23.190 C can be implemented with precisely this code. 00:07:23.190 --> 00:07:27.570 But frankly, we're starting to stray pretty far from that low level language 00:07:27.570 --> 00:07:32.220 that computers ultimately understand and need to accept as their input binary. 00:07:32.220 --> 00:07:35.100 So how do we get from this higher level language, so to speak, 00:07:35.100 --> 00:07:38.040 called C down to those zeros and ones? 00:07:38.040 --> 00:07:40.920 Well, frankly, the process by which this happens 00:07:40.920 --> 00:07:44.830 tends to make an intermediate stop in what's called that assembly language. 00:07:44.830 --> 00:07:47.250 So a human might write code like this, like I did here 00:07:47.250 --> 00:07:53.260 in C. You might then use a program that converts the C code to assembly code 00:07:53.260 --> 00:07:56.100 and then another program that converts that assembly code down 00:07:56.100 --> 00:07:57.500 to those zeros and ones. 00:07:57.500 --> 00:07:59.250 And frankly, you could probably use a tool 00:07:59.250 --> 00:08:01.680 that does both of those steps at once so that it 00:08:01.680 --> 00:08:05.490 creates the illusion of going directly from this to so-called machine code. 00:08:05.490 --> 00:08:06.880 Those zeros and ones. 00:08:06.880 --> 00:08:10.020 So let's take a moment and actually do that on my computer here. 00:08:10.020 --> 00:08:12.360 This is a process that you can do on a Mac or PC 00:08:12.360 --> 00:08:15.577 running Mac OS, Windows, Linux, or any number of operating systems. 00:08:15.577 --> 00:08:17.160 I happen to be doing it on a Mac here. 00:08:17.160 --> 00:08:20.340 And I'm going to use a fairly common program these days called a text 00:08:20.340 --> 00:08:20.970 editor. 00:08:20.970 --> 00:08:23.400 This is a very lightweight version of a word processor. 00:08:23.400 --> 00:08:25.980 It doesn't have bold facing and underline and italics. 00:08:25.980 --> 00:08:28.200 It really just allows you to type text, but it 00:08:28.200 --> 00:08:30.810 does tend to colorize it for you to draw your attention 00:08:30.810 --> 00:08:33.150 to the disparate parts of a program. 00:08:33.150 --> 00:08:36.450 And I'm also going to open up what's called a terminal window. 00:08:36.450 --> 00:08:41.580 A terminal window is a keyboard-only interface to what your computer can do. 00:08:41.580 --> 00:08:43.419 So while I still have access to my mouse, 00:08:43.419 --> 00:08:45.990 it's not going to be all that useful for this environment 00:08:45.990 --> 00:08:50.460 because anytime I want to run a program or execute a command or make my Mac 00:08:50.460 --> 00:08:53.570 do something, I'm going to have to do it from my keyboard alone. 00:08:53.570 --> 00:08:54.320 Let's take a look. 00:08:58.070 --> 00:08:59.930 Now here I am in front of my text editor, 00:08:59.930 --> 00:09:03.470 and I'm going to go ahead and create a file called sayhello.c, 00:09:03.470 --> 00:09:06.560 dot c being a conventional file name to indicate to the computer 00:09:06.560 --> 00:09:08.590 that the code I am about to write is going 00:09:08.590 --> 00:09:10.925 to be implemented in that language called C. Now 00:09:10.925 --> 00:09:13.550 at the top of this program is where I'm going to write my code, 00:09:13.550 --> 00:09:16.460 and I'm simply going to transcribe what we just saw. 00:09:16.460 --> 00:09:23.510 Include standard IO dot h int main void open curly brace 00:09:23.510 --> 00:09:26.480 followed by a closing curly brace ultimately 00:09:26.480 --> 00:09:32.420 and then printf quote unquote hello comma world backslash n, finally, 00:09:32.420 --> 00:09:33.480 a semicolon. 00:09:33.480 --> 00:09:35.330 So herein I've written my source code. 00:09:35.330 --> 00:09:37.760 And indeed, it's been colorized by the text editor 00:09:37.760 --> 00:09:40.640 simply to draw my attention to disparate parts of this program 00:09:40.640 --> 00:09:42.720 and were we to dive deeper into C, in particular, 00:09:42.720 --> 00:09:45.050 we'd see what each of these different symbols mean. 00:09:45.050 --> 00:09:46.985 But for now, let me propose that we only care 00:09:46.985 --> 00:09:48.860 about what this program is meant to do, which 00:09:48.860 --> 00:09:51.470 is to print ultimately hello world. 00:09:51.470 --> 00:09:54.320 But all I've done is write a program in C. 00:09:54.320 --> 00:09:57.830 I somehow have to get it to its form of zeros and ones 00:09:57.830 --> 00:09:59.900 that the computer ultimately understands. 00:09:59.900 --> 00:10:02.450 So it's not sufficient just to save the file because, indeed, 00:10:02.450 --> 00:10:05.900 all that's been saved are these letters and symbols in some file called 00:10:05.900 --> 00:10:06.890 hello.c. 00:10:06.890 --> 00:10:10.310 I somehow have to convert that file to zeros and ones. 00:10:10.310 --> 00:10:13.520 Well, it turns out that there exists tools called compilers. 00:10:13.520 --> 00:10:15.530 A compiler is simply a piece of software-- 00:10:15.530 --> 00:10:17.390 written presumably by someone else-- 00:10:17.390 --> 00:10:21.500 that knows how to understand C, perhaps knows about assembly code, 00:10:21.500 --> 00:10:23.990 but definitely knows about those patterns and ones 00:10:23.990 --> 00:10:28.490 that Intel and AMD ultimately require that I output in order 00:10:28.490 --> 00:10:30.410 to get them to execute commands. 00:10:30.410 --> 00:10:33.240 So how do I go about compiling hello.c? 00:10:33.240 --> 00:10:35.240 Well, it turns out installed it on my computer-- 00:10:35.240 --> 00:10:39.440 and perhaps even yours-- is a program called CC, the C compiler, if you 00:10:39.440 --> 00:10:42.560 will-- compile simply referring to this process of translating 00:10:42.560 --> 00:10:44.090 one language to another. 00:10:44.090 --> 00:10:47.330 And so I'm going to go down here to the lower portion of my screen 00:10:47.330 --> 00:10:50.450 wherein I have a prompt, dollar sign that for whatever 00:10:50.450 --> 00:10:54.470 historical reason simply represents a prompt in a terminal window. 00:10:54.470 --> 00:10:57.120 And it's in here that I can type these textural commands, 00:10:57.120 --> 00:11:02.540 cc -o hello space hello.c-- 00:11:02.540 --> 00:11:05.030 a fairly cryptic incantation of commands. 00:11:05.030 --> 00:11:07.490 But ultimately, this has created a new file 00:11:07.490 --> 00:11:11.662 on my computer called quite simply hello with no file extension. 00:11:11.662 --> 00:11:14.120 Now on a typical Mac or PC, when you want to run a program, 00:11:14.120 --> 00:11:17.240 you would typically just double click its icon, and it would be loaded up. 00:11:17.240 --> 00:11:19.500 And you would see its ultimate behavior. 00:11:19.500 --> 00:11:22.490 But here in this command line interface, so to speak, 00:11:22.490 --> 00:11:25.010 wherein I can only type commands textually, 00:11:25.010 --> 00:11:28.907 I have to tell the computer to run this program only via my keyboard. 00:11:28.907 --> 00:11:30.740 And the convention via which you can do this 00:11:30.740 --> 00:11:36.110 is quite simply to say dot slash hello where dot refers to the current folder 00:11:36.110 --> 00:11:38.570 or directory in which this file is. 00:11:38.570 --> 00:11:41.810 The slash just separates from its name, and hello, of course, 00:11:41.810 --> 00:11:43.910 is the name of the program I've written. 00:11:43.910 --> 00:11:44.720 And here we go. 00:11:44.720 --> 00:11:48.470 With the stroke of enter, I now see hello world. 00:11:48.470 --> 00:11:51.140 And thus was born my very first program in C. 00:11:51.140 --> 00:11:53.930 Of course, it took me quite a while to get to this point, 00:11:53.930 --> 00:11:56.750 and I didn't necessarily even understand all of those lines of code 00:11:56.750 --> 00:11:57.500 along the way. 00:11:57.500 --> 00:12:01.070 But I did understand that my goal at hand and the problem to be solved 00:12:01.070 --> 00:12:02.902 was to print quite simply hello. 00:12:02.902 --> 00:12:04.610 But there's a bit of overhead, of course, 00:12:04.610 --> 00:12:08.840 to a language like C wherein there's not only the syntactic complexity of it, 00:12:08.840 --> 00:12:11.580 but that frankly gets much more familiar over time. 00:12:11.580 --> 00:12:13.880 There's also this additional step, this middleman, 00:12:13.880 --> 00:12:17.960 a compiler that has to exist and somehow translate your source 00:12:17.960 --> 00:12:19.650 code to machine code. 00:12:19.650 --> 00:12:22.580 In fact, what we've effectively done just now is this. 00:12:22.580 --> 00:12:26.450 If up here is my so-called source code stored in any file, 00:12:26.450 --> 00:12:29.960 for instance, hello.c, and I want to convert it ultimately 00:12:29.960 --> 00:12:34.490 to so-called machine code, the zeros and ones that my computer understands, 00:12:34.490 --> 00:12:37.880 I somehow have to get from input to output. 00:12:37.880 --> 00:12:40.280 And the middleman here is again this tool 00:12:40.280 --> 00:12:44.300 called compiler in the context of my having written this program call in C. 00:12:44.300 --> 00:12:48.950 I used a program, a compiler, called CC, but any number of other options exist. 00:12:48.950 --> 00:12:53.060 You might have heard of Visual Studio perhaps or Eclipse or yet others still. 00:12:53.060 --> 00:12:56.780 This middleman simply takes as input that source code in C 00:12:56.780 --> 00:13:00.980 and produces as its output that machine code that the computer expects. 00:13:00.980 --> 00:13:06.530 And so when I type that Command cc -o hello hello.c, 00:13:06.530 --> 00:13:11.450 it ultimately was telling my computer take as input hello.c, 00:13:11.450 --> 00:13:15.140 produce as output a new file called hello, inside 00:13:15.140 --> 00:13:17.240 of which are those zeros and ones. 00:13:17.240 --> 00:13:19.890 Now not all languages operate like this. 00:13:19.890 --> 00:13:23.450 It turns out that more modern languages skip that step of compilation 00:13:23.450 --> 00:13:26.570 altogether or at least hide that detail from the user 00:13:26.570 --> 00:13:30.740 so that he or she doesn't necessarily need to know how to compile their code. 00:13:30.740 --> 00:13:33.410 It's handled more automatically. 00:13:33.410 --> 00:13:37.160 Now some languages instead use not a compiler but an interpreter. 00:13:37.160 --> 00:13:41.300 Whereas a compiler takes as input one language like C and produces as output 00:13:41.300 --> 00:13:43.910 another language like machine code, an interpreter 00:13:43.910 --> 00:13:47.690 instead takes as input some source code and then runs it 00:13:47.690 --> 00:13:51.350 or interprets it line by line top to bottom, left to right. 00:13:51.350 --> 00:13:54.680 And whenever it sees an instruction like print, it thinks to itself, 00:13:54.680 --> 00:13:56.720 oh, I know how to print something on the screen, 00:13:56.720 --> 00:13:59.960 and it goes and does it on behalf of that source code. 00:13:59.960 --> 00:14:02.420 It does not, strictly speaking, convert those instructions 00:14:02.420 --> 00:14:04.460 instead to zeros and one. 00:14:04.460 --> 00:14:06.830 It is instead the interpreter itself which 00:14:06.830 --> 00:14:10.520 is just a program that itself is implemented with zeros and ones 00:14:10.520 --> 00:14:12.560 that the CPU understands. 00:14:12.560 --> 00:14:14.810 And those zeros and ones collectively know 00:14:14.810 --> 00:14:19.690 how to recognize keywords or functions in that source code language 00:14:19.690 --> 00:14:23.940 it takes as input in order to execute it on the program's behalf. 00:14:23.940 --> 00:14:26.540 So what is an example of an interpreted language? 00:14:26.540 --> 00:14:29.600 Well, among the most popular ones today is that called Python. 00:14:29.600 --> 00:14:32.930 Python is especially popular in the world of data science and web 00:14:32.930 --> 00:14:35.450 programming and in command line applications 00:14:35.450 --> 00:14:39.860 ones written at the so-called terminal window via which I can solve problems. 00:14:39.860 --> 00:14:44.930 And so Python is notable too for its relative simplicity-- certainly vis 00:14:44.930 --> 00:14:46.370 a vie something like c. 00:14:46.370 --> 00:14:49.840 In fact, in order to implement the equivalent program in Python 00:14:49.840 --> 00:14:54.330 that I just implemented in C, it suffices to write just this. 00:14:54.330 --> 00:14:56.300 Say what you mean and little more. 00:14:56.300 --> 00:14:57.560 There's less syntax here. 00:14:57.560 --> 00:14:59.820 There's fewer keywords that are unfamiliar. 00:14:59.820 --> 00:15:03.860 It's just instead the verb or function print followed by hello world-- 00:15:03.860 --> 00:15:08.390 no semicolon, no curly braces, fewer symbols altogether. 00:15:08.390 --> 00:15:11.150 But how do I go about running a program in Python? 00:15:11.150 --> 00:15:14.852 Well, it turns out that typically Python is interpreted, not compiled. 00:15:14.852 --> 00:15:17.060 So I'm not going to run it through a compiler per se, 00:15:17.060 --> 00:15:20.690 but I am instead going to interpret it line by line-- albeit just one 00:15:20.690 --> 00:15:22.800 line with this particular program. 00:15:22.800 --> 00:15:25.870 So let me go back into my text editor and terminal window 00:15:25.870 --> 00:15:28.171 and this time create a file called hello.py-- dot 00:15:28.171 --> 00:15:33.290 py being the conventional file extension for any program written in Python. 00:15:33.290 --> 00:15:36.110 And in hello.py, I am now going to write that one line 00:15:36.110 --> 00:15:41.150 program print open parenthesis quote unquote hello world. 00:15:41.150 --> 00:15:43.490 How do I interpret this file called hello.py? 00:15:43.490 --> 00:15:46.610 Well, it turns out I run a program that itself is 00:15:46.610 --> 00:15:48.830 called Python, which is my interpreter. 00:15:48.830 --> 00:15:52.250 And so I run in my terminal window Python space hello.py, 00:15:52.250 --> 00:15:55.340 and the output is now the same. 00:15:55.340 --> 00:15:56.570 So what has just happened? 00:15:56.570 --> 00:16:00.830 Albeit just one line, what that program called Python has done 00:16:00.830 --> 00:16:04.740 is open up this file called hello, read it top to bottom, 00:16:04.740 --> 00:16:07.430 left to right, albeit quite quickly, recognized 00:16:07.430 --> 00:16:11.630 that it knows this keyword or function called print and therefore knew 00:16:11.630 --> 00:16:12.710 what to do next. 00:16:12.710 --> 00:16:17.520 It went ahead and printed hello world on the screen and then automatically quit. 00:16:17.520 --> 00:16:19.220 So this seems like a nice thing. 00:16:19.220 --> 00:16:23.360 No longer do I have to remember and take the time to compile my code, 00:16:23.360 --> 00:16:24.980 but surely, there must be some price. 00:16:24.980 --> 00:16:28.940 And indeed, one of the implications of saving that step no longer having 00:16:28.940 --> 00:16:31.400 to compile your code but instead just jumping right 00:16:31.400 --> 00:16:35.390 to its execution or interpretation is that you pay potentially 00:16:35.390 --> 00:16:36.860 a performance penalty. 00:16:36.860 --> 00:16:40.310 You certainly can't quite see it in a program as short as hello world. 00:16:40.310 --> 00:16:42.950 But if you were to write a program with hundreds or thousands 00:16:42.950 --> 00:16:46.010 or millions of lines, the overhead required 00:16:46.010 --> 00:16:48.658 to read that file top to bottom, left to right, 00:16:48.658 --> 00:16:50.450 and to figure out based on the instructions 00:16:50.450 --> 00:16:53.330 therein what it is the programmer intended actually 00:16:53.330 --> 00:16:55.830 does take non-zero amount of time. 00:16:55.830 --> 00:16:59.480 And it can surely add up for the most computationally complex of problems-- 00:16:59.480 --> 00:17:03.350 anything involving an analysis, anything involving loops or cycles, 00:17:03.350 --> 00:17:06.079 you can certainly begin to feel its effects. 00:17:06.079 --> 00:17:09.140 But that's OK because we humans have been fairly creative over time. 00:17:09.140 --> 00:17:11.625 And as we've invented more and more programming languages, 00:17:11.625 --> 00:17:14.000 we have fortunately also invented more and more solutions 00:17:14.000 --> 00:17:15.530 to problems like these. 00:17:15.530 --> 00:17:19.520 And so it turns out that even though this all happened quite quickly, when 00:17:19.520 --> 00:17:24.109 I ran this interpreter called Python, odds are if my computer were smart, 00:17:24.109 --> 00:17:27.230 it was probably doing me a favor underneath the hood 00:17:27.230 --> 00:17:28.640 without my even knowing. 00:17:28.640 --> 00:17:31.130 And in fact, what Python and some other interpreters 00:17:31.130 --> 00:17:34.280 do is actually compile your code for you, 00:17:34.280 --> 00:17:38.990 save the results in a temporary file that you yourself might not even see, 00:17:38.990 --> 00:17:43.010 and the next time I run this program, especially if it's large and complex, 00:17:43.010 --> 00:17:46.790 Python will skip this step of reinterpreting the file again and again 00:17:46.790 --> 00:17:51.320 and instead look at that precompiled version of my same program-- 00:17:51.320 --> 00:17:54.500 therefore, saving some time but achieving the same results. 00:17:54.500 --> 00:17:58.010 Only if I go back and change my code and make changes to my program 00:17:58.010 --> 00:18:03.240 does Python need to regenerate that cached version of code, if you will, 00:18:03.240 --> 00:18:05.960 in order to reuse that again and again. 00:18:05.960 --> 00:18:10.040 And this intermediately cached code is generally called byte code. 00:18:10.040 --> 00:18:14.000 It's not quite zeros and ones, but it's closer to it than Python itself. 00:18:14.000 --> 00:18:17.270 And so for this same program in Python, were my computer 00:18:17.270 --> 00:18:22.070 to actually compile it for me, what I would actually see is code like this. 00:18:22.070 --> 00:18:24.830 Much like assembly code is it fairly cryptic, but at least 00:18:24.830 --> 00:18:27.650 in there is some familiar phrase hello world as well 00:18:27.650 --> 00:18:31.820 as the function I ultimately called, which is that known as print. 00:18:31.820 --> 00:18:35.030 Now Python and C are not the only languages out there. 00:18:35.030 --> 00:18:37.340 In fact, there are dozens in vogue these days. 00:18:37.340 --> 00:18:39.170 And there are hundreds-- if not thousands-- 00:18:39.170 --> 00:18:41.360 that humans have created over time. 00:18:41.360 --> 00:18:44.270 For instance, depicted here is perhaps one with which you 00:18:44.270 --> 00:18:46.580 yourself are familiar or at least heard of. 00:18:46.580 --> 00:18:50.300 It's called Java, and it happens to be an object oriented programming 00:18:50.300 --> 00:18:52.430 language in a language, which means it has features 00:18:52.430 --> 00:18:55.320 beyond those earliest of ones like C. Here, though, 00:18:55.320 --> 00:18:58.750 is perhaps the simplest way via which you can implement that same program 00:18:58.750 --> 00:19:01.330 hello world, but it-- not unlike C-- has a bit 00:19:01.330 --> 00:19:04.570 of overhead, a number of samples and words that at first glance 00:19:04.570 --> 00:19:06.070 certainly are not obvious. 00:19:06.070 --> 00:19:08.620 But ultimately, that's all this program does. 00:19:08.620 --> 00:19:12.520 But Java is distinct in that it took a different approach to another problem 00:19:12.520 --> 00:19:14.140 that we've not yet tripped over. 00:19:14.140 --> 00:19:17.680 I've been running and running this program thus far in my Mac, 00:19:17.680 --> 00:19:21.683 and I compiled it particularly for this Mac on an Intel CPU. 00:19:21.683 --> 00:19:24.100 But it certainly stands to reason that you or someone else 00:19:24.100 --> 00:19:27.070 might not have the same computer or operating system as I, 00:19:27.070 --> 00:19:30.190 and it would seem to be quite the burden on the programmer 00:19:30.190 --> 00:19:33.580 if they have to compile their code in a different way for you and for me 00:19:33.580 --> 00:19:35.180 and for everyone else. 00:19:35.180 --> 00:19:39.040 And so it turns out that this cost of doing business, if you will, 00:19:39.040 --> 00:19:42.550 shipping different shrink wrapped boxes back in the day of the same program 00:19:42.550 --> 00:19:45.310 for different computers and OSes is ultimately 00:19:45.310 --> 00:19:47.590 solved by way of a virtual machine. 00:19:47.590 --> 00:19:49.840 A virtual machine, as the name implies, is not 00:19:49.840 --> 00:19:54.700 a physical machine but a virtual one implemented, as they say in software-- 00:19:54.700 --> 00:19:58.090 software that humans have written that mimics the behavior 00:19:58.090 --> 00:20:00.790 of a virtual imaginary machine. 00:20:00.790 --> 00:20:04.180 And then companies like Sun and others have implemented support 00:20:04.180 --> 00:20:07.660 for that virtual machine for Macs and for PCs 00:20:07.660 --> 00:20:09.760 and for multiple operating systems. 00:20:09.760 --> 00:20:13.690 And so Java subscribes to the monitor of write once run anywhere. 00:20:13.690 --> 00:20:16.870 You needn't compile it again and again for different platforms, 00:20:16.870 --> 00:20:21.010 rather you can install on each platform its own virtual machine 00:20:21.010 --> 00:20:23.590 and run the exact same code. 00:20:23.590 --> 00:20:27.520 So it's simply a different approach to an otherwise omnipresent problem, 00:20:27.520 --> 00:20:31.985 but Java falls into a class of languages that uses that particular technique. 00:20:31.985 --> 00:20:33.610 And what other languages are out there? 00:20:33.610 --> 00:20:37.060 Well, very popular these days on both servers and clients is a language 00:20:37.060 --> 00:20:39.193 called JavaScript, such as that pictured here. 00:20:39.193 --> 00:20:41.110 Here is yet another language-- this one called 00:20:41.110 --> 00:20:43.930 Ruby-- that replaces the word print with just put, 00:20:43.930 --> 00:20:48.280 but it too is more syntactically simpler, much like Python itself is. 00:20:48.280 --> 00:20:51.430 On the other hand, here is C++, incredibly common still, 00:20:51.430 --> 00:20:54.400 especially on PCs, along with other languages as well. 00:20:54.400 --> 00:20:59.860 But in C++, you see code reminiscent of C itself, also conventionally compiled. 00:20:59.860 --> 00:21:03.640 Now if you'd like to see any number of overwhelming examples of how you might 00:21:03.640 --> 00:21:07.450 quite simply in hundreds of languages say hello world, 00:21:07.450 --> 00:21:09.547 take a look at this URL here. 00:21:09.547 --> 00:21:11.380 And in fact, there's so many other languages 00:21:11.380 --> 00:21:14.390 that are popular and powerful, sometimes in different ways. 00:21:14.390 --> 00:21:17.560 In fact, it's not just the case that you use one particular language 00:21:17.560 --> 00:21:18.880 for one particular job. 00:21:18.880 --> 00:21:20.922 There are many tools that you might bring to bear 00:21:20.922 --> 00:21:22.990 on the exact same problem at hand. 00:21:22.990 --> 00:21:25.150 In fact, the reason that so many languages exist 00:21:25.150 --> 00:21:28.990 is that over time, we humans have perhaps rightly or arrogantly 00:21:28.990 --> 00:21:31.840 decided that we can do better than languages past. 00:21:31.840 --> 00:21:35.140 And so humans invent new languages that have other features, 00:21:35.140 --> 00:21:38.080 different approaches, and of course, reasonable people can disagree. 00:21:38.080 --> 00:21:41.170 And so you have some languages that can achieve the very same task. 00:21:41.170 --> 00:21:42.430 They just do it differently. 00:21:42.430 --> 00:21:43.883 The text looks a little different. 00:21:43.883 --> 00:21:45.550 The features are a little bit different. 00:21:45.550 --> 00:21:48.610 And so it's up to the programmer as part of the design process 00:21:48.610 --> 00:21:52.750 to decide what tool is best for the job, not unlike a physical tool 00:21:52.750 --> 00:21:55.060 that you might have in a home tool chest. 00:21:55.060 --> 00:21:58.660 Among the most popular these days perhaps are these here Bash and C 00:21:58.660 --> 00:22:04.450 and C++ and C#, Closure and Erlang and F# and Go, Haskell, Java, JavaScript, 00:22:04.450 --> 00:22:11.080 Objective OCaml and then PHP, Python, R, Ruby, Scala, Scheme, SQL, Swift, 00:22:11.080 --> 00:22:12.010 and so many more. 00:22:12.010 --> 00:22:14.800 In fact, if you'd like to see a nearly exhaustive list of all 00:22:14.800 --> 00:22:17.740 of the language humans have invented, take a look at Wikipedia 00:22:17.740 --> 00:22:21.520 here, which goes into so much more detail. 00:22:21.520 --> 00:22:25.270 So suffice it to say that computers can print hello world, 00:22:25.270 --> 00:22:27.430 but they can do so much more as well. 00:22:27.430 --> 00:22:31.180 So what are the basic building blocks that can be found in languages like C, 00:22:31.180 --> 00:22:34.430 like Python, Java, C++, and any number of others? 00:22:34.430 --> 00:22:38.350 Well, let's return to the pseudocode with which we began to program, albeit 00:22:38.350 --> 00:22:40.150 verbally, some time ago. 00:22:40.150 --> 00:22:43.660 Here we had the algorithm via which to find Mike Smith in a phone book 00:22:43.660 --> 00:22:45.490 and to search more generally. 00:22:45.490 --> 00:22:48.970 Recall, though, that laced throughout this pseudocode or program 00:22:48.970 --> 00:22:53.380 really were a few constructs that were semantically distinct from each other. 00:22:53.380 --> 00:22:57.170 Let me go ahead and highlight in yellow some of the verbs that we saw last. 00:22:57.170 --> 00:23:01.450 Pickup and open to and look at call, open and open and quit-- 00:23:01.450 --> 00:23:04.720 all of these were calls to actions, verbs or functions, 00:23:04.720 --> 00:23:06.400 as we described them previously. 00:23:06.400 --> 00:23:10.370 Well, it turns out that in C and in Python and other languages as well, 00:23:10.370 --> 00:23:14.112 we have not necessarily these same words but these same actions. 00:23:14.112 --> 00:23:15.820 For instance, we've seen in two languages 00:23:15.820 --> 00:23:18.640 already that you can print hello world on the screen. 00:23:18.640 --> 00:23:22.270 That function or verb in C happens to be called printf. 00:23:22.270 --> 00:23:25.480 In Python, it happens to be called more specifically print. 00:23:25.480 --> 00:23:29.740 And so those are examples of functions in those languages as well. 00:23:29.740 --> 00:23:31.120 What else did we see last time? 00:23:31.120 --> 00:23:33.840 Well, we had if and else if and else if and else, 00:23:33.840 --> 00:23:36.520 and we describe these as conditions or branches 00:23:36.520 --> 00:23:40.090 via which you can make decisions and either go left or right down 00:23:40.090 --> 00:23:41.350 the fork in the road. 00:23:41.350 --> 00:23:43.690 Well, in Python in C and in other languages 00:23:43.690 --> 00:23:46.790 too do we have those same features as well. 00:23:46.790 --> 00:23:49.090 We then looked at Boolean expressions, questions 00:23:49.090 --> 00:23:53.260 that you can ask to which there are yes and no or true and false answers. 00:23:53.260 --> 00:23:55.970 In Python and C, we're going to see those as well. 00:23:55.970 --> 00:23:59.740 And lastly, we saw loops-- go back to step two, go back to step two-- 00:23:59.740 --> 00:24:02.890 a programming construct via which you can do something again and again. 00:24:02.890 --> 00:24:05.440 Now here we did this in order to keep looking for Mike. 00:24:05.440 --> 00:24:08.800 But in the real world, might you be searching for data in a large database 00:24:08.800 --> 00:24:12.227 or an Excel file, a CSV file, Comma Separated Values. 00:24:12.227 --> 00:24:14.560 And so you might want to have some programming code that 00:24:14.560 --> 00:24:17.920 opens that file and then iterate or loops over one line 00:24:17.920 --> 00:24:21.100 after another doing some calculation or analysis, 00:24:21.100 --> 00:24:23.840 loops enable us to do exactly that. 00:24:23.840 --> 00:24:25.840 So let's go ahead now and see if we can see 00:24:25.840 --> 00:24:29.260 these constructs in a common modern programming language. 00:24:29.260 --> 00:24:32.920 We'll use Python if only because it's incredibly commonly used at the command 00:24:32.920 --> 00:24:34.420 line within a terminal window. 00:24:34.420 --> 00:24:37.480 It's incredibly commonly used nowadays for web programming. 00:24:37.480 --> 00:24:41.860 And it's ultimately incredibly useful in the context of data analysis and data 00:24:41.860 --> 00:24:45.410 science, more generally. 00:24:45.410 --> 00:24:48.430 So let's begin where we left off recreating a program in Python that 00:24:48.430 --> 00:24:50.590 quite simply says hello world. 00:24:50.590 --> 00:24:53.290 To do that in my text editor here, I've created 00:24:53.290 --> 00:24:57.070 a file now called hello0.py to suggest that we're going to do this 00:24:57.070 --> 00:25:01.470 a few times, starting at zero, in order to iteratively improve on this program. 00:25:01.470 --> 00:25:06.820 Well, herein I'm going to do print quote unquote hello world, saving my file. 00:25:06.820 --> 00:25:09.100 No need to compile it but I'm now going to go ahead 00:25:09.100 --> 00:25:12.550 and type Python space hello0.py. 00:25:12.550 --> 00:25:13.930 And there we have hello world. 00:25:13.930 --> 00:25:16.990 Now this program, of course, is by definition not terribly dynamic. 00:25:16.990 --> 00:25:18.850 No matter how many times I run this program, 00:25:18.850 --> 00:25:22.030 it's going to print hello world, hello world again and again. 00:25:22.030 --> 00:25:24.987 Suppose that I instead wanted to get user inputs. 00:25:24.987 --> 00:25:26.320 Well, it turns out that Python-- 00:25:26.320 --> 00:25:30.010 like other languages-- has a function via which you can do exactly that. 00:25:30.010 --> 00:25:33.310 And a function, of course, takes optionally inputs 00:25:33.310 --> 00:25:35.290 and optionally produces some output. 00:25:35.290 --> 00:25:38.830 Now in Python, perhaps the easiest function with which to get user's input 00:25:38.830 --> 00:25:41.150 is called quite simply input. 00:25:41.150 --> 00:25:42.980 So let's see how we might use that. 00:25:42.980 --> 00:25:45.910 Let me go ahead and create a new file called hello1.py. 00:25:45.910 --> 00:25:49.360 And in this file am I first going to call that function called input, 00:25:49.360 --> 00:25:51.430 and input is called exactly that. 00:25:51.430 --> 00:25:56.680 But it takes as input some input, which is to say the string or the phrase, 00:25:56.680 --> 00:26:00.760 the sentence via which you want to prompt the human for his or her input. 00:26:00.760 --> 00:26:03.310 So for instance, I'll go ahead here and ask 00:26:03.310 --> 00:26:06.610 what is your name question mark space. 00:26:06.610 --> 00:26:10.810 And now this function is going to return to me a value, so to speak. 00:26:10.810 --> 00:26:12.700 I don't know how input is implemented. 00:26:12.700 --> 00:26:15.340 Someone else yesteryear implement this for me. 00:26:15.340 --> 00:26:18.010 But I do know per its documentation that when I call it, 00:26:18.010 --> 00:26:21.310 it's going to do the equivalent of handing me back a slip of paper 00:26:21.310 --> 00:26:24.970 on which is written the user's input from his or her keyboard. 00:26:24.970 --> 00:26:28.660 But for me to make use of this input, I need to store it somewhere. 00:26:28.660 --> 00:26:30.720 And in a programming language, much like in math, 00:26:30.720 --> 00:26:32.690 you have access to what are called variables. 00:26:32.690 --> 00:26:36.160 Now a mathematician might call their variables x or y or z. 00:26:36.160 --> 00:26:38.630 And in programming could we also call our variables 00:26:38.630 --> 00:26:40.750 the same but more conventional is ways to use 00:26:40.750 --> 00:26:43.660 names that are a little more descriptive-- actual words that 00:26:43.660 --> 00:26:49.210 describe their contents and the means by which I can store the return value, 00:26:49.210 --> 00:26:49.810 so to speak. 00:26:49.810 --> 00:26:54.850 What is handed back from a function like input is in a variable 00:26:54.850 --> 00:26:57.640 that I'll call perhaps aptly name. 00:26:57.640 --> 00:27:03.160 And in order to store the return value of input into a variable called name, 00:27:03.160 --> 00:27:05.958 it suffices simply to use a single equal sign. 00:27:05.958 --> 00:27:08.500 And this equal sign is not to be confused with the equal sign 00:27:08.500 --> 00:27:09.910 that you and I know in math. 00:27:09.910 --> 00:27:12.310 After all, in math, using an equal sign would 00:27:12.310 --> 00:27:15.700 apply that what is on the left equals what is on the right. 00:27:15.700 --> 00:27:17.050 And ultimately, that's our goal. 00:27:17.050 --> 00:27:20.190 But initially, we only know the value on the right-- 00:27:20.190 --> 00:27:24.050 the so-called return value from input that's handed back to me, so to speak, 00:27:24.050 --> 00:27:25.250 from the function. 00:27:25.250 --> 00:27:29.740 So in fact, with the equal sign in many programming languages is called-- 00:27:29.740 --> 00:27:32.770 Python among them-- is the so-called assignment operator. 00:27:32.770 --> 00:27:35.638 And so you consider what's happening here as being this. 00:27:35.638 --> 00:27:37.180 On the right-hand side is a function. 00:27:37.180 --> 00:27:39.910 It's handing back some value, the user's name. 00:27:39.910 --> 00:27:43.660 The equal sign implies copy that value from right 00:27:43.660 --> 00:27:45.850 to left into whatever is on the left. 00:27:45.850 --> 00:27:48.520 And so if on the left I have a variable called name, 00:27:48.520 --> 00:27:51.250 that variable-- much like x or y or z-- is 00:27:51.250 --> 00:27:53.740 going to store the user's input ultimately, 00:27:53.740 --> 00:27:56.830 so I can use it subsequently in some other line of code. 00:27:56.830 --> 00:27:59.110 Now what might that subsequent line of code be? 00:27:59.110 --> 00:28:02.920 Well, suppose I'd like to print not just hello world but hello, David, 00:28:02.920 --> 00:28:04.520 or hello, someone else. 00:28:04.520 --> 00:28:08.590 Well, now I can simply structure the second line is taking as input 00:28:08.590 --> 00:28:10.670 the contents of that variable. 00:28:10.670 --> 00:28:15.340 So I might go ahead here and say print quote unquote hello comma, 00:28:15.340 --> 00:28:19.240 and I now somehow need to append to that shorter phrase 00:28:19.240 --> 00:28:21.217 the name that has actually been typed in. 00:28:21.217 --> 00:28:23.050 And there's a number of ways we can do this. 00:28:23.050 --> 00:28:25.310 For instance, we might do it as follows. 00:28:25.310 --> 00:28:27.670 It turns out in Python that you can use plus not 00:28:27.670 --> 00:28:30.490 only to add two numbers together but also, in some sense, 00:28:30.490 --> 00:28:34.780 to add two words or two strings as they're called in a language. 00:28:34.780 --> 00:28:38.590 And if I do plus in between hello and that variable name, 00:28:38.590 --> 00:28:43.000 the effect is going to be to join or to contaminate those two strings of text 00:28:43.000 --> 00:28:45.160 together so that would ultimately is printed 00:28:45.160 --> 00:28:49.580 is hopefully hello comma David or hello comma yourself. 00:28:49.580 --> 00:28:53.170 So let me go ahead and save this file and in my terminal window now run 00:28:53.170 --> 00:28:57.730 Python of hello.py, I'll input my name David. 00:28:57.730 --> 00:29:01.810 And now notice my input is separated by one space from that question mark 00:29:01.810 --> 00:29:04.870 because on line one did I preemptively include 00:29:04.870 --> 00:29:07.210 that space between those double quotes. 00:29:07.210 --> 00:29:09.450 Now I'm going to go ahead and hit enter, the effect 00:29:09.450 --> 00:29:13.000 of which is to have input return that value, storing it 00:29:13.000 --> 00:29:16.150 in the variable called name, and then in line two, just go ahead 00:29:16.150 --> 00:29:19.270 and print hello comma followed by name. 00:29:19.270 --> 00:29:22.240 Now just to be clear that this is not hardcoded somewhere else. 00:29:22.240 --> 00:29:24.070 Let me go ahead and run it one more time. 00:29:24.070 --> 00:29:25.730 Python hello1.py. 00:29:25.730 --> 00:29:28.570 And let's go ahead and type your name and there too 00:29:28.570 --> 00:29:30.130 do we see hello your name. 00:29:30.130 --> 00:29:33.340 So the program is now dynamic, and I've outputted dynamically 00:29:33.340 --> 00:29:35.020 whatever it is the human is typed in. 00:29:35.020 --> 00:29:37.840 But it turns out there's any number of other ways we can do this. 00:29:37.840 --> 00:29:40.060 And so just so that we've seen a few different ways, 00:29:40.060 --> 00:29:43.030 let me go ahead and create a new file called hello2.py 00:29:43.030 --> 00:29:46.600 wherein we have almost the same code, but we're displaying your name 00:29:46.600 --> 00:29:48.640 or mine a little bit differently. 00:29:48.640 --> 00:29:52.540 As before, I'll go ahead and define my variable called name, assign to it 00:29:52.540 --> 00:29:57.387 the return value of input asking the user for his or her name. 00:29:57.387 --> 00:29:59.720 And then on my second line of code, we'll again go ahead 00:29:59.720 --> 00:30:02.030 and say print hello comma. 00:30:02.030 --> 00:30:04.820 And then following that, I actually have a choice. 00:30:04.820 --> 00:30:07.970 I don't necessarily have to just concatenate 00:30:07.970 --> 00:30:10.880 one input onto that first string hello. 00:30:10.880 --> 00:30:15.125 I can actually combine these as two separate inputs to print. 00:30:15.125 --> 00:30:17.000 It turns out that print, like many functions, 00:30:17.000 --> 00:30:20.720 can take either zero or one or two or any number of inputs. 00:30:20.720 --> 00:30:23.000 You simply, as the programmer, need to separate them 00:30:23.000 --> 00:30:25.670 by commas, not the comma that's inside of the quotes. 00:30:25.670 --> 00:30:28.850 That's just my English grammar separating hello from your name 00:30:28.850 --> 00:30:30.230 but outside of the quotes. 00:30:30.230 --> 00:30:33.770 And so my text editor is displaying it a little bit disparately in white. 00:30:33.770 --> 00:30:35.690 So in this case, am I using the same function 00:30:35.690 --> 00:30:40.220 print passing in not one but two arguments or inputs, the second 00:30:40.220 --> 00:30:42.140 of which is that actual name. 00:30:42.140 --> 00:30:45.080 Now there's a minor bug here that I'm not yet anticipating, 00:30:45.080 --> 00:30:46.710 but let's see what happens next. 00:30:46.710 --> 00:30:51.500 If I go ahead now and run Python hello2.py, type in my name, voila 00:30:51.500 --> 00:30:54.380 it's so close and almost right. 00:30:54.380 --> 00:30:56.210 But here perhaps is my first bug. 00:30:56.210 --> 00:30:59.600 It's not necessarily a bug that's breaking the program altogether. 00:30:59.600 --> 00:31:03.670 But to be a little bit nit-picky, I think we could do a little bit better, 00:31:03.670 --> 00:31:06.920 say, grammatically there seems to be two spaces instead of one 00:31:06.920 --> 00:31:08.490 but where are they coming from? 00:31:08.490 --> 00:31:10.865 Well, it turns out that functions, when they take inputs, 00:31:10.865 --> 00:31:12.620 can also have default behavior. 00:31:12.620 --> 00:31:15.440 And in this case, print is designed by its authors 00:31:15.440 --> 00:31:17.630 whenever you pass in two or more arguments 00:31:17.630 --> 00:31:20.330 to just separate them by default with one space. 00:31:20.330 --> 00:31:22.530 Now I can fix this mistake in a couple of ways. 00:31:22.530 --> 00:31:25.520 But the simplest way is just to remove it from my own input, 00:31:25.520 --> 00:31:28.940 saving the file again and rerunning Python of hello2.py 00:31:28.940 --> 00:31:31.670 gives me now what is your name? 00:31:31.670 --> 00:31:32.450 David. 00:31:32.450 --> 00:31:35.090 And we're back-- hello, David. 00:31:35.090 --> 00:31:38.660 Now it turns out there is yet more ways than you can format information 00:31:38.660 --> 00:31:39.380 on the screen. 00:31:39.380 --> 00:31:41.210 And just so that we've seen one other way, 00:31:41.210 --> 00:31:43.890 allow me to create a fourth file called hello4.py. 00:31:43.890 --> 00:31:49.520 In hello3.py, am I going to start almost exactly the same declaring a variable 00:31:49.520 --> 00:31:53.000 called name, assigning to it the return value of input, 00:31:53.000 --> 00:31:55.640 asking the user for their name. 00:31:55.640 --> 00:31:58.170 And then in my second line of code am I again going 00:31:58.170 --> 00:32:02.270 to use print starting with a parenthesis and a close parenthesis. 00:32:02.270 --> 00:32:06.650 In there am I ultimately going to say hello comma something, 00:32:06.650 --> 00:32:11.420 and the something this time is literally going to be name but not quite name 00:32:11.420 --> 00:32:12.290 on its own. 00:32:12.290 --> 00:32:14.630 Indeed, if I were to run this program now, 00:32:14.630 --> 00:32:17.060 what do you think might be printed? 00:32:17.060 --> 00:32:21.950 I've said to the program print hello comma name, but unfortunately, 00:32:21.950 --> 00:32:25.760 name is as written, N-A-M-E. And so I would literally see that on the screen 00:32:25.760 --> 00:32:27.350 no matter what I typed in. 00:32:27.350 --> 00:32:31.340 But it turns out in Python, if you surround your variable's name with 00:32:31.340 --> 00:32:34.340 curly braces, typically found just above the Enter key-- 00:32:34.340 --> 00:32:39.260 at least on a US keyboard-- you can tell Python that this N-A-M-E is actually 00:32:39.260 --> 00:32:43.130 the name of a variable, not just a string of text on its own. 00:32:43.130 --> 00:32:47.360 But in order to tell Python that it should treat this string, this input 00:32:47.360 --> 00:32:50.060 to print a little bit differently than usual, 00:32:50.060 --> 00:32:53.990 you have to fairly cryptically prefix it with a single F, 00:32:53.990 --> 00:32:56.000 thereby telling Python that this is what should 00:32:56.000 --> 00:32:58.910 be called a format string or string-- 00:32:58.910 --> 00:33:01.580 That is, a string of text a sequence of characters 00:33:01.580 --> 00:33:04.400 that should be treated as formatted in a special way. 00:33:04.400 --> 00:33:06.380 And according to Python's own documentation, 00:33:06.380 --> 00:33:10.190 if you format your input using these curly braces inside of which 00:33:10.190 --> 00:33:12.770 is the name of a variable, Python will go 00:33:12.770 --> 00:33:15.530 through the trouble of plugging in the value of that variable 00:33:15.530 --> 00:33:18.260 there and therefore formatting it for you. 00:33:18.260 --> 00:33:23.390 And so I'll again go ahead and run Python of hello3.pi, input my name, 00:33:23.390 --> 00:33:26.030 and voila, we're back to Hello, David. 00:33:26.030 --> 00:33:28.760 So what are the takeaways then from the simplest of programs? 00:33:28.760 --> 00:33:32.150 Well, we clearly have the ability to print information on the screen. 00:33:32.150 --> 00:33:34.850 But we have at our disposal a function, a piece of code 00:33:34.850 --> 00:33:38.420 that someone wrote long before me that takes as input perhaps just 00:33:38.420 --> 00:33:39.920 a string like hello world. 00:33:39.920 --> 00:33:43.170 But if you pass it multiple strings will it handle those as well. 00:33:43.170 --> 00:33:47.660 And if you pass it a special type of string will it handle that as well too. 00:33:47.660 --> 00:33:50.390 And so depending on the documentation for some language 00:33:50.390 --> 00:33:53.220 do you have these and so many more features available to you. 00:33:53.220 --> 00:33:56.270 And so one of the first steps in learning any programming language 00:33:56.270 --> 00:33:59.210 is not to take a formal lesson or class but simply, 00:33:59.210 --> 00:34:01.610 quite honestly to read the documentation. 00:34:01.610 --> 00:34:04.460 And once you have under your belt some knowledge of one or two 00:34:04.460 --> 00:34:08.179 or few programming languages, it is much easier in the computer world 00:34:08.179 --> 00:34:11.540 to pick up new ones than I daresay it is in the human world where you 00:34:11.540 --> 00:34:15.355 have a much larger vocabulary at hand. 00:34:15.355 --> 00:34:16.730 All right, enough about printing. 00:34:16.730 --> 00:34:17.991 What more can we do? 00:34:17.991 --> 00:34:19.699 Well, programming languages can typically 00:34:19.699 --> 00:34:23.150 handle any number of arithmetic or mathematical operations-- 00:34:23.150 --> 00:34:25.790 addition, subtraction, and many others among them. 00:34:25.790 --> 00:34:28.880 So why don't we go ahead and create a program that quite simply prompts 00:34:28.880 --> 00:34:30.889 the user not just for their name or string 00:34:30.889 --> 00:34:33.290 but rather two inputs, two numbers. 00:34:33.290 --> 00:34:35.912 And we'll call them more familiarly x and y. 00:34:35.912 --> 00:34:38.037 I'm going to go ahead and declare a variable called 00:34:38.037 --> 00:34:42.560 x assigned to it from right to left the result of asking 00:34:42.560 --> 00:34:44.927 for the user's input of x. 00:34:44.927 --> 00:34:47.510 And then I'm going to go ahead and do precisely the same thing 00:34:47.510 --> 00:34:50.389 on another line of code defining a variable called y 00:34:50.389 --> 00:34:54.739 and assigning to it from right to left the return value of another call 00:34:54.739 --> 00:34:56.550 or invocation of input-- 00:34:56.550 --> 00:34:59.250 this time, prompting the user for y. 00:34:59.250 --> 00:35:02.520 And then I quite simply am going to go ahead and print out the result. 00:35:02.520 --> 00:35:04.590 So print x plus y-- 00:35:04.590 --> 00:35:07.260 this time not using any double quotes at all 00:35:07.260 --> 00:35:10.200 because what I literally want to print is x plus y, 00:35:10.200 --> 00:35:13.150 not the concatenation of two strings, not a string at all, 00:35:13.150 --> 00:35:15.480 but rather just a number. 00:35:15.480 --> 00:35:18.210 In my terminal window now will I go ahead and run Python 00:35:18.210 --> 00:35:20.680 on the name of this file, arithmetic.py. 00:35:20.680 --> 00:35:24.180 I'm prompted for x, which I'll say one. 00:35:24.180 --> 00:35:26.610 I'm prompted for y, so I'll say two. 00:35:26.610 --> 00:35:33.490 And indeed, 1 plus 2 is 12, but no 1 plus 2 is not 12. 00:35:33.490 --> 00:35:35.500 So what's actually happened here? 00:35:35.500 --> 00:35:40.000 Well, this is the first real bug, if you will, that I've introduced to my code. 00:35:40.000 --> 00:35:44.860 Now step one here has x equals input, prompting the user for x, and step 00:35:44.860 --> 00:35:48.370 two has y equals input prompting the user for y-- 00:35:48.370 --> 00:35:51.700 little different from before that we've chosen different variable names. 00:35:51.700 --> 00:35:53.020 Now on line four-- 00:35:53.020 --> 00:35:55.180 and I've separated with this blank space just 00:35:55.180 --> 00:35:59.830 to make it a bit easier to read do I have print x plus y. 00:35:59.830 --> 00:36:02.260 But 1 plus 1 is surely not 12. 00:36:02.260 --> 00:36:03.940 But what appears to be happening? 00:36:03.940 --> 00:36:06.220 Well, it's no coincidence that the answer Python's 00:36:06.220 --> 00:36:09.730 giving me is my first input followed by my second. 00:36:09.730 --> 00:36:12.460 It would seem, in fact, that Python is contaminating 00:36:12.460 --> 00:36:15.820 my first input to my second that is joining them just 00:36:15.820 --> 00:36:17.800 like hello comma David. 00:36:17.800 --> 00:36:19.520 So why is that happening? 00:36:19.520 --> 00:36:22.000 Well, I would hope that plus when given two numbers 00:36:22.000 --> 00:36:24.040 would, in fact, add those two together. 00:36:24.040 --> 00:36:25.390 But here's a catch. 00:36:25.390 --> 00:36:29.350 It turns out that underneath the hood, Python, like many languages, 00:36:29.350 --> 00:36:32.380 actually has what are called types or data types-- 00:36:32.380 --> 00:36:35.050 that is to say, different ways of representing information 00:36:35.050 --> 00:36:37.870 if that information is a number like an integer 00:36:37.870 --> 00:36:40.420 or even a real number with a decimal point 00:36:40.420 --> 00:36:43.780 or a string of text that is a sequence of characters or maybe more 00:36:43.780 --> 00:36:48.070 generally, a value like true or false, a so-called Boolean value, and even 00:36:48.070 --> 00:36:49.060 others still. 00:36:49.060 --> 00:36:52.840 And in this case, indeed, it turns out-- and you would only know this by trial 00:36:52.840 --> 00:36:55.300 and error by having read the documentation first-- 00:36:55.300 --> 00:36:58.480 does the input function built into Python return 00:36:58.480 --> 00:37:01.420 to you not a number, but a string. 00:37:01.420 --> 00:37:04.420 Even though what I typed on my keyboard looks like a number 00:37:04.420 --> 00:37:09.250 and surely is in practice, it's actually not being stored as such by Python. 00:37:09.250 --> 00:37:13.000 It's being stored and said in such a way that the computer is interpreting it 00:37:13.000 --> 00:37:15.130 as a string of text. 00:37:15.130 --> 00:37:17.140 So we have to be ever more explicit. 00:37:17.140 --> 00:37:19.810 And indeed, computers don't necessarily know what I intend. 00:37:19.810 --> 00:37:22.840 Maybe the goal at hand was to write a program that concatenates 00:37:22.840 --> 00:37:25.160 one number against another. 00:37:25.160 --> 00:37:28.120 And so if I really want the user's input to be treated as numbers, 00:37:28.120 --> 00:37:31.240 I somehow have to coerce it to such or convert it 00:37:31.240 --> 00:37:35.267 or more technically, to cast it from string to an int. 00:37:35.267 --> 00:37:37.600 Now it turns out Python has other functions with which I 00:37:37.600 --> 00:37:42.060 can fix this mistake, and I might do this in between these lines here. 00:37:42.060 --> 00:37:44.470 X should not just be whatever the user typed in. 00:37:44.470 --> 00:37:47.740 I want x to be the result of converting whatever 00:37:47.740 --> 00:37:53.320 the human typed in into the integer or into version of that string. 00:37:53.320 --> 00:37:58.420 Meanwhile, can I fix y by the same y equals int of y, 00:37:58.420 --> 00:38:01.510 thereby telling Python even though, yes, the human typed something 00:38:01.510 --> 00:38:04.510 in a keyboard, therefore implying a string, 00:38:04.510 --> 00:38:07.180 go ahead and convert that one and that two 00:38:07.180 --> 00:38:11.320 to an actual integer underneath the hood, a pattern of bits, if you will, 00:38:11.320 --> 00:38:14.740 that represents not the ASCII value or the Unicode value 00:38:14.740 --> 00:38:17.710 that the user typed in but the underlying pattern of bits 00:38:17.710 --> 00:38:20.920 that represents one and represents two. 00:38:20.920 --> 00:38:25.870 Let's go ahead now and rerun this file as Python arithmetic py, 00:38:25.870 --> 00:38:29.560 inputting again 1, inputting again 2, and lastly, hitting Enter. 00:38:29.560 --> 00:38:31.630 This time, we have three. 00:38:31.630 --> 00:38:35.458 Unfortunately, we've paid a bit of a price, albeit just a matter of style. 00:38:35.458 --> 00:38:38.500 I seem to have increased the length of this program from just three lines 00:38:38.500 --> 00:38:41.080 to five just to fix a simple mistake. 00:38:41.080 --> 00:38:45.250 But it turns out that just as functions take input and produce outputs, 00:38:45.250 --> 00:38:49.720 so can you pipeline them, so to speak, or nest them so that one function's 00:38:49.720 --> 00:38:51.520 output is another function's input. 00:38:51.520 --> 00:38:54.640 The result of which is that we can kind of tighten this up. 00:38:54.640 --> 00:38:57.840 I can instead on line 2 here delete what I have 00:38:57.840 --> 00:39:00.580 and on line one alone simply say that once 00:39:00.580 --> 00:39:02.980 I get back the return value of input-- 00:39:02.980 --> 00:39:06.490 the user's input, if you will, return on a conceptual sheet of paper-- 00:39:06.490 --> 00:39:09.370 go ahead and pass it immediately to that function called 00:39:09.370 --> 00:39:12.220 int and surround the whole thing with parentheses, 00:39:12.220 --> 00:39:14.885 thereby passing the output of input into the input of int 00:39:14.885 --> 00:39:19.450 and then assigning the result from right to left. 00:39:19.450 --> 00:39:23.680 And again, here can I get rid of line 3, passing the output of this input 00:39:23.680 --> 00:39:26.860 to the input of this int and again surrounding it 00:39:26.860 --> 00:39:30.880 with parentheses in order to pass its output into its input, 00:39:30.880 --> 00:39:33.970 ultimately assigning from right to left that value. 00:39:33.970 --> 00:39:36.220 Let's to be sure go ahead and save this file. 00:39:36.220 --> 00:39:40.940 And then in my terminal window, run one final time Python of arithmetic.py, 00:39:40.940 --> 00:39:46.000 inputting 1, inputting 2, and still do we get 3. 00:39:46.000 --> 00:39:48.610 Suppose, though, that we not only want to take a user's input 00:39:48.610 --> 00:39:52.090 but make a decision based on it, taking out for a spin 00:39:52.090 --> 00:39:54.830 this notion of a condition with some Boolean expressions. 00:39:54.830 --> 00:39:55.940 How might we do that? 00:39:55.940 --> 00:39:58.690 Well, let me go ahead and create here a file called conditions.py, 00:39:58.690 --> 00:40:02.110 inside of which I'm again going to ask the user for two inputs, 00:40:02.110 --> 00:40:06.550 call it x and y, and then I'd like to determine whether x is less than y, 00:40:06.550 --> 00:40:09.010 greater than y, or exactly the same. 00:40:09.010 --> 00:40:12.640 Well, as before, I can declare my variable called x, assign to it-- 00:40:12.640 --> 00:40:14.200 preemptively this time-- 00:40:14.200 --> 00:40:18.550 the result of passing to int the return value of input 00:40:18.550 --> 00:40:21.940 asking the user for just x and then define another variable 00:40:21.940 --> 00:40:24.610 called y, assigning to it the return value 00:40:24.610 --> 00:40:27.970 of int, which is passed, in turn, the return value of input, 00:40:27.970 --> 00:40:29.870 asking the user for y. 00:40:29.870 --> 00:40:32.330 And now with these two numbers in hand, am 00:40:32.330 --> 00:40:34.790 I going to proceed to do the following. 00:40:34.790 --> 00:40:39.740 If x is less than y followed by a colon, indented below that, 00:40:39.740 --> 00:40:41.900 I'm going to say quite simply, well, you know what? 00:40:41.900 --> 00:40:46.130 X is less than y, simply to inform the user as much. 00:40:46.130 --> 00:40:47.760 And then back aligns with the if. 00:40:47.760 --> 00:40:52.280 Am I going to say else if x is greater than y with a colon 00:40:52.280 --> 00:40:57.200 and then indented below that, print x is greater than y. 00:40:57.200 --> 00:40:59.510 Now those are the semantics that I intend, 00:40:59.510 --> 00:41:01.760 but it turns out you have to read the fine print. 00:41:01.760 --> 00:41:04.460 In Python, it is not else if that you can use. 00:41:04.460 --> 00:41:07.730 Humans some years ago decided that slightly more succinct than else 00:41:07.730 --> 00:41:10.310 if would be literally elif. 00:41:10.310 --> 00:41:14.840 And that, in fact, is the correct syntax to use when you have a second fork here 00:41:14.840 --> 00:41:15.590 in the road. 00:41:15.590 --> 00:41:20.240 But indeed when we have now a third, we might do this else if-- 00:41:20.240 --> 00:41:21.360 there I go again-- 00:41:21.360 --> 00:41:29.270 elif x equals y, let me go ahead and say print x equals y. 00:41:29.270 --> 00:41:33.170 But there's a bug here already because equal does not mean what you think. 00:41:33.170 --> 00:41:36.650 Indeed, before we've been using the equal sign as the assignment operator, 00:41:36.650 --> 00:41:39.560 so to speak, copying from right to left some value. 00:41:39.560 --> 00:41:42.170 And in fact, here on line 8, there's already a bug. 00:41:42.170 --> 00:41:44.760 Using a single equal sign between my x and y 00:41:44.760 --> 00:41:47.510 here would have the effect not of comparing those two values 00:41:47.510 --> 00:41:50.600 but instead copying from right to left that value in y 00:41:50.600 --> 00:41:55.490 into that value for x, thereby making them equal no matter what. 00:41:55.490 --> 00:41:58.130 And so it turns out that we humans painted ourselves 00:41:58.130 --> 00:42:00.170 into a bit of a corner some years ago. 00:42:00.170 --> 00:42:03.410 We've already used equal to assign one thing to another. 00:42:03.410 --> 00:42:05.540 So it turns out that humans in many languages 00:42:05.540 --> 00:42:09.570 decided well, let's still use equal but two of them back to back. 00:42:09.570 --> 00:42:13.130 And so if you want to use not the assignment operator but the equality 00:42:13.130 --> 00:42:16.370 operator, do you want to use two of these things back to back. 00:42:16.370 --> 00:42:20.610 And so now I have a program that asks quite simply if x less than y, print 00:42:20.610 --> 00:42:21.110 as much. 00:42:21.110 --> 00:42:24.190 Elif x greater than y, print as much then. 00:42:24.190 --> 00:42:29.120 Elif x equals y, print the same, x equals y. 00:42:29.120 --> 00:42:32.600 But we don't necessarily need this third condition, it turns out. 00:42:32.600 --> 00:42:35.390 Logically, if you've got two numbers, two integers, 00:42:35.390 --> 00:42:39.200 I'm pretty sure, by definition, it'll either be greater than 1, 00:42:39.200 --> 00:42:42.200 less than the other, or exactly the same. 00:42:42.200 --> 00:42:45.050 And so we can save a tiny bit of time here 00:42:45.050 --> 00:42:47.300 by not even asking that third question. 00:42:47.300 --> 00:42:51.200 If I know that x is not less than y and I know that x is not greater than y, 00:42:51.200 --> 00:42:57.110 I might as well just confer logically that they must be actually equal. 00:42:57.110 --> 00:42:59.720 And so here we have the first of my Python programs 00:42:59.720 --> 00:43:02.570 that much like my pseudocode for finding Mike Smith allows 00:43:02.570 --> 00:43:05.150 me to take users' input and then compare it 00:43:05.150 --> 00:43:09.120 in order to take a different fork in the road based on its value. 00:43:09.120 --> 00:43:13.880 So my Boolean expressions here are x less than y and x greater than y 00:43:13.880 --> 00:43:14.780 and that's it. 00:43:14.780 --> 00:43:18.230 And my conditions here or the syntax with which I induced these branches 00:43:18.230 --> 00:43:21.110 are my if, my elif and my else. 00:43:21.110 --> 00:43:23.060 But important in Python-- 00:43:23.060 --> 00:43:25.730 unlike in pseudocode-- is some of this syntax. 00:43:25.730 --> 00:43:30.710 The colon very specifically say do the following if this is true, 00:43:30.710 --> 00:43:33.140 and the indentation, the multiple spaces-- 00:43:33.140 --> 00:43:34.340 here I typed four-- 00:43:34.340 --> 00:43:35.990 is ever so important as well. 00:43:35.990 --> 00:43:38.330 Whereas many programming languages are a bit loose 00:43:38.330 --> 00:43:40.460 when it comes to whitespace, so to speak, 00:43:40.460 --> 00:43:42.650 how many times you hit the spacebar or tab. 00:43:42.650 --> 00:43:46.880 Python enforces that you have the same amount of indentation everywhere. 00:43:46.880 --> 00:43:52.430 And so if I want lines four and six and eight here to all line up logically, 00:43:52.430 --> 00:43:54.440 they must literally do so in the file. 00:43:54.440 --> 00:43:57.380 Similarly, must lines five and seven and nine line 00:43:57.380 --> 00:44:00.140 up just right so that they are only executed 00:44:00.140 --> 00:44:02.690 if the lines are just above them are true. 00:44:02.690 --> 00:44:04.370 Let me go ahead and save this file. 00:44:04.370 --> 00:44:07.280 And in my terminal window, run Python of conditions.py, 00:44:07.280 --> 00:44:11.630 typing in shall we say 1 for x, 2 for y. 00:44:11.630 --> 00:44:13.520 And indeed, x is less than y. 00:44:13.520 --> 00:44:15.950 Let's run it again, this time, flipping things around. 00:44:15.950 --> 00:44:16.970 X is 2. 00:44:16.970 --> 00:44:17.960 Y is 1. 00:44:17.960 --> 00:44:20.240 And indeed, x is greater than y. 00:44:20.240 --> 00:44:24.680 And one third time, which should be inferred if x is 1 and y is 1, 00:44:24.680 --> 00:44:26.810 then, indeed, x equals y. 00:44:26.810 --> 00:44:28.730 Now programming is not all about numbers. 00:44:28.730 --> 00:44:30.650 In fact, pictured here in a program I wrote 00:44:30.650 --> 00:44:34.610 already is answer.py wherein we have lines of code that, again, 00:44:34.610 --> 00:44:38.310 prompt the user for input but this time, leave it as a string. 00:44:38.310 --> 00:44:41.360 And so on line four am I asking the user for his or her answer 00:44:41.360 --> 00:44:43.220 to a yes or no question. 00:44:43.220 --> 00:44:47.660 Presumably the user might type little y or little n or perhaps capital Y 00:44:47.660 --> 00:44:48.290 or capital N. 00:44:48.290 --> 00:44:51.710 And indeed, I'd like this program to handle any number of those cases, 00:44:51.710 --> 00:44:52.760 as any program might. 00:44:52.760 --> 00:44:56.450 On line 7 here am I asking then two possible questions. 00:44:56.450 --> 00:45:00.050 If c-- the name of the variable I gave to the users' input-- 00:45:00.050 --> 00:45:05.150 equals equals capital Y. Or to be robust, that variable c equals 00:45:05.150 --> 00:45:06.620 equals lowercase y. 00:45:06.620 --> 00:45:10.100 Let me go ahead and conclude and print that the user meant yes. 00:45:10.100 --> 00:45:13.610 Meanwhile, if c equals equals n capitalized 00:45:13.610 --> 00:45:17.510 or if c equals equals c in lowercase, similarly 00:45:17.510 --> 00:45:20.750 do I want to conclude that the user meant no. 00:45:20.750 --> 00:45:24.290 And so here the new operative word is quite literally or. 00:45:24.290 --> 00:45:27.050 In Python, it tends to be fairly English-like, 00:45:27.050 --> 00:45:29.420 ever more so than C and other languages where 00:45:29.420 --> 00:45:31.480 if you want to do something or something else, 00:45:31.480 --> 00:45:33.520 you literally say quite simply or. 00:45:33.520 --> 00:45:37.030 If I wanted both situations to be true, albeit illogically, 00:45:37.030 --> 00:45:39.340 could I use the actual word and. 00:45:39.340 --> 00:45:42.430 But of course, the user's input can't simultaneously be capital 00:45:42.430 --> 00:45:44.080 Y or lowercase y. 00:45:44.080 --> 00:45:47.088 And so here is using or apt. 00:45:47.088 --> 00:45:48.880 Now when programming, you don't have to use 00:45:48.880 --> 00:45:52.000 only those functions that are handed to you by the particular language. 00:45:52.000 --> 00:45:54.700 You yourself can invent functions of your own. 00:45:54.700 --> 00:45:57.280 For instance, let me go ahead and in a file called 00:45:57.280 --> 00:46:01.840 return.py implement a program that takes as input an integer 00:46:01.840 --> 00:46:06.290 or number from the user and then quite simply prints out the square thereof. 00:46:06.290 --> 00:46:09.060 So if the human types in 2, I'll print out 4. 00:46:09.060 --> 00:46:11.710 If the human types in 3, I'll print out 9. 00:46:11.710 --> 00:46:13.910 And so how when might we go about doing this? 00:46:13.910 --> 00:46:16.660 Well, perhaps in a familiar way now, x shall 00:46:16.660 --> 00:46:21.340 equal the result of converting to an integer whatever the user's input is 00:46:21.340 --> 00:46:23.380 after prompting them for x. 00:46:23.380 --> 00:46:28.030 And then I'm going to go ahead quite simply and print out, well, x times x. 00:46:28.030 --> 00:46:29.890 That is the square of x. 00:46:29.890 --> 00:46:32.800 I'll go ahead and save this and in my terminal window 00:46:32.800 --> 00:46:39.260 run Python on return.py and as proposed, square 2 and as proposed, square 3. 00:46:39.260 --> 00:46:43.600 And indeed, this program works exactly like this but to square a value 00:46:43.600 --> 00:46:45.310 has kind of a nice ring to it. 00:46:45.310 --> 00:46:48.190 And the fact that it happens to be implemented as x times x 00:46:48.190 --> 00:46:51.490 is really just an mathematical implementation detail-- 00:46:51.490 --> 00:46:54.550 something that I shouldn't really have to worry about or remember. 00:46:54.550 --> 00:46:57.310 I would just like to square the user's input. 00:46:57.310 --> 00:46:59.950 So wouldn't it be nice if there were a function in Python-- 00:46:59.950 --> 00:47:01.450 or any language for that matter-- 00:47:01.450 --> 00:47:03.040 quite simply called square. 00:47:03.040 --> 00:47:06.370 Indeed, I can make that happen, whether or not it exists, 00:47:06.370 --> 00:47:08.680 and simply define it myself. 00:47:08.680 --> 00:47:11.980 And so here I'm going to go and do the following. 00:47:11.980 --> 00:47:13.990 Using Python's keyword called def-- 00:47:13.990 --> 00:47:15.190 short for define-- 00:47:15.190 --> 00:47:18.100 I'm going to go ahead and define a function called square. 00:47:18.100 --> 00:47:22.300 I'm going to specify to Python that that new function shall take input 00:47:22.300 --> 00:47:26.140 that I'll arbitrarily but conventionally call n for number. 00:47:26.140 --> 00:47:29.530 And then with a colon as before and some indentation 2 00:47:29.530 --> 00:47:34.810 am I'm going to go ahead and return quite simply n times n. 00:47:34.810 --> 00:47:36.430 In other words, the math is the same. 00:47:36.430 --> 00:47:38.230 The implementation details are the same. 00:47:38.230 --> 00:47:41.320 But what's new here is this new keyword return. 00:47:41.320 --> 00:47:44.260 Just like with the input function built into Python, 00:47:44.260 --> 00:47:47.260 some human in that function's own implementation 00:47:47.260 --> 00:47:51.640 had a line of code that said return to the user whatever they have typed in. 00:47:51.640 --> 00:47:55.910 Here am I doing the same but returning not some users' input but rather 00:47:55.910 --> 00:47:57.340 n times n. 00:47:57.340 --> 00:48:00.430 And so you can think of this function called square much like that input 00:48:00.430 --> 00:48:04.060 function, jotting down on a digital piece of paper that value, 00:48:04.060 --> 00:48:08.680 handing it back to the caller or whoever uses this function, 00:48:08.680 --> 00:48:11.900 and letting them use it as they see fit. 00:48:11.900 --> 00:48:18.070 So now rather than use x times x myself can I more conceptually clearly say 00:48:18.070 --> 00:48:20.680 square of the user's input x. 00:48:20.680 --> 00:48:23.860 And the fact that x is not the same as n is quite OK. 00:48:23.860 --> 00:48:27.310 It is this function square that presumes to call its own input n. 00:48:27.310 --> 00:48:32.380 I can call my own input to square whatever I want, say, x. 00:48:32.380 --> 00:48:36.250 So let's go ahead now and run this program and see what happens. 00:48:36.250 --> 00:48:41.180 Based on these definitions, it would seem that I could square x in this way. 00:48:41.180 --> 00:48:45.890 If I go ahead and run this program again, Python return.py, and type to, 00:48:45.890 --> 00:48:48.550 I get more output than I surely intended. 00:48:48.550 --> 00:48:52.390 In fact, this is the first of my truly bad mistakes. 00:48:52.390 --> 00:48:55.360 Here do I see what's called traceback, which is sort of a trace 00:48:55.360 --> 00:48:57.970 or a log of everything the computer tried to do. 00:48:57.970 --> 00:49:01.990 And you'll see some hints, however, arcane this output, that line two is 00:49:01.990 --> 00:49:03.730 where my mistake probably is. 00:49:03.730 --> 00:49:06.520 In particular, I have some kind of name error in Python 00:49:06.520 --> 00:49:11.170 where the named square is not defined, and yet it's right here. 00:49:11.170 --> 00:49:13.630 Well, it turns out that Python and a lot of languages 00:49:13.630 --> 00:49:15.460 take things fairly literally. 00:49:15.460 --> 00:49:17.290 And if when you're interpreting your file 00:49:17.290 --> 00:49:19.720 they're reading top to bottom, left to right, 00:49:19.720 --> 00:49:23.860 unfortunately, it's too late to define square on line four 00:49:23.860 --> 00:49:27.370 if you yourself want to use it on line two. 00:49:27.370 --> 00:49:29.320 But we could surely fix this logically. 00:49:29.320 --> 00:49:34.510 As by moving that code up top down below, defining square at the top, 00:49:34.510 --> 00:49:39.220 writing my own logic below, now trusting that Python will see square and only 00:49:39.220 --> 00:49:41.380 use it on line five. 00:49:41.380 --> 00:49:46.210 Let's go ahead and save this file, rerunning Python of return.py, 00:49:46.210 --> 00:49:47.290 again typing 2. 00:49:47.290 --> 00:49:49.690 And voila, now it works. 00:49:49.690 --> 00:49:53.110 But this isn't quite the most conventional way to solve this problem. 00:49:53.110 --> 00:49:56.380 As naive as Python is reading your code top to bottom, 00:49:56.380 --> 00:49:59.740 it's a bit of a regression now, a bit of a mistake 00:49:59.740 --> 00:50:01.780 that I'm putting the actual code that I care 00:50:01.780 --> 00:50:04.600 about at the bottom and the actual code that I 00:50:04.600 --> 00:50:08.950 was trying to abstract away, if you will, and give name to at the very top. 00:50:08.950 --> 00:50:11.380 At the end of this day, the program I care about 00:50:11.380 --> 00:50:14.740 are these lines here, not my implementation of square. 00:50:14.740 --> 00:50:17.890 And so I would actually prefer, albeit a bit nit pickily, 00:50:17.890 --> 00:50:20.590 to put that code actually where it was. 00:50:20.590 --> 00:50:24.040 And so if you do that, thereby keeping the main part of your program 00:50:24.040 --> 00:50:28.460 at the top, as is convention, you still have to solve the same problem. 00:50:28.460 --> 00:50:29.870 So how might we do that? 00:50:29.870 --> 00:50:33.160 Well, the Pythonic way or the conventional way in Python 00:50:33.160 --> 00:50:34.210 is to do this-- 00:50:34.210 --> 00:50:38.020 to define a function that most people call itself main, again, 00:50:38.020 --> 00:50:42.430 ending it with a colon, indenting below that the lines you have written. 00:50:42.430 --> 00:50:44.920 And then at the bottom of the file is the very last thing 00:50:44.920 --> 00:50:49.330 you do, telling Python to call that function called main. 00:50:49.330 --> 00:50:51.300 Because here now is what Python will do. 00:50:51.300 --> 00:50:55.060 It will read this file in interpreting it top to bottom, left to right, 00:50:55.060 --> 00:50:59.380 defining a function called main and then here defining a function called square 00:50:59.380 --> 00:51:02.890 and then here calling one of those two functions, which 00:51:02.890 --> 00:51:04.760 in turn, calls the second. 00:51:04.760 --> 00:51:09.070 But in this way have you taken care to define all of your functions first 00:51:09.070 --> 00:51:13.250 and never calling any of them until everything's been defined. 00:51:13.250 --> 00:51:16.570 Now so that you've seen it too, it's not quite conventional to just run 00:51:16.570 --> 00:51:17.710 main at the bottom. 00:51:17.710 --> 00:51:21.310 Instead, you'll typically see a more magical incantation like this. 00:51:21.310 --> 00:51:28.060 If underscore name underscore underscore equals equals quote unquote underscore 00:51:28.060 --> 00:51:31.960 underscore main underscore colon, indented below that 00:51:31.960 --> 00:51:33.910 will be your actual call to main. 00:51:33.910 --> 00:51:36.610 This, for more arcane reasons, ensures that if you're 00:51:36.610 --> 00:51:39.280 using a small program as part of a bigger one, 00:51:39.280 --> 00:51:43.010 it won't necessarily get executed at the wrong time. 00:51:43.010 --> 00:51:44.860 But logically, what the key takeaways here 00:51:44.860 --> 00:51:47.940 are what you can actually do by defining your own functions. 00:51:47.940 --> 00:51:49.600 Here too do we have an abstraction. 00:51:49.600 --> 00:51:51.280 What does it mean to square two values? 00:51:51.280 --> 00:51:53.920 Simply multiplying one against the other. 00:51:53.920 --> 00:51:58.030 But it would be nice to just refer to that as a verb unto itself like square. 00:51:58.030 --> 00:52:02.110 And so by defining this function do we now abstract away that multiplication 00:52:02.110 --> 00:52:06.093 and just treat this as the idea we actually care about. 00:52:06.093 --> 00:52:08.260 Now, of course, we're not saving all that much time, 00:52:08.260 --> 00:52:10.240 and my programs even bigger than need be. 00:52:10.240 --> 00:52:13.780 But it's demonstrative of this principle of abstracting away implementation 00:52:13.780 --> 00:52:18.190 details and building your more interesting product on top of work 00:52:18.190 --> 00:52:21.250 that you or someone else has already done. 00:52:21.250 --> 00:52:23.830 Now what if we want to do more than just square a number? 00:52:23.830 --> 00:52:28.000 We instead want to prompt the user for input, convert that input to a number, 00:52:28.000 --> 00:52:31.420 and then ensure that that number is the type of number we want. 00:52:31.420 --> 00:52:35.530 Indeed, it's not uncommon in a program to ask the user for a positive 00:52:35.530 --> 00:52:37.090 integer-- something useful-- 00:52:37.090 --> 00:52:40.030 again and again until he or she provides just that. 00:52:40.030 --> 00:52:43.630 For instance, the user might type 0 or negative or something else still, 00:52:43.630 --> 00:52:45.880 but you want to pester them again and again until they 00:52:45.880 --> 00:52:48.670 provide exactly the input you expect. 00:52:48.670 --> 00:52:51.190 Much like in a website where you're forced to type a number 00:52:51.190 --> 00:52:53.800 or email address or something else, similarly 00:52:53.800 --> 00:52:56.330 can we do that in Python in code. 00:52:56.330 --> 00:52:58.120 So let's go ahead and do exactly that. 00:52:58.120 --> 00:53:02.080 And assume for the moment that there exists already a function called, say, 00:53:02.080 --> 00:53:03.670 get positive int-- 00:53:03.670 --> 00:53:06.880 a function whose purpose in life is to get from the human an 00:53:06.880 --> 00:53:09.520 integer from one on up. 00:53:09.520 --> 00:53:11.740 Let me go ahead and preemptively this time 00:53:11.740 --> 00:53:14.470 to find my own main function with def. 00:53:14.470 --> 00:53:19.060 And inside of that code, go ahead and declare a variable called I for integer 00:53:19.060 --> 00:53:21.520 and then just presume for the moment to call 00:53:21.520 --> 00:53:25.420 a function called get positive int, which itself will take a prompt as 00:53:25.420 --> 00:53:29.230 before asking the user, say, for I. And what do I 00:53:29.230 --> 00:53:30.770 want it down to with this number? 00:53:30.770 --> 00:53:34.810 Well, let's keep it simple for now and just print I itself. 00:53:34.810 --> 00:53:38.260 But I now need to implement that function called get positive int. 00:53:38.260 --> 00:53:42.340 So for that, I can use def and say def get positive int. 00:53:42.340 --> 00:53:44.965 But I need this function to take itself a prompt. 00:53:44.965 --> 00:53:46.840 And I'm going to go ahead and call it exactly 00:53:46.840 --> 00:53:49.270 that, which is to say when I call this function, 00:53:49.270 --> 00:53:52.180 as I've done on line two passing in some string of text 00:53:52.180 --> 00:53:56.620 that I want the user to see, well, in my definition of get positive int on line 00:53:56.620 --> 00:54:00.130 five, I need to tell Python to give that input a name, 00:54:00.130 --> 00:54:02.170 so I can refer to it ultimately. 00:54:02.170 --> 00:54:04.830 Because, indeed, when I'm going to do after adding that colon 00:54:04.830 --> 00:54:07.180 and indenting underneath is ultimately, we 00:54:07.180 --> 00:54:10.930 want to call input, passing in precisely that prompt. 00:54:10.930 --> 00:54:13.540 After all, get positive int is not going to presume 00:54:13.540 --> 00:54:18.310 to know in advance what the programmer wants to prompt the user with. 00:54:18.310 --> 00:54:23.380 Instead, it's going to get it just in time via that input or argument. 00:54:23.380 --> 00:54:27.160 But I need to pester the user again and again if they don't actually 00:54:27.160 --> 00:54:29.410 give me a positive int. 00:54:29.410 --> 00:54:31.120 And so how might I do that? 00:54:31.120 --> 00:54:35.470 Well, just like in pseudocode, we might define for ourselves loops-- 00:54:35.470 --> 00:54:37.510 blocks of code that do something again. 00:54:37.510 --> 00:54:42.760 And again as we go to step two, as before, so can I do that in Python 00:54:42.760 --> 00:54:46.270 in any number of ways, but perhaps the simplest here is this-- 00:54:46.270 --> 00:54:48.490 to simply say you know what, Python? 00:54:48.490 --> 00:54:52.930 Go ahead and give me an infinite loop while true-- 00:54:52.930 --> 00:54:57.340 while being my operative word here, inducing a loop while something 00:54:57.340 --> 00:54:58.310 is true. 00:54:58.310 --> 00:55:01.270 Well, you know what's true always is the word true. 00:55:01.270 --> 00:55:04.120 And indeed, built into Python are Boolean values-- 00:55:04.120 --> 00:55:08.650 true and false literally-- by definition, capital T and capital F. 00:55:08.650 --> 00:55:11.470 So by saying while true colon, I'm saying Python, please 00:55:11.470 --> 00:55:16.090 go ahead and do something again and again until I tell you to stop. 00:55:16.090 --> 00:55:19.420 Well, what do you want Python to do in this loop? 00:55:19.420 --> 00:55:22.660 I want to go ahead and declare a variable called say n, 00:55:22.660 --> 00:55:26.080 assign to it the return value of calling that in function, 00:55:26.080 --> 00:55:28.660 passing to it the output of input. 00:55:28.660 --> 00:55:35.530 And then and only then if the human has obliged and given me a positive int, 00:55:35.530 --> 00:55:39.430 I'll go ahead and say, well, if n greater than zero, 00:55:39.430 --> 00:55:42.255 go ahead and break out of this loop. 00:55:42.255 --> 00:55:45.130 So a different approach than we saw in pseudocode where I simply said 00:55:45.130 --> 00:55:47.320 go to and go to and go to again. 00:55:47.320 --> 00:55:52.240 Here I've instead said, Python, do this forever until I say break. 00:55:52.240 --> 00:55:55.930 And only once n is greater than zero, as per the user's input, 00:55:55.930 --> 00:56:00.280 do I break out of this loop entirely and therefore return 00:56:00.280 --> 00:56:03.460 when I'm ready that value called n. 00:56:03.460 --> 00:56:07.090 And so here on my last line of code am I again using return, handing back, 00:56:07.090 --> 00:56:09.460 if you will, a sheet of paper on which is that number. 00:56:09.460 --> 00:56:14.620 But I only reach this line 10 after I've said break on line 9, 00:56:14.620 --> 00:56:19.960 and so does this function get positive in ultimately return exactly that. 00:56:19.960 --> 00:56:23.290 So as always, let me go ahead now and save this file but only 00:56:23.290 --> 00:56:26.020 after adding that last cryptic line. 00:56:26.020 --> 00:56:31.540 If the name of this file is implicitly underscore underscore main, 00:56:31.540 --> 00:56:33.880 then do I want to go ahead and call main so 00:56:33.880 --> 00:56:37.570 that we avoid all of those issues of code in the wrong order. 00:56:37.570 --> 00:56:40.660 I'll go ahead and click Save and do Python of positive.py, 00:56:40.660 --> 00:56:45.010 providing an input, say, negative 1, being 00:56:45.010 --> 00:56:47.980 prompted again for a number so negative 2-- still 00:56:47.980 --> 00:56:50.020 not a positive int nor a zero. 00:56:50.020 --> 00:56:52.810 But if I finally type of value like one do 00:56:52.810 --> 00:56:56.240 I actually see the one that I inputted. 00:56:56.240 --> 00:56:58.660 But it turns out Python supports other types of loops 00:56:58.660 --> 00:57:02.350 as well, not just via this keyword called while but actually 00:57:02.350 --> 00:57:04.195 via a preposition called for. 00:57:04.195 --> 00:57:07.660 For instance, suppose that I want to implement a program that, 00:57:07.660 --> 00:57:10.090 not unlike a charting program like Excel, 00:57:10.090 --> 00:57:12.520 prints for me some kind of bar chart. 00:57:12.520 --> 00:57:15.340 These bar charts will be purely textual using, say, 00:57:15.340 --> 00:57:17.470 hash marks to represent values. 00:57:17.470 --> 00:57:20.530 But to do this, I'm going to have to prompt the user for input 00:57:20.530 --> 00:57:24.490 and then print out precisely that many hashes horizontally. 00:57:24.490 --> 00:57:26.330 Well, let's see what I get. 00:57:26.330 --> 00:57:29.650 I'm going to go ahead and as always prompt the user for input. 00:57:29.650 --> 00:57:31.720 We'll call it say n. 00:57:31.720 --> 00:57:36.220 And that user's input shall be converted via int after asking them 00:57:36.220 --> 00:57:38.770 for that value of n. 00:57:38.770 --> 00:57:41.680 And then once I have that value do I want to iterate 00:57:41.680 --> 00:57:44.140 that is loop some number of times-- 00:57:44.140 --> 00:57:47.500 some number of times equal to whatever the user's input was. 00:57:47.500 --> 00:57:50.860 So if the user has inputted one, I'll print just one hash mark. 00:57:50.860 --> 00:57:54.250 If the user inputs 10, I want to print 10 of those hashes. 00:57:54.250 --> 00:57:55.630 But how do I do this? 00:57:55.630 --> 00:58:00.460 A while true loop or forever loop that infinitely loops is probably 00:58:00.460 --> 00:58:02.360 not the right approach here. 00:58:02.360 --> 00:58:05.440 But rather I want to iterate some finite number of times. 00:58:05.440 --> 00:58:08.320 And so a for loop allows us to do exactly that 00:58:08.320 --> 00:58:10.880 with built-in functionality as follows. 00:58:10.880 --> 00:58:16.270 Let me go ahead and say Python for I in the range 00:58:16.270 --> 00:58:20.530 of n, which is the user's input, go ahead per the colon 00:58:20.530 --> 00:58:22.330 and do the following next. 00:58:22.330 --> 00:58:26.430 Go ahead and print out a single hash for each value. 00:58:26.430 --> 00:58:28.450 And so what is this line of code doing? 00:58:28.450 --> 00:58:32.110 Here on line 3 do I have for I in the range of n. 00:58:32.110 --> 00:58:35.080 Well, it turns out that range is a function built into Python 00:58:35.080 --> 00:58:37.780 that returns to you effectively a range of values. 00:58:37.780 --> 00:58:43.030 By default, that range starts at 0 and goes up to but not through the value 00:58:43.030 --> 00:58:44.170 you ask for. 00:58:44.170 --> 00:58:49.270 So if you pass to range the value like 1, you will iterate only one 00:58:49.270 --> 00:58:52.720 time the range of 0 to but not through 1. 00:58:52.720 --> 00:58:56.800 If you instead input a value of 10, you'll iterate over a range of 0 00:58:56.800 --> 00:59:03.160 through 9 up to but not through 10 and get precisely that many hashes. 00:59:03.160 --> 00:59:05.740 My goal, again, is to print a bar chart of sorts 00:59:05.740 --> 00:59:08.710 with one hash representing each of these values 00:59:08.710 --> 00:59:11.210 from left to right, a horizontal bar chart. 00:59:11.210 --> 00:59:14.860 So let me go ahead and save this file here and in my terminal window 00:59:14.860 --> 00:59:16.690 run Python of score.py. 00:59:16.690 --> 00:59:18.760 We'll input a number like 10. 00:59:18.760 --> 00:59:22.390 And unfortunately, they all seem to be vertical. 00:59:22.390 --> 00:59:25.750 And if I scrolled up higher in my terminal window would I see all 10 00:59:25.750 --> 00:59:28.210 but again, one on top of the other. 00:59:28.210 --> 00:59:32.500 So how do I somehow keep my cursor, if you will, on the same line? 00:59:32.500 --> 00:59:34.450 Well, all this time, I've been using print. 00:59:34.450 --> 00:59:37.870 I've been getting a new line for free, so to speak. 00:59:37.870 --> 00:59:39.850 At the end of printing anything has Python 00:59:39.850 --> 00:59:43.180 been moving my cursor, not unlike an old school typewriter, 00:59:43.180 --> 00:59:45.940 to the bottom left of the next line. 00:59:45.940 --> 00:59:49.540 But sometimes I want my cursor to stay on the same line, 00:59:49.540 --> 00:59:52.270 even as I do something again and again. 00:59:52.270 --> 00:59:53.510 And it turns out Python-- 00:59:53.510 --> 00:59:56.110 and knowably know this by having looked at the documentation, 00:59:56.110 --> 00:59:58.990 therefore is that the print function can take 00:59:58.990 --> 01:00:01.540 a second input that is not necessarily just 01:00:01.540 --> 01:00:03.580 some other string you want to print. 01:00:03.580 --> 01:00:06.460 But instead it's a named parameter-- 01:00:06.460 --> 01:00:08.950 that is, an input that has a predetermined named-- 01:00:08.950 --> 01:00:10.540 in this case called n-- 01:00:10.540 --> 01:00:15.910 that you can set equal to a specific value like nothing. 01:00:15.910 --> 01:00:20.470 It turns out, albeit non-obviously, that, by default, Python 01:00:20.470 --> 01:00:25.510 ends each line with a carriage return, if you will, or a blank line, 01:00:25.510 --> 01:00:30.690 otherwise represented here technically as a backslash n, which itself 01:00:30.690 --> 01:00:33.750 is technically distinct from an old school carriage return 01:00:33.750 --> 01:00:38.400 but has the effect of moving that cursor down to the next line. 01:00:38.400 --> 01:00:39.442 So this is implicit. 01:00:39.442 --> 01:00:42.150 It would be incredibly annoying if any time you wrote Python code 01:00:42.150 --> 01:00:43.900 and wanted to print something that you had 01:00:43.900 --> 01:00:46.080 to type out that sequence of symbols. 01:00:46.080 --> 01:00:48.120 And so you get those for free, so to speak. 01:00:48.120 --> 01:00:50.850 But if you want to override that default behavior, 01:00:50.850 --> 01:00:53.540 you need to instead tell Python's print function, you know what? 01:00:53.540 --> 01:00:59.250 End your lines with nothing at all, quote unquote with nothing in between. 01:00:59.250 --> 01:01:01.980 But when I'm done printing all of them, it 01:01:01.980 --> 01:01:06.660 would be nice to move my cursor to the next line so that my next prompt-- 01:01:06.660 --> 01:01:09.300 that dollar sign we keep seeing in my terminal window-- 01:01:09.300 --> 01:01:10.860 is at least on its own. 01:01:10.860 --> 01:01:14.460 So I'm going to go ahead and say print open paren closed paren 01:01:14.460 --> 01:01:18.420 with nothing inside that because if I get for free a blank line, 01:01:18.420 --> 01:01:20.280 I don't need to pass anything to print. 01:01:20.280 --> 01:01:26.020 That is as the very last step just going to move my cursor to the next line. 01:01:26.020 --> 01:01:28.680 So let me go ahead and save this program now and again, 01:01:28.680 --> 01:01:33.720 in my terminal, window run Python of score.py, typing in this time 10. 01:01:33.720 --> 01:01:37.620 And there do I get my 10 hashes horizontally. 01:01:37.620 --> 01:01:41.370 So it turns out Python has what are called types, and the only time you 01:01:41.370 --> 01:01:44.400 really need to know or care about this is when, frankly, it 01:01:44.400 --> 01:01:46.410 starts to bite you, like it did us. 01:01:46.410 --> 01:01:48.570 Indeed, when I asked for the user's input 01:01:48.570 --> 01:01:51.480 and expecting an integer but the user typed exactly 01:01:51.480 --> 01:01:53.970 that but I didn't convert it in advance to an int, 01:01:53.970 --> 01:01:59.250 I got ultimately that weird behavior of concatenating one string to another. 01:01:59.250 --> 01:02:03.450 So underneath the hood are there are any number of types built into Python-- 01:02:03.450 --> 01:02:08.430 a bool like true/false, integers like numbers, strs or strings of text, 01:02:08.430 --> 01:02:11.400 and even floats, real numbers that have a decimal point 01:02:11.400 --> 01:02:13.080 and some number of digits after. 01:02:13.080 --> 01:02:17.730 But beyond that are more sophisticated data types or data structures still. 01:02:17.730 --> 01:02:21.900 Dict or dictionary, which is as we'll call it a hash table of sorts, 01:02:21.900 --> 01:02:24.630 list which can be any number of values back to back, 01:02:24.630 --> 01:02:27.780 a range of values as we've just seen, or a set wherein 01:02:27.780 --> 01:02:32.970 you have no duplicate values and a tuple, not unlike x comma y or latitude 01:02:32.970 --> 01:02:34.470 comma longitude. 01:02:34.470 --> 01:02:37.770 But it turns out that an appreciation of these types 01:02:37.770 --> 01:02:41.610 can help you avoid some very serious mistakes in code 01:02:41.610 --> 01:02:45.330 because it turns out that depending on how you store your data in a computer's 01:02:45.330 --> 01:02:49.920 memory, you might actually get behavior that you didn't actually intend. 01:02:49.920 --> 01:02:53.310 For instance, let me quite simply write a program that prints out 01:02:53.310 --> 01:02:56.880 the value of, oh, say, 1 divided by 10. 01:02:56.880 --> 01:03:01.530 I have here a file called, say, imprecision that I quite simply 01:03:01.530 --> 01:03:03.120 am going to do this-- 01:03:03.120 --> 01:03:07.740 prompt the user for an input called x, converting their input to an int, 01:03:07.740 --> 01:03:12.390 as always, asking for x, and then defining another variable called y-- 01:03:12.390 --> 01:03:17.040 this time, converting to an integer the user's input after prompting for y. 01:03:17.040 --> 01:03:20.970 And then quite simply, I'm going to print x divided by y. 01:03:20.970 --> 01:03:23.340 Let me go ahead and save this and in my terminal window 01:03:23.340 --> 01:03:26.850 run Python of imprecision.py, hitting Enter here, 01:03:26.850 --> 01:03:30.000 typing in, say, 1 divided by 10. 01:03:30.000 --> 01:03:33.510 And so 0.1 is the answer, just as you'd expect. 01:03:33.510 --> 01:03:37.650 Let's go ahead and print out more digits than 1 after that decimal point, 01:03:37.650 --> 01:03:43.740 just to make sure that 1/10 is indeed 0.1 with implicitly an infinite number 01:03:43.740 --> 01:03:45.660 of zeros to the right. 01:03:45.660 --> 01:03:48.330 Well, let me go ahead just for simplicity's sake 01:03:48.330 --> 01:03:53.100 and first store the value x divided by y and a third variable z. 01:03:53.100 --> 01:03:56.790 And then in my print statement here, let me print out exactly that z, 01:03:56.790 --> 01:03:59.520 but let me format it a bit differently than usual. 01:03:59.520 --> 01:04:02.640 Using Python support for an or format string, 01:04:02.640 --> 01:04:06.360 using that prefix f, which connotes give me a format string, 01:04:06.360 --> 01:04:10.038 am I going to print exactly that quote unquote z. 01:04:10.038 --> 01:04:13.080 But recall that you need to surround it with those so-called curly braces 01:04:13.080 --> 01:04:17.460 to make clear to Python that you want to plug in its value and not literally z. 01:04:17.460 --> 01:04:21.120 Well, it turns out there's additional syntax we can use, albeit cryptic, 01:04:21.120 --> 01:04:26.220 via which to tell Python, yes, print z but to this many decimal places. 01:04:26.220 --> 01:04:30.450 And the syntax for that is a colon right after the variable, 01:04:30.450 --> 01:04:34.110 followed by a literal period, and the number of decimal points 01:04:34.110 --> 01:04:35.280 that you'd like to print. 01:04:35.280 --> 01:04:39.180 And because this is a so-called floating point value or real number, 01:04:39.180 --> 01:04:42.870 we need one additional F. Now with this syntax 01:04:42.870 --> 01:04:46.770 should I be able to print that same value but to a specific number 01:04:46.770 --> 01:04:48.120 of decimal places. 01:04:48.120 --> 01:04:48.720 Let's see. 01:04:48.720 --> 01:04:51.930 In my terminal window, let me go ahead and run Python and imprecision.py, 01:04:51.930 --> 01:04:56.490 again inputting 1, again inputting 10, and whew, 01:04:56.490 --> 01:05:00.330 I indeed see point 1 followed by nine more zeros-- 01:05:00.330 --> 01:05:02.430 a total of 10 digits. 01:05:02.430 --> 01:05:06.300 Well, let me get a little more curious and instead print out, oh, shall 01:05:06.300 --> 01:05:08.700 we say 20 decimal places-- 01:05:08.700 --> 01:05:10.860 again running my program in precision.py, 01:05:10.860 --> 01:05:18.360 inputting 1, followed by 10, and all looks almost good until wow, 5, 5, 5. 01:05:18.360 --> 01:05:20.680 I'm a little curious now as to what's going on. 01:05:20.680 --> 01:05:25.700 Let me go ahead and run this one last time after printing 30 decimal places. 01:05:25.700 --> 01:05:28.200 Here I'm going to go ahead and run Python of imprecision.py, 01:05:28.200 --> 01:05:31.810 hoping that the I'm not going to get worse, and it does. 01:05:31.810 --> 01:05:36.130 It seems if you look far enough out, you start to see some weirdness. 01:05:36.130 --> 01:05:38.290 In fact, let me go as far as out as-- 01:05:38.290 --> 01:05:42.040 I don't know-- 55 decimal places, going ahead and running 01:05:42.040 --> 01:05:45.460 Python of imprecision.py, and putting one in 10. 01:05:45.460 --> 01:05:47.800 Oh, my god, it does get worse. 01:05:47.800 --> 01:05:51.400 So it would seem that all of us taught in grade school that one divided by 10 01:05:51.400 --> 01:05:54.700 indeed equals 1/10 is not quite true. 01:05:54.700 --> 01:05:58.300 If you look far enough beyond the decimal point, eventually, 01:05:58.300 --> 01:06:03.940 things go horribly, horribly awry, and that is because computers quite often, 01:06:03.940 --> 01:06:07.870 as powerful and sophisticated as they are, can't quite do everything 01:06:07.870 --> 01:06:10.750 and can't quite do everything that we humans do. 01:06:10.750 --> 01:06:11.950 Now why is this? 01:06:11.950 --> 01:06:14.290 It would seem that Python is ever so slightly 01:06:14.290 --> 01:06:17.320 off when it comes to the representation of this floating 01:06:17.320 --> 01:06:19.510 point or this real value. 01:06:19.510 --> 01:06:20.770 Now why is that? 01:06:20.770 --> 01:06:26.080 Well, inside of a computer is hardware like this, RAM or Random Access Memory, 01:06:26.080 --> 01:06:30.280 which is a little chip of memory inside of your computer wherein files 01:06:30.280 --> 01:06:33.460 and programs or stored when they're open or running. 01:06:33.460 --> 01:06:38.170 And inside of each of these black chips is some number of bytes or bits 01:06:38.170 --> 01:06:42.310 that are ultimately used to represent any of the values in your program. 01:06:42.310 --> 01:06:46.480 The catch here, though, is that this device, like any physical device 01:06:46.480 --> 01:06:51.820 in the real world, has only a finite amount of space or capacity, which 01:06:51.820 --> 01:06:56.230 is to say, no matter how big or expensive this particular RAM is, 01:06:56.230 --> 01:06:59.050 it has a finite number of bytes-- 01:06:59.050 --> 01:07:03.280 maybe one billion if it's a gigabyte or two billion if it's two gigs. 01:07:03.280 --> 01:07:05.800 But it's a finite number in total. 01:07:05.800 --> 01:07:08.860 And by default, what Python and most languages do is 01:07:08.860 --> 01:07:11.710 they decide a priori how many bits or bytes 01:07:11.710 --> 01:07:15.280 to use to represent any of the values in your program. 01:07:15.280 --> 01:07:19.690 And so if your number is so precise or so big 01:07:19.690 --> 01:07:23.620 that it can't quite be represented in only that many bits, 01:07:23.620 --> 01:07:27.640 the language, like Python, is going to come as close as it can 01:07:27.640 --> 01:07:30.932 and represent that value with some approximation. 01:07:30.932 --> 01:07:32.140 And that's what we're seeing. 01:07:32.140 --> 01:07:37.090 One divided by 10 is surely a mathematically well-defined number. 01:07:37.090 --> 01:07:45.220 Indeed, it's 1/10 or 0.1 and mathematically should be 0.10000 ad 01:07:45.220 --> 01:07:47.560 nauseum infinitely. 01:07:47.560 --> 01:07:52.660 But in Python, if you're only using, say, 32 or 64 any number of bits, 01:07:52.660 --> 01:07:55.960 you can't possibly represent the infinite number 01:07:55.960 --> 01:07:59.560 of numbers that exist in the world and represent all of them 01:07:59.560 --> 01:08:01.420 perfectly, precisely. 01:08:01.420 --> 01:08:05.020 To do so, you would surely need an infinite number of bits, 01:08:05.020 --> 01:08:08.480 and we don't have that in our physical world. 01:08:08.480 --> 01:08:11.110 And so you have to suffer, unfortunately, 01:08:11.110 --> 01:08:15.160 this potential for floating point imprecision where values you care about 01:08:15.160 --> 01:08:17.979 are going to be close but not quite what you 01:08:17.979 --> 01:08:21.430 intend, unless you, the programmer or designer of the system, 01:08:21.430 --> 01:08:24.580 are willing to spend more than just 32 or 64 bits 01:08:24.580 --> 01:08:26.859 but more and more and more and enough that you 01:08:26.859 --> 01:08:31.450 can get that decimal point and those values as far off to the right, 01:08:31.450 --> 01:08:33.670 so to speak, as you can. 01:08:33.670 --> 01:08:36.729 But darn it, if there isn't another problem that derives 01:08:36.729 --> 01:08:38.950 from precisely the same constraint. 01:08:38.950 --> 01:08:42.760 Not only can you have imprecision when it comes to floating point values, 01:08:42.760 --> 01:08:46.569 even integers are potentially flawed, not necessarily in Python 01:08:46.569 --> 01:08:50.350 because in the latest version of Python have they designed in the language 01:08:50.350 --> 01:08:54.790 the ability to use as many bits as you need to represent integers-- 01:08:54.790 --> 01:08:58.210 specifically, numbers like negative 1 and 0 and 1 and everything 01:08:58.210 --> 01:09:00.100 to the left and everything to the right. 01:09:00.100 --> 01:09:02.800 The language itself will use more and more bits 01:09:02.800 --> 01:09:06.220 to store exactly the integer you want, but that 01:09:06.220 --> 01:09:09.160 did not used to be the case in Python and is still not the case 01:09:09.160 --> 01:09:10.609 in some languages. 01:09:10.609 --> 01:09:15.430 Some languages are vulnerable to what's called integer overflow whereby 01:09:15.430 --> 01:09:19.210 if in that language you try to count so high that you need 01:09:19.210 --> 01:09:23.890 to represent a number that's too big to fit in the amount of storage 01:09:23.890 --> 01:09:24.670 you've allocated-- 01:09:24.670 --> 01:09:27.580 32 or 60 or some number of other bits-- 01:09:27.580 --> 01:09:30.399 you're going to overflow the value. 01:09:30.399 --> 01:09:33.798 The result of which is that you might be going up and up and up and up 01:09:33.798 --> 01:09:35.590 and representing a bigger and bigger value. 01:09:35.590 --> 01:09:40.420 But it gets so big that all of the ones in that number become zeros. 01:09:40.420 --> 01:09:42.250 And somehow accidentally. 01:09:42.250 --> 01:09:46.540 you end up overflowing and starting all over numerically. 01:09:46.540 --> 01:09:47.620 Now how might that be? 01:09:47.620 --> 01:09:51.430 Well, consider a number that's represented with only three digits, 01:09:51.430 --> 01:09:54.670 and let's start counting, for instance, from 123. 01:09:54.670 --> 01:10:01.810 Adding ones to that gives you 124, followed by 125, 126, 127, 128, 129. 01:10:01.810 --> 01:10:04.780 And what do we do in our human mathematical world? 01:10:04.780 --> 01:10:08.950 Well, if you were about to hit nine and we now need to go to 10, 01:10:08.950 --> 01:10:10.120 you don't just write 10. 01:10:10.120 --> 01:10:14.170 Rather you write zero, and you carry the one, so to speak, 01:10:14.170 --> 01:10:19.090 continuing now with your logic, adding that one to that two, giving you 130. 01:10:19.090 --> 01:10:19.870 And that's OK. 01:10:19.870 --> 01:10:23.770 We've stayed within the confines of that three-digit number. 01:10:23.770 --> 01:10:28.420 But of course, if we go count up long enough, we'll eventually reach 999. 01:10:28.420 --> 01:10:32.660 But if we have decided to only allocate three digits for this number 01:10:32.660 --> 01:10:37.880 where or what is going to happen when we add one number to this? 01:10:37.880 --> 01:10:41.420 Well, you might be inclined to carry the one and carry then one again. 01:10:41.420 --> 01:10:44.930 And in the world where you have the luxury of pen and paper, 01:10:44.930 --> 01:10:48.590 you might simply write down 1,000, which is the right value. 01:10:48.590 --> 01:10:53.900 But if your computer or device is only representing values with three digits, 01:10:53.900 --> 01:10:56.460 you have overflowed this particular value, 01:10:56.460 --> 01:10:59.210 and you've overflowed in the sense that even though I have it here 01:10:59.210 --> 01:11:04.190 on the screen, that doesn't actually have room in which to fit. 01:11:04.190 --> 01:11:08.240 And so your number 1,000 becomes mistaken for 0 0 0, 01:11:08.240 --> 01:11:12.350 thereby having you've overflowed and wrapped around from a big number 01:11:12.350 --> 01:11:13.910 to a small. 01:11:13.910 --> 01:11:16.670 Now you might think that this is fairly contrived and why would 01:11:16.670 --> 01:11:19.730 you ever do something so foolish as to only represent numbers 01:11:19.730 --> 01:11:20.480 with three digits. 01:11:20.480 --> 01:11:22.170 Well, we humans have done worse. 01:11:22.170 --> 01:11:26.120 Now it wasn't all that long ago that we humans made precisely this mistake 01:11:26.120 --> 01:11:29.210 using just two digits to store years. 01:11:29.210 --> 01:11:33.290 After all, if almost all of your dates start with 1900-something, 01:11:33.290 --> 01:11:36.200 you might as well just store those last two digits. 01:11:36.200 --> 01:11:41.690 Unfortunately, by December 31, 1999, were many of us quite a bit nervous 01:11:41.690 --> 01:11:44.930 that we hadn't found all of the code and all of the devices in the world 01:11:44.930 --> 01:11:48.350 that we're still using just two digits because, as 01:11:48.350 --> 01:11:52.220 with integer overflow, if you have a number already counted up to 99 01:11:52.220 --> 01:11:57.680 and you only have two digits, you might run the risk that 99 rolls over, 01:11:57.680 --> 01:11:59.690 overflowing to 0 0. 01:11:59.690 --> 01:12:05.240 And all of a sudden, it is not the year 2000 but 1900 again-- 01:12:05.240 --> 01:12:07.490 quite simply the result of integer overflow 01:12:07.490 --> 01:12:10.040 by using a fixed amount of memory to represent 01:12:10.040 --> 01:12:15.770 something and not having anticipated that you might eventually need more. 01:12:15.770 --> 01:12:20.120 But you can certainly design for this and engineer defenses against this. 01:12:20.120 --> 01:12:23.690 Indeed, some games have done better than we as a society. 01:12:23.690 --> 01:12:27.770 Indeed, this game here Lego Star Wars has a point system wherein 01:12:27.770 --> 01:12:29.980 you can accumulate coins over time. 01:12:29.980 --> 01:12:31.730 And if you play this game long enough, you 01:12:31.730 --> 01:12:35.640 can accumulate apparently as many as 4 billion of these coins 01:12:35.640 --> 01:12:38.870 but unfortunately no more because the engineers 01:12:38.870 --> 01:12:43.070 who designed this game decided that the maximum number of points you can accrue 01:12:43.070 --> 01:12:44.150 is just that-- 01:12:44.150 --> 01:12:45.170 4 billion. 01:12:45.170 --> 01:12:46.370 But why? 01:12:46.370 --> 01:12:49.070 Well, it turns out that in many computers and game consoles, 01:12:49.070 --> 01:12:54.370 it's conventional to store your integers or ints 32-bit values-- 01:12:54.370 --> 01:12:57.470 32 zeros or ones back to back, which means 01:12:57.470 --> 01:13:00.920 you have 2 to the 32 possible permutations 01:13:00.920 --> 01:13:05.750 thereof, which means you have roughly four billion possible values. 01:13:05.750 --> 01:13:07.850 You actually have a few more than 4 billion, 01:13:07.850 --> 01:13:10.100 but it's perhaps cleaner in a game to just choose 01:13:10.100 --> 01:13:11.780 a clean value with lots of zeros. 01:13:11.780 --> 01:13:15.890 But in this game did they anticipate that you'd play this game too long, 01:13:15.890 --> 01:13:18.050 and you might eventually overflow. 01:13:18.050 --> 01:13:20.690 And who knows what might happen to that gamer 01:13:20.690 --> 01:13:23.360 if he or she plays the game so long, and all of a sudden, 01:13:23.360 --> 01:13:25.640 their high score becomes zero. 01:13:25.640 --> 01:13:27.170 So these problems are solvable. 01:13:27.170 --> 01:13:31.650 You just have to anticipate and actually engineer those solutions. 01:13:31.650 --> 01:13:35.010 But sometimes we don't, including companies like Boeing. 01:13:35.010 --> 01:13:40.190 It wasn't all that long ago that the Boeing 747 had a software bug in it 01:13:40.190 --> 01:13:43.250 whereby the plane's power system might actually 01:13:43.250 --> 01:13:48.020 turn off while the plane in the worst case were actually flying. 01:13:48.020 --> 01:13:50.210 One article put it as follows. 01:13:50.210 --> 01:13:55.250 A model 787 airplane that has been powered continuously for 248 days 01:13:55.250 --> 01:13:57.350 can lose all alternating current-- 01:13:57.350 --> 01:14:00.590 AC electrical power-- due to the Generator Control Units, 01:14:00.590 --> 01:14:05.330 GCUs, simultaneously going into failsafe mode, the memo stated. 01:14:05.330 --> 01:14:08.390 This condition is caused by the software counter 01:14:08.390 --> 01:14:15.080 internal to the GCUs that will overflow after 248 days of continuous power. 01:14:15.080 --> 01:14:19.100 Boeing is in the process of developing a GCU software upgrade that 01:14:19.100 --> 01:14:22.010 will remedy the unsafe condition. 01:14:22.010 --> 01:14:24.800 Another website analyzed the situation as follows. 01:14:24.800 --> 01:14:30.975 A simple guess suggests that the problem is a signed 32-bit overflow as 2 01:14:30.975 --> 01:14:37.790 to the 31st power is the number of seconds in 248 days multiplied by 100-- 01:14:37.790 --> 01:14:41.030 that is, a counter in hundreds of second. 01:14:41.030 --> 01:14:45.830 Which is to say it is presumed that Boeing had stored some form of integer 01:14:45.830 --> 01:14:48.560 in its own software, and that integer was 01:14:48.560 --> 01:14:51.980 representing the number of hundreds of seconds 01:14:51.980 --> 01:14:54.050 for which the power had been on. 01:14:54.050 --> 01:14:57.140 But if you're only using 32 bits and only have at your disposal 01:14:57.140 --> 01:15:01.850 roughly 4 billion one hundredths of seconds, turns out mathematically, 01:15:01.850 --> 01:15:07.280 after 248 days, that counter, which is clearly important, 01:15:07.280 --> 01:15:10.700 might overflow and wrap around to zero, the result of which 01:15:10.700 --> 01:15:14.010 is that the power of an airplane might shut off. 01:15:14.010 --> 01:15:18.020 And so the temporary work around, before the software upgrade was deployed, 01:15:18.020 --> 01:15:22.430 was to quite literally reboot the airplane while on the ground 01:15:22.430 --> 01:15:27.150 before that 248th day. 01:15:27.150 --> 01:15:29.183 Now we've only just scratched the surface 01:15:29.183 --> 01:15:31.350 of what you could do with Python, and in fact, we've 01:15:31.350 --> 01:15:34.770 looked at some of those characteristics that are demonstrative of the features 01:15:34.770 --> 01:15:38.040 that you might find in any number of programming languages. 01:15:38.040 --> 01:15:40.590 But we're not even limited to the functions that come 01:15:40.590 --> 01:15:42.930 built into the core language itself. 01:15:42.930 --> 01:15:47.040 It turns out there are called libraries and frameworks and yet more 01:15:47.040 --> 01:15:51.000 in any number of languages that provide additional features that you somehow 01:15:51.000 --> 01:15:54.750 have to load or import manually in order to use. 01:15:54.750 --> 01:15:56.670 One such example might be in Python. 01:15:56.670 --> 01:16:00.930 If you want to generate random or technically pseudorandom numbers 01:16:00.930 --> 01:16:04.350 to create some kind of variation in how your program behaves, 01:16:04.350 --> 01:16:07.680 it turns out that you can't just call a random function right out of the box. 01:16:07.680 --> 01:16:12.910 You need to tell Python to please load or import that feature for you. 01:16:12.910 --> 01:16:17.490 So let's go ahead and write a program in a file called pseudorandom.py 01:16:17.490 --> 01:16:23.100 that allows us to generate a random number between, say, 1 in 10. 01:16:23.100 --> 01:16:25.650 I want to go ahead, though, first and do this. 01:16:25.650 --> 01:16:29.580 From a library called random, go ahead and import a function 01:16:29.580 --> 01:16:32.310 called randint for random int. 01:16:32.310 --> 01:16:35.400 And then if I want to go ahead and generate or select and then 01:16:35.400 --> 01:16:39.120 print a pseudorandom number between 1 and 10 inclusive, 01:16:39.120 --> 01:16:43.470 I can simply do print rand int 1 comma 10, 01:16:43.470 --> 01:16:46.500 and the effect will be to let Python somehow figure out 01:16:46.500 --> 01:16:49.210 how to choose a number between those two values 01:16:49.210 --> 01:16:51.795 and return to it to me so that I can print it. 01:16:51.795 --> 01:16:54.420 Let's go ahead and save the file and then in my terminal window 01:16:54.420 --> 01:16:57.947 run Python of pseudorandom.py and 10. 01:16:57.947 --> 01:16:59.280 Let's go ahead and run it again. 01:16:59.280 --> 01:17:00.840 And this time, I get six. 01:17:00.840 --> 01:17:03.780 Let's go ahead and run it yet again, and this time, I get 10. 01:17:03.780 --> 01:17:06.930 Yet again, let's go ahead and run it, and this, time I get one. 01:17:06.930 --> 01:17:10.980 And if I were to run it shall we say an infinite number of times, over time, 01:17:10.980 --> 01:17:15.870 I would see a uniform distribution, hopefully, of those 10 possible values. 01:17:15.870 --> 01:17:18.870 And that's what's meant by pseudorandom itself. 01:17:18.870 --> 01:17:22.860 It turns out that languages and computers more generally can't really 01:17:22.860 --> 01:17:26.130 pick a random number off the top of their head like you 01:17:26.130 --> 01:17:29.400 and I can, rather they need to use algorithms, which themselves 01:17:29.400 --> 01:17:33.150 are deterministic processes-- code that does the same thing again 01:17:33.150 --> 01:17:36.630 and again in order to create, if you will, the illusion of randomness 01:17:36.630 --> 01:17:41.460 by creating a statistically uniform distribution over some range of values. 01:17:41.460 --> 01:17:44.130 Now often, a computer will use something that's 01:17:44.130 --> 01:17:46.800 changing, like the clock that's built into it, 01:17:46.800 --> 01:17:49.080 taking a look at the current time, and then generating 01:17:49.080 --> 01:17:53.970 a random number or again pseudorandom number based on that variation 01:17:53.970 --> 01:17:58.050 or if it has a microphone or a camera taking some ambient noise of sorts 01:17:58.050 --> 01:18:01.020 and using that to feed into whatever algorithm it's 01:18:01.020 --> 01:18:03.540 using to choose something randomly. 01:18:03.540 --> 01:18:07.570 But suppose I want to now do something with this value and not just print it. 01:18:07.570 --> 01:18:10.530 Suppose that we want to implement a bit of a game for a user-- 01:18:10.530 --> 01:18:14.820 pick a random number between 1 and 10 and see if they can guess it correctly. 01:18:14.820 --> 01:18:18.060 Well, let's see how a program might like that might be implemented. 01:18:18.060 --> 01:18:19.410 Let's first go ahead. 01:18:19.410 --> 01:18:23.040 And from that library called random, import as before, 01:18:23.040 --> 01:18:25.260 a function called rand int. 01:18:25.260 --> 01:18:28.410 Although it turns out if you want to use not only this function or others, 01:18:28.410 --> 01:18:31.500 you can more succinctly just say import random. 01:18:31.500 --> 01:18:34.290 The difference being if you only import random, 01:18:34.290 --> 01:18:37.860 we are going to have to prefix with the word random followed 01:18:37.860 --> 01:18:40.320 by a dot every use of a function. 01:18:40.320 --> 01:18:41.850 So we'll do it that way this time. 01:18:41.850 --> 01:18:43.920 Let's go ahead and declare a variable called n 01:18:43.920 --> 01:18:47.620 and assign to it right to left the result of calling rand int. 01:18:47.620 --> 01:18:50.670 But this time, because we've not mentioned rand int by name, 01:18:50.670 --> 01:18:55.230 I need to qualify this symbol and say random dot rand int, 01:18:55.230 --> 01:18:58.290 thereby making clear to Python that the function I'd like you to call 01:18:58.290 --> 01:19:01.710 is actually inside of that library called random. 01:19:01.710 --> 01:19:04.810 But I can otherwise use it as before passing in two values-- 01:19:04.810 --> 01:19:07.620 a lower and upper bound like this, one comma 10-- 01:19:07.620 --> 01:19:11.617 and that should give me an n, a random number in that range. 01:19:11.617 --> 01:19:13.950 Now I want to go ahead and ask the user for their guess, 01:19:13.950 --> 01:19:16.830 and I'll go ahead and define another variable called, say, guess. 01:19:16.830 --> 01:19:19.680 That is the result of converting to an int whatever 01:19:19.680 --> 01:19:22.770 the user's input is for their guess. 01:19:22.770 --> 01:19:25.800 And now, quite simply for this game, I want to compare those two values 01:19:25.800 --> 01:19:27.870 and print, say, correct or incorrect. 01:19:27.870 --> 01:19:31.890 And so I'll go ahead and say if the users guess equals equals n, 01:19:31.890 --> 01:19:34.980 go ahead and print out as much correct. 01:19:34.980 --> 01:19:38.490 Else if the user's guess is not right, let's implicitly 01:19:38.490 --> 01:19:40.530 infer that, nope, incorrect. 01:19:40.530 --> 01:19:42.930 And so we'll print that instead. 01:19:42.930 --> 01:19:46.890 Let me go ahead and save this file now and run Python of guess.py. 01:19:46.890 --> 01:19:48.810 This time, I'll be prompted for my guess. 01:19:48.810 --> 01:19:50.500 I'll say five. 01:19:50.500 --> 01:19:52.320 Unfortunately, it's incorrect. 01:19:52.320 --> 01:19:55.470 Let's go ahead and play again this time running Python guess.py. 01:19:55.470 --> 01:19:59.790 This time, I'll go with 10 since we saw it so many times before and correct. 01:19:59.790 --> 01:20:01.970 I'll run it a third time and see what's going on. 01:20:01.970 --> 01:20:05.070 This time, I'll guess one but incorrect. 01:20:05.070 --> 01:20:08.070 Unfortunately, I've implemented perhaps the most frustrating game ever 01:20:08.070 --> 01:20:11.610 because I'm not even telling the user what the actual number was. 01:20:11.610 --> 01:20:14.940 But surely, I could do that by printing the value of the variable 01:20:14.940 --> 01:20:20.238 I stored that random int in, but that would be yet another game altogether. 01:20:22.990 --> 01:20:26.740 All right, let's now bring all of this together and solve an actual problem-- 01:20:26.740 --> 01:20:28.450 one say from yesteryear. 01:20:28.450 --> 01:20:32.230 You might recall this game here, Super Mario Brothers, the original, and this 01:20:32.230 --> 01:20:35.710 was a two-dimensional world built up of images like this. 01:20:35.710 --> 01:20:39.250 Well, there in the sky so, to speak, do I see four question marks. 01:20:39.250 --> 01:20:42.370 And that seems like an opportunity to do something again and again. 01:20:42.370 --> 01:20:46.780 How might I print out for question marks in a row, not nearly as graphically 01:20:46.780 --> 01:20:50.440 as that here but just with my terminal window and text editor? 01:20:50.440 --> 01:20:52.690 Well, here let me go ahead and open those two. 01:20:52.690 --> 01:20:57.100 And in a file called, say, mario.py, let me go ahead and print out, 01:20:57.100 --> 01:20:59.380 quite simply, four questions. 01:20:59.380 --> 01:21:03.040 I'll go ahead and print out quote unquote question mark question mark 01:21:03.040 --> 01:21:04.540 question mark question mark. 01:21:04.540 --> 01:21:08.470 Saving that file again in mario0.py, running in my terminal window 01:21:08.470 --> 01:21:13.300 Python of mario0.puy and voila do I get an approximation 01:21:13.300 --> 01:21:15.880 of what Nintendo did in yesteryear. 01:21:15.880 --> 01:21:17.920 Now of course, doing something again and again 01:21:17.920 --> 01:21:21.430 is clearly an opportunity for, say, a loop and not just printing it 01:21:21.430 --> 01:21:24.177 all at once but just doing something again and again. 01:21:24.177 --> 01:21:26.260 And while this will complicate the code initially, 01:21:26.260 --> 01:21:29.920 it sets us up for a more interesting solution thereafter. 01:21:29.920 --> 01:21:33.220 In a file then called mario1.py, let me go ahead 01:21:33.220 --> 01:21:36.160 and implement that same sequence of question marks 01:21:36.160 --> 01:21:40.750 but this time using that familiar loop, not a while loop or an infinite loop 01:21:40.750 --> 01:21:42.910 but perhaps just a for loop like this. 01:21:42.910 --> 01:21:48.400 For I in the range of 0 up to, but not through 4, 01:21:48.400 --> 01:21:51.880 go ahead and print out just one question mark. 01:21:51.880 --> 01:21:57.160 Saving that file brings me now to my terminal window in Python of mario1.py 01:21:57.160 --> 01:21:58.060 enter. 01:21:58.060 --> 01:22:01.000 Unfortunately, I have created not quite the right level. 01:22:01.000 --> 01:22:03.220 But that's OK because remember that with print, you 01:22:03.220 --> 01:22:06.510 get one new line for free every time you call it, 01:22:06.510 --> 01:22:09.140 unless you override that default behavior. 01:22:09.140 --> 01:22:12.160 So let's say, no, Python instead and your line with nothing, 01:22:12.160 --> 01:22:15.490 and only once I'm completely done do I want you to print 01:22:15.490 --> 01:22:17.470 one of those free new lines for me. 01:22:17.470 --> 01:22:19.900 I'll go ahead and save my file again in my terminal 01:22:19.900 --> 01:22:25.780 window, rerun Python of mario.py, and now we have the same exact result. 01:22:25.780 --> 01:22:29.620 But later in the game do we see different aspects of Mario's world, not 01:22:29.620 --> 01:22:31.570 unlike this thing here underground. 01:22:31.570 --> 01:22:34.570 Pictured here are a number of blocks in the underworld, 01:22:34.570 --> 01:22:37.000 and it looks to me like that bigger block 01:22:37.000 --> 01:22:39.310 there is a composition of, say, 4. 01:22:39.310 --> 01:22:43.240 So let's go ahead now and print out a block of bricks, 01:22:43.240 --> 01:22:47.140 these are all represented by hashes so that I have four horizontally, 01:22:47.140 --> 01:22:51.010 four vertically as well, and everything else filled in too. 01:22:51.010 --> 01:22:54.130 We've not yet printed out anything on multiple axes, 01:22:54.130 --> 01:22:57.190 if you will, both rows and columns of sorts. 01:22:57.190 --> 01:22:58.360 So how to do this? 01:22:58.360 --> 01:23:02.620 Well, in my terminal window, I'm going to create a file called mario2.py. 01:23:02.620 --> 01:23:07.760 And in this file, I'm going to decompose this problem conceptually, so to speak, 01:23:07.760 --> 01:23:09.910 into two different problems. 01:23:09.910 --> 01:23:13.000 Built into that underworld are some number 01:23:13.000 --> 01:23:16.250 of rows of bricks within which are these columns. 01:23:16.250 --> 01:23:19.060 And I bet I could bite those off each one at a time. 01:23:19.060 --> 01:23:20.830 So let me go ahead and do this. 01:23:20.830 --> 01:23:25.040 For I in range of 4, go ahead and print what? 01:23:25.040 --> 01:23:29.890 Well, for every row of bricks, do I want to print some number of columns too? 01:23:29.890 --> 01:23:33.910 Because it's a square, the same number of columns as rows and so I 01:23:33.910 --> 01:23:36.340 know how to print that many things too. 01:23:36.340 --> 01:23:39.130 I can simply use another loop perhaps with a different variable 01:23:39.130 --> 01:23:45.220 with which to count like for j, as is conventional in range also of 4. 01:23:45.220 --> 01:23:48.700 And then inside of this inner nested loop, so to speak, 01:23:48.700 --> 01:23:52.300 might I go ahead and print out just one hash ending each of my lines 01:23:52.300 --> 01:23:54.170 with, as before, nothing. 01:23:54.170 --> 01:23:57.070 In fact, I only want to move my cursor to the next line 01:23:57.070 --> 01:23:59.620 after I've printed each of those columns. 01:23:59.620 --> 01:24:04.780 And so only underneath that innermost loop do I want that call to print. 01:24:04.780 --> 01:24:09.030 Let me go ahead and save this file now and run Python of mario2.py. 01:24:09.030 --> 01:24:12.640 And if I've gotten this logic right, I can go ahead and print out 01:24:12.640 --> 01:24:15.280 those rows and columns too. 01:24:15.280 --> 01:24:16.547 And indeed, that's what I get. 01:24:16.547 --> 01:24:18.880 It's not quite a perfect square because those hashes are 01:24:18.880 --> 01:24:20.500 a little taller than they are wider. 01:24:20.500 --> 01:24:24.070 But via a for loop, one nested inside of another can 01:24:24.070 --> 01:24:26.080 I handle two problems at once-- 01:24:26.080 --> 01:24:30.520 the act of iterating from row to row to row and within each row, iterating 01:24:30.520 --> 01:24:32.260 left to right via column. 01:24:32.260 --> 01:24:34.690 And in fact, to make that ever more clear, why 01:24:34.690 --> 01:24:37.390 do I even call my variables more aptly. 01:24:37.390 --> 01:24:41.890 For each row in the range of four in each column in the range of four, 01:24:41.890 --> 01:24:43.692 go ahead and print each of those hashes. 01:24:43.692 --> 01:24:46.900 Frankly, it doesn't even matter what I call these variables because I'm never 01:24:46.900 --> 01:24:48.940 actually using them per se. 01:24:48.940 --> 01:24:54.430 I'm simply telling Python to count up from 0 to 4 using 01:24:54.430 --> 01:24:56.200 those particular names. 01:24:56.200 --> 01:24:58.180 Those then are some programming languages-- 01:24:58.180 --> 01:25:00.100 Python especially among them. 01:25:00.100 --> 01:25:03.670 And just as we saw in pseudocode, the ability to express functions 01:25:03.670 --> 01:25:07.480 and conditions with Boolean expressions and loops and then things like 01:25:07.480 --> 01:25:12.160 variables and more, so can we express those exact same ideas in Python, in C, 01:25:12.160 --> 01:25:13.825 C++, and Java. 01:25:13.825 --> 01:25:16.450 And with each of those languages do you get different features, 01:25:16.450 --> 01:25:18.992 with each of those languages do you get different techniques. 01:25:18.992 --> 01:25:21.610 But ultimately, those languages are all just 01:25:21.610 --> 01:25:27.540 tools for one's toolkit with which to solve any number of problems with data.