1 00:00:07,060 --> 00:00:08,420 TOMMY: In this video, we'll learn about 2 00:00:08,420 --> 00:00:10,140 redirecting and pipes. 3 00:00:10,140 --> 00:00:12,780 So far, we've been using functions like printf to 4 00:00:12,780 --> 00:00:15,590 output data to the terminal and functions like GetString 5 00:00:15,590 --> 00:00:17,520 to allow the user to provide input to our 6 00:00:17,520 --> 00:00:19,490 program using the keyboard. 7 00:00:19,490 --> 00:00:21,880 Let's quickly take a look at a program that gets a line of 8 00:00:21,880 --> 00:00:25,960 input from the user and then outputs it. 9 00:00:25,960 --> 00:00:28,990 >> On line 7, we're prompting the user for a string, and 10 00:00:28,990 --> 00:00:31,680 then on line 8, we're printing it back out. 11 00:00:31,680 --> 00:00:35,220 Let's compile and run our program. 12 00:00:35,220 --> 00:00:35,900 Great. 13 00:00:35,900 --> 00:00:37,620 The string we provided was echoed back 14 00:00:37,620 --> 00:00:39,170 to us at the terminal. 15 00:00:39,170 --> 00:00:42,110 This happened because the printf function wrote to a 16 00:00:42,110 --> 00:00:46,220 stream called standard out, or s-t-d-out. 17 00:00:46,220 --> 00:00:49,230 When something is written to stdout, it's by default 18 00:00:49,230 --> 00:00:51,110 displayed by the terminal. 19 00:00:51,110 --> 00:00:53,720 >> So that's all well and good, but what if, instead of simply 20 00:00:53,720 --> 00:00:57,700 displaying the string, we wanted to save it to a file? 21 00:00:57,700 --> 00:01:00,470 For example, we might want to remember exactly what our 22 00:01:00,470 --> 00:01:04,450 program did when we gave it a particular input later. 23 00:01:04,450 --> 00:01:07,270 One approach would be to do this in our C program, using 24 00:01:07,270 --> 00:01:09,680 some special functions for writing to files that we'll 25 00:01:09,680 --> 00:01:11,270 see in another video. 26 00:01:11,270 --> 00:01:13,260 Even easier, though, would be to somehow 27 00:01:13,260 --> 00:01:16,090 redirect stdout to a file. 28 00:01:16,090 --> 00:01:19,780 That way, when printf writes to stdout, the contents will 29 00:01:19,780 --> 00:01:21,720 be written to a file rather than 30 00:01:21,720 --> 00:01:23,410 displayed by the terminal. 31 00:01:23,410 --> 00:01:26,690 We can do just that by adding a greater-than sign, followed 32 00:01:26,690 --> 00:01:30,820 by a file name, to the command we use to execute our program. 33 00:01:30,820 --> 00:01:34,730 >> So, rather than simply executing ./redirect, we can 34 00:01:34,730 --> 00:01:38,880 run ./redirect, followed by a greater than sign, followed by 35 00:01:38,880 --> 00:01:41,530 filename, like file.txt. 36 00:01:41,530 --> 00:01:44,290 Let's see what happens. 37 00:01:44,290 --> 00:01:45,130 OK. 38 00:01:45,130 --> 00:01:48,470 Notice that this time, nothing was displayed at the terminal, 39 00:01:48,470 --> 00:01:50,290 but we haven't modified the contents of our 40 00:01:50,290 --> 00:01:52,040 C program at all. 41 00:01:52,040 --> 00:01:56,090 Let's now examine the contents of this directory with ls. 42 00:01:56,090 --> 00:01:56,630 >> All right. 43 00:01:56,630 --> 00:02:00,840 We now have a new file in our directory called file.txt, 44 00:02:00,840 --> 00:02:03,640 which is the file name we supplied when we ran our 45 00:02:03,640 --> 00:02:05,050 Redirect program. 46 00:02:05,050 --> 00:02:08,020 Let's open up file.txt. 47 00:02:08,020 --> 00:02:11,840 And here, we can see that the stdout out of redirect was 48 00:02:11,840 --> 00:02:15,550 written to the file called file.txt. 49 00:02:15,550 --> 00:02:18,470 So let's run the previous command again, but supplying a 50 00:02:18,470 --> 00:02:20,075 different input this time. 51 00:02:25,140 --> 00:02:25,900 Okay. 52 00:02:25,900 --> 00:02:28,205 Let's take a look at file.txt now. 53 00:02:31,070 --> 00:02:34,580 >> We can see here that the file has been overwritten, so our 54 00:02:34,580 --> 00:02:37,120 original input isn't there anymore. 55 00:02:37,120 --> 00:02:40,280 If we instead want to append to this file, putting the new 56 00:02:40,280 --> 00:02:43,600 input below the existing contents of the file, we can 57 00:02:43,600 --> 00:02:46,800 use two greater-than signs instead of just one. 58 00:02:46,800 --> 00:02:48,050 Let's try that. 59 00:02:52,160 --> 00:02:57,910 Now, if we open file.txt again, we can see both of our 60 00:02:57,910 --> 00:02:59,580 input lines. 61 00:02:59,580 --> 00:03:02,180 In some cases, we might want to discard any 62 00:03:02,180 --> 00:03:03,850 output of our program. 63 00:03:03,850 --> 00:03:06,450 Rather than writing the output to a file and then deleting 64 00:03:06,450 --> 00:03:09,310 the file when we're done with it, we can write a to special 65 00:03:09,310 --> 00:03:12,360 file called /dev/null. 66 00:03:12,360 --> 00:03:15,160 When anything is written to /dev/null-- 67 00:03:15,160 --> 00:03:16,960 or just devnull for short-- 68 00:03:16,960 --> 00:03:18,950 it is automatically discarded. 69 00:03:18,950 --> 00:03:23,290 So think of devnull as a black hole for your data. 70 00:03:23,290 --> 00:03:26,070 >> So now we've seen how the greater than sign can redirect 71 00:03:26,070 --> 00:03:29,610 stdout, let's see how we can redirect standard in-- 72 00:03:29,610 --> 00:03:31,250 or s-t-d-in-- 73 00:03:31,250 --> 00:03:33,550 the analogue of stdout. 74 00:03:33,550 --> 00:03:36,010 While functions like printf write to the stream called 75 00:03:36,010 --> 00:03:40,500 stdout, GetString and similar functions read from the stream 76 00:03:40,500 --> 00:03:43,770 called stdin, which, by default, is the stream of 77 00:03:43,770 --> 00:03:46,290 characters typed at the keyboard. 78 00:03:46,290 --> 00:03:50,010 We can redirect stdin using the less than sign, followed 79 00:03:50,010 --> 00:03:51,370 by a filename. 80 00:03:51,370 --> 00:03:54,000 Now, rather than prompting the user for input at the 81 00:03:54,000 --> 00:03:57,870 terminal, a program will open the file we specified and use 82 00:03:57,870 --> 00:03:59,790 its lines as input. 83 00:03:59,790 --> 00:04:02,620 >> Let's see what happens. 84 00:04:02,620 --> 00:04:03,280 Great. 85 00:04:03,280 --> 00:04:07,590 The first line of file.txt has been printed to the terminal 86 00:04:07,590 --> 00:04:10,160 because we're calling GetString once. 87 00:04:10,160 --> 00:04:13,170 If we had another call to GetString in our program, the 88 00:04:13,170 --> 00:04:16,149 next line of file.txt would have been used as 89 00:04:16,149 --> 00:04:17,990 input to that call. 90 00:04:17,990 --> 00:04:21,050 Again, we haven't modified our C program at all. 91 00:04:21,050 --> 00:04:23,620 We're only changing how we run it. 92 00:04:23,620 --> 00:04:27,080 And also remember, we haven't redirected stdout this time, 93 00:04:27,080 --> 00:04:28,970 so the output of the program was still 94 00:04:28,970 --> 00:04:31,040 displayed at the terminal. 95 00:04:31,040 --> 00:04:33,500 We can, of course, redirect both stdin 96 00:04:33,500 --> 00:04:37,320 and stdout like this. 97 00:04:37,320 --> 00:04:43,550 Now, file2.txt contains the first line of file.txt. 98 00:04:43,550 --> 00:04:46,140 >> So, using these operators, we've been able to read and 99 00:04:46,140 --> 00:04:48,130 write from text files. 100 00:04:48,130 --> 00:04:51,890 Now, let's see how we can use the output of one program as 101 00:04:51,890 --> 00:04:54,710 the input to another program. 102 00:04:54,710 --> 00:04:56,650 So here's another simple C program I 103 00:04:56,650 --> 00:05:00,190 have here called hello.c. 104 00:05:00,190 --> 00:05:02,617 As you can see, this simply outputs "Hi 105 00:05:02,617 --> 00:05:04,430 there!" to the user. 106 00:05:04,430 --> 00:05:08,890 If I want redirect to use as input the output of hello-- 107 00:05:08,890 --> 00:05:10,190 another program-- 108 00:05:10,190 --> 00:05:13,920 I could first redirect the stdout of hello to a file called 109 00:05:13,920 --> 00:05:18,960 input.txt, then redirect the stdin of redirect to that same 110 00:05:18,960 --> 00:05:21,190 file--input.txt. 111 00:05:21,190 --> 00:05:26,730 So I can do ./hello > input.txt. 112 00:05:26,730 --> 00:05:28,810 Press Enter to execute this. 113 00:05:28,810 --> 00:05:31,910 Followed by ./redirect < 114 00:05:31,910 --> 00:05:35,270 input.txt, and execute that. 115 00:05:35,270 --> 00:05:38,290 So we can shorten this a bit with a semicolon, which allows 116 00:05:38,290 --> 00:05:41,360 us to run two or more commands on the same line. 117 00:05:41,360 --> 00:05:47,920 So I can say, ./hello > input.txt, semicolon, 118 00:05:47,920 --> 00:05:50,580 ./redirect < input.txt. 119 00:05:53,850 --> 00:05:56,740 >> So this works, but it still feels pretty inelegant. 120 00:05:56,740 --> 00:05:59,530 I mean, do we really need this intermediary text file that's 121 00:05:59,530 --> 00:06:02,520 no longer necessary after redirect runs? 122 00:06:02,520 --> 00:06:05,780 Luckily, we can avoid this extra text file using what's 123 00:06:05,780 --> 00:06:07,220 called a pipe. 124 00:06:07,220 --> 00:06:13,740 If I say, ./hello | ./redirect, then the stdout of 125 00:06:13,740 --> 00:06:15,310 the program on the left-- 126 00:06:15,310 --> 00:06:16,740 in this case, hello-- 127 00:06:16,740 --> 00:06:18,970 will be used as the standard input for the 128 00:06:18,970 --> 00:06:20,370 program on the right. 129 00:06:20,370 --> 00:06:24,850 In this case, redirect. So let's run this. 130 00:06:24,850 --> 00:06:25,930 >> There we go. 131 00:06:25,930 --> 00:06:30,080 We can see that the output of hello was used as the input 132 00:06:30,080 --> 00:06:31,520 for redirect. 133 00:06:31,520 --> 00:06:34,890 By stringing commands together using pipes, we form what's 134 00:06:34,890 --> 00:06:38,120 called a pipeline, since our output is essentially moving 135 00:06:38,120 --> 00:06:40,590 through a sequence of commands. 136 00:06:40,590 --> 00:06:43,570 Using pipes, we can do some cool stuff without needing to 137 00:06:43,570 --> 00:06:45,870 write any code at all. 138 00:06:45,870 --> 00:06:48,760 For example, let's say we want to know how many files are 139 00:06:48,760 --> 00:06:50,630 inside of this directory. 140 00:06:50,630 --> 00:06:55,200 Using a pipe, we can combine the ls command with the wc-- 141 00:06:55,200 --> 00:06:56,460 or wordcount-- 142 00:06:56,460 --> 00:06:57,850 command. 143 00:06:57,850 --> 00:07:02,230 Ls will output each file in the directory to stdout, and 144 00:07:02,230 --> 00:07:08,040 wc will tell us how many lines were given to it via stdin. 145 00:07:08,040 --> 00:07:12,440 So, if we say ls | wc -l-- 146 00:07:12,440 --> 00:07:16,800 supplying the -l flag to wc to tell it to count lines-- 147 00:07:16,800 --> 00:07:19,260 we can see exactly how many files are 148 00:07:19,260 --> 00:07:21,940 in the current directory. 149 00:07:21,940 --> 00:07:24,570 >> So let's take a look at one more example. 150 00:07:24,570 --> 00:07:27,740 I have here a file called students.txt, 151 00:07:27,740 --> 00:07:29,600 with a list of names. 152 00:07:29,600 --> 00:07:32,730 However, these names aren't in any order it all, and it looks 153 00:07:32,730 --> 00:07:34,850 like a few names are repeated. 154 00:07:34,850 --> 00:07:38,510 What we want is a list of unique names in alphabetical 155 00:07:38,510 --> 00:07:42,550 order, saved to a file called final.txt. 156 00:07:42,550 --> 00:07:45,210 We could, of course, write a C program to do this for us. 157 00:07:45,210 --> 00:07:46,560 But it's going to get unnecessarily 158 00:07:46,560 --> 00:07:48,560 complex pretty quickly. 159 00:07:48,560 --> 00:07:51,740 Let's instead use pipes and some built-in-tools to solve 160 00:07:51,740 --> 00:07:53,300 this problem. 161 00:07:53,300 --> 00:07:57,760 >> The first thing we need to do is read the file students.txt. 162 00:07:57,760 --> 00:08:00,530 The cat command will do just that. 163 00:08:00,530 --> 00:08:03,230 It will read in the specified file and write 164 00:08:03,230 --> 00:08:05,750 its contents to stdout. 165 00:08:05,750 --> 00:08:07,570 After we've read the text file, we'll 166 00:08:07,570 --> 00:08:09,490 want to sort the names. 167 00:08:09,490 --> 00:08:12,510 The sort command can handle this for us. 168 00:08:12,510 --> 00:08:16,830 Sort will output the line supplied via stdin to stdout 169 00:08:16,830 --> 00:08:19,310 in sorted order. 170 00:08:19,310 --> 00:08:23,450 In order to supply the contents of students.txt to 171 00:08:23,450 --> 00:08:29,600 sort's stdin, we could use a pipe to combine cat and sort. 172 00:08:29,600 --> 00:08:34,440 So I can execute cat students.txt | sort and 173 00:08:34,440 --> 00:08:35,640 press Enter. 174 00:08:35,640 --> 00:08:39,309 And now, we see the contents of students.txt in 175 00:08:39,309 --> 00:08:40,909 alphabetical order. 176 00:08:40,909 --> 00:08:42,860 >> So let's add another command-- 177 00:08:42,860 --> 00:08:44,730 uniq, or unique-- 178 00:08:44,730 --> 00:08:46,230 to our pipeline. 179 00:08:46,230 --> 00:08:49,810 As you might guess, uniq, when supplied a sorted sequence of 180 00:08:49,810 --> 00:08:53,650 lines via stdin, will output the unique lines. 181 00:08:53,650 --> 00:08:56,910 So now we have cat students.txt 182 00:08:56,910 --> 00:09:00,040 | sort | uniq. 183 00:09:00,040 --> 00:09:03,330 Finally, we can save the output of the pipeline to a 184 00:09:03,330 --> 00:09:09,090 file via cat students.txt | sort | uniq 185 00:09:09,090 --> 00:09:12,440 > final.txt. 186 00:09:12,440 --> 00:09:16,260 So, if we open up final.txt, we have exactly what we were 187 00:09:16,260 --> 00:09:17,270 looking for: 188 00:09:17,270 --> 00:09:20,180 a list of unique names in alphabetical order, 189 00:09:20,180 --> 00:09:22,150 saved to a text file. 190 00:09:22,150 --> 00:09:26,020 By the way, we also could have said sort < 191 00:09:26,020 --> 00:09:32,290 students.txt | uniq > final.txt to do exactly 192 00:09:32,290 --> 00:09:35,400 the same thing, using each of the operators we've seen in 193 00:09:35,400 --> 00:09:36,580 this video. 194 00:09:36,580 --> 00:09:39,540 >> My name is Tommy, and this is CS50.