1 00:00:00,000 --> 00:00:03,290 [MUSIC PLAYING] 2 00:00:03,290 --> 00:00:05,650 3 00:00:05,650 --> 00:00:09,440 SPEAKER: Well, hello, one and all, and welcome to our short on vectors. 4 00:00:09,440 --> 00:00:12,220 We'll take a look at how we can use these things called vectors 5 00:00:12,220 --> 00:00:15,040 to store information and also how to quickly build up 6 00:00:15,040 --> 00:00:20,470 vectors for use in R. Now, vectors are this way of storing data that 7 00:00:20,470 --> 00:00:24,100 has all the same type, or we might say in R, the same storage mode, 8 00:00:24,100 --> 00:00:27,650 for instance, all numbers or all character strings. 9 00:00:27,650 --> 00:00:31,018 And they're really handy to build up all kinds of other structures. 10 00:00:31,018 --> 00:00:33,310 So we'll focus on them today and see if we can actually 11 00:00:33,310 --> 00:00:37,240 quickly make a lot of them using functions built into R. 12 00:00:37,240 --> 00:00:41,530 So let's say I want to have a vector here of words. 13 00:00:41,530 --> 00:00:46,270 And to make a new vector, I can actually use this function called c(), 14 00:00:46,270 --> 00:00:48,710 where c() stands for Combine. 15 00:00:48,710 --> 00:00:51,140 This is a pretty key idea with vectors. 16 00:00:51,140 --> 00:00:56,120 I'm combining different individual pieces of data into one vector. 17 00:00:56,120 --> 00:01:04,760 So here, I might have a vector of words, like "It's", "a", "beautiful", "day", 18 00:01:04,760 --> 00:01:05,720 just like this. 19 00:01:05,720 --> 00:01:10,640 And I can store it, let's say, inside of this object that I'll call words. 20 00:01:10,640 --> 00:01:12,380 And I can run this on line 1. 21 00:01:12,380 --> 00:01:16,140 And we'll see that it has been run down in my R console down below. 22 00:01:16,140 --> 00:01:19,520 So if I want to see what's inside of this words vector, 23 00:01:19,520 --> 00:01:22,010 I can simply type "words" and run that line. 24 00:01:22,010 --> 00:01:25,500 And I'll see that I have down below those four words-- 25 00:01:25,500 --> 00:01:27,450 "It's" "a" "beautiful" "day." 26 00:01:27,450 --> 00:01:29,180 And again, this works because this vector 27 00:01:29,180 --> 00:01:34,130 is created from individual pieces of data that are all of the same type, 28 00:01:34,130 --> 00:01:36,560 in this case, all character strings. 29 00:01:36,560 --> 00:01:42,750 So c()-- great function to get to know to build up vectors from scratch. 30 00:01:42,750 --> 00:01:47,540 But if you want to make other kinds of vectors, if you want to manipulate them, 31 00:01:47,540 --> 00:01:49,910 it's worth thinking about other functions that 32 00:01:49,910 --> 00:01:52,400 can help you do so pretty quickly. 33 00:01:52,400 --> 00:01:56,870 Let's say I wanted to have this vector, "It's", "a", "beautiful", "day". 34 00:01:56,870 --> 00:02:00,603 But I want to repeat these words, these four words, multiple times, 35 00:02:00,603 --> 00:02:03,270 like "It's" "a" "beautiful" "day", "It's" "a" "beautiful" "day", 36 00:02:03,270 --> 00:02:04,478 "It's" "a" "beautiful" "day." 37 00:02:04,478 --> 00:02:09,620 I can actually do that using a function, one called rep() where rep() stands 38 00:02:09,620 --> 00:02:10,580 for Repeat. 39 00:02:10,580 --> 00:02:15,590 And rep() can actually take as input a vector and very quickly make for me some 40 00:02:15,590 --> 00:02:18,630 new vector based on that old vector. 41 00:02:18,630 --> 00:02:19,800 Let's try this out. 42 00:02:19,800 --> 00:02:23,900 The first argument to rep() is the vector to begin with, so in this case, 43 00:02:23,900 --> 00:02:24,530 words. 44 00:02:24,530 --> 00:02:28,080 Or it could even be a single value. 45 00:02:28,080 --> 00:02:32,340 And here, one of the arguments to rep() is one called times. 46 00:02:32,340 --> 00:02:34,730 So I can say, times = 2 here. 47 00:02:34,730 --> 00:02:38,420 And this will actually take my vector, "It's", "a", "beautiful", "day", 48 00:02:38,420 --> 00:02:42,290 and give me a new vector with this, "It's", "a", "beautiful", "day", 49 00:02:42,290 --> 00:02:44,340 repeated two times. 50 00:02:44,340 --> 00:02:45,560 I'll try running this here. 51 00:02:45,560 --> 00:02:49,530 And we'll see down below that I get "It's" "a" "beautiful" "day", 52 00:02:49,530 --> 00:02:50,990 "It's" "a" "beautiful" "day". 53 00:02:50,990 --> 00:02:51,930 Pretty good. 54 00:02:51,930 --> 00:02:54,110 I could even do times = 3. 55 00:02:54,110 --> 00:02:58,790 And now I'll get that repeated three times down below as well. 56 00:02:58,790 --> 00:03:02,600 So rep() good for repeating things some number of times. 57 00:03:02,600 --> 00:03:07,140 But it turns out rep() also has another argument, one called each. 58 00:03:07,140 --> 00:03:12,090 So notice how when I used times, this took my vector as it is-- 59 00:03:12,090 --> 00:03:17,090 "It's", "a", "beautiful", "day"-- and repeated these values in that same order 60 00:03:17,090 --> 00:03:18,930 two or three different times. 61 00:03:18,930 --> 00:03:24,350 Well, if I use each, that will actually look at every individual element I have 62 00:03:24,350 --> 00:03:28,970 in my vector, like "It's", "a", "beautiful", "day," individually now, 63 00:03:28,970 --> 00:03:31,708 and repeat each of those two times. 64 00:03:31,708 --> 00:03:33,000 So let's see what happens here. 65 00:03:33,000 --> 00:03:33,980 I'll run this. 66 00:03:33,980 --> 00:03:37,710 And I'll get "It's" "It's" "a" "a" "beautiful" "beautiful" "day" "day". 67 00:03:37,710 --> 00:03:40,700 So less of a utility here, but you could imagine 68 00:03:40,700 --> 00:03:44,720 this being useful if you want to transform a vector where every element 69 00:03:44,720 --> 00:03:48,900 you have might be repeated each some number of times. 70 00:03:48,900 --> 00:03:53,480 So rep() a good tool to have in your toolkit when working with vectors like 71 00:03:53,480 --> 00:03:54,513 these. 72 00:03:54,513 --> 00:03:56,430 Let's think of other kinds of vectors, though. 73 00:03:56,430 --> 00:04:00,240 So this is here a character vector, one that is filled with character strings. 74 00:04:00,240 --> 00:04:03,780 But, of course, vectors can also be composed of numbers. 75 00:04:03,780 --> 00:04:05,240 So let me clear my console. 76 00:04:05,240 --> 00:04:09,930 And let's try making a new vector, one called numbers. 77 00:04:09,930 --> 00:04:16,040 And I'll, just for simplicity, make this a vector of 1, 2, 3, 4, 5, 6, 7, 8, 9, 78 00:04:16,040 --> 00:04:17,040 and 10. 79 00:04:17,040 --> 00:04:20,019 So this is my vector of numbers 1 through 10. 80 00:04:20,019 --> 00:04:25,250 If I run line 1 to assign this vector to the object numbers and type "numbers" 81 00:04:25,250 --> 00:04:29,000 down below, we'll all see that vector down below as well. 82 00:04:29,000 --> 00:04:32,780 So c() again coming in handy when I want to make, in this case, 83 00:04:32,780 --> 00:04:34,700 a vector of some numbers. 84 00:04:34,700 --> 00:04:38,070 But you could imagine it's getting pretty tiring pretty quickly 85 00:04:38,070 --> 00:04:42,140 if I were to enter in beyond just 10, like 100 or 200 or so. 86 00:04:42,140 --> 00:04:45,870 And I wanted 1, 2, 3, all the way up to some number. 87 00:04:45,870 --> 00:04:53,210 So thankfully, R comes with the ability to use what's called this colon here, 88 00:04:53,210 --> 00:04:59,450 which will give me a vector of numbers, starting with the first one, inclusive, 89 00:04:59,450 --> 00:05:04,010 and ending with the last one, inclusive, and going up in ascending order one 90 00:05:04,010 --> 00:05:05,240 number at a time. 91 00:05:05,240 --> 00:05:08,830 So here, if I run line 1 and line 3, I'll 92 00:05:08,830 --> 00:05:11,710 get the same result but with much fewer lines-- 93 00:05:11,710 --> 00:05:14,270 well, much less typing, let's say. 94 00:05:14,270 --> 00:05:15,410 I could even do this. 95 00:05:15,410 --> 00:05:17,710 I believe I could do without the c(). 96 00:05:17,710 --> 00:05:21,650 And I could do numbers 1 through 10, and same thing here. 97 00:05:21,650 --> 00:05:25,220 So 1:10 actually creates the vector for me. 98 00:05:25,220 --> 00:05:28,630 I don't need c() to do that in this particular case. 99 00:05:28,630 --> 00:05:32,920 Now, what else could we do when it comes to things like these sequences? 100 00:05:32,920 --> 00:05:35,920 You might often want to create some sequence of numbers. 101 00:05:35,920 --> 00:05:41,930 But you might not always want it to be in ascending order one number at a time. 102 00:05:41,930 --> 00:05:45,190 So there is a function to be aware of, one called seq(), 103 00:05:45,190 --> 00:05:50,020 which stands for sequence, and allows you to construct numeric vectors 104 00:05:50,020 --> 00:05:55,660 by specifying some start point, some end point, and some, let's say-- 105 00:05:55,660 --> 00:05:58,630 what do you call it-- maybe an increase along the way. 106 00:05:58,630 --> 00:06:04,460 So here, why don't we try this from argument to seq() and start with 1, 107 00:06:04,460 --> 00:06:08,280 and the "to" argument to seq() and start with 10. 108 00:06:08,280 --> 00:06:11,780 And this will by default give me the same thing we said before, 109 00:06:11,780 --> 00:06:13,820 a sequence of numbers 1 through 10. 110 00:06:13,820 --> 00:06:15,960 I'll hit Enter on this, Enter on this. 111 00:06:15,960 --> 00:06:20,730 And now we'll see here that I get 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 112 00:06:20,730 --> 00:06:24,650 so sequence then doing the same work we had with 1:10. 113 00:06:24,650 --> 00:06:27,630 But let's say I want to get a little more fancy with this. 114 00:06:27,630 --> 00:06:28,610 I certainly can. 115 00:06:28,610 --> 00:06:33,650 Seq() actually has as an argument as well an argument called "by." 116 00:06:33,650 --> 00:06:39,680 And by specifies a number to increase by as we go through 117 00:06:39,680 --> 00:06:41,280 and create our sequence. 118 00:06:41,280 --> 00:06:46,250 So to be clear, we would start at 1 and then jump up two to 3 119 00:06:46,250 --> 00:06:52,290 and then jump up two to 5, all the way until we get to 10 exactly or above 10. 120 00:06:52,290 --> 00:06:53,790 So let's see what happens here. 121 00:06:53,790 --> 00:06:55,890 I'll run numbers. 122 00:06:55,890 --> 00:06:58,200 I'll run numbers and then numbers again. 123 00:06:58,200 --> 00:07:06,620 And here, we see 1, 3, 5, 7, 9, so now increasing, we'll see, by 2 each time. 124 00:07:06,620 --> 00:07:09,740 Well, one other thing we can use sequence for 125 00:07:09,740 --> 00:07:14,600 is maybe you want a vector of some specific length. 126 00:07:14,600 --> 00:07:19,650 And you want it to start with some number and end with some number. 127 00:07:19,650 --> 00:07:23,010 And all numbers in the middle should be equally spaced apart. 128 00:07:23,010 --> 00:07:30,830 Well, seq() can do that as well using an argument called length.out. 129 00:07:30,830 --> 00:07:34,190 A bit of a weird argument name here, but length.out out 130 00:07:34,190 --> 00:07:38,690 can take as input some number and create a sequence for us 131 00:07:38,690 --> 00:07:43,070 where the first number is 1, the last number is 10, 132 00:07:43,070 --> 00:07:47,810 and the total vector's length will be this number here, in this case, 3, 133 00:07:47,810 --> 00:07:50,670 with all the numbers evenly spaced apart. 134 00:07:50,670 --> 00:07:52,050 So let's try this out. 135 00:07:52,050 --> 00:07:55,640 I'll run line 1 here, and I'll run line 3. 136 00:07:55,640 --> 00:07:58,910 And now we'll see, true to its word, I get 137 00:07:58,910 --> 00:08:04,130 a vector of three numbers, where each is evenly spaced out. 138 00:08:04,130 --> 00:08:07,970 And it begins with 1 and ends with 10. 139 00:08:07,970 --> 00:08:14,030 So this was a brief foray into vectors how to create them manually with c(), 140 00:08:14,030 --> 00:08:18,320 how to repeat them using rep(), how to, let's say, 141 00:08:18,320 --> 00:08:23,120 get a sequence of values using colon, and how to modify or adjust that 142 00:08:23,120 --> 00:08:27,210 sequence using, in this case, the seq() function. 143 00:08:27,210 --> 00:08:29,630 So this then was our short on vectors. 144 00:08:29,630 --> 00:08:32,350 And we'll see you next time. 145 00:08:32,350 --> 00:08:34,000