1 00:00:00,000 --> 00:00:00,780 2 00:00:00,780 --> 00:00:11,880 >> [MUSIC PLAYING] 3 00:00:11,880 --> 00:00:16,480 >> DAVID CHOUINARD: I'm David Chouinard, and this is D3. 4 00:00:16,480 --> 00:00:17,700 Welcome. 5 00:00:17,700 --> 00:00:21,270 We're going to learn about D3 today. 6 00:00:21,270 --> 00:00:25,020 D3 is a JavaScript framework for building a high quality 7 00:00:25,020 --> 00:00:28,110 interactive visualizations for the web. 8 00:00:28,110 --> 00:00:30,870 Things like what we're seeing in back of me, 9 00:00:30,870 --> 00:00:34,230 we're going to learn to make those things, kind of the basics of it. 10 00:00:34,230 --> 00:00:36,452 But it's going to be cool. 11 00:00:36,452 --> 00:00:38,160 Let's get started making pretty pictures. 12 00:00:38,160 --> 00:00:41,108 13 00:00:41,108 --> 00:00:44,350 We've got more demos of prospects available. 14 00:00:44,350 --> 00:00:48,040 15 00:00:48,040 --> 00:00:50,760 Let's do it. 16 00:00:50,760 --> 00:00:58,700 >> Act I, DOM manipulation-- we're going to start right away making cool things. 17 00:00:58,700 --> 00:01:01,240 First of all, on the left, we have code. 18 00:01:01,240 --> 00:01:03,470 On the right, we have the result of our code. 19 00:01:03,470 --> 00:01:04,900 Let's go through it. 20 00:01:04,900 --> 00:01:05,780 >> Let's make a circle. 21 00:01:05,780 --> 00:01:08,570 How does that sound? 22 00:01:08,570 --> 00:01:14,934 svg.append circle-- we just made a circle. 23 00:01:14,934 --> 00:01:16,100 You don't believe me, right? 24 00:01:16,100 --> 00:01:18,190 It's not there. 25 00:01:18,190 --> 00:01:21,830 >> So what we did right here is, SVG is scalable vector graphics. 26 00:01:21,830 --> 00:01:27,530 This is the way we tell the browser to make vector graphics in the browser. 27 00:01:27,530 --> 00:01:30,740 What we just did right now is added a circle to browse. 28 00:01:30,740 --> 00:01:34,790 >> The promise is that the circle requires a bit of basic attributes 29 00:01:34,790 --> 00:01:36,850 before we can actually see it. 30 00:01:36,850 --> 00:01:40,045 We need to tell it its x position, its y position, its radius. 31 00:01:40,045 --> 00:01:43,310 We didn't tell it any of that, so we're not seeing it right now. 32 00:01:43,310 --> 00:01:46,210 But let's tell it stuff. 33 00:01:46,210 --> 00:01:49,510 >> So first of all, you've got to give our circle a name. 34 00:01:49,510 --> 00:01:53,070 So let's call it circle. 35 00:01:53,070 --> 00:01:54,406 Our circle has a name now. 36 00:01:54,406 --> 00:01:57,230 37 00:01:57,230 --> 00:01:59,490 And let's give it a few attributes. 38 00:01:59,490 --> 00:02:03,690 How about cx would center x, so the center of the x position. 39 00:02:03,690 --> 00:02:06,730 Let's say, 200 for 200 pixels. 40 00:02:06,730 --> 00:02:10,220 >> Let's give it a y of 200 pixels as well. 41 00:02:10,220 --> 00:02:16,032 And an r, a radius, of about 40 pixels. 42 00:02:16,032 --> 00:02:16,950 Now let's see. 43 00:02:16,950 --> 00:02:21,740 44 00:02:21,740 --> 00:02:23,440 I cannot spell. 45 00:02:23,440 --> 00:02:30,430 46 00:02:30,430 --> 00:02:31,520 >> There you go. 47 00:02:31,520 --> 00:02:37,330 We have a circle at position 200 pixels, 200 pixels, radius of 40 pixels. 48 00:02:37,330 --> 00:02:38,280 Kind of cool, right? 49 00:02:38,280 --> 00:02:38,988 We have a circle. 50 00:02:38,988 --> 00:02:40,880 Yeah. 51 00:02:40,880 --> 00:02:42,670 >> So no need to follow along. 52 00:02:42,670 --> 00:02:45,790 All these examples, all of the code I'm doing today 53 00:02:45,790 --> 00:02:51,300 will be provided online at the end in the form of interactive examples 54 00:02:51,300 --> 00:02:54,010 with checkpoints at every act, and so on. 55 00:02:54,010 --> 00:02:55,160 >> Let's do more stuff. 56 00:02:55,160 --> 00:02:58,901 This black circle is really ugly. 57 00:02:58,901 --> 00:03:01,541 I'm sorry for that error messages right there. 58 00:03:01,541 --> 00:03:05,340 There we go. 59 00:03:05,340 --> 00:03:06,350 >> Let's give it a color. 60 00:03:06,350 --> 00:03:07,170 How's that? 61 00:03:07,170 --> 00:03:08,340 I like to steel blue. 62 00:03:08,340 --> 00:03:13,280 63 00:03:13,280 --> 00:03:16,030 Well, our circle changed color. 64 00:03:16,030 --> 00:03:17,320 That's great. 65 00:03:17,320 --> 00:03:31,330 Let's make it semi-transparent too-- semi-transparent. 66 00:03:31,330 --> 00:03:33,670 >> So these are attributes we're defining on the circle. 67 00:03:33,670 --> 00:03:36,774 The first thing we did is we put a circle on the page. 68 00:03:36,774 --> 00:03:38,690 And then we're defining a bunch of attributes. 69 00:03:38,690 --> 00:03:41,610 Some of these are required, like CX, CY, and Radius. 70 00:03:41,610 --> 00:03:42,680 And others are optional. 71 00:03:42,680 --> 00:03:44,730 >> There are a lot more attributes. 72 00:03:44,730 --> 00:03:46,760 There's a lot of them. 73 00:03:46,760 --> 00:03:53,070 For example, we could have a stroke as well, a stroke of red. 74 00:03:53,070 --> 00:03:55,630 But let's remove that. 75 00:03:55,630 --> 00:04:00,450 We're back to a circle, a blue circle. 76 00:04:00,450 --> 00:04:01,600 >> So let's make more circles. 77 00:04:01,600 --> 00:04:02,810 How's that? 78 00:04:02,810 --> 00:04:04,665 Let's make another circle. 79 00:04:04,665 --> 00:04:05,985 This is exciting, right? 80 00:04:05,985 --> 00:04:09,630 81 00:04:09,630 --> 00:04:12,300 >> So say I just Copy-Pasted what we had already. 82 00:04:12,300 --> 00:04:13,570 Let's call it circle2. 83 00:04:13,570 --> 00:04:15,840 And let's do the exact same thing and give it 84 00:04:15,840 --> 00:04:20,450 attributes, given an x position of 300. 85 00:04:20,450 --> 00:04:24,140 Yay, we have two circles now. 86 00:04:24,140 --> 00:04:27,240 >> And of course, we could update these values. 87 00:04:27,240 --> 00:04:31,640 I could put it at 400, and now it moves. 88 00:04:31,640 --> 00:04:35,470 And since it's annoying, let's remove it, so circle2.remove. 89 00:04:35,470 --> 00:04:39,000 90 00:04:39,000 --> 00:04:40,730 It's gone now. 91 00:04:40,730 --> 00:04:43,170 >> So what we're doing and is just very, very-- this 92 00:04:43,170 --> 00:04:46,030 is very similar to what you might do in jQuery, for example. 93 00:04:46,030 --> 00:04:48,240 We're just manipulating the DOM, it's called. 94 00:04:48,240 --> 00:04:50,040 You might have heard that word before. 95 00:04:50,040 --> 00:04:53,255 We're creating stuff, setting attributes on stuff, removing stuff. 96 00:04:53,255 --> 00:04:58,950 97 00:04:58,950 --> 00:05:02,360 >> Now, here's where it gets interesting. 98 00:05:02,360 --> 00:05:07,250 So later in the code, we could still refer to the original circle here. 99 00:05:07,250 --> 00:05:14,100 So let's reset its attribute to cx. 100 00:05:14,100 --> 00:05:18,260 Let's say, its x position to 400. 101 00:05:18,260 --> 00:05:22,406 And I'm going to transition that, so it's obvious. 102 00:05:22,406 --> 00:05:23,360 There we go. 103 00:05:23,360 --> 00:05:24,780 >> So we added a circle. 104 00:05:24,780 --> 00:05:26,440 We set some attributes. 105 00:05:26,440 --> 00:05:28,210 We added another circle, removed it. 106 00:05:28,210 --> 00:05:31,650 And then we're modifying the original circle. 107 00:05:31,650 --> 00:05:35,400 >> But here's where it gets a lot more interesting. 108 00:05:35,400 --> 00:05:39,070 Not only can we set attributes as just values, we can say, 109 00:05:39,070 --> 00:05:41,610 hey, circle, go to position 200. 110 00:05:41,610 --> 00:05:44,540 We can also set them as functions. 111 00:05:44,540 --> 00:05:48,850 >> So instead of giving 400 here, we can make some calculation 112 00:05:48,850 --> 00:05:53,950 on the fly for what we want that attribute to be. 113 00:05:53,950 --> 00:05:56,580 So this is how you'd express that. 114 00:05:56,580 --> 00:06:00,660 We say, instead of 400, let me give you a function instead. 115 00:06:00,660 --> 00:06:04,180 And here, inside this function, we can make any crazy calculation. 116 00:06:04,180 --> 00:06:06,820 >> We could take the time and look at some other thing 117 00:06:06,820 --> 00:06:11,230 and dynamically decide for the circle what value we want. 118 00:06:11,230 --> 00:06:15,266 How about we just give it a random x position? 119 00:06:15,266 --> 00:06:20,360 120 00:06:20,360 --> 00:06:21,120 So that's that. 121 00:06:21,120 --> 00:06:25,490 >> So what that says is, for every x, run this function. 122 00:06:25,490 --> 00:06:29,340 And what we're doing is calculating some things, a random times the width 123 00:06:29,340 --> 00:06:30,410 and returning that. 124 00:06:30,410 --> 00:06:34,765 So every time we run that, we get a circle that goes to a random place. 125 00:06:34,765 --> 00:06:36,394 It's kind of cool. 126 00:06:36,394 --> 00:06:38,310 I feel like I could look at this for a little. 127 00:06:38,310 --> 00:06:44,274 128 00:06:44,274 --> 00:06:46,440 We're starting to get to something interesting here. 129 00:06:46,440 --> 00:06:49,120 130 00:06:49,120 --> 00:06:51,390 Let's make this data driven now. 131 00:06:51,390 --> 00:06:53,420 There's no data here. 132 00:06:53,420 --> 00:06:54,482 Let's change that. 133 00:06:54,482 --> 00:06:57,440 134 00:06:57,440 --> 00:07:12,140 >> Act II, Data Driven Documents-- So let's return to here. 135 00:07:12,140 --> 00:07:15,340 And let's just get rid of circle2, because we're just adding and removing 136 00:07:15,340 --> 00:07:15,840 it. 137 00:07:15,840 --> 00:07:17,382 So we don't really need it. 138 00:07:17,382 --> 00:07:21,421 We need to be a lot more clever here. 139 00:07:21,421 --> 00:07:23,170 Let's say, we have some data of some sort. 140 00:07:23,170 --> 00:07:31,540 141 00:07:31,540 --> 00:07:40,020 One moment-- let's say, we had data of this form. 142 00:07:40,020 --> 00:07:41,800 We had an array, just a bunch of numbers. 143 00:07:41,800 --> 00:07:45,750 We have seven numbers here, whatever these represent-- amount 144 00:07:45,750 --> 00:07:48,810 in people's bank account, how much they weigh, god knows what. 145 00:07:48,810 --> 00:07:51,310 >> These are numbers, and we want to use our circles 146 00:07:51,310 --> 00:07:53,240 to represent those numbers somehow. 147 00:07:53,240 --> 00:07:55,515 We want to tie our circles to those numbers. 148 00:07:55,515 --> 00:07:58,750 149 00:07:58,750 --> 00:07:59,626 So what we do. 150 00:07:59,626 --> 00:08:01,500 Let's say, we want a circle for every number. 151 00:08:01,500 --> 00:08:03,590 We could do the old thing we were doing-- 152 00:08:03,590 --> 00:08:06,020 circle append and circle2 and circle3. 153 00:08:06,020 --> 00:08:10,020 But this gets out of hand, and there's a lot of repeating logic. 154 00:08:10,020 --> 00:08:12,760 >> So let's get more clever with that. 155 00:08:12,760 --> 00:08:17,810 Instead of using the var circle svg.append that we were just using, 156 00:08:17,810 --> 00:08:21,580 we're going to use this little block here. 157 00:08:21,580 --> 00:08:24,510 I don't want to go in-depth into what all these parts do. 158 00:08:24,510 --> 00:08:26,020 And it's kind of an advanced topic. 159 00:08:26,020 --> 00:08:27,830 And I wish I could. 160 00:08:27,830 --> 00:08:31,370 >> But the key thing to recognize-- and you'll see is very often in D3 code. 161 00:08:31,370 --> 00:08:36,840 This block of text basic creates as many circles 162 00:08:36,840 --> 00:08:41,360 as there are data elements in this array right here. 163 00:08:41,360 --> 00:08:53,420 164 00:08:53,420 --> 00:08:55,780 So this creates as many circles as there are elements. 165 00:08:55,780 --> 00:08:58,520 It's going to create us seven circles. 166 00:08:58,520 --> 00:09:01,710 And it does a really, really key thing. 167 00:09:01,710 --> 00:09:02,460 So let's run that. 168 00:09:02,460 --> 00:09:05,460 Let's remove our other circle. 169 00:09:05,460 --> 00:09:09,565 Let's just comment this part out and run this again. 170 00:09:09,565 --> 00:09:13,840 171 00:09:13,840 --> 00:09:15,260 >> There we go. 172 00:09:15,260 --> 00:09:18,030 So our circle here is a lot darker, because we 173 00:09:18,030 --> 00:09:20,720 have seven circles, one on top of the other. 174 00:09:20,720 --> 00:09:25,425 We just created seven circles, one each for each of these data elements. 175 00:09:25,425 --> 00:09:28,860 But there's a key thing that happened with this snippet right here. 176 00:09:28,860 --> 00:09:31,030 >> It's that data was bound. 177 00:09:31,030 --> 00:09:33,440 So every single one of those data elements, 178 00:09:33,440 --> 00:09:38,830 10, 45, 105, was bound to a particular circle. 179 00:09:38,830 --> 00:09:40,960 So these not only created a bunch of circles 180 00:09:40,960 --> 00:09:43,420 but ties those two things together. 181 00:09:43,420 --> 00:09:48,740 >> And in the future, because we created those circles with this D3 function, 182 00:09:48,740 --> 00:09:52,430 if I give you a circle, you can give me the data associated with it. 183 00:09:52,430 --> 00:09:53,280 So we can ask D3. 184 00:09:53,280 --> 00:09:54,840 Hey, D3, I have this circle. 185 00:09:54,840 --> 00:09:57,350 What's the data that the circle has? 186 00:09:57,350 --> 00:10:01,290 And D3 would tell us 10 or 45 or 105. 187 00:10:01,290 --> 00:10:02,380 >> These things are bound. 188 00:10:02,380 --> 00:10:04,490 That's a very, very fundamental concept. 189 00:10:04,490 --> 00:10:06,070 Let's look at that. 190 00:10:06,070 --> 00:10:12,210 >> So the way we'd ask D3-- so this is irrelevant for this, 191 00:10:12,210 --> 00:10:16,620 but just trust me on it. 192 00:10:16,620 --> 00:10:17,620 This is how we ask D3. 193 00:10:17,620 --> 00:10:21,312 Hey, D3, give me the first circle that you can find. 194 00:10:21,312 --> 00:10:23,580 Give me the first circle you can find. 195 00:10:23,580 --> 00:10:29,660 And then we could ask D3, what's the data on that, like this, 10. 196 00:10:29,660 --> 00:10:33,380 >> So we just ask D3, find me the first circle you can find. 197 00:10:33,380 --> 00:10:34,400 What's its data? 198 00:10:34,400 --> 00:10:36,650 10, that is indeed our first data element. 199 00:10:36,650 --> 00:10:42,150 We could ask it, hey, D3, find us our third circle. 200 00:10:42,150 --> 00:10:44,450 105. 201 00:10:44,450 --> 00:10:45,740 Why is this really important? 202 00:10:45,740 --> 00:10:49,790 203 00:10:49,790 --> 00:10:52,250 >> So right here, I mentioned that we could use functions. 204 00:10:52,250 --> 00:10:54,910 And I mentioned that was a very powerful thing. 205 00:10:54,910 --> 00:11:03,070 So not only can our functions do things like do some computation, for example, 206 00:11:03,070 --> 00:11:09,170 return a random number, it can also do things based on the data. 207 00:11:09,170 --> 00:11:11,550 This is what data driven documents mean. 208 00:11:11,550 --> 00:11:13,750 That's what D3 stands for. 209 00:11:13,750 --> 00:11:17,800 >> So this x postition-- instead of just saying, all the circles, 210 00:11:17,800 --> 00:11:21,735 get x position 200, we could give it a function. 211 00:11:21,735 --> 00:11:26,140 212 00:11:26,140 --> 00:11:30,140 And here, we can make some calculation. 213 00:11:30,140 --> 00:11:33,710 and d here stands in place for the data. 214 00:11:33,710 --> 00:11:36,120 So every time we have a circle, basically, 215 00:11:36,120 --> 00:11:37,750 D3 will create these seven circles. 216 00:11:37,750 --> 00:11:38,500 And then for every 217 00:11:38,500 --> 00:11:41,920 >> circle, it's going to go, hey, circle1 what's your x position. 218 00:11:41,920 --> 00:11:45,210 Previously, we were always answering 200. 219 00:11:45,210 --> 00:11:48,630 But now, every time D3 asks us what's your x position, 220 00:11:48,630 --> 00:11:51,790 it's going to give us-- we have that circle, so we have the data. 221 00:11:51,790 --> 00:11:55,290 It's going to give us the data and say, what do you want the exposition to be, 222 00:11:55,290 --> 00:11:57,120 based on that data. 223 00:11:57,120 --> 00:11:59,590 >> Let's just return the actual data. 224 00:11:59,590 --> 00:12:04,910 So if we run this, this gives us data driven documents. 225 00:12:04,910 --> 00:12:08,040 These circles are based in relation position-- 226 00:12:08,040 --> 00:12:11,120 they're bases as a function of the data. 227 00:12:11,120 --> 00:12:13,100 >> So for the first circle, D3 puts a circle. 228 00:12:13,100 --> 00:12:16,770 And then D3 asks us, what do you want the exposition to be. 229 00:12:16,770 --> 00:12:19,620 And we just say, whatever the data is. 230 00:12:19,620 --> 00:12:21,185 Make the exposition 10. 231 00:12:21,185 --> 00:12:26,320 >> Then it asks, what do you want the exposition to be for the second circle. 232 00:12:26,320 --> 00:12:27,270 And we answer, 45. 233 00:12:27,270 --> 00:12:30,000 234 00:12:30,000 --> 00:12:32,230 And we, of course, can make some computation here. 235 00:12:32,230 --> 00:12:35,510 I find that those circles are kind of squished up. 236 00:12:35,510 --> 00:12:38,965 >> So multiply it by 3, multiply data by 3. 237 00:12:38,965 --> 00:12:41,870 238 00:12:41,870 --> 00:12:43,840 Our circle just got expanded out. 239 00:12:43,840 --> 00:12:46,730 Our value was tripled. 240 00:12:46,730 --> 00:12:51,010 >> The circle is really on the edge, so let's maybe kind of offset it. 241 00:12:51,010 --> 00:12:53,632 Let's say, by 20. 242 00:12:53,632 --> 00:12:56,070 Here you go. 243 00:12:56,070 --> 00:12:57,590 >> This is a data visualization. 244 00:12:57,590 --> 00:13:01,767 It's a very basic one, but this gives us some insight into our data. 245 00:13:01,767 --> 00:13:04,600 It tells us that, for example, we have a little cluster of elements. 246 00:13:04,600 --> 00:13:06,340 And we have a big outlier here. 247 00:13:06,340 --> 00:13:10,830 This gives us some information about the distribution. 248 00:13:10,830 --> 00:13:20,830 >> If we were, for example, to change the data to 150 here and refresh, 249 00:13:20,830 --> 00:13:22,630 our visualization is changed. 250 00:13:22,630 --> 00:13:24,285 This document is data driven. 251 00:13:24,285 --> 00:13:32,640 252 00:13:32,640 --> 00:13:36,180 >> So of course, all these elements, all these attributes here, 253 00:13:36,180 --> 00:13:38,430 we can use a function, not just the numbers, not just 254 00:13:38,430 --> 00:13:39,900 the x and y positions. 255 00:13:39,900 --> 00:13:42,120 So we can use a function for the color. 256 00:13:42,120 --> 00:13:45,260 257 00:13:45,260 --> 00:13:46,360 So we'll do the same. 258 00:13:46,360 --> 00:13:49,360 We'll give it a function. 259 00:13:49,360 --> 00:13:52,320 >> And let's say, we could have conditionals in our function. 260 00:13:52,320 --> 00:13:54,770 This function can be hundred of lines long. 261 00:13:54,770 --> 00:13:57,150 It can do very, very complicated things. 262 00:13:57,150 --> 00:13:59,080 >> So let's put an if statement here. 263 00:13:59,080 --> 00:14:03,420 Let's say, if our data is less than 50, that's some threshold 264 00:14:03,420 --> 00:14:05,817 that we're interested in for some reason. 265 00:14:05,817 --> 00:14:06,650 Let's make it green. 266 00:14:06,650 --> 00:14:09,830 267 00:14:09,830 --> 00:14:15,320 Otherwise, let's make it red. 268 00:14:15,320 --> 00:14:16,110 How's that? 269 00:14:16,110 --> 00:14:19,630 270 00:14:19,630 --> 00:14:21,220 Nice. 271 00:14:21,220 --> 00:14:24,860 >> So our data visualization is starting to convey more interesting information 272 00:14:24,860 --> 00:14:26,727 on many channels. 273 00:14:26,727 --> 00:14:28,560 So now we know a bit about the distribution. 274 00:14:28,560 --> 00:14:31,768 And we know that there's some sort of cut off at 50 that we're interested in. 275 00:14:31,768 --> 00:14:35,630 We know that there are two data points below that threshold and most of them 276 00:14:35,630 --> 00:14:36,130 above. 277 00:14:36,130 --> 00:14:41,510 278 00:14:41,510 --> 00:14:46,160 >> So as a final step, this data here, it's very rare to see this like that. 279 00:14:46,160 --> 00:14:52,610 So let's just move it out to a variable because that's cleaner, like this. 280 00:14:52,610 --> 00:15:02,980 281 00:15:02,980 --> 00:15:05,197 And then we use that variable here. 282 00:15:05,197 --> 00:15:06,280 It's the exact same thing. 283 00:15:06,280 --> 00:15:07,280 It's just a bit cleaner. 284 00:15:07,280 --> 00:15:25,300 285 00:15:25,300 --> 00:15:35,300 >> Next up, Act III, Scales-- So one problem right 286 00:15:35,300 --> 00:15:38,920 here is, if we change our data in this 200 value-- 287 00:15:38,920 --> 00:15:41,685 if we change it to 400 or something and refresh, 288 00:15:41,685 --> 00:15:44,540 then this value just went offscreen. 289 00:15:44,540 --> 00:15:49,040 So our logic right here of how we do the times 3 290 00:15:49,040 --> 00:15:52,570 and 20, to spread it out and then offset it a bit is really clunky. 291 00:15:52,570 --> 00:15:54,150 >> What do those numbers mean? 292 00:15:54,150 --> 00:15:55,400 They're just hard coded there. 293 00:15:55,400 --> 00:15:58,830 And they're very much tied to the data. 294 00:15:58,830 --> 00:16:00,550 We want a data driven document. 295 00:16:00,550 --> 00:16:05,460 We want a very flexible document, that given data, adapts to it 296 00:16:05,460 --> 00:16:07,900 and represents it. 297 00:16:07,900 --> 00:16:11,330 >> What we basically need is, we have this range of numbers 10. 298 00:16:11,330 --> 00:16:12,640 45, 105. 299 00:16:12,640 --> 00:16:17,630 And we want to map that out onto the width, the full width here. 300 00:16:17,630 --> 00:16:20,620 So we have the range of numbers going from 0 to 100. 301 00:16:20,620 --> 00:16:24,980 And we have this campus I goes from 20 to 700, in this case. 302 00:16:24,980 --> 00:16:26,515 >> We kind of want to map that on. 303 00:16:26,515 --> 00:16:30,002 We want to scale that up and then offset it a little bit. 304 00:16:30,002 --> 00:16:33,165 It turns out that D3 has these. 305 00:16:33,165 --> 00:16:34,220 It's called a scale. 306 00:16:34,220 --> 00:16:37,410 307 00:16:37,410 --> 00:16:38,250 So let's use it. 308 00:16:38,250 --> 00:16:46,300 309 00:16:46,300 --> 00:16:49,670 >> The way that works-- I'm going to type this up and then explain it. 310 00:16:49,670 --> 00:17:01,530 311 00:17:01,530 --> 00:17:02,450 This is a scale. 312 00:17:02,450 --> 00:17:08,670 What it will do is, it will map out values from 1 to 200 on to 20 to 600. 313 00:17:08,670 --> 00:17:10,990 We can check that. 314 00:17:10,990 --> 00:17:13,329 We can see that here. 315 00:17:13,329 --> 00:17:21,704 >> So if I feed it 1-- one moment. 316 00:17:21,704 --> 00:17:47,764 317 00:17:47,764 --> 00:17:48,555 Give me one second. 318 00:17:48,555 --> 00:17:53,680 319 00:17:53,680 --> 00:17:55,080 I must have mistyped it. 320 00:17:55,080 --> 00:18:15,320 321 00:18:15,320 --> 00:18:15,990 There you go. 322 00:18:15,990 --> 00:18:17,930 I'm sorry about that. 323 00:18:17,930 --> 00:18:22,050 >> So what a scale will do is, it will take a value 324 00:18:22,050 --> 00:18:24,930 and then convert that, expand that out, so it 325 00:18:24,930 --> 00:18:27,320 fills the full range you're asking for. 326 00:18:27,320 --> 00:18:32,910 So in this case, if we give it one, it's going to map that out onto 20. 327 00:18:32,910 --> 00:18:37,750 And if we give it 200, it's going to map that on to 600. 328 00:18:37,750 --> 00:18:40,460 And somewhere in between, if we get 100, it's 329 00:18:40,460 --> 00:18:44,610 going to be somewhere in between 20 and 600. 330 00:18:44,610 --> 00:18:51,480 >> And of course, now this is what we need to remove those hard coded 331 00:18:51,480 --> 00:18:53,402 things we have right there. 332 00:18:53,402 --> 00:18:55,950 So what we want to do is take the data that we're 333 00:18:55,950 --> 00:19:00,950 given, that individual data element, and pass it to scale first. 334 00:19:00,950 --> 00:19:02,635 So scale will scale it up. 335 00:19:02,635 --> 00:19:27,020 336 00:19:27,020 --> 00:19:48,880 >> Well-- Oh, we have a little error here. 337 00:19:48,880 --> 00:19:50,120 We're missing data. 338 00:19:50,120 --> 00:19:51,290 There you go. 339 00:19:51,290 --> 00:19:58,550 340 00:19:58,550 --> 00:19:59,550 And that expands it out. 341 00:19:59,550 --> 00:20:01,383 >> That gives us the same result we had before, 342 00:20:01,383 --> 00:20:04,030 but instead of having those hard coded constraints. 343 00:20:04,030 --> 00:20:07,790 And if the size of our canvas changes, for example, 344 00:20:07,790 --> 00:20:11,790 if we want to have this over 400 pixels and it squishes out, 345 00:20:11,790 --> 00:20:15,440 we can have it over-- we can expand it, or we 346 00:20:15,440 --> 00:20:21,890 can reduce this left margin to something less or more than 20. 347 00:20:21,890 --> 00:20:25,470 These numbers, these hard coded numbers now make sense to us. 348 00:20:25,470 --> 00:20:28,110 349 00:20:28,110 --> 00:20:30,520 >> And we could do a lot more interesting things as well. 350 00:20:30,520 --> 00:20:35,990 So instead of having a linear scale, we might want to log a scale. 351 00:20:35,990 --> 00:20:37,840 And that will give us a log scale. 352 00:20:37,840 --> 00:20:41,269 >> So now our scale, instead of just expanding out that range, 353 00:20:41,269 --> 00:20:42,810 it's doing more sophisticated things. 354 00:20:42,810 --> 00:20:48,790 355 00:20:48,790 --> 00:20:53,790 Instead of having this range hard coded, and instead of having that 600, 356 00:20:53,790 --> 00:20:58,465 we might want to just use the width, so from 20 to the width minus 40, 357 00:20:58,465 --> 00:21:02,392 2 times the margin on the other side. 358 00:21:02,392 --> 00:21:05,350 And this makes a lot more sense to somebody who might look at the code. 359 00:21:05,350 --> 00:21:08,080 360 00:21:08,080 --> 00:21:11,850 >> Interestingly, the scales get very, very sophisticated as well. 361 00:21:11,850 --> 00:21:13,350 They do a lot of interesting things. 362 00:21:13,350 --> 00:21:17,620 So scales don't necessarily have to operate just with numbers. 363 00:21:17,620 --> 00:21:18,955 Let's make a color scale. 364 00:21:18,955 --> 00:21:23,120 365 00:21:23,120 --> 00:21:26,120 >> So our range could be-- our domain is 1 to 200. 366 00:21:26,120 --> 00:21:28,220 That's the input thing. 367 00:21:28,220 --> 00:21:33,793 But we might want to map from green to red, for example. 368 00:21:33,793 --> 00:21:39,710 369 00:21:39,710 --> 00:21:42,910 And now, if we pass it 1, we're going to get green. 370 00:21:42,910 --> 00:21:45,110 If we give it 200, we'll get red. 371 00:21:45,110 --> 00:21:49,480 And if we pass it something in between, it's going to be some mix of that, 372 00:21:49,480 --> 00:21:52,520 somewhere on the gradient between green and red. 373 00:21:52,520 --> 00:21:55,210 >> And instead of having this kind of clunky logic 374 00:21:55,210 --> 00:21:58,550 we have here with the conditional right there, 375 00:21:58,550 --> 00:22:03,250 we could have something-- a linear scale between those. 376 00:22:03,250 --> 00:22:07,100 So we'd use the scale we just created, which we called color. 377 00:22:07,100 --> 00:22:09,060 And we'd give it d, which is our data element. 378 00:22:09,060 --> 00:22:14,250 379 00:22:14,250 --> 00:22:15,060 And there we go. 380 00:22:15,060 --> 00:22:18,070 We have a color scale. 381 00:22:18,070 --> 00:22:18,940 >> So this is mapping. 382 00:22:18,940 --> 00:22:20,960 So the far left is completely green. 383 00:22:20,960 --> 00:22:22,560 The far right is completely red. 384 00:22:22,560 --> 00:22:24,828 And everything in between is a function of d. 385 00:22:24,828 --> 00:22:33,369 386 00:22:33,369 --> 00:22:35,160 We have an interesting visualizations here. 387 00:22:35,160 --> 00:22:36,952 But our data was kind of boring. 388 00:22:36,952 --> 00:22:39,410 Let's see what we could do if we had more interesting data. 389 00:22:39,410 --> 00:22:44,420 390 00:22:44,420 --> 00:22:50,500 >> Act IV, Working With Data-- the first thing 391 00:22:50,500 --> 00:22:53,560 we'll want to do to make our visualization more interesting 392 00:22:53,560 --> 00:22:56,140 is to move the data somewhere else. 393 00:22:56,140 --> 00:22:58,310 It's very clunky to have the data hard coded here. 394 00:22:58,310 --> 00:23:01,220 And generally, we'll be asking somebody else for the data. 395 00:23:01,220 --> 00:23:05,400 We'll be maybe asking the government, the Census Bureau, what's your data 396 00:23:05,400 --> 00:23:10,170 and then plotting that or asking some third-party entity for some data 397 00:23:10,170 --> 00:23:13,330 and then building a visualization on that. 398 00:23:13,330 --> 00:23:17,170 >> So the first thing we want to do is move that to somewhere else. 399 00:23:17,170 --> 00:23:24,130 So I'm going to create a file here called data.json. 400 00:23:24,130 --> 00:23:25,600 JSON is the data format. 401 00:23:25,600 --> 00:23:29,210 You don't have to know much about that. 402 00:23:29,210 --> 00:23:33,210 And we're going to copy the little data we have there, 403 00:23:33,210 --> 00:23:40,330 paste it in there verbatim, go back to our visualization code 404 00:23:40,330 --> 00:23:45,362 here, and use this function right here. 405 00:23:45,362 --> 00:23:46,820 You don't have to know the details. 406 00:23:46,820 --> 00:23:49,800 But what this will do is, it will find that file, 407 00:23:49,800 --> 00:23:51,780 fetch it, and return it to us. 408 00:23:51,780 --> 00:24:11,660 409 00:24:11,660 --> 00:24:15,220 So what this does is, it goes and get the data.json file. 410 00:24:15,220 --> 00:24:18,570 And then all the code that's indented inside-- essentially, 411 00:24:18,570 --> 00:24:21,800 all the code we have there-- will run only when we get the data back. 412 00:24:21,800 --> 00:24:25,760 And then it's going to run that code with the data we have. 413 00:24:25,760 --> 00:24:28,870 Great, we have a visualization that queries 414 00:24:28,870 --> 00:24:31,390 for some code somewhere else, which is usually 415 00:24:31,390 --> 00:24:36,110 where it queries some data from somewhere else, which is usually 416 00:24:36,110 --> 00:24:38,656 how visualizations work. 417 00:24:38,656 --> 00:24:41,400 >> But I want to go back to the data. 418 00:24:41,400 --> 00:24:48,030 So the data fundamentally in D3-- D3 consumes data that's a list of things. 419 00:24:48,030 --> 00:24:53,000 D3 expects the data just be a list of things, an array of things. 420 00:24:53,000 --> 00:24:58,780 It doesn't matter what those things are, so long as it's an array of them. 421 00:24:58,780 --> 00:25:02,460 >> So here, for example, we could of course have floating point values. 422 00:25:02,460 --> 00:25:04,830 We could have negatives. 423 00:25:04,830 --> 00:25:09,400 D3 doesn't care, so long as it's a list of things. 424 00:25:09,400 --> 00:25:13,270 >> As interesting things we could have, we could also 425 00:25:13,270 --> 00:25:19,410 have a list of strings like that. 426 00:25:19,410 --> 00:25:25,440 So these are the Crimson headlines I picked up a few days ago. 427 00:25:25,440 --> 00:25:29,220 And maybe you can find some interesting things about these a headlines. 428 00:25:29,220 --> 00:25:30,970 >> So again, this is a list of things. 429 00:25:30,970 --> 00:25:32,360 D3 doesn't care. 430 00:25:32,360 --> 00:25:35,572 These happen to be a string. 431 00:25:35,572 --> 00:25:36,530 We've changed our data. 432 00:25:36,530 --> 00:25:38,210 >> Let's return to our visualization. 433 00:25:38,210 --> 00:25:42,495 Now, our visualization expects the input to be numbers. 434 00:25:42,495 --> 00:25:44,370 So we're going to have to make a few changes. 435 00:25:44,370 --> 00:25:47,180 436 00:25:47,180 --> 00:25:52,180 So for example, first of all, maybe we want to put these circles along 437 00:25:52,180 --> 00:25:56,870 by the length of the headline, the number of characters in the headline. 438 00:25:56,870 --> 00:26:03,600 >> So what we'll do is-- every time our function is called with a string, 439 00:26:03,600 --> 00:26:09,095 we'll find it's length And then pass that to scale. 440 00:26:09,095 --> 00:26:11,550 The color, I'll return that to steel blue. 441 00:26:11,550 --> 00:26:19,060 442 00:26:19,060 --> 00:26:20,420 And there we go. 443 00:26:20,420 --> 00:26:23,190 We have a visualization of Crimson headlines. 444 00:26:23,190 --> 00:26:25,500 >> Our scale is a bit off. 445 00:26:25,500 --> 00:26:29,680 Let's assume that the longest headline is 100 characters long, 446 00:26:29,680 --> 00:26:32,244 so span that out a bit. 447 00:26:32,244 --> 00:26:33,410 And we have a visualization. 448 00:26:33,410 --> 00:26:36,710 So it seems that most headlines are pretty close together, 449 00:26:36,710 --> 00:26:38,750 in terms of character line. 450 00:26:38,750 --> 00:26:41,200 But one there really stands out. 451 00:26:41,200 --> 00:26:46,660 >> We could build some tools to explore that more. 452 00:26:46,660 --> 00:26:50,710 But when I was working on this, I was curious whether, in this data set, 453 00:26:50,710 --> 00:26:53,880 headlines with a colon in them would be longer. 454 00:26:53,880 --> 00:26:55,770 I assumes they would. 455 00:26:55,770 --> 00:26:56,660 >> So let's find out. 456 00:26:56,660 --> 00:27:00,650 Let's use the color channel like we did before, 457 00:27:00,650 --> 00:27:04,540 to encode some about whether there's a colon or no. 458 00:27:04,540 --> 00:27:07,220 So we'll use a conditional again. 459 00:27:07,220 --> 00:27:09,350 You don't have to know the details of this, 460 00:27:09,350 --> 00:27:14,260 but this is how we check a string for a particular character 461 00:27:14,260 --> 00:27:16,355 in JavaScript, again, not relevant. 462 00:27:16,355 --> 00:27:18,910 463 00:27:18,910 --> 00:27:23,270 >> But if we don't find a colon, we'll return green. 464 00:27:23,270 --> 00:27:26,100 And if we do, we'll return red. 465 00:27:26,100 --> 00:27:29,010 So again, headlines that have a colon will be red. 466 00:27:29,010 --> 00:27:34,980 This is what this means-- nice. 467 00:27:34,980 --> 00:27:38,040 >> So it seems that my hypothesis is bumped. 468 00:27:38,040 --> 00:27:39,360 There's only two. 469 00:27:39,360 --> 00:27:42,380 We only have six data points and only two had colons. 470 00:27:42,380 --> 00:27:45,510 But it seems a bit more on the lower end, in fact. 471 00:27:45,510 --> 00:27:47,830 Headlines with colons seem to generally be shorter, 472 00:27:47,830 --> 00:27:52,370 at least in our data set-- interesting. 473 00:27:52,370 --> 00:27:55,830 >> Let's return that to steel blue and then see 474 00:27:55,830 --> 00:28:00,601 what we can make with even more interesting data. 475 00:28:00,601 --> 00:28:04,370 476 00:28:04,370 --> 00:28:09,070 So again, I mentioned that data in D3 is a list of things. 477 00:28:09,070 --> 00:28:11,080 We've seen numbers of many types. 478 00:28:11,080 --> 00:28:12,810 We've seen strings. 479 00:28:12,810 --> 00:28:15,700 But the things can also be objects. 480 00:28:15,700 --> 00:28:20,080 >> They can be complicated things that include a lot of things. 481 00:28:20,080 --> 00:28:24,510 To say that more clearly, in most cases, we 482 00:28:24,510 --> 00:28:28,384 want to build every data point as more complicated than just one value. 483 00:28:28,384 --> 00:28:30,175 If you'd imagine a database about students, 484 00:28:30,175 --> 00:28:32,470 there might be a student name, a student ID, 485 00:28:32,470 --> 00:28:36,370 and a lot of things associated with a particular record, 486 00:28:36,370 --> 00:28:39,834 not just a string or a number. 487 00:28:39,834 --> 00:28:40,750 So let's look at that. 488 00:28:40,750 --> 00:28:55,180 489 00:28:55,180 --> 00:28:56,760 >> This is one such data set. 490 00:28:56,760 --> 00:28:59,090 This is a data set about earthquakes. 491 00:28:59,090 --> 00:29:01,910 492 00:29:01,910 --> 00:29:08,430 So everything here on our list or array of things contains many things itself. 493 00:29:08,430 --> 00:29:11,380 So every data point has a magnitude and a coordinate. 494 00:29:11,380 --> 00:29:13,425 And coordinates themselves contain two things. 495 00:29:13,425 --> 00:29:15,960 496 00:29:15,960 --> 00:29:20,450 >> So every day is now a lot more complicated and a lot more interesting 497 00:29:20,450 --> 00:29:22,700 and contains much more interesting information. 498 00:29:22,700 --> 00:29:26,730 Let's see we could build out of that. 499 00:29:26,730 --> 00:29:36,130 Returning back to here, again, using our histogram circle visualization 500 00:29:36,130 --> 00:29:42,110 we've built, let's see if we can build a visualization of magnitude distribution 501 00:29:42,110 --> 00:29:43,305 in our data set. 502 00:29:43,305 --> 00:29:45,850 503 00:29:45,850 --> 00:29:48,660 >> So here, it's the same concept. 504 00:29:48,660 --> 00:29:51,920 But now, d contains more things. 505 00:29:51,920 --> 00:29:54,780 d contains many data elements. 506 00:29:54,780 --> 00:29:57,946 So we get d back. 507 00:29:57,946 --> 00:29:59,670 D3 gives us d. 508 00:29:59,670 --> 00:30:06,080 And we respond by finding the magnitude of d and then passing that to scale. 509 00:30:06,080 --> 00:30:08,490 >> And then we need to change our scale, of course. 510 00:30:08,490 --> 00:30:12,980 So magnitudes simply don't go much more than 10. 511 00:30:12,980 --> 00:30:15,485 Actually, there's never been a 10 magnitude earthquake. 512 00:30:15,485 --> 00:30:19,360 But that's kind of our upper end, our upper spectrum. 513 00:30:19,360 --> 00:30:20,240 >> Let's refresh. 514 00:30:20,240 --> 00:30:22,990 Nice, we have a visualization. 515 00:30:22,990 --> 00:30:25,490 It's interesting to note-- so there are two data points that 516 00:30:25,490 --> 00:30:29,010 are almost exactly on top of each other, in terms of magnitude. 517 00:30:29,010 --> 00:30:31,350 You see this by the opacity we're using. 518 00:30:31,350 --> 00:30:40,810 519 00:30:40,810 --> 00:30:42,690 >> We have geographic data now. 520 00:30:42,690 --> 00:30:44,710 We have latitudes and longitude. 521 00:30:44,710 --> 00:30:47,549 Maybe we could do something a lot more interesting with that. 522 00:30:47,549 --> 00:30:49,590 Let's find some more interesting way to visualize 523 00:30:49,590 --> 00:30:53,500 this more complicated data we have access to. 524 00:30:53,500 --> 00:31:04,950 >> Act V, Mapping-- fundamentally, we want to put these on a map. 525 00:31:04,950 --> 00:31:07,690 I mean, this is where this is going. 526 00:31:07,690 --> 00:31:13,130 We want to encode information about the position of these earthquake readings, 527 00:31:13,130 --> 00:31:16,350 as well their magnitude, because we have that now. 528 00:31:16,350 --> 00:31:21,310 We understand how to consume more complicated data. 529 00:31:21,310 --> 00:31:26,200 >> The first thing we'll do is create a map, a background map. 530 00:31:26,200 --> 00:31:29,360 I'm going to go through this very quickly. 531 00:31:29,360 --> 00:31:30,560 This is tricky code. 532 00:31:30,560 --> 00:31:33,110 It's another one of those recipes you don't really 533 00:31:33,110 --> 00:31:35,690 have to understand fully for you to use. 534 00:31:35,690 --> 00:31:38,510 535 00:31:38,510 --> 00:31:39,740 But this is code. 536 00:31:39,740 --> 00:31:43,580 This code right here creates a map. 537 00:31:43,580 --> 00:31:45,730 >> We're not going to go in detail. 538 00:31:45,730 --> 00:31:54,210 But superficially, what it does is, it queries this us.json file, which 539 00:31:54,210 --> 00:31:57,150 is a data file like the one we had before. 540 00:31:57,150 --> 00:31:59,150 It's more complex, of course. 541 00:31:59,150 --> 00:32:02,920 But in this case, everything, every data point is this state 542 00:32:02,920 --> 00:32:05,420 and has a list of latitudes and longitude 543 00:32:05,420 --> 00:32:10,500 that define the polygon, that form, that state. 544 00:32:10,500 --> 00:32:13,280 >> So what D3 will do is similar to what we did before. 545 00:32:13,280 --> 00:32:18,140 It will request that and bind that to an element. 546 00:32:18,140 --> 00:32:20,890 And there's a function that will map that element out, 547 00:32:20,890 --> 00:32:23,410 based on the latitudes and longitude. 548 00:32:23,410 --> 00:32:24,580 You can read more on that. 549 00:32:24,580 --> 00:32:27,385 And I recommend it. 550 00:32:27,385 --> 00:32:30,090 >> There are links at the end of this code posted. 551 00:32:30,090 --> 00:32:31,570 And the code is commented. 552 00:32:31,570 --> 00:32:34,050 In there are links for further on this. 553 00:32:34,050 --> 00:32:36,590 I recommend you look it up. 554 00:32:36,590 --> 00:32:39,460 But what we care about is this projection function. 555 00:32:39,460 --> 00:32:41,210 I want to go through that. 556 00:32:41,210 --> 00:32:43,522 >> First of all, let me show you that, yes, we have a map. 557 00:32:43,522 --> 00:32:47,300 558 00:32:47,300 --> 00:32:49,970 Maps are cool. 559 00:32:49,970 --> 00:32:52,330 So let's look at this production function. 560 00:32:52,330 --> 00:32:56,481 >> Projection is very much like a scale, scales again. 561 00:32:56,481 --> 00:32:59,210 So what production for this projection function 562 00:32:59,210 --> 00:33:06,610 does is, we could pass it longitude and latitudes-- in this case, 563 00:33:06,610 --> 00:33:09,590 these values here are the lat-longs of the building 564 00:33:09,590 --> 00:33:13,990 we're sitting in right now-- to projection. 565 00:33:13,990 --> 00:33:20,560 And projection will convert that into x and y pixel values. 566 00:33:20,560 --> 00:33:23,300 >> So what projection is doing is very similar to our scale. 567 00:33:23,300 --> 00:33:27,270 It's taking our latitudes and longitude that represents a whole globe 568 00:33:27,270 --> 00:33:31,390 and shrinking and sizing that down to the square that we want, 569 00:33:31,390 --> 00:33:33,510 that we've given it. 570 00:33:33,510 --> 00:33:35,220 In this case, we're passing these values. 571 00:33:35,220 --> 00:33:41,370 And it's giving us, well, that on your screen means 640 pixels. 572 00:33:41,370 --> 00:33:46,250 This whole screen is 700 pixels wide, so that makes us about here, 573 00:33:46,250 --> 00:33:53,310 and 154 pixels down, which I would estimate is pretty much here. 574 00:33:53,310 --> 00:33:57,250 >> So taking those lat-longs, which represent something on the whole globe 575 00:33:57,250 --> 00:34:02,850 and squishing and moving that around to give us x and y pixel values, 576 00:34:02,850 --> 00:34:05,450 this is the first thing that's done in this mapping code. 577 00:34:05,450 --> 00:34:07,920 And then the rest of the code consumes the data 578 00:34:07,920 --> 00:34:14,310 and then maps those lat-longs onto something on your screen. 579 00:34:14,310 --> 00:34:18,380 >> But we're going to use this projection functions, because it turns out 580 00:34:18,380 --> 00:34:20,270 we have lat-longs longs as well. 581 00:34:20,270 --> 00:34:24,509 Looking back at our data, we have latitudes and longitude coordinates 582 00:34:24,509 --> 00:34:25,425 for every observation. 583 00:34:25,425 --> 00:34:28,131 584 00:34:28,131 --> 00:34:29,130 So let's use projection. 585 00:34:29,130 --> 00:34:33,250 586 00:34:33,250 --> 00:34:37,639 >> So looking at our exposition, we want our exposition-- 587 00:34:37,639 --> 00:34:39,590 we have a latitude and a longitude. 588 00:34:39,590 --> 00:34:40,770 But we want pixel values. 589 00:34:40,770 --> 00:34:43,510 And it turns out, we have exactly what we want-- projection. 590 00:34:43,510 --> 00:34:46,239 Very much like we were using scale right here, 591 00:34:46,239 --> 00:34:52,075 we're now going to use projection and pass it coordinates. 592 00:34:52,075 --> 00:34:55,241 593 00:34:55,241 --> 00:34:56,949 So the first thing we're doing-- so we're 594 00:34:56,949 --> 00:35:01,520 getting d, which is an individual data element of an individual earthquake 595 00:35:01,520 --> 00:35:02,370 reading. 596 00:35:02,370 --> 00:35:04,640 The first thing we do is get the coordinates. 597 00:35:04,640 --> 00:35:06,150 All right, we have the coordinates. 598 00:35:06,150 --> 00:35:09,160 >> The second thing we do is pass that on to projection. 599 00:35:09,160 --> 00:35:13,440 Projection converts those coordinates into pixel values, x and y. 600 00:35:13,440 --> 00:35:16,680 And then the last thing we want to do is just get the x, 601 00:35:16,680 --> 00:35:19,342 which this case is the first one. 602 00:35:19,342 --> 00:35:22,050 It's the first of the two things that are returned by projection. 603 00:35:22,050 --> 00:35:27,840 604 00:35:27,840 --> 00:35:29,630 >> We'll do the same for y. 605 00:35:29,630 --> 00:35:34,960 But instead, we'll return the second element, the y. 606 00:35:34,960 --> 00:35:35,980 Get ready to refresh. 607 00:35:35,980 --> 00:35:39,830 608 00:35:39,830 --> 00:35:46,450 Ooh, extra character here-- nice, we have 609 00:35:46,450 --> 00:35:51,730 a data driven document that's concealing this JSON file of objects, 610 00:35:51,730 --> 00:35:57,560 making a map, and changing the attribute in relation to the data 611 00:35:57,560 --> 00:35:59,600 to project it on a map. 612 00:35:59,600 --> 00:36:00,840 This is really interesting. 613 00:36:00,840 --> 00:36:03,770 This is cool. 614 00:36:03,770 --> 00:36:05,640 >> Let's take it up a notch. 615 00:36:05,640 --> 00:36:08,795 I mean, we have two pieces of information with every data point. 616 00:36:08,795 --> 00:36:10,000 I mean, three. 617 00:36:10,000 --> 00:36:12,540 We have the coordinates, which is an x and y. 618 00:36:12,540 --> 00:36:15,700 And we have the magnitude. 619 00:36:15,700 --> 00:36:17,420 >> We need to encode magnitude somehow. 620 00:36:17,420 --> 00:36:18,920 We have a lot of channels. 621 00:36:18,920 --> 00:36:20,370 We can use color. 622 00:36:20,370 --> 00:36:21,890 We can use radius. 623 00:36:21,890 --> 00:36:23,040 We could use opacity. 624 00:36:23,040 --> 00:36:25,540 We could use many things in code. 625 00:36:25,540 --> 00:36:29,180 Any of these attributes and many more that are not listed there, 626 00:36:29,180 --> 00:36:33,065 because they're optional, we could use to encode this data, the stroke 627 00:36:33,065 --> 00:36:35,670 and all these things I've mentioned. 628 00:36:35,670 --> 00:36:36,690 >> Let's do radius. 629 00:36:36,690 --> 00:36:38,830 I think radius is the most intuitive. 630 00:36:38,830 --> 00:36:46,210 So again, we'll replace that hard-coded 40 and make some calculations. 631 00:36:46,210 --> 00:36:48,810 We'll use our favorite scale again. 632 00:36:48,810 --> 00:36:50,290 And we're past d. 633 00:36:50,290 --> 00:36:55,850 But not d because we want the magnitude of d. d is just the data point. 634 00:36:55,850 --> 00:36:57,430 We'll pass the magnitude to scale. 635 00:36:57,430 --> 00:36:58,470 >> Let's try that again. 636 00:36:58,470 --> 00:37:00,230 Ooh, it doesn't work. 637 00:37:00,230 --> 00:37:02,940 Why does it not work? 638 00:37:02,940 --> 00:37:04,387 >> So remember what scale does. 639 00:37:04,387 --> 00:37:05,470 Let's look at scale again. 640 00:37:05,470 --> 00:37:10,800 Scale maps from 1 to 10 on to 22 to 600, more or less. 641 00:37:10,800 --> 00:37:12,030 600 is huge. 642 00:37:12,030 --> 00:37:14,730 This is why we're getting this. 643 00:37:14,730 --> 00:37:18,420 >> So we want to change our scale to something more reasonable. 644 00:37:18,420 --> 00:37:22,610 Let's say, we want 0 to 60. 645 00:37:22,610 --> 00:37:25,340 60 is big, but 10 earthquakes are incredibly rare. 646 00:37:25,340 --> 00:37:27,880 In fact, they've never happened. 647 00:37:27,880 --> 00:37:31,830 >> So what this will do is, it'll take our magnitude that goes from 1 to 10 648 00:37:31,830 --> 00:37:34,490 and map it on to expand it out. 649 00:37:34,490 --> 00:37:37,370 And map it on to 0 to 60. 650 00:37:37,370 --> 00:37:38,840 Let's refresh. 651 00:37:38,840 --> 00:37:41,850 >> Nice, we have a visualization. 652 00:37:41,850 --> 00:37:42,500 This is great. 653 00:37:42,500 --> 00:37:43,736 This is actual data. 654 00:37:43,736 --> 00:37:46,360 You'll notice, in my little toy example, the biggest earthquake 655 00:37:46,360 --> 00:37:49,417 is right on top of us. 656 00:37:49,417 --> 00:37:50,000 But that's it. 657 00:37:50,000 --> 00:37:54,422 We have a date driven visualization that consumes the data 658 00:37:54,422 --> 00:37:56,255 and gives us really interesting information. 659 00:37:56,255 --> 00:38:02,600 660 00:38:02,600 --> 00:38:06,420 Yeah, let's add some interactivity to it. 661 00:38:06,420 --> 00:38:08,675 I mentioned that was the strong force of D3. 662 00:38:08,675 --> 00:38:11,490 663 00:38:11,490 --> 00:38:15,060 >> So here, for every element, we're describing a bunch of attributes. 664 00:38:15,060 --> 00:38:20,230 But we can also describe what we want to happen with interactivity elements. 665 00:38:20,230 --> 00:38:26,190 For example, we could describe what happens when we mouse over. 666 00:38:26,190 --> 00:38:28,740 667 00:38:28,740 --> 00:38:33,640 And very similar that, that'll take a function, 668 00:38:33,640 --> 00:38:36,700 very similar to the attributes we had before, 669 00:38:36,700 --> 00:38:44,650 where we do something to the element when we hover over it. 670 00:38:44,650 --> 00:38:47,100 >> So first thing we need to do is select that element, 671 00:38:47,100 --> 00:38:49,435 to find it basically, in the browser. 672 00:38:49,435 --> 00:38:57,090 673 00:38:57,090 --> 00:39:00,920 and then we could set an attribute to it. 674 00:39:00,920 --> 00:39:06,870 So what I'm doing here is, when we hover over something, we'll get that element 675 00:39:06,870 --> 00:39:11,197 and then set its opacity back to 1, to completely opaque. 676 00:39:11,197 --> 00:39:12,488 Let's see what that looks like. 677 00:39:12,488 --> 00:39:29,430 678 00:39:29,430 --> 00:39:39,080 >> It appears we have an extra semicolon here. 679 00:39:39,080 --> 00:39:42,420 So if we hover over here, it gets full. 680 00:39:42,420 --> 00:39:46,530 681 00:39:46,530 --> 00:39:48,960 But now, of course, it stays full, because we 682 00:39:48,960 --> 00:39:53,240 have to describe what happens when remove our cursor. 683 00:39:53,240 --> 00:39:59,990 So let's do exactly that on mouseout, as opposed to mouseover. 684 00:39:59,990 --> 00:40:06,399 >> And we'll reset it to what we had before-- 0.5. 685 00:40:06,399 --> 00:40:10,260 And now, every time we hover, we get a full circle. 686 00:40:10,260 --> 00:40:13,468 It helps us see what we we're selecting essentially. 687 00:40:13,468 --> 00:40:19,210 688 00:40:19,210 --> 00:40:22,860 >> And now let's make this really great. 689 00:40:22,860 --> 00:40:26,210 Let's connect this to real data. 690 00:40:26,210 --> 00:40:30,890 So let's ask could USGS about their data. 691 00:40:30,890 --> 00:40:35,630 So the US Geological Survey has data about earthquakes. 692 00:40:35,630 --> 00:40:41,460 They have a public API that's able to be consumed in JSON format. 693 00:40:41,460 --> 00:40:42,548 So let's do that. 694 00:40:42,548 --> 00:40:49,730 695 00:40:49,730 --> 00:40:55,900 >> So this is a bit of code that connects to the USGS API. 696 00:40:55,900 --> 00:40:57,990 And there's a bit of processing on it. 697 00:40:57,990 --> 00:41:02,200 This is not relevant but simplifies it to a simple data format like the one 698 00:41:02,200 --> 00:41:03,800 we had before. 699 00:41:03,800 --> 00:41:08,140 So I get rid of our call to our fake data.json on file. 700 00:41:08,140 --> 00:41:13,110 And instead, I'm calling the USGS essentially. 701 00:41:13,110 --> 00:41:16,700 >> Let's refresh, nice. 702 00:41:16,700 --> 00:41:21,260 This is actual, real-life data from this week for earthquakes. 703 00:41:21,260 --> 00:41:23,217 This is really interesting. 704 00:41:23,217 --> 00:41:25,050 This is not surprising for us, but there are 705 00:41:25,050 --> 00:41:27,909 a lot of earthquakes on the West Coast in California. 706 00:41:27,909 --> 00:41:30,950 But I thought it was very interesting that there were so many earthquakes 707 00:41:30,950 --> 00:41:34,350 in Alaska, and apparently, here in the Midwest. 708 00:41:34,350 --> 00:41:37,630 I mean, interesting, and we're good. 709 00:41:37,630 --> 00:41:40,410 That's the conclusion. 710 00:41:40,410 --> 00:41:43,760 >> But fundamentally, this is what D3 helps us do. 711 00:41:43,760 --> 00:41:48,030 It helps us take data, bind it to elements in the DOM, 712 00:41:48,030 --> 00:41:51,620 and have those elements change as a function of the data, 713 00:41:51,620 --> 00:41:54,780 have those attributes, all the many attributes of the elements, 714 00:41:54,780 --> 00:41:57,393 all be useful for channels to convey information. 715 00:41:57,393 --> 00:42:05,440 716 00:42:05,440 --> 00:42:09,290 >> D3 is an incredibly powerful library and amazingly well run. 717 00:42:09,290 --> 00:42:12,260 This is some powerful stuff. 718 00:42:12,260 --> 00:42:15,960 Data visualization is an incredibly powerful tool 719 00:42:15,960 --> 00:42:21,530 for conveying to people deep insights that gets to their core 720 00:42:21,530 --> 00:42:25,430 and helps them understand, in this profound and intuitive way, 721 00:42:25,430 --> 00:42:29,760 how data works and how data changes our life. 722 00:42:29,760 --> 00:42:31,019