1 00:00:00,000 --> 00:00:00,442 2 00:00:00,442 --> 00:00:01,900 MICHAEL: Hello, my name is Michael. 3 00:00:01,900 --> 00:00:05,200 And welcome to Data Visualization in D3. 4 00:00:05,200 --> 00:00:06,970 Let's get started. 5 00:00:06,970 --> 00:00:08,510 So what is the D3? 6 00:00:08,510 --> 00:00:11,620 D3 stands for data-driven documents, and it's essentially 7 00:00:11,620 --> 00:00:14,970 a JavaScript library-- a collection of JavaScript functions, 8 00:00:14,970 --> 00:00:18,220 that's going to make it easy to visualize and interact 9 00:00:18,220 --> 00:00:21,530 with data in web pages. 10 00:00:21,530 --> 00:00:26,800 So there are a few prominent examples of media companies that use D3. 11 00:00:26,800 --> 00:00:28,330 One is The New York Times. 12 00:00:28,330 --> 00:00:31,130 Let me pull up a quick example of that. 13 00:00:31,130 --> 00:00:33,520 So this is a New York Times D3 visualization 14 00:00:33,520 --> 00:00:37,120 of Obama's 2013 budget proposal. 15 00:00:37,120 --> 00:00:39,260 So you can see it is interactive. 16 00:00:39,260 --> 00:00:42,400 You can see how people spend different ways. 17 00:00:42,400 --> 00:00:46,780 They have different arrangements of the data in different sorts of charts. 18 00:00:46,780 --> 00:00:51,490 And it all moves pretty seamlessly, and D3 allows us to do that. 19 00:00:51,490 --> 00:00:53,779 The other prominent example, which I had included, 20 00:00:53,779 --> 00:00:56,320 but then took out because I didn't really want to look at it, 21 00:00:56,320 --> 00:01:00,170 was the 538 projections for the presidential election 22 00:01:00,170 --> 00:01:01,660 and for the Senate races. 23 00:01:01,660 --> 00:01:09,320 But that was also built in D3 along with a few other JavaScript libraries. 24 00:01:09,320 --> 00:01:13,810 So what can we expect to achieve during the next 40 minutes or so 25 00:01:13,810 --> 00:01:16,120 of this seminar? 26 00:01:16,120 --> 00:01:18,070 So we're going to build a visualization. 27 00:01:18,070 --> 00:01:21,410 We're going to look at shots in the NBA, and we're 28 00:01:21,410 --> 00:01:23,410 going to build some sort of visualization that's 29 00:01:23,410 --> 00:01:26,140 going to help us analyze that. 30 00:01:26,140 --> 00:01:31,480 But the goal of this lecture is not so much to provide a few functions in D3 31 00:01:31,480 --> 00:01:36,550 or look at a few components and just give a basic knowledge of those. 32 00:01:36,550 --> 00:01:40,600 Rather, we want to provide some sort of literacy in D3. 33 00:01:40,600 --> 00:01:46,240 And by this I mean that, in D3 there is a lot of creativity. 34 00:01:46,240 --> 00:01:51,610 We have several different ways that we can analyze and express data 35 00:01:51,610 --> 00:01:54,230 to express different things. 36 00:01:54,230 --> 00:02:00,850 So we have to be creative, and we have to be very versatile in how we use it. 37 00:02:00,850 --> 00:02:02,740 And the way that versatility is built up is 38 00:02:02,740 --> 00:02:05,560 by looking at examples of what other people have done 39 00:02:05,560 --> 00:02:09,430 and being able to read that code and understand how it translates 40 00:02:09,430 --> 00:02:11,590 to the visual you see on the screen. 41 00:02:11,590 --> 00:02:13,990 After a certain amount of time doing this, 42 00:02:13,990 --> 00:02:16,330 you're going to be able to start thinking 43 00:02:16,330 --> 00:02:18,670 about the visuals you want to make and the code 44 00:02:18,670 --> 00:02:22,069 that you're going to have to write to make those visuals in parallel. 45 00:02:22,069 --> 00:02:23,860 And that's really the goal of this lecture, 46 00:02:23,860 --> 00:02:28,720 is to get you started on that path. , Well let's take a step back 47 00:02:28,720 --> 00:02:34,060 for a moment and review a little basic HTML and JavaScript. 48 00:02:34,060 --> 00:02:41,800 So remember that HTML sort of follows this domain object model, which 49 00:02:41,800 --> 00:02:44,270 is something of a tree structure. 50 00:02:44,270 --> 00:02:50,260 So here we have some very simple HTML and next to it we have the DOM tree. 51 00:02:50,260 --> 00:02:53,860 So this is just a document with the HTML tag. 52 00:02:53,860 --> 00:02:56,470 Within that HTML, there's a head and a body. 53 00:02:56,470 --> 00:02:59,630 The head contains the title, and within the title and the body, 54 00:02:59,630 --> 00:03:00,790 there is some text. 55 00:03:00,790 --> 00:03:03,070 And it's going to all be represented as a tree. 56 00:03:03,070 --> 00:03:07,270 And any of the JavaScript libraries that we use to interact with HTML 57 00:03:07,270 --> 00:03:10,720 are going to be about grabbing notes from this tree 58 00:03:10,720 --> 00:03:12,800 and doing something with them in some manner. 59 00:03:12,800 --> 00:03:15,670 60 00:03:15,670 --> 00:03:20,410 And JavaScript, remember, can interact with and change these DOM elements, 61 00:03:20,410 --> 00:03:25,240 so change the nodes of this tree on the client side of the operation. 62 00:03:25,240 --> 00:03:29,240 So let's take a look at that quickly. 63 00:03:29,240 --> 00:03:32,530 So here is my Flask application, which I have put together just 64 00:03:32,530 --> 00:03:36,040 so we can look at these D3 examples. 65 00:03:36,040 --> 00:03:37,460 And you can see it's very simple. 66 00:03:37,460 --> 00:03:41,260 All it does is when we go to the site, it serves up the file we want. 67 00:03:41,260 --> 00:03:43,960 And this is just reinforcing the fact that JavaScript is all 68 00:03:43,960 --> 00:03:45,820 being done on the client side. 69 00:03:45,820 --> 00:03:49,210 The server really does nothing here. 70 00:03:49,210 --> 00:03:51,161 And here's the HTML I have. 71 00:03:51,161 --> 00:03:53,410 So I can actually even take out these scripts for now. 72 00:03:53,410 --> 00:03:56,050 We'll use them later. 73 00:03:56,050 --> 00:03:58,240 And so we just have a basic page. 74 00:03:58,240 --> 00:03:59,530 It's titled hello, world. 75 00:03:59,530 --> 00:04:02,080 And we have three paragraphs-- hello, world, hi, world, hey, 76 00:04:02,080 --> 00:04:05,860 world-- with three different IDs. 77 00:04:05,860 --> 00:04:07,240 We reload the page. 78 00:04:07,240 --> 00:04:08,620 That's what we have. 79 00:04:08,620 --> 00:04:14,440 So this is that page where we went to javascript.html. 80 00:04:14,440 --> 00:04:17,450 And we have a console, a JavaScript console. 81 00:04:17,450 --> 00:04:20,000 So we can type a little bit of JavaScript directly. 82 00:04:20,000 --> 00:04:27,820 So if we do document.getElementbyId("greeting") 83 00:04:27,820 --> 00:04:32,770 and I make sure to close my quotes, then you can see we get that paragraph with 84 00:04:32,770 --> 00:04:33,850 the ID greeting. 85 00:04:33,850 --> 00:04:39,572 And we can edit that paragraph so we can change the text to say hi, world. 86 00:04:39,572 --> 00:04:42,410 87 00:04:42,410 --> 00:04:45,280 So in this manner, JavaScript allows us to edit the text. 88 00:04:45,280 --> 00:04:47,890 And you can imagine that if instead of selecting the greeting, 89 00:04:47,890 --> 00:04:51,460 we selected the whole body and rather than selecting text content, 90 00:04:51,460 --> 00:04:54,670 we did affected the whole HTML, we could actually 91 00:04:54,670 --> 00:04:56,890 add different tags to our body. 92 00:04:56,890 --> 00:05:00,770 We could expand our HTML in a lot of different ways. 93 00:05:00,770 --> 00:05:02,510 And JavaScript allows us to do that. 94 00:05:02,510 --> 00:05:03,970 But it's kind of slow. 95 00:05:03,970 --> 00:05:05,220 We have to type a lot of code. 96 00:05:05,220 --> 00:05:08,500 We have to go through things one by one. 97 00:05:08,500 --> 00:05:11,920 So just doing it in pure JavaScript to make these sorts of visualizations 98 00:05:11,920 --> 00:05:13,166 is not really feasible. 99 00:05:13,166 --> 00:05:16,500 100 00:05:16,500 --> 00:05:20,360 So another thing that you guys can briefly introduce to another library 101 00:05:20,360 --> 00:05:21,500 is jQuery. 102 00:05:21,500 --> 00:05:24,260 In jQuery, you can see from its description 103 00:05:24,260 --> 00:05:28,010 is a JavaScript library that handles document traversal and manipulation, 104 00:05:28,010 --> 00:05:31,840 event handling animation, and Ajax. 105 00:05:31,840 --> 00:05:33,110 And it's a very broad library. 106 00:05:33,110 --> 00:05:36,110 It's actually much broader than D3. 107 00:05:36,110 --> 00:05:38,630 Often jQuery is going to be used in parallel with D3. 108 00:05:38,630 --> 00:05:40,910 So it's going to cover some things that it's good at, 109 00:05:40,910 --> 00:05:43,940 and D3 is going to cover it's much narrower focus, which 110 00:05:43,940 --> 00:05:45,590 is really visualizations. 111 00:05:45,590 --> 00:05:51,500 And I think in that area, it's going to be a lot more effective than jQuery. 112 00:05:51,500 --> 00:05:55,820 So let's talk about D3, and let's get into what D3 is. 113 00:05:55,820 --> 00:06:00,520 So the main interface in D3 for traversing documents 114 00:06:00,520 --> 00:06:03,560 and really for doing anything is called a selection, 115 00:06:03,560 --> 00:06:06,590 which is essentially a group of HTML elements, a group 116 00:06:06,590 --> 00:06:10,310 of those nodes from our DOM tree. 117 00:06:10,310 --> 00:06:15,780 And selections are created using this d3.select or d3.selectAll. 118 00:06:15,780 --> 00:06:18,230 So remember, in jQuery we have this dollar 119 00:06:18,230 --> 00:06:22,620 sign which starts all of our sort of jQuery functions, our jQuery calls. 120 00:06:22,620 --> 00:06:27,130 In D3, it's all started with just lowercase d3. 121 00:06:27,130 --> 00:06:31,670 And we can select in a very similar manner that we do in jQuery. 122 00:06:31,670 --> 00:06:35,120 So if we do d3.select with a tag name like body, 123 00:06:35,120 --> 00:06:37,820 that's going to be the same as doing dollar 124 00:06:37,820 --> 00:06:43,280 body in jQuery to select all of the body tags or to select the one body tag. 125 00:06:43,280 --> 00:06:49,200 And select all is going to select, in this case, all of the block classes. 126 00:06:49,200 --> 00:06:52,460 So remember in CSS, we have that dot represents a class. 127 00:06:52,460 --> 00:06:58,330 So the d3.selectAll("block") selects all the nodes which have the class block. 128 00:06:58,330 --> 00:07:02,810 And this is, again, the same syntax that we would see in jQuery. 129 00:07:02,810 --> 00:07:05,870 We can also select on ID using a hashtag. 130 00:07:05,870 --> 00:07:07,970 And this could be done the same way in jQuery. 131 00:07:07,970 --> 00:07:13,242 It's also the same as we do in pure HTML when we do document.getElementById. 132 00:07:13,242 --> 00:07:15,200 And we'll come back to this filter in a moment. 133 00:07:15,200 --> 00:07:17,310 But let's look at this in action. 134 00:07:17,310 --> 00:07:20,900 So now I'm going to add back in this script. 135 00:07:20,900 --> 00:07:23,750 And this script is going to include the D3 library. 136 00:07:23,750 --> 00:07:28,910 So you can see it's hosted on de.js.org and we're getting this JavaScript 137 00:07:28,910 --> 00:07:33,440 file which contains the source for D3. 138 00:07:33,440 --> 00:07:40,380 Now if we reload this, we can do d3.selectAll, 139 00:07:40,380 --> 00:07:42,400 and if we select all the paragraphs-- so all 140 00:07:42,400 --> 00:07:48,870 of the nodes with the tag paragraph, we're going to get this sort of object 141 00:07:48,870 --> 00:07:49,370 back. 142 00:07:49,370 --> 00:07:52,220 And it's sort of a weird look at an object, a selection, 143 00:07:52,220 --> 00:07:54,140 but if we look inside this group's element, 144 00:07:54,140 --> 00:07:58,100 you can see it has this node list, which is each of the paragraphs. 145 00:07:58,100 --> 00:08:02,990 So the question is how do we interact with these selections? 146 00:08:02,990 --> 00:08:05,040 And there's a lot of different ways. 147 00:08:05,040 --> 00:08:09,650 So once we make a selection, so once we do de.selectAll or d3.select, 148 00:08:09,650 --> 00:08:13,370 there are several functions which can be used to manipulate that selection 149 00:08:13,370 --> 00:08:15,710 and to manipulate the nodes within that selection. 150 00:08:15,710 --> 00:08:20,620 So a couple of examples of these are .attr, which stands for attribute. 151 00:08:20,620 --> 00:08:26,750 So that's going to be changing attributes of the HTML documents, 152 00:08:26,750 --> 00:08:29,040 of the HTML nodes. 153 00:08:29,040 --> 00:08:33,590 .style, which is going to change the CSS for those nodes. 154 00:08:33,590 --> 00:08:39,740 And .append, which is actually going to add HTML elements between the tags 155 00:08:39,740 --> 00:08:40,520 of those nodes. 156 00:08:40,520 --> 00:08:44,310 So let's do one example, one which I didn't actually show here, 157 00:08:44,310 --> 00:08:47,810 which is .text, which changes the text inside the node. 158 00:08:47,810 --> 00:08:52,800 So if we do d3.selectAll("p"), and then we do .text, 159 00:08:52,800 --> 00:08:54,440 this will change the text of the nodes. 160 00:08:54,440 --> 00:08:58,121 So if we changed them all to yo, world. 161 00:08:58,121 --> 00:09:00,294 You can see we've selected all the paragraphs, 162 00:09:00,294 --> 00:09:02,460 and we've changed the text inside them to yo, world. 163 00:09:02,460 --> 00:09:07,385 So you can see that D3 allows us to act on all the nodes at the same time, all 164 00:09:07,385 --> 00:09:08,936 of the nodes in our selection. 165 00:09:08,936 --> 00:09:12,620 166 00:09:12,620 --> 00:09:16,700 We could have also selected just one of them using d3.select on the ID. 167 00:09:16,700 --> 00:09:22,850 So if we just did d3.select("#greeting"), 168 00:09:22,850 --> 00:09:27,020 then this is going to select the node with the ID greeting. 169 00:09:27,020 --> 00:09:32,920 And if we do .text("hello,world") here, then you can see it changes that one 170 00:09:32,920 --> 00:09:35,330 node back to hello, world. 171 00:09:35,330 --> 00:09:38,690 So why would we use select rather than select all? 172 00:09:38,690 --> 00:09:40,900 There's no really specific reason why. 173 00:09:40,900 --> 00:09:43,571 I mean it's going to be better style in some ways, because it's 174 00:09:43,571 --> 00:09:46,070 going to make it clear that we're selecting only one element 175 00:09:46,070 --> 00:09:47,360 rather than selecting many. 176 00:09:47,360 --> 00:09:50,242 If we had done d3.selectAll("#greeting"), of course, 177 00:09:50,242 --> 00:09:51,950 we're only going to get one element back, 178 00:09:51,950 --> 00:09:54,320 because elements have to have unique IDs. 179 00:09:54,320 --> 00:09:57,920 180 00:09:57,920 --> 00:10:02,960 OK, so this is all well and good, right? 181 00:10:02,960 --> 00:10:04,760 But the question is why would we ever want 182 00:10:04,760 --> 00:10:08,410 to change all the attributes of a bunch of nodes to the same thing. 183 00:10:08,410 --> 00:10:11,580 And maybe there's a couple instances where we would do that, 184 00:10:11,580 --> 00:10:15,890 but usually what would be useful is if we could change nodes 185 00:10:15,890 --> 00:10:20,210 in different ways, but in one, sort of a short period of code. 186 00:10:20,210 --> 00:10:23,870 And for that we're going to have to have some reason 187 00:10:23,870 --> 00:10:25,250 to treat the nodes differently. 188 00:10:25,250 --> 00:10:30,290 And that comes back to what D3 is really for, and D3 is for data. 189 00:10:30,290 --> 00:10:34,400 And how do we associate the nodes we want to change with the data 190 00:10:34,400 --> 00:10:35,570 that we have. 191 00:10:35,570 --> 00:10:38,600 And that's done with this idiom basically, 192 00:10:38,600 --> 00:10:40,850 which is going to be d3.select.d3.selectA 193 00:10:40,850 --> 00:10:43,170 ll(...).data(...).enter(). 194 00:10:43,170 --> 00:10:45,870 So there's a bunch of functions that we string together here. 195 00:10:45,870 --> 00:10:50,070 And let's explain what each of those do in an example. 196 00:10:50,070 --> 00:10:55,897 So here I have a simple JS script, which has that idiom that we talked about. 197 00:10:55,897 --> 00:10:57,980 And let's just explain what each of the line does. 198 00:10:57,980 --> 00:11:00,200 So we first do d3.select("body"). 199 00:11:00,200 --> 00:11:02,960 So this is selecting that first body tag. 200 00:11:02,960 --> 00:11:06,990 And now we're going to select all the paragraphs inside the body. 201 00:11:06,990 --> 00:11:09,560 So you can see we can string together selections, 202 00:11:09,560 --> 00:11:13,250 where we call a select on something and that's 203 00:11:13,250 --> 00:11:15,710 going to select whatever we've decided to select. 204 00:11:15,710 --> 00:11:17,060 Here, it's the body tag. 205 00:11:17,060 --> 00:11:20,300 And then if we string together a select all or a select on that, 206 00:11:20,300 --> 00:11:22,850 it's going to call select All on all the nodes 207 00:11:22,850 --> 00:11:27,120 below that element in the DOM tree. 208 00:11:27,120 --> 00:11:29,780 So here it would select all three of these paragraphs 209 00:11:29,780 --> 00:11:31,890 because they're within the body tag. 210 00:11:31,890 --> 00:11:34,970 But suppose I comment those out. 211 00:11:34,970 --> 00:11:37,470 Then the selection is going to be empty. 212 00:11:37,470 --> 00:11:41,660 So it's a little weird that we've done that, but let's go a bit further. 213 00:11:41,660 --> 00:11:44,090 Next we're going to specify some data. 214 00:11:44,090 --> 00:11:47,029 So .data is a function which is going to take in an array. 215 00:11:47,029 --> 00:11:49,820 And that array should represent the data that we want to visualize. 216 00:11:49,820 --> 00:11:56,210 Here it's just an array of greetings-- hello, hi, yo, hey, and so on. 217 00:11:56,210 --> 00:12:03,530 And .enter, the next function, binds the data we've input to our selection. 218 00:12:03,530 --> 00:12:07,100 So every element in our array is going to be 219 00:12:07,100 --> 00:12:10,082 bound to one node in our selection. 220 00:12:10,082 --> 00:12:11,790 But wait a second, our selection's empty. 221 00:12:11,790 --> 00:12:14,270 There's nothing to bind to. 222 00:12:14,270 --> 00:12:18,710 In that case, if we just add a .append below this, 223 00:12:18,710 --> 00:12:22,670 it's going to try to fill out our selection so that it can bind each data 224 00:12:22,670 --> 00:12:25,620 point to one element of the selection. 225 00:12:25,620 --> 00:12:30,230 So here by doing ,append("p"), it's going to append one paragraph per 226 00:12:30,230 --> 00:12:31,500 element of our array. 227 00:12:31,500 --> 00:12:34,280 So in this case we have seven elements of our array, 228 00:12:34,280 --> 00:12:38,000 so it's going to append seven paragraphs. 229 00:12:38,000 --> 00:12:42,860 So we have one last thing which we're doing here, which is this .text. 230 00:12:42,860 --> 00:12:45,770 So once it's appended all of these paragraphs, right, 231 00:12:45,770 --> 00:12:49,400 we now have a selection of seven paragraphs and we're going to call 232 00:12:49,400 --> 00:12:51,990 .text on those paragraphs. 233 00:12:51,990 --> 00:12:56,180 Now remember, here we could have put something like .text hi world 234 00:12:56,180 --> 00:12:57,210 or whatnot. 235 00:12:57,210 --> 00:13:00,200 So we could have just typed hi world and then 236 00:13:00,200 --> 00:13:04,864 we would have ended up and deleted this function here. 237 00:13:04,864 --> 00:13:07,280 And if we did that, then we'd end up with seven paragraphs 238 00:13:07,280 --> 00:13:08,852 that say, hi, world. 239 00:13:08,852 --> 00:13:10,310 So let's look at that real quickly. 240 00:13:10,310 --> 00:13:13,160 So if we uncomment out this script here, so we 241 00:13:13,160 --> 00:13:15,380 have some script which is executed within the body. 242 00:13:15,380 --> 00:13:17,600 And that script is this example. 243 00:13:17,600 --> 00:13:20,450 And then go over and run it, we can see that we end up 244 00:13:20,450 --> 00:13:22,970 with seven paragraphs called hi, world. 245 00:13:22,970 --> 00:13:25,770 But we can actually use the data to make these distinct. 246 00:13:25,770 --> 00:13:28,890 So if we go back and put back in this function. 247 00:13:28,890 --> 00:13:33,290 So it's a nameless function, but it takes one argument d. 248 00:13:33,290 --> 00:13:37,100 And d is going to be the data associated with that node. 249 00:13:37,100 --> 00:13:39,680 So remember, each of these elements of the array 250 00:13:39,680 --> 00:13:43,377 are associated with one node in our selection. 251 00:13:43,377 --> 00:13:44,460 So it's a simple function. 252 00:13:44,460 --> 00:13:49,230 It takes in the data d, and it returns d plus comma world. 253 00:13:49,230 --> 00:13:52,340 So we're going to end up with hello, world, hi, world, yo, world, 254 00:13:52,340 --> 00:13:54,100 and hey, world. 255 00:13:54,100 --> 00:13:58,200 And let's see if we do that, and that's indeed what we end up with. 256 00:13:58,200 --> 00:14:01,580 So we can see how we've bound data to each of these nodes. 257 00:14:01,580 --> 00:14:04,880 And then using that, we've been able to treat the node differently, but still 258 00:14:04,880 --> 00:14:09,590 with only sort of one writing this function once. 259 00:14:09,590 --> 00:14:13,540 This is beginning to show the power of D3- 260 00:14:13,540 --> 00:14:16,935 not it's a pretty simple dataset, just a list of strings. 261 00:14:16,935 --> 00:14:20,060 We're going to be working with a little bit more complicated dataset today, 262 00:14:20,060 --> 00:14:23,300 but I think it's a fun one, which is all the shots taken by the New York 263 00:14:23,300 --> 00:14:26,150 Knicks in the 2013-2014 season. 264 00:14:26,150 --> 00:14:29,440 So let's take a quick look at what that looks like. 265 00:14:29,440 --> 00:14:32,840 And this is a little hard to see. 266 00:14:32,840 --> 00:14:38,810 But basically we have games, all the game in the 2013 267 00:14:38,810 --> 00:14:41,990 season specified by game ID. 268 00:14:41,990 --> 00:14:44,480 The period in which they occurred, so the quarter, 269 00:14:44,480 --> 00:14:48,260 the score at that time, the remaining time, the team, which is always 270 00:14:48,260 --> 00:14:51,560 New York Knicks, the player who took the shot, whether the shot was made 271 00:14:51,560 --> 00:14:53,900 or missed, the x and y-coordinates of the shot, 272 00:14:53,900 --> 00:14:58,100 the distance it was taken from, and who, if anybody, got the assist. 273 00:14:58,100 --> 00:14:59,570 So we have a bunch of data here. 274 00:14:59,570 --> 00:15:02,200 The main data we're going to be working with today 275 00:15:02,200 --> 00:15:04,550 is just these four columns, which are the player who 276 00:15:04,550 --> 00:15:07,810 took the shot, the result, and where the shot came from. 277 00:15:07,810 --> 00:15:11,680 And we're going try to visualize that in some sense. 278 00:15:11,680 --> 00:15:14,830 So the question is how do we actually read in data like this? 279 00:15:14,830 --> 00:15:18,300 So in this case all the data is stored in a CSV format, 280 00:15:18,300 --> 00:15:23,880 and D3 provides some excellent interfaces for reading in CSV data. 281 00:15:23,880 --> 00:15:27,390 And that's going to be done with this function d3.csv. 282 00:15:27,390 --> 00:15:29,850 The first argument to d3.csv is just going 283 00:15:29,850 --> 00:15:33,510 to be the file name, as to be expected. 284 00:15:33,510 --> 00:15:36,010 And the second argument is a nameless function, 285 00:15:36,010 --> 00:15:38,630 which takes one argument data. 286 00:15:38,630 --> 00:15:42,660 And what's going to be passed to this function, what's going to be executed 287 00:15:42,660 --> 00:15:46,620 is the data which is contained in filename.csv. 288 00:15:46,620 --> 00:15:51,340 And by that I mean it's going to be an array of JavaScript objects, 289 00:15:51,340 --> 00:15:56,670 where each JavaScript object has the keys that are the column names. 290 00:15:56,670 --> 00:16:00,510 So these are going to be the keys, and values, 291 00:16:00,510 --> 00:16:02,080 which are the values within the row. 292 00:16:02,080 --> 00:16:04,360 So these are going to be the values. 293 00:16:04,360 --> 00:16:07,560 So let's take a look at that real quickly. 294 00:16:07,560 --> 00:16:10,260 So as you guys can see, I'm not quite as confident in my ability 295 00:16:10,260 --> 00:16:11,950 to type code live as David. 296 00:16:11,950 --> 00:16:13,950 So I've typed a lot of this code beforehand. 297 00:16:13,950 --> 00:16:17,340 And we'll go through and comment it out line by line. 298 00:16:17,340 --> 00:16:23,640 But for the start, all we really have here is we read in the shots.csv file. 299 00:16:23,640 --> 00:16:28,260 And we're going to log what the data looks like. 300 00:16:28,260 --> 00:16:33,300 And we should also look at what the actual HTML looks like here. 301 00:16:33,300 --> 00:16:37,909 So at shots.html, we can ignore this SVG for now. 302 00:16:37,909 --> 00:16:39,450 That's not going to be important yet. 303 00:16:39,450 --> 00:16:41,400 And we can ignore this selector for now. 304 00:16:41,400 --> 00:16:43,370 But we're just running the scripts shots.js. 305 00:16:43,370 --> 00:16:46,020 306 00:16:46,020 --> 00:16:52,820 So if we go to shots.html, take a look at that. 307 00:16:52,820 --> 00:16:54,727 Here's what our data looks like. 308 00:16:54,727 --> 00:16:56,310 And as you can see, it's what we said. 309 00:16:56,310 --> 00:17:00,232 It's an array where each element of the array 310 00:17:00,232 --> 00:17:02,940 is a JavaScript object, which are those keys are the column names 311 00:17:02,940 --> 00:17:06,519 and the values are the values within the rows. 312 00:17:06,519 --> 00:17:07,019 OK. 313 00:17:07,019 --> 00:17:09,720 314 00:17:09,720 --> 00:17:12,660 So the question is, how are we actually going to visualize this? 315 00:17:12,660 --> 00:17:17,690 And there are maybe some limitations with just using pure HTML elements. 316 00:17:17,690 --> 00:17:19,440 Like we're probably not just going to want 317 00:17:19,440 --> 00:17:24,180 to put a bunch of paragraphs saying, Carmelo Anthony made this shot here, 318 00:17:24,180 --> 00:17:28,810 or a div that has maybe, we fill in blue, you know, if someone made a shot. 319 00:17:28,810 --> 00:17:31,680 Green, if someone made a shot, red if someone didn't. 320 00:17:31,680 --> 00:17:36,320 We want to have a little bit more freedom to draw things than that. 321 00:17:36,320 --> 00:17:40,320 And that's going to be allowed by this-- I really shouldn't say beyond the DOM 322 00:17:40,320 --> 00:17:44,540 because it still is the DOM, but it's beyond the basic HTML elements. 323 00:17:44,540 --> 00:17:47,040 We're going to be dealing with this image format called SVG. 324 00:17:47,040 --> 00:17:50,610 And SVG stands for scalable vector graphics. 325 00:17:50,610 --> 00:17:54,390 And in SVGs, we're going to be able to draw lines, shapes, and text. 326 00:17:54,390 --> 00:17:57,777 So what do I mean by SVG follows the DOM? 327 00:17:57,777 --> 00:17:58,860 So let's take a look here. 328 00:17:58,860 --> 00:18:02,820 We have this basic HTML file called svg.html. 329 00:18:02,820 --> 00:18:05,540 And you can see it looks, it is HTML. 330 00:18:05,540 --> 00:18:10,720 We open up an SVG with the attributes width 600 and height 200. 331 00:18:10,720 --> 00:18:17,340 Within the SVG I have three tags, each called circle, each a circle tag. 332 00:18:17,340 --> 00:18:22,350 And I've specified the center-- x the center, y, and the radius. 333 00:18:22,350 --> 00:18:24,330 So that's going to be the x and y-coordinates 334 00:18:24,330 --> 00:18:26,940 of the center of the circle and the radius of the circle. 335 00:18:26,940 --> 00:18:31,870 And for the last one also specified that it should be a fill of blue. 336 00:18:31,870 --> 00:18:33,730 So it's going to be colored blue. 337 00:18:33,730 --> 00:18:41,290 So if I go to svg.html, you can see I get this image here, 338 00:18:41,290 --> 00:18:44,130 which is these three circles that I've specified. 339 00:18:44,130 --> 00:18:51,610 So we can see that SVGs follow the same DOM tree that we work with HTML. 340 00:18:51,610 --> 00:18:53,670 So we're going to be able to interact with them 341 00:18:53,670 --> 00:18:57,330 using D3 in pretty much the same way we interact with HTML. 342 00:18:57,330 --> 00:19:00,060 And that's going to allow us to create a lot of these drawings. 343 00:19:00,060 --> 00:19:03,870 And this is the manner in which The New York Times created a drawing like this, 344 00:19:03,870 --> 00:19:07,950 by appending circles to an SVG, and doing a lot of other stuff, 345 00:19:07,950 --> 00:19:11,340 but that's the starting point. 346 00:19:11,340 --> 00:19:16,842 So a couple important tips for using SVGs in these visualizations. 347 00:19:16,842 --> 00:19:19,800 So it's going to be tempting at first to just append a bunch of circles 348 00:19:19,800 --> 00:19:24,540 and specify their center x and center y since that's what we've just seen. 349 00:19:24,540 --> 00:19:28,860 But this can be a little bit annoying down the road. 350 00:19:28,860 --> 00:19:34,200 So take for example, if we wanted to draw a smiley face, for instance, 351 00:19:34,200 --> 00:19:36,300 on an SVG. 352 00:19:36,300 --> 00:19:39,750 Then we might have one circle for the head, two circles for the eyes, 353 00:19:39,750 --> 00:19:41,510 and an arc for the mouth. 354 00:19:41,510 --> 00:19:44,250 And no matter where we put the smiley face on the SVG, 355 00:19:44,250 --> 00:19:46,860 the relative positions of all of those elements 356 00:19:46,860 --> 00:19:50,860 are going to each other are going to be the same. 357 00:19:50,860 --> 00:19:53,700 However, if we want to move that smiley face around, 358 00:19:53,700 --> 00:19:57,450 then we need to change the absolute location of each of those elements 359 00:19:57,450 --> 00:19:58,440 independently, right? 360 00:19:58,440 --> 00:20:03,520 So they each need to specify-- if we move it 500 down, 361 00:20:03,520 --> 00:20:07,470 500 left, then we need to move each element 500 down, 500 to the left. 362 00:20:07,470 --> 00:20:09,720 And this can get a little bit annoying if for each one 363 00:20:09,720 --> 00:20:11,610 we're specifying a center. 364 00:20:11,610 --> 00:20:18,510 So to deal with that, we can use this g tag, and g stands for group. 365 00:20:18,510 --> 00:20:21,690 And the idea is we can specify a g tag and put 366 00:20:21,690 --> 00:20:27,860 all the elements that we want to group together inside of these two tags. 367 00:20:27,860 --> 00:20:30,920 The g tag has this attribute transform. 368 00:20:30,920 --> 00:20:35,300 And transform is going to be some transformation which we 369 00:20:35,300 --> 00:20:38,180 apply to all the elements of the group. 370 00:20:38,180 --> 00:20:43,440 So this first one we have a translation, 100 down and 50 to the left. 371 00:20:43,440 --> 00:20:46,344 So if we have the smiley face within this group, 372 00:20:46,344 --> 00:20:49,010 then it would move all of the elements 100 down and 50 the left. 373 00:20:49,010 --> 00:20:54,530 And this is clearly a lot easier than moving them individually. 374 00:20:54,530 --> 00:20:57,560 We can apply other sorts of transformations. 375 00:20:57,560 --> 00:21:01,100 So in the second one, you have a rotation by 20 degrees 376 00:21:01,100 --> 00:21:05,720 and then a translation 100 to the right. 377 00:21:05,720 --> 00:21:08,390 One thing to keep in mind, 100 to the right. 378 00:21:08,390 --> 00:21:13,970 One thing to keep in mind here is that the transformations 379 00:21:13,970 --> 00:21:15,960 are going to occur right to left. 380 00:21:15,960 --> 00:21:20,517 So the rotation here is applied before the translation. 381 00:21:20,517 --> 00:21:22,850 You can imagine using a rotation if you wanted some sort 382 00:21:22,850 --> 00:21:25,020 of circular layout of your data. 383 00:21:25,020 --> 00:21:28,550 So if you wanted to draw a bunch of circles or rectangles 384 00:21:28,550 --> 00:21:31,520 or text in a circle around some center, then you 385 00:21:31,520 --> 00:21:36,500 might use the rotation transform for a group. 386 00:21:36,500 --> 00:21:37,850 So let's see this in action. 387 00:21:37,850 --> 00:21:42,320 And let's get started putting together the shots visualization. 388 00:21:42,320 --> 00:21:47,700 So back in shots.html, let's go back to the SVG which we have here. 389 00:21:47,700 --> 00:21:52,050 So I have an SVG with the ID canvas, a height of 600 pixels 390 00:21:52,050 --> 00:21:56,060 and a width of 1,200 pixels. 391 00:21:56,060 --> 00:21:59,750 And over in shots.js, we've read in our data. 392 00:21:59,750 --> 00:22:01,940 So let's start putting a little bit of-- let's start 393 00:22:01,940 --> 00:22:03,860 doing some work with this data. 394 00:22:03,860 --> 00:22:07,580 So first I'm going to select the SVG. 395 00:22:07,580 --> 00:22:10,520 And let's ignore that I've set it equal to shots for now. 396 00:22:10,520 --> 00:22:13,280 We can even delete that for now and just select the SVG 397 00:22:13,280 --> 00:22:16,010 and start working with that. 398 00:22:16,010 --> 00:22:19,330 And then I'm going to use the same idiom which we had before, 399 00:22:19,330 --> 00:22:20,810 where we select all the groups. 400 00:22:20,810 --> 00:22:27,260 And we're going to bind the data, so each of these shots to one group. 401 00:22:27,260 --> 00:22:31,310 So remember it's going to append one group per shot in our dataset, per row 402 00:22:31,310 --> 00:22:33,860 in our dataset. 403 00:22:33,860 --> 00:22:36,660 I'm going to give each of these groups a class. 404 00:22:36,660 --> 00:22:41,810 This is a very good practice in D3 when we append these general objects, 405 00:22:41,810 --> 00:22:44,160 with each general objects like groups or circles. 406 00:22:44,160 --> 00:22:47,750 You're just going to use groups and possibly a lot of contexts. 407 00:22:47,750 --> 00:22:49,850 And you're going to be able to want to select 408 00:22:49,850 --> 00:22:52,070 all the groups in a certain context. 409 00:22:52,070 --> 00:22:56,330 And the way the best way to do that is to say, well, if we want to say, 410 00:22:56,330 --> 00:22:59,030 draw all the shots and we want to draw all the rebounds, 411 00:22:59,030 --> 00:23:01,800 then we might have one, which is this class shot and one 412 00:23:01,800 --> 00:23:03,080 is the class rebound. 413 00:23:03,080 --> 00:23:06,980 And then we can just select all the ones with class shot 414 00:23:06,980 --> 00:23:09,380 rather than having to select all the groups 415 00:23:09,380 --> 00:23:12,830 and then do some filtering in order to get only the shots. 416 00:23:12,830 --> 00:23:16,310 So this makes it a little bit easier in the long run. 417 00:23:16,310 --> 00:23:19,190 And then we're going to have our transform. 418 00:23:19,190 --> 00:23:21,497 So here I've done a little work already. 419 00:23:21,497 --> 00:23:23,330 But sort of the first thing that makes sense 420 00:23:23,330 --> 00:23:25,861 is to do just the x and y-coordinates. 421 00:23:25,861 --> 00:23:27,860 So we're going to translate it the x-coordinates 422 00:23:27,860 --> 00:23:30,560 to the left and the y-coordinates down. 423 00:23:30,560 --> 00:23:33,950 So we have some group here. 424 00:23:33,950 --> 00:23:37,430 Now what you can see here is that at the end of this 425 00:23:37,430 --> 00:23:44,220 we have a selection, which is all the groups, one group per data point. 426 00:23:44,220 --> 00:23:46,670 So if I go back and I put in what I had before, where 427 00:23:46,670 --> 00:23:50,510 I say shots equals the selection, what I now have is 428 00:23:50,510 --> 00:23:53,780 shots is a selection of all the groups, it's 429 00:23:53,780 --> 00:23:58,840 going to be selection where each node in the selection represents one shot. 430 00:23:58,840 --> 00:24:01,940 So this makes a lot of sense now. 431 00:24:01,940 --> 00:24:06,500 And to each group I'm going to append a circle of radius 5. 432 00:24:06,500 --> 00:24:09,510 So for each shot, we're going to have one circle. 433 00:24:09,510 --> 00:24:12,010 And let's take a look at what we end up with. 434 00:24:12,010 --> 00:24:15,780 OK, so now that I've reloaded the page, it's a little bit cramped right. 435 00:24:15,780 --> 00:24:19,190 So it turns out that our x and y-coordinates are perhaps not perfectly 436 00:24:19,190 --> 00:24:22,730 spaced for a visualization like this. 437 00:24:22,730 --> 00:24:24,650 So let's expand those out a little bit. 438 00:24:24,650 --> 00:24:31,110 So let's multiply the x-coordinate by 10 and the y-coordinate by 10. 439 00:24:31,110 --> 00:24:34,850 And we can see that now it spaces it out a bit nicer. 440 00:24:34,850 --> 00:24:37,100 Still though, we can't see the lower half of the court 441 00:24:37,100 --> 00:24:40,659 because our SVG is not big enough. 442 00:24:40,659 --> 00:24:42,950 And typically, when we're looking at basketball courts, 443 00:24:42,950 --> 00:24:46,040 we want them to turned to the side here. 444 00:24:46,040 --> 00:24:49,922 So we can do that as well by flipping the y and x-coordinates. 445 00:24:49,922 --> 00:24:54,170 446 00:24:54,170 --> 00:24:57,960 And now when we run it, it's going to look about how we expect. 447 00:24:57,960 --> 00:25:02,150 And yeah, it looks pretty much like what we might expect from a shot chart, 448 00:25:02,150 --> 00:25:04,910 where we have some clustering around the three point line 449 00:25:04,910 --> 00:25:06,870 and a lot of shots near the basket. 450 00:25:06,870 --> 00:25:09,580 451 00:25:09,580 --> 00:25:14,240 OK, so let's start doing some work with this data in a more interesting way. 452 00:25:14,240 --> 00:25:15,680 So what data do we have? 453 00:25:15,680 --> 00:25:17,610 We have whether the shot was made or missed. 454 00:25:17,610 --> 00:25:21,020 So how about we make the two circles green 455 00:25:21,020 --> 00:25:24,650 if it was a make and red if it was a miss. 456 00:25:24,650 --> 00:25:28,910 So remember we have this attribute fill for the circles. 457 00:25:28,910 --> 00:25:30,430 So we can edit that attribute. 458 00:25:30,430 --> 00:25:34,397 And we're going to have a function here, which is again 459 00:25:34,397 --> 00:25:36,980 going to be a nameless function that's going to take that data 460 00:25:36,980 --> 00:25:40,550 bound to that node as the input. 461 00:25:40,550 --> 00:25:42,940 And it's going to output the value we want for fill. 462 00:25:42,940 --> 00:25:45,710 So if we uncomment each of these lines, we have this function. 463 00:25:45,710 --> 00:25:48,570 And it says, if d.result equals made. 464 00:25:48,570 --> 00:25:51,740 So remember, the node, the data represents 465 00:25:51,740 --> 00:25:56,820 is a JavaScript object, which has the keys, which are the rows, which 466 00:25:56,820 --> 00:25:59,580 are the column names in our CSV file. 467 00:25:59,580 --> 00:26:03,170 So here we have a column name called result, which is either made or missed. 468 00:26:03,170 --> 00:26:06,400 So if we do it with d.result, it equals made, then we're going to turn green. 469 00:26:06,400 --> 00:26:08,210 And otherwise, we're going to turn red. 470 00:26:08,210 --> 00:26:10,418 So we're going to have a green circle if it's a make, 471 00:26:10,418 --> 00:26:13,066 and a red circle if it's a miss. 472 00:26:13,066 --> 00:26:18,330 If we look at this again, then that's what we end up with. 473 00:26:18,330 --> 00:26:23,070 And we can begin to see some more things so we can see that out in this area, 474 00:26:23,070 --> 00:26:24,510 people are making a lot of shots. 475 00:26:24,510 --> 00:26:27,530 So these sort of end of the half, half court shots 476 00:26:27,530 --> 00:26:31,350 are not very effective, especially since this is the Knicks and not 477 00:26:31,350 --> 00:26:34,290 the Warriors. 478 00:26:34,290 --> 00:26:37,920 So we've began to look at our data and really 479 00:26:37,920 --> 00:26:40,830 be able to pull maybe not the most interesting thing out, 480 00:26:40,830 --> 00:26:44,726 but beginning to pull something out from this data. 481 00:26:44,726 --> 00:26:49,550 The question is, how can we add more and can we make this visualization 482 00:26:49,550 --> 00:26:51,120 interactive? 483 00:26:51,120 --> 00:26:55,890 So the main interface for adding visualizations is this .on function, 484 00:26:55,890 --> 00:26:59,690 which again is called on selections. 485 00:26:59,690 --> 00:27:03,470 And the first argument to .on is going to be a string representing an event. 486 00:27:03,470 --> 00:27:05,720 And this string can be click. 487 00:27:05,720 --> 00:27:11,640 It could be move if you move the mouse over the selection. 488 00:27:11,640 --> 00:27:16,110 It can be key down if you press a key. 489 00:27:16,110 --> 00:27:19,090 We're going to use mouse over and mouse out. 490 00:27:19,090 --> 00:27:21,840 So mouse over is going to be an event that 491 00:27:21,840 --> 00:27:24,910 triggers when you move the mouse over a node in selection. 492 00:27:24,910 --> 00:27:27,410 And mouse out is going to be an event that triggers when you 493 00:27:27,410 --> 00:27:29,820 move the mouse away from the selection. 494 00:27:29,820 --> 00:27:32,400 The second argument for on is the function that's 495 00:27:32,400 --> 00:27:35,550 called when that event is triggered. 496 00:27:35,550 --> 00:27:38,210 And again, the function is going to take a single argument. 497 00:27:38,210 --> 00:27:43,490 And that argument is going to be the data associated with the node which 498 00:27:43,490 --> 00:27:45,730 triggered the event. 499 00:27:45,730 --> 00:27:49,170 So let's look at that. 500 00:27:49,170 --> 00:27:54,940 So if we do .on mouseover, we are going to have a few lines here, 501 00:27:54,940 --> 00:27:58,560 which we can explain. 502 00:27:58,560 --> 00:28:02,820 So we have first this idiom de.selectthis. 503 00:28:02,820 --> 00:28:07,250 So d3.selectthis-- this represents the actual node. 504 00:28:07,250 --> 00:28:09,260 So remember the argument here is not the node 505 00:28:09,260 --> 00:28:12,920 itself, but the data associated with the node. 506 00:28:12,920 --> 00:28:15,410 But often, when an event's triggered on the node, 507 00:28:15,410 --> 00:28:18,740 we want to actually change just that node. 508 00:28:18,740 --> 00:28:23,260 And the way to get just that node, is to call this de.selectthis. 509 00:28:23,260 --> 00:28:27,630 And it's going to return a selection which contains just that node which 510 00:28:27,630 --> 00:28:29,370 triggered the event. 511 00:28:29,370 --> 00:28:34,510 I'm going to call the dot raise, to raise that note above the other ones. 512 00:28:34,510 --> 00:28:38,580 And let's take that off for now and see what happens if we don't do that. 513 00:28:38,580 --> 00:28:43,460 But that's going to essentially move that HTML element to the end of the SVG 514 00:28:43,460 --> 00:28:45,040 so it's placed above everything else. 515 00:28:45,040 --> 00:28:48,840 But let's see what happens if we don't have that first. 516 00:28:48,840 --> 00:28:51,650 To that node, we're going to append some text. 517 00:28:51,650 --> 00:28:54,140 We're going to give it a class called player name. 518 00:28:54,140 --> 00:28:56,430 And again, good practice whenever we append something 519 00:28:56,430 --> 00:28:58,076 to give it some sort of class. 520 00:28:58,076 --> 00:29:00,200 We probably should have given these circles a class 521 00:29:00,200 --> 00:29:02,070 if we want to draw other circles. 522 00:29:02,070 --> 00:29:04,050 We're actually going to not draw the circles, 523 00:29:04,050 --> 00:29:06,050 so I'm not going to bother with it now. 524 00:29:06,050 --> 00:29:09,260 But it is a good practice to have that class attribute. 525 00:29:09,260 --> 00:29:14,100 And we're going to add this text to this text SVG 526 00:29:14,100 --> 00:29:17,370 tag, which is going to be the player name, so d.player, 527 00:29:17,370 --> 00:29:19,800 so we're using that argument. 528 00:29:19,800 --> 00:29:27,020 So now, if we run this, and we move our mouse over some nodes, 529 00:29:27,020 --> 00:29:31,750 we can see that we indeed end up with the player's name. 530 00:29:31,750 --> 00:29:34,020 And we have a little bit of a problem, because when 531 00:29:34,020 --> 00:29:38,070 we move the mouse off the node, it doesn't delete the player's name. 532 00:29:38,070 --> 00:29:42,940 So now let's go back, and we add this mouseout function. 533 00:29:42,940 --> 00:29:46,420 So here we have on mouseout we're going to trigger another function. 534 00:29:46,420 --> 00:29:50,610 This function is just going to select all the text which has the class player 535 00:29:50,610 --> 00:29:53,300 name, and it's going to remove that. 536 00:29:53,300 --> 00:29:55,365 So here we wouldn't even need text. 537 00:29:55,365 --> 00:29:57,740 We can just do All everything with the class player name. 538 00:29:57,740 --> 00:30:02,750 And we could actually even do everything with just the tag text. 539 00:30:02,750 --> 00:30:06,460 However, probably the best practice is be as specific as possible, 540 00:30:06,460 --> 00:30:10,460 because we may want to add in later other sorts of text 541 00:30:10,460 --> 00:30:13,350 onto our visualization that we don't want to delete 542 00:30:13,350 --> 00:30:15,900 every time we move off a node. 543 00:30:15,900 --> 00:30:20,370 So now that we've added this, let's go back and reload our visualization, 544 00:30:20,370 --> 00:30:22,710 and see what happens as we move around. 545 00:30:22,710 --> 00:30:29,429 And now, we can indeed see that once we move off a node, the text goes away. 546 00:30:29,429 --> 00:30:31,220 Maybe we can make our nodes a little larger 547 00:30:31,220 --> 00:30:34,050 if we want it to be a little easier get over them. 548 00:30:34,050 --> 00:30:36,050 One of the things that I love about this dataset 549 00:30:36,050 --> 00:30:41,020 is that we can see J.R. Smith likes to take a lot of shots from way out. 550 00:30:41,020 --> 00:30:43,460 That's an interesting point. 551 00:30:43,460 --> 00:30:47,600 But one thing we can see is that once we get into here, 552 00:30:47,600 --> 00:30:50,209 the text is going to be added behind a bunch of other nodes. 553 00:30:50,209 --> 00:30:52,250 And the reason this is occurring is because we're 554 00:30:52,250 --> 00:30:55,400 adding it to the group, which has already been placed. 555 00:30:55,400 --> 00:30:58,220 So if that group was placed before another group, 556 00:30:58,220 --> 00:31:01,670 even when we add the text, it's going to be behind all the circles 557 00:31:01,670 --> 00:31:04,130 in the group's place after it. 558 00:31:04,130 --> 00:31:09,260 That's why we had this .raise here, so this selection is going to be that one 559 00:31:09,260 --> 00:31:09,960 group. 560 00:31:09,960 --> 00:31:17,300 And .raise is going to raise that group, so it's the last group in the SVG, 561 00:31:17,300 --> 00:31:19,100 so it's going to be on top. 562 00:31:19,100 --> 00:31:22,760 And now if we look at it again with that .raise, 563 00:31:22,760 --> 00:31:27,050 we can see that now the names come up above all of the other circles. 564 00:31:27,050 --> 00:31:33,290 So often we're not going to have data in necessarily the best form 565 00:31:33,290 --> 00:31:35,660 for the visualizations we want to make. 566 00:31:35,660 --> 00:31:38,130 In this case, we have rows of data. 567 00:31:38,130 --> 00:31:39,830 And we basically want one row for shot. 568 00:31:39,830 --> 00:31:41,660 So this works out pretty well. 569 00:31:41,660 --> 00:31:46,610 But often we are going to want to restructure data in some manner. 570 00:31:46,610 --> 00:31:51,060 And there are a bunch of functions that D3 provides in order to do that. 571 00:31:51,060 --> 00:31:53,150 We're going to work with d3.nest a little bit 572 00:31:53,150 --> 00:31:55,810 to do some stuff with the shots data. 573 00:31:55,810 --> 00:31:59,810 But I'm going to talk a little bit about d3.stratify and d3.hierarchy first, 574 00:31:59,810 --> 00:32:02,530 which deal with hierarchical data. 575 00:32:02,530 --> 00:32:06,720 So the classic example for hierarchical data is a family tree. 576 00:32:06,720 --> 00:32:09,170 So suppose we have some data, which is again 577 00:32:09,170 --> 00:32:13,190 represented as a table in some sort of CSV or whatnot, 578 00:32:13,190 --> 00:32:16,520 where we have the name of a person and then the name of their parent. 579 00:32:16,520 --> 00:32:19,220 So this could clearly be represented with a tree structure. 580 00:32:19,220 --> 00:32:21,410 And a lot of D3 visualizations are going to be 581 00:32:21,410 --> 00:32:26,810 trees or something that represents sort of a tree structure or a hierarchy. 582 00:32:26,810 --> 00:32:33,230 So we can use stratify in order to convert this sort of CSV format 583 00:32:33,230 --> 00:32:38,780 into something which is easier to work with when making trees. 584 00:32:38,780 --> 00:32:40,760 Where we actually want to end up with something 585 00:32:40,760 --> 00:32:45,710 like this, where we have the JavaScript object with three keys-- the ID, 586 00:32:45,710 --> 00:32:49,880 so the person's name, the parent ID, so who their parent is, 587 00:32:49,880 --> 00:32:54,800 and their children, which is a list of all of the nodes 588 00:32:54,800 --> 00:32:58,010 which have the parent of Eve. 589 00:32:58,010 --> 00:33:00,110 And this is going to be sort of recursive, right? 590 00:33:00,110 --> 00:33:02,700 So each of these children is going to have the same format. 591 00:33:02,700 --> 00:33:05,210 So it'll have their ID, their parent ID, and then 592 00:33:05,210 --> 00:33:10,710 their children, which is again going to be another array of a similar format. 593 00:33:10,710 --> 00:33:14,140 So if we write something like strat equals de.stratify, 594 00:33:14,140 --> 00:33:16,730 so creating some stratifier. 595 00:33:16,730 --> 00:33:20,900 In ID and parent ID, we're going to specify a function which is 596 00:33:20,900 --> 00:33:24,140 going to act on the data in that row. 597 00:33:24,140 --> 00:33:27,240 And it's going to return the ID we want to use 598 00:33:27,240 --> 00:33:28,850 and the parent ID we want to use. 599 00:33:28,850 --> 00:33:32,060 So here the ID is going to be the name of the person. 600 00:33:32,060 --> 00:33:34,950 And here the parent ID is going to be the parent. 601 00:33:34,950 --> 00:33:44,050 So you can see that d3.stratify was sort of designed with family trees in mind. 602 00:33:44,050 --> 00:33:46,940 And then if we call strat on whatever our data is, the data 603 00:33:46,940 --> 00:33:49,430 we write in maybe with d3.csv, then we'll 604 00:33:49,430 --> 00:33:53,220 end up with a JavaScript object that looks something like this. 605 00:33:53,220 --> 00:33:55,640 And there's going to be a lot of visualizations in D3 606 00:33:55,640 --> 00:33:59,210 that are built with formats something like this. 607 00:33:59,210 --> 00:34:02,294 So I just wanted to touch on that briefly. 608 00:34:02,294 --> 00:34:04,460 But now let's return to what we're going to actually 609 00:34:04,460 --> 00:34:06,830 work with, with just this d3.nest. 610 00:34:06,830 --> 00:34:10,429 And this is used to group together data. 611 00:34:10,429 --> 00:34:14,090 And in our case, we're going to group together data by player. 612 00:34:14,090 --> 00:34:18,340 So we're going to take all the shots taken by certain players. 613 00:34:18,340 --> 00:34:22,040 D3.nest has three methods which we're going to use. 614 00:34:22,040 --> 00:34:27,170 The first is .entries, and the argument to entries is just going to be the data 615 00:34:27,170 --> 00:34:28,100 we want to group. 616 00:34:28,100 --> 00:34:32,690 So in this case, we're going to do d3.nest, entries(data). 617 00:34:32,690 --> 00:34:35,150 The second method we're going to use is this key. 618 00:34:35,150 --> 00:34:38,600 So key is going to specify what we want to group on. 619 00:34:38,600 --> 00:34:40,400 So here we're going to have a function. 620 00:34:40,400 --> 00:34:44,050 The function is going to take in a row of our data. 621 00:34:44,050 --> 00:34:47,159 And it's going to return what the key should be for that route. 622 00:34:47,159 --> 00:34:49,909 So in this case, it's going to turn the player name. 623 00:34:49,909 --> 00:34:52,710 So let's take a look at that. 624 00:34:52,710 --> 00:34:57,000 So if we do players d3.nest, the key is going to be a player. 625 00:34:57,000 --> 00:35:01,570 We're going to come back to rollup, and the entry is just going to be our data. 626 00:35:01,570 --> 00:35:08,920 And let's log what that looks like once we do this grouping. 627 00:35:08,920 --> 00:35:12,560 So we reload. 628 00:35:12,560 --> 00:35:16,280 We can see that we have an array with 17 objects where the keys are the players' 629 00:35:16,280 --> 00:35:21,140 names, and the values are each of the shots taken by that player. 630 00:35:21,140 --> 00:35:23,360 So if we open up one of these, we can see 631 00:35:23,360 --> 00:35:27,230 that this is a shot indeed taken by Carmelo Anthony assisted 632 00:35:27,230 --> 00:35:29,380 by Tyson Chandler. 633 00:35:29,380 --> 00:35:31,580 OK. 634 00:35:31,580 --> 00:35:41,150 So we now have this object, which has the shots grouped by player. 635 00:35:41,150 --> 00:35:45,350 The last method, which we're going to use from d3.nest is this .rollup. 636 00:35:45,350 --> 00:35:49,460 So .rollup allows us, once we've done this grouping, 637 00:35:49,460 --> 00:35:54,920 to do some aggregation on the rows that have been grouped together. 638 00:35:54,920 --> 00:36:01,550 So rollup takes a function, which rather than acting on the rows individually, 639 00:36:01,550 --> 00:36:06,020 is going to act on the whole array of rows. 640 00:36:06,020 --> 00:36:08,810 So in our case, we're going to do the simplest thing possible, 641 00:36:08,810 --> 00:36:12,270 which is we're just going to return the length of that array. 642 00:36:12,270 --> 00:36:15,170 So the length of the array representing the number of shots 643 00:36:15,170 --> 00:36:17,700 taken by the player. 644 00:36:17,700 --> 00:36:19,670 So here we do .rollup. 645 00:36:19,670 --> 00:36:23,660 Here I do V, maybe V stands for vector or whatnot. 646 00:36:23,660 --> 00:36:26,480 We can change it to A to stand for array. 647 00:36:26,480 --> 00:36:29,510 So it's just going to be this function A, which 648 00:36:29,510 --> 00:36:31,880 takes in the array of shots for that player 649 00:36:31,880 --> 00:36:34,410 and returns the length of that array. 650 00:36:34,410 --> 00:36:36,030 So now if we run this again. 651 00:36:36,030 --> 00:36:38,930 And log what we get, we can indeed see that we 652 00:36:38,930 --> 00:36:44,870 get this object where we have each object has two keys-- 653 00:36:44,870 --> 00:36:52,940 key-- Carmelo Anthony, value-- number of shots taken by Carmela Anthony, 1,643. 654 00:36:52,940 --> 00:36:56,030 So nest has allowed us to sort of pull some other things out from our data 655 00:36:56,030 --> 00:36:58,841 and group it in some manner that we can then use. 656 00:36:58,841 --> 00:37:01,670 657 00:37:01,670 --> 00:37:06,132 So let's use this a little bit. 658 00:37:06,132 --> 00:37:08,340 So we haven't actually added it toward visualization. 659 00:37:08,340 --> 00:37:10,680 We've done some sort of computations here. 660 00:37:10,680 --> 00:37:15,360 But let's actually add it to our visualization and do something with it. 661 00:37:15,360 --> 00:37:18,570 So back in the shots.html, you saw that I 662 00:37:18,570 --> 00:37:22,550 had this selector with the ID selector. 663 00:37:22,550 --> 00:37:25,430 And we're going to want to be able to just look at the shots taken 664 00:37:25,430 --> 00:37:27,290 by a specific player. 665 00:37:27,290 --> 00:37:32,030 So now we can go back and start thinking about using D3 with the HTML 666 00:37:32,030 --> 00:37:34,460 rather than in the SVG context. 667 00:37:34,460 --> 00:37:37,580 So I'm just going to select the selector. 668 00:37:37,580 --> 00:37:41,230 I'm going to select-- I'm going to use this idiom again. 669 00:37:41,230 --> 00:37:44,600 .data, .selectAll, .data, .enter, .append. 670 00:37:44,600 --> 00:37:50,270 So I'm going to add one option for each element of this player's array 671 00:37:50,270 --> 00:37:52,050 that I've just created. 672 00:37:52,050 --> 00:37:58,700 So one option per player where the text of that option is going to be the key, 673 00:37:58,700 --> 00:38:02,120 so the player name and a colon and then the number of shots 674 00:38:02,120 --> 00:38:03,590 taken by the player. 675 00:38:03,590 --> 00:38:05,990 And the value-- so this attribute value is just 676 00:38:05,990 --> 00:38:09,060 going to be the key of that player. 677 00:38:09,060 --> 00:38:13,010 So if I do that, and I look at what I end up with, 678 00:38:13,010 --> 00:38:16,220 you can see now I have a selector where I have the player names 679 00:38:16,220 --> 00:38:17,120 and the shots taken. 680 00:38:17,120 --> 00:38:19,960 And I can go through and click on that selector. 681 00:38:19,960 --> 00:38:22,280 And you can see the selector isn't so pretty, 682 00:38:22,280 --> 00:38:25,010 but you can imagine we could also combine this with Bootstrap, 683 00:38:25,010 --> 00:38:26,968 and we could have a very nice looking selector. 684 00:38:26,968 --> 00:38:30,710 So we can start to use a lot of these different CSS and JS libraries 685 00:38:30,710 --> 00:38:35,210 together to make a more beautiful application. 686 00:38:35,210 --> 00:38:37,880 So the selector doesn't do anything right now, 687 00:38:37,880 --> 00:38:41,300 but we can combine some of the things we've 688 00:38:41,300 --> 00:38:46,370 learned in terms of interaction and selections 689 00:38:46,370 --> 00:38:50,240 in order to make it only show the shots for that player. 690 00:38:50,240 --> 00:38:53,810 So we're going to, rather than using on click here, 691 00:38:53,810 --> 00:38:55,520 we're going to use on change. 692 00:38:55,520 --> 00:38:58,907 So that when they-- rather than just when they click on the same thing, 693 00:38:58,907 --> 00:39:00,740 we don't want it to change anything, so only 694 00:39:00,740 --> 00:39:03,050 when they change the value selector. 695 00:39:03,050 --> 00:39:06,970 We're going to select all-- well, let's leave that for now. 696 00:39:06,970 --> 00:39:09,350 We're going to get the value of the selector. 697 00:39:09,350 --> 00:39:15,950 So what's interesting about these .attribute and here we're using 698 00:39:15,950 --> 00:39:20,510 .property, and .style, is that they can be used as accessors as well 699 00:39:20,510 --> 00:39:22,520 as setters, so getters as well as setters. 700 00:39:22,520 --> 00:39:25,700 So if we don't provide this second argument here to set it, 701 00:39:25,700 --> 00:39:28,350 then it's going to return the current value. 702 00:39:28,350 --> 00:39:31,190 So here we're just getting the current value of the selector. 703 00:39:31,190 --> 00:39:34,880 So whatever we set here is the value. 704 00:39:34,880 --> 00:39:40,070 And once we get the value, we're going to select all the shots. 705 00:39:40,070 --> 00:39:43,610 We're going to apply this filter function which 706 00:39:43,610 --> 00:39:46,230 is going to restrict our selection. 707 00:39:46,230 --> 00:39:50,720 So .filter, which I mentioned way back in terms of selections, 708 00:39:50,720 --> 00:39:53,360 and said I would come back to, well, this is the time. 709 00:39:53,360 --> 00:39:56,330 So once I've created a selection I may not want 710 00:39:56,330 --> 00:39:57,870 all of the nodes in that selection. 711 00:39:57,870 --> 00:40:01,580 And I may want to be able to pick out the nodes in a more specific sense 712 00:40:01,580 --> 00:40:05,860 than just class, ID, and tag. 713 00:40:05,860 --> 00:40:10,100 And I can do that by using this .filter function. 714 00:40:10,100 --> 00:40:15,650 So the argument to filter is just going to be a function, which 715 00:40:15,650 --> 00:40:19,500 takes in the data bound to that node, and if it returns true, 716 00:40:19,500 --> 00:40:22,020 then we keep the node in selection, And if returns false, 717 00:40:22,020 --> 00:40:23,330 we don't keep the node. 718 00:40:23,330 --> 00:40:29,030 So here my function takes the data, and it returns d.player not equal to value. 719 00:40:29,030 --> 00:40:32,420 So it returns true if the player for that row 720 00:40:32,420 --> 00:40:35,900 is not equal to the player I've selected. 721 00:40:35,900 --> 00:40:40,190 So it's going to select all the shots not taken by the player. 722 00:40:40,190 --> 00:40:42,520 And it's going to set their opacity to 0.1. 723 00:40:42,520 --> 00:40:45,170 And so it's going to make them transparent. 724 00:40:45,170 --> 00:40:47,410 Then let's see that in action. 725 00:40:47,410 --> 00:40:50,720 So now if I reload and click, you can see that it only 726 00:40:50,720 --> 00:40:53,270 shows me the shots for Beno Udrih. 727 00:40:53,270 --> 00:40:55,280 Problem is, if I select another player, then 728 00:40:55,280 --> 00:40:56,780 they're all going to be transparent. 729 00:40:56,780 --> 00:41:00,210 Because I'm only making more nodes transparent, I'm never resetting them. 730 00:41:00,210 --> 00:41:02,960 So in order to change that, I go back to the start of my function. 731 00:41:02,960 --> 00:41:04,160 I select all the shots. 732 00:41:04,160 --> 00:41:12,260 And I change their opacity to 1 before I go and change the opacity of the shots 733 00:41:12,260 --> 00:41:14,760 not taken by that player. 734 00:41:14,760 --> 00:41:22,220 So if I do that, and reload, and now I select Raymond Felton. 735 00:41:22,220 --> 00:41:25,050 And see he's spread it out, spreads out his shots pretty well. 736 00:41:25,050 --> 00:41:30,540 If I go and look at Tyson Chandler, a little bit tighter to the hoop. 737 00:41:30,540 --> 00:41:33,960 Metta World Peace, maybe not taking that many shots. 738 00:41:33,960 --> 00:41:37,170 Amare Stoudemire, another guy who plays tight to the hoop, and of course, 739 00:41:37,170 --> 00:41:40,210 Carmelo Anthony, all over the floor. 740 00:41:40,210 --> 00:41:44,350 So again, we can begin to pull more things out of our data. 741 00:41:44,350 --> 00:41:47,250 Now the one thing which I don't like about this is it starts out 742 00:41:47,250 --> 00:41:49,980 with Carmelo Anthony, right? 743 00:41:49,980 --> 00:41:52,200 But it's not just Carmelo Anthony's shots here. 744 00:41:52,200 --> 00:41:56,900 Like nothing gets faded until we change the selector. 745 00:41:56,900 --> 00:42:00,080 So to remedy that-- I'm just going to add-- 746 00:42:00,080 --> 00:42:05,940 remember players is just an array-- so I can just add one JavaScript 747 00:42:05,940 --> 00:42:08,790 object at the start of players, one associative array 748 00:42:08,790 --> 00:42:10,040 at the start of players. 749 00:42:10,040 --> 00:42:14,700 I'm going to have my key be all, and I'm going to have my value be this d3.sum. 750 00:42:14,700 --> 00:42:16,975 751 00:42:16,975 --> 00:42:19,350 And we're not going to go into too much detail in d3.sum. 752 00:42:19,350 --> 00:42:21,750 That's something that can be looked at, and it's not 753 00:42:21,750 --> 00:42:24,220 too hard to understand given the rest of this lecture. 754 00:42:24,220 --> 00:42:29,610 But it's just going to sum up the values of the elements and players. 755 00:42:29,610 --> 00:42:31,980 So it's going to be the total number of shots taken. 756 00:42:31,980 --> 00:42:38,820 And I'm only going to fade shots if the value is not equal to all. 757 00:42:38,820 --> 00:42:45,000 So now we have this selector, where we can select through different players. 758 00:42:45,000 --> 00:42:48,720 And if we go back to all, then all the shots are shown. 759 00:42:48,720 --> 00:42:51,490 So now this works exactly as we would want it to. 760 00:42:51,490 --> 00:42:53,910 And we have all the lines of code uncommented, 761 00:42:53,910 --> 00:42:56,054 so we're done with what I had planned. 762 00:42:56,054 --> 00:42:59,580 763 00:42:59,580 --> 00:43:01,792 So we've created this visualization. 764 00:43:01,792 --> 00:43:03,750 And of course, there are a lot of ways that I'm 765 00:43:03,750 --> 00:43:06,750 sure you are beginning to think about, how we can iterate on this, 766 00:43:06,750 --> 00:43:10,260 how we can improve on this, even based on just the few functions 767 00:43:10,260 --> 00:43:11,460 that we've seen. 768 00:43:11,460 --> 00:43:15,150 We can think about looking at how assists play into this. 769 00:43:15,150 --> 00:43:17,640 We can look at maybe coloring the notes differently 770 00:43:17,640 --> 00:43:20,760 based on shot distance or shot difficulty 771 00:43:20,760 --> 00:43:22,920 and whether they made or missed. 772 00:43:22,920 --> 00:43:25,870 And those are some great things to think about going forward. 773 00:43:25,870 --> 00:43:28,740 However, as I said, really the way to work with D3 774 00:43:28,740 --> 00:43:32,850 is to look at other examples to begin to formulate your own ideas about what 775 00:43:32,850 --> 00:43:35,460 visualizations you want to create and what 776 00:43:35,460 --> 00:43:39,870 different kinds of code, how D3 is being used in versatile ways 777 00:43:39,870 --> 00:43:41,580 to create those visualizations. 778 00:43:41,580 --> 00:43:44,450 So I'm providing a couple of resources here that you can look at. 779 00:43:44,450 --> 00:43:48,180 So we have d3js.org, which are going to have a wide gallery of sort 780 00:43:48,180 --> 00:43:54,120 of complete D3 applications as well as the code for them 781 00:43:54,120 --> 00:43:56,740 and a ton of great documentation. 782 00:43:56,740 --> 00:43:58,770 There's also this site Blocks by this guy 783 00:43:58,770 --> 00:44:03,600 Mike Bostock, which has a lot of simpler examples which sort of illustrate 784 00:44:03,600 --> 00:44:05,340 single aspects of D3. 785 00:44:05,340 --> 00:44:07,740 So here he has this circle dragging example, 786 00:44:07,740 --> 00:44:13,560 which just demonstrates basically a good way to implement circles 787 00:44:13,560 --> 00:44:14,620 that you can drag around. 788 00:44:14,620 --> 00:44:16,860 So this is a pretty simple example. 789 00:44:16,860 --> 00:44:18,660 It's not really a full visualization. 790 00:44:18,660 --> 00:44:21,240 But if you're looking for a more specific thing 791 00:44:21,240 --> 00:44:24,240 you want to implement in D3, then this blocks is a good place 792 00:44:24,240 --> 00:44:27,310 to look if someone has already done it. 793 00:44:27,310 --> 00:44:29,280 And the last thing I'm linking is a library 794 00:44:29,280 --> 00:44:32,580 with 538 put together called D3-pre. 795 00:44:32,580 --> 00:44:35,670 As you saw, at some points our application 796 00:44:35,670 --> 00:44:37,590 was a little laggy in terms of loading. 797 00:44:37,590 --> 00:44:41,240 It took a little while for it to render. 798 00:44:41,240 --> 00:44:45,360 D3-pre allows us to pre-render a lot of our visualizations. 799 00:44:45,360 --> 00:44:50,040 So if we want to have a lot of stuff going on in a visualization, 800 00:44:50,040 --> 00:44:56,850 we can use this D3-pre so it renders more quickly when 801 00:44:56,850 --> 00:45:00,690 we want to actually show it to users. 802 00:45:00,690 --> 00:45:04,530 So that should complete what I wanted to talk about D3. 803 00:45:04,530 --> 00:45:08,310 I hope that you guys are inspired to look at some examples 804 00:45:08,310 --> 00:45:13,610 and go forward, and feel that you were literate with the language. 805 00:45:13,610 --> 00:45:15,320 Thank you. 806 00:45:15,320 --> 00:45:16,780