1 00:00:00,000 --> 00:00:12,080 2 00:00:12,080 --> 00:00:13,799 >> JAMES CUFF: Hi, good afternoon, everyone. 3 00:00:13,799 --> 00:00:14,715 My name is James Cuff. 4 00:00:14,715 --> 00:00:18,970 I'm the Assistant Dean for Research Computing here at Harvard University. 5 00:00:18,970 --> 00:00:24,540 And today I'm going to talk to you about why scale-out computing is essential. 6 00:00:24,540 --> 00:00:26,810 >> So I guess, first up, who is this guy? 7 00:00:26,810 --> 00:00:27,750 Why am I here? 8 00:00:27,750 --> 00:00:29,200 Why am I talking to you? 9 00:00:29,200 --> 00:00:33,730 I have a background in scientific computing and research computing, 10 00:00:33,730 --> 00:00:38,530 stretching back to the United Kingdom-- The Wellcome Trust Sanger 11 00:00:38,530 --> 00:00:43,270 Institute for the human genome-- and then more recently in the United States 12 00:00:43,270 --> 00:00:50,170 working at the Broad and other esteemed places of learning, such as Harvard. 13 00:00:50,170 --> 00:00:53,930 >> I guess what that really means is that I'm a recovering molecular bio 14 00:00:53,930 --> 00:00:55,740 physicist. 15 00:00:55,740 --> 00:01:01,250 So what right have I got to tell you about scale-out computing? 16 00:01:01,250 --> 00:01:03,570 There's a however. 17 00:01:03,570 --> 00:01:09,530 18 years or so I've just seen the most dramatic increases in scale complexity 18 00:01:09,530 --> 00:01:13,570 and overall efficiency of computing systems. 19 00:01:13,570 --> 00:01:18,890 >> When I was doing my PhD at Oxford, I was pretty excited with a 200 megahertz 20 00:01:18,890 --> 00:01:23,830 Silicon Graphics machine with 18 gigabytes of storage and a single CPU. 21 00:01:23,830 --> 00:01:24,910 Times have changed. 22 00:01:24,910 --> 00:01:29,860 If you fast forward now, we're spinning over 60,000 CPUs here at Harvard. 23 00:01:29,860 --> 00:01:32,810 Many other organizations are spinning many more. 24 00:01:32,810 --> 00:01:37,740 >> The important takeaway from this is that scale is now not only inevitable, 25 00:01:37,740 --> 00:01:41,910 it's happened and it's going to continue to happen. 26 00:01:41,910 --> 00:01:44,760 So let's, for a moment, kind of rewind and talk very quickly 27 00:01:44,760 --> 00:01:50,530 about science, my favorite subject, the scientific method. 28 00:01:50,530 --> 00:01:53,180 >> If you are to be a scientist, you have to do a few key things. 29 00:01:53,180 --> 00:01:56,140 If you don't do these things you can not consider yourself a scientist 30 00:01:56,140 --> 00:02:03,250 and you will struggle being able to understand your area of discipline. 31 00:02:03,250 --> 00:02:07,290 >> So first of all, you would formulate your question, you generate hypotheses, 32 00:02:07,290 --> 00:02:09,289 but more importantly, you predict your results-- 33 00:02:09,289 --> 00:02:13,090 you have a guess as to what the results will be. 34 00:02:13,090 --> 00:02:19,560 And then finally, you test your hypothesis and analyze your results. 35 00:02:19,560 --> 00:02:25,460 >> So this scientific method is extremely important in computing. 36 00:02:25,460 --> 00:02:28,450 Computing of both the prediction and being able to test your results 37 00:02:28,450 --> 00:02:33,660 are a key part of what we need to do in the scientific method. 38 00:02:33,660 --> 00:02:37,310 These predictions and testings are the real two cornerstones 39 00:02:37,310 --> 00:02:42,350 of the scientific method, and each require the most significant advances 40 00:02:42,350 --> 00:02:45,240 in modern computation. 41 00:02:45,240 --> 00:02:51,210 >> The two pillars of science are that of theory and that of experimentation. 42 00:02:51,210 --> 00:02:54,300 And more recently, computing is often mentioned 43 00:02:54,300 --> 00:02:58,090 as being the third pillar of science. 44 00:02:58,090 --> 00:03:01,440 So if you students are watching this, you have absolutely no pressure. 45 00:03:01,440 --> 00:03:03,960 46 00:03:03,960 --> 00:03:08,720 Third pillar of science-- no big deal-- computing, kind of important. 47 00:03:08,720 --> 00:03:14,000 So glad this is the computing part of computer science course 50. 48 00:03:14,000 --> 00:03:16,220 >> So enough of the background. 49 00:03:16,220 --> 00:03:20,226 I want to tell you the plan of what we're going to talk about today. 50 00:03:20,226 --> 00:03:22,870 I'm going to go over some history. 51 00:03:22,870 --> 00:03:25,250 I'm going to explain why we got here. 52 00:03:25,250 --> 00:03:27,750 I'm going to talk about some of the history of the computing 53 00:03:27,750 --> 00:03:33,890 here at Harvard, some activities around social media, 54 00:03:33,890 --> 00:03:36,200 green things-- very passionate about all things 55 00:03:36,200 --> 00:03:43,640 green-- storage-- computer storage-- how chaos affects scale-out out systems, 56 00:03:43,640 --> 00:03:45,640 and distributive systems in particular. 57 00:03:45,640 --> 00:03:48,473 >> And then I'm going to touch on some of the scale-out hardware that's 58 00:03:48,473 --> 00:03:51,370 required to be able to do computing at scale. 59 00:03:51,370 --> 00:03:55,830 And then finally, we're going to wrap up with some awesome science. 60 00:03:55,830 --> 00:04:00,894 >> So, let's take a minute to look at our actual history. 61 00:04:00,894 --> 00:04:01,810 Computing has evolved. 62 00:04:01,810 --> 00:04:07,370 So since the '60s, all the away through to today, 63 00:04:07,370 --> 00:04:11,260 we've seen basically a change of scope from centralized computing 64 00:04:11,260 --> 00:04:14,679 to decentralize computing, to collaborative and then independent 65 00:04:14,679 --> 00:04:15,970 computing and right back again. 66 00:04:15,970 --> 00:04:17,709 >> And let me annotate that a little bit. 67 00:04:17,709 --> 00:04:20,370 When we first started off with computers, we had mainframes. 68 00:04:20,370 --> 00:04:22,824 They were inordinately expensive devices. 69 00:04:22,824 --> 00:04:23,990 Everything had to be shared. 70 00:04:23,990 --> 00:04:25,556 The computing was complex. 71 00:04:25,556 --> 00:04:29,060 You can see, it filled rooms and there were operators and tapes 72 00:04:29,060 --> 00:04:32,780 and all sorts of whirry, clicky, spinny devices. 73 00:04:32,780 --> 00:04:39,930 >> Around the '70s early '80s, you started to see an impact of the fax machines. 74 00:04:39,930 --> 00:04:43,620 So you're starting to see computing start to appear back in laboratories 75 00:04:43,620 --> 00:04:45,880 and become closer to you. 76 00:04:45,880 --> 00:04:49,800 The rise of the personal computer, certainly 77 00:04:49,800 --> 00:04:57,460 in the '80s, early part of the decade, really changed computing. 78 00:04:57,460 --> 00:04:59,570 >> And there's a clue in the title, because it 79 00:04:59,570 --> 00:05:04,080 was called the personal computer, which meant it belonged to you. 80 00:05:04,080 --> 00:05:07,630 So as the evolution of computing continued, 81 00:05:07,630 --> 00:05:10,530 people realized that their personal computer wasn't really big enough 82 00:05:10,530 --> 00:05:15,020 to be able to do anything of any merit, or significant merit, in science. 83 00:05:15,020 --> 00:05:17,790 >> And so folks started to develop network device 84 00:05:17,790 --> 00:05:21,920 drivers to be able to connect PCs together to be able to build clusters. 85 00:05:21,920 --> 00:05:26,430 And so this begat the era of the Beowulf cluster. 86 00:05:26,430 --> 00:05:32,470 Linux exploded as a response to proprietary operating system, both cost 87 00:05:32,470 --> 00:05:33,650 and complexity. 88 00:05:33,650 --> 00:05:36,530 >> And then, here we are today, where, yet again, we're 89 00:05:36,530 --> 00:05:40,610 faced with rooms full of computer equipment and the ability 90 00:05:40,610 --> 00:05:44,570 to swipe one's credit card and get access to these computing facilities, 91 00:05:44,570 --> 00:05:45,290 remotely. 92 00:05:45,290 --> 00:05:49,680 >> And so you can then see, in terms of history impacting 93 00:05:49,680 --> 00:05:52,180 how we do computing today, it's definitely 94 00:05:52,180 --> 00:05:56,090 evolved from machine rooms full of computers 95 00:05:56,090 --> 00:05:59,160 through some personal computing all the way right back again 96 00:05:59,160 --> 00:06:02,400 to machine rooms full of computers. 97 00:06:02,400 --> 00:06:06,620 >> So this is my first cluster. 98 00:06:06,620 --> 00:06:10,170 So 2000, we built a computer system in Europe 99 00:06:10,170 --> 00:06:13,900 to effectively annotate the human genome. 100 00:06:13,900 --> 00:06:16,521 There's a lot of technology listed on the right hand side 101 00:06:16,521 --> 00:06:18,520 there that, unfortunately, is no longer with us. 102 00:06:18,520 --> 00:06:23,460 It's passed off to the great technology in the sky. 103 00:06:23,460 --> 00:06:26,610 >> The machine itself is probably equivalent of a few decent laptops 104 00:06:26,610 --> 00:06:29,020 today, and that just kind of shows you. 105 00:06:29,020 --> 00:06:36,260 However, we did carefully annotate the human genome and both protected it 106 00:06:36,260 --> 00:06:43,190 with this particular paper in Nature from the concerns the data 107 00:06:43,190 --> 00:06:45,380 being public or private. 108 00:06:45,380 --> 00:06:48,610 >> So this is awesome, right? 109 00:06:48,610 --> 00:06:50,280 So we've got a human genome. 110 00:06:50,280 --> 00:06:51,510 We've done computing. 111 00:06:51,510 --> 00:06:53,400 I'm feeling very pleased myself. 112 00:06:53,400 --> 00:06:59,090 I rolled up to Harvard in 2006, feeling a lot less pleased with myself. 113 00:06:59,090 --> 00:07:00,210 >> This is what I inherited. 114 00:07:00,210 --> 00:07:03,575 This is a departmental mail and file server. 115 00:07:03,575 --> 00:07:05,450 You can see here there's a little bit of tape 116 00:07:05,450 --> 00:07:07,710 that's used to hold the system together. 117 00:07:07,710 --> 00:07:09,890 This is our license and print server. 118 00:07:09,890 --> 00:07:13,990 I'm pretty sure there maybe passwords on some of these Post-it Notes. 119 00:07:13,990 --> 00:07:16,560 120 00:07:16,560 --> 00:07:17,360 >> Not awesome. 121 00:07:17,360 --> 00:07:18,530 Pretty far from awesome. 122 00:07:18,530 --> 00:07:22,060 And so, I realize this little chart that I showed you at the beginning 123 00:07:22,060 --> 00:07:25,350 from sharing to ownership back to sharing, 124 00:07:25,350 --> 00:07:27,930 that we needed to change the game. 125 00:07:27,930 --> 00:07:31,330 And so we changed the game by providing incentives. 126 00:07:31,330 --> 00:07:34,250 And so human beings, as this little Wikipedia article 127 00:07:34,250 --> 00:07:35,990 says here, our purposeful creatures. 128 00:07:35,990 --> 00:07:39,250 And the study of incentive structures is essential to the study 129 00:07:39,250 --> 00:07:41,100 of economic activity. 130 00:07:41,100 --> 00:07:44,580 >> So we started to incentivize our faculty and our researchers. 131 00:07:44,580 --> 00:07:47,720 And so we incentivized them with a really big computer system. 132 00:07:47,720 --> 00:07:52,720 So in 2008, we built a 4,096 processor machine-- 10 racks, 133 00:07:52,720 --> 00:07:54,470 couple hundred kilowatts of power. 134 00:07:54,470 --> 00:07:56,178 >> What I think is interesting is it doesn't 135 00:07:56,178 --> 00:07:58,300 matter where you are in the cycle. 136 00:07:58,300 --> 00:08:03,510 This same amount of power and compute, the power is the constant. 137 00:08:03,510 --> 00:08:06,270 It was 200 kilowatts when we were building systems in Europe. 138 00:08:06,270 --> 00:08:09,770 It's two hundred kilowatts in 2008, and that 139 00:08:09,770 --> 00:08:15,820 seems to be the [? quanter ?] of small university-based computing systems. 140 00:08:15,820 --> 00:08:20,540 >> So Harvard today-- fast forward, I'm no longer sad panda, quite a happy panda. 141 00:08:20,540 --> 00:08:25,860 We have 60-odd thousand load balanced CPUs, and their climbing dramatically. 142 00:08:25,860 --> 00:08:28,780 We have 15 petabytes of storage, also climbing. 143 00:08:28,780 --> 00:08:30,720 Again, this 200 kilowatt increment, we seem 144 00:08:30,720 --> 00:08:33,000 to be adding that every six or so months. 145 00:08:33,000 --> 00:08:35,480 Lots and lots of virtual machines. 146 00:08:35,480 --> 00:08:37,620 And more importantly, about 1.8 megawatts 147 00:08:37,620 --> 00:08:39,669 of research computing equipment. 148 00:08:39,669 --> 00:08:41,820 >> And I'm going to come back to this later on, 149 00:08:41,820 --> 00:08:46,913 as to why I now no longer necessarily count how much CPU we have, 150 00:08:46,913 --> 00:08:48,980 but how big is the electricity bill. 151 00:08:48,980 --> 00:08:52,690 20 other so dedicated research computing staff. 152 00:08:52,690 --> 00:08:57,250 And more importantly, we're starting to grow our GPGPUs. 153 00:08:57,250 --> 00:09:05,030 I was staggered at how much of this is being added on a day-to-day basis. 154 00:09:05,030 --> 00:09:07,310 So, history lesson over, right? 155 00:09:07,310 --> 00:09:11,280 >> So how do we get there from here? 156 00:09:11,280 --> 00:09:14,560 Let's look at some modern scale-out compute examples. 157 00:09:14,560 --> 00:09:18,290 158 00:09:18,290 --> 00:09:23,230 I'm a little bit obsessed with the size and scale of social media. 159 00:09:23,230 --> 00:09:30,850 There are a number of extremely successful large scale computing 160 00:09:30,850 --> 00:09:34,820 organizations now on the planet, providing support and services 161 00:09:34,820 --> 00:09:36,810 to us all. 162 00:09:36,810 --> 00:09:39,340 So that's the disclaimer. 163 00:09:39,340 --> 00:09:42,990 >> And I want to start with a number of ounces in an Instagram. 164 00:09:42,990 --> 00:09:48,336 It's not actually a lead-in to a joke, it's 165 00:09:48,336 --> 00:09:50,460 not even that funny, actually, come to think of it. 166 00:09:50,460 --> 00:09:52,751 But anyway, we're going to look at ounces in Instagram. 167 00:09:52,751 --> 00:09:55,260 And we're going to start with "My bee and a flower." 168 00:09:55,260 --> 00:09:57,600 I was at [INAUDIBLE] Village and I took a little picture 169 00:09:57,600 --> 00:10:00,460 of a bee sitting on a flower. 170 00:10:00,460 --> 00:10:03,270 And then I started to think about what does this actually mean. 171 00:10:03,270 --> 00:10:07,013 And I took this picture off my phone and counted how many bytes are in it, 172 00:10:07,013 --> 00:10:09,070 and it's about 256 kilobytes. 173 00:10:09,070 --> 00:10:13,550 Which when I started, would basically fill a 5 and 1/4 inch floppy. 174 00:10:13,550 --> 00:10:15,340 And started to think, well, that's cool. 175 00:10:15,340 --> 00:10:18,630 >> And I started to look and do some research on the network. 176 00:10:18,630 --> 00:10:22,490 And I found out that Instagram has 200 million MAUs. 177 00:10:22,490 --> 00:10:25,105 I wasn't actually that sure what a MAU was. 178 00:10:25,105 --> 00:10:28,960 And a MAU, down here, is a monthly active user. 179 00:10:28,960 --> 00:10:34,270 >> So, 200 million MAUs-- pretty cool. 180 00:10:34,270 --> 00:10:38,190 20 billion photographs-- so quite a lot of photographs. 181 00:10:38,190 --> 00:10:42,300 60 million new photographs each and every day 182 00:10:42,300 --> 00:10:46,990 coming out at about .002 gig per photo. 183 00:10:46,990 --> 00:10:51,290 That's about five petabytes of disk just right there. 184 00:10:51,290 --> 00:10:55,480 And that's really not the central part of what we're going to talk about. 185 00:10:55,480 --> 00:10:57,830 That is small potatoes. 186 00:10:57,830 --> 00:11:00,710 Or as we say in England, tiny spuds. 187 00:11:00,710 --> 00:11:05,050 >> So let's look at the real elephant in the room-- unique faces. 188 00:11:05,050 --> 00:11:09,170 Again, let's measure in this new quanta call a MAU. 189 00:11:09,170 --> 00:11:13,260 Facebook itself has 1.3 billion MAUs. 190 00:11:13,260 --> 00:11:17,510 WhatsApp, which I hadn't even heard of until recently, it's 191 00:11:17,510 --> 00:11:23,260 some sort messaging service, is 500 million MAUs. 192 00:11:23,260 --> 00:11:26,620 Instagram, which we just talked about, 200 million MAUs. 193 00:11:26,620 --> 00:11:29,370 And Messenger, which is another messaging service, 194 00:11:29,370 --> 00:11:31,120 is also 200 million MAUs. 195 00:11:31,120 --> 00:11:35,920 >> So total that up, it's about 2.2 billion total users. 196 00:11:35,920 --> 00:11:39,880 Clearly there's some overlap, but that's equivalent to a third of the planet. 197 00:11:39,880 --> 00:11:44,270 And they send something in the region of 12 billion messages a day. 198 00:11:44,270 --> 00:11:46,680 And again, there's only 7 billion people on the planet. 199 00:11:46,680 --> 00:11:48,550 Not everyone has a smartphone. 200 00:11:48,550 --> 00:11:53,960 So this is insane numbers. 201 00:11:53,960 --> 00:12:02,050 >> And I'm going to argue that it's not even about the storage or the compute. 202 00:12:02,050 --> 00:12:05,610 And to quote the song, it's all about that graph. 203 00:12:05,610 --> 00:12:09,045 Here's our lovely Meghan Trainor down here, singing about all the bass. 204 00:12:09,045 --> 00:12:12,570 Note, she also has quite a bit of bass herself-- 207, 205 00:12:12,570 --> 00:12:16,460 well 218 million people have seen this young lady singing her song. 206 00:12:16,460 --> 00:12:19,910 >> So my argument is it it's all about the graph. 207 00:12:19,910 --> 00:12:23,480 So we took some open source software and started to look at a graph. 208 00:12:23,480 --> 00:12:27,740 And this is LinkedIn, so this is a Facebook for old people. 209 00:12:27,740 --> 00:12:29,910 And so, this is my LinkedIn graph. 210 00:12:29,910 --> 00:12:34,080 I have 1,200 or so nodes, so-called "Friends." 211 00:12:34,080 --> 00:12:36,360 And here's me at the top. 212 00:12:36,360 --> 00:12:38,140 And here's all of the interconnections. 213 00:12:38,140 --> 00:12:40,570 >> Now, think back to the Instagram story. 214 00:12:40,570 --> 00:12:42,815 Each one of these is not just the photo, it 215 00:12:42,815 --> 00:12:46,860 has a whole plethora of connections between this particular individual 216 00:12:46,860 --> 00:12:48,220 and many others. 217 00:12:48,220 --> 00:12:52,190 This is central piece is either a bug in the graph drawing algorithm, 218 00:12:52,190 --> 00:12:55,982 or this maybe David Malan, I'm not sure yet. 219 00:12:55,982 --> 00:12:57,690 So you can redraw the graphs in all sorts 220 00:12:57,690 --> 00:13:02,510 of ways-- gephi.gihub.io is where you can pull that software from. 221 00:13:02,510 --> 00:13:05,410 It's really cool for being able to organize communities. 222 00:13:05,410 --> 00:13:08,640 You can see here, this is Harvard and various other places that I've worked, 223 00:13:08,640 --> 00:13:12,160 because this is my work-related data. 224 00:13:12,160 --> 00:13:15,080 >> So just think about the complexity of the graph and all of the data 225 00:13:15,080 --> 00:13:17,070 that you pull along with. 226 00:13:17,070 --> 00:13:20,870 So meanwhile, back at FriendFace, right? 227 00:13:20,870 --> 00:13:24,360 We looked at the Instagram data that was of the order of five petabytes. 228 00:13:24,360 --> 00:13:25,300 No big deal. 229 00:13:25,300 --> 00:13:28,830 Still quite a lot of data, but no big deal in the greater scheme of things. 230 00:13:28,830 --> 00:13:33,850 >> From this article on the old internet, "Scaling the Facebook data warehouse 231 00:13:33,850 --> 00:13:36,250 to 300 petabytes." 232 00:13:36,250 --> 00:13:38,110 That's a whole different game changer now, 233 00:13:38,110 --> 00:13:40,234 when you're starting to think of data and the graph 234 00:13:40,234 --> 00:13:41,690 and what you bring along with. 235 00:13:41,690 --> 00:13:47,480 And their high data is growing of the order of 600 terrabytes a day. 236 00:13:47,480 --> 00:13:52,980 >> Now, you know, well, then-- I mean, 600 terrabytes a day, 237 00:13:52,980 --> 00:13:55,670 300 petabytes-- they're also now starting 238 00:13:55,670 --> 00:13:58,550 to get very concerned about how to keep this stuff 239 00:13:58,550 --> 00:14:01,160 and to make sure this data stays around. 240 00:14:01,160 --> 00:14:04,630 And this gentleman here, Jay Parikh, is looking 241 00:14:04,630 --> 00:14:08,250 at how to store an exabyte of data. 242 00:14:08,250 --> 00:14:10,180 >> Just for those of you who are watching along 243 00:14:10,180 --> 00:14:13,940 at home, an exabyte-- 10 to the 18. 244 00:14:13,940 --> 00:14:18,210 It's got its own Wikipedia page, it's that big of a number. 245 00:14:18,210 --> 00:14:23,120 That is the size and scale of what we're looking at, to be able to store data. 246 00:14:23,120 --> 00:14:27,090 And these guys aren't mucking around, they're storing that amount of data. 247 00:14:27,090 --> 00:14:29,550 So one of the clues that they're looking at here 248 00:14:29,550 --> 00:14:32,185 is data centers for so-called cold storage. 249 00:14:32,185 --> 00:14:35,020 250 00:14:35,020 --> 00:14:36,470 >> Which brings me to being green. 251 00:14:36,470 --> 00:14:38,340 And here is Kermit. 252 00:14:38,340 --> 00:14:43,050 He and I agree-- it's extremely difficult to be green, 253 00:14:43,050 --> 00:14:44,920 but we give it our best try. 254 00:14:44,920 --> 00:14:47,430 Kermit can't help it, he has to be green all the time, 255 00:14:47,430 --> 00:14:49,945 can't take his green-ness off at all. 256 00:14:49,945 --> 00:14:55,410 >> So, being concepts-- a few kind of core concepts 257 00:14:55,410 --> 00:14:59,510 of greenness, when it relates to computing. 258 00:14:59,510 --> 00:15:05,510 The one that is the most important is the longevity of the product. 259 00:15:05,510 --> 00:15:09,405 If your product has a short lifetime, you cannot, by definition, be green. 260 00:15:09,405 --> 00:15:13,280 The energy taken to manufacture a disk drive, a motherboard, a computer 261 00:15:13,280 --> 00:15:17,890 system, a tablet, whatever it may be, the longevity of your systems 262 00:15:17,890 --> 00:15:21,700 are a key part of how green you can be. 263 00:15:21,700 --> 00:15:27,960 >> The important part, as all of you are building software algorithms-- 264 00:15:27,960 --> 00:15:30,455 algorithm's a partial word for software, right? 265 00:15:30,455 --> 00:15:34,000 So, your algorithm design is absolutely critical in terms 266 00:15:34,000 --> 00:15:43,080 of how you are going to be able to make quick and accurate computations to use 267 00:15:43,080 --> 00:15:44,710 the least amount of energy possible. 268 00:15:44,710 --> 00:15:47,280 And I'll get to this in a little bit. 269 00:15:47,280 --> 00:15:51,270 >> Data center design-- you've seen that we already have thousands 270 00:15:51,270 --> 00:15:54,870 upon thousands of machines, sitting quietly in small, dark corners 271 00:15:54,870 --> 00:15:57,760 of the world, computing. 272 00:15:57,760 --> 00:16:01,670 Resource allocation-- how to get to the compute, to the storage, 273 00:16:01,670 --> 00:16:03,840 through the network. 274 00:16:03,840 --> 00:16:08,530 Operating systems are a key part of this, and a lot of virtualization 275 00:16:08,530 --> 00:16:12,080 to be able to pack more and more compute into a small space. 276 00:16:12,080 --> 00:16:15,530 >> I'll give you a small example from research computing. 277 00:16:15,530 --> 00:16:18,220 We needed more ping, more power, and more pipe. 278 00:16:18,220 --> 00:16:21,030 We needed more bigger, better, faster computers, 279 00:16:21,030 --> 00:16:23,390 and needed to use less juice. 280 00:16:23,390 --> 00:16:26,856 And we couldn't work out how to do this. 281 00:16:26,856 --> 00:16:29,980 I don't know if the hashtag gowest as probably been used by the Kardashian, 282 00:16:29,980 --> 00:16:32,560 but anyway, gowest. 283 00:16:32,560 --> 00:16:33,220 And we did. 284 00:16:33,220 --> 00:16:36,610 >> We picked up our operation and we moved it out 285 00:16:36,610 --> 00:16:39,660 to Western Massachusetts in a small mill town 286 00:16:39,660 --> 00:16:45,000 called Holyoke, just north of Chikopee and Springfield. 287 00:16:45,000 --> 00:16:49,280 We did this for a couple of reasons. 288 00:16:49,280 --> 00:16:55,150 The main one was that we had a very, very large dam. 289 00:16:55,150 --> 00:17:00,080 And this very large dam is able to put out 30 plus megawatts of energy, 290 00:17:00,080 --> 00:17:02,980 and it was underutilized at the time. 291 00:17:02,980 --> 00:17:06,170 >> More importantly, we also had a very complicated network 292 00:17:06,170 --> 00:17:07,254 that was already in place. 293 00:17:07,254 --> 00:17:09,711 If you look at where the network goes in the United States, 294 00:17:09,711 --> 00:17:11,230 it follows all the train tracks. 295 00:17:11,230 --> 00:17:14,290 This particular piece of network was owned by our colleagues and friends 296 00:17:14,290 --> 00:17:16,480 at Massachusetts Institute of Technology, 297 00:17:16,480 --> 00:17:19,720 and it was basically built all the way out to Route 90. 298 00:17:19,720 --> 00:17:24,760 >> So we had a large river tick, Route 90 tick, we had a short path of 100 miles, 299 00:17:24,760 --> 00:17:26,960 and a long path of about 1,000 miles. 300 00:17:26,960 --> 00:17:29,890 We did have to do a very large network splice, as you can see here, 301 00:17:29,890 --> 00:17:32,990 to basically put a link in, to be able to connect to Holyoke, 302 00:17:32,990 --> 00:17:36,390 but we had all of the requisite infrastructure-- ping, power, pipe. 303 00:17:36,390 --> 00:17:37,280 Life was good. 304 00:17:37,280 --> 00:17:38,980 And again, big dam. 305 00:17:38,980 --> 00:17:42,120 >> So we built basically the Massachusetts Green High Performance Computing 306 00:17:42,120 --> 00:17:42,850 Center. 307 00:17:42,850 --> 00:17:46,580 This was a labor of love through five universities-- MIT, Harvard, UMass, 308 00:17:46,580 --> 00:17:47,870 Northeastern, and BU. 309 00:17:47,870 --> 00:17:49,554 Five megawatt day one connected load. 310 00:17:49,554 --> 00:17:51,845 We did all sorts of cleverness with airside economizers 311 00:17:51,845 --> 00:17:53,585 to keep things green. 312 00:17:53,585 --> 00:18:03,330 And we built out 640-odd racks, dedicated for research computing. 313 00:18:03,330 --> 00:18:08,770 >> It was an old brownfield site, so we had some reclamation and some tidy-up 314 00:18:08,770 --> 00:18:10,500 and some clean-up of the site. 315 00:18:10,500 --> 00:18:13,590 And then we started to build the facility 316 00:18:13,590 --> 00:18:19,710 and, boom-- lovely facility with the ability to run sandbox computing, 317 00:18:19,710 --> 00:18:24,430 to have conferences and seminars, and also a massive data center floor. 318 00:18:24,430 --> 00:18:26,007 >> Here is my good self. 319 00:18:26,007 --> 00:18:27,590 I'm obviously wearing the same jacket. 320 00:18:27,590 --> 00:18:29,423 I maybe only have one jacket, but there's me 321 00:18:29,423 --> 00:18:34,030 and John Goodhue-- he's the executive director of the Center-- 322 00:18:34,030 --> 00:18:36,740 standing in the machine room floor, which, as you can see, 323 00:18:36,740 --> 00:18:40,560 is pretty dramatic, and it goes back a long, long way. 324 00:18:40,560 --> 00:18:44,830 >> I often play games driving from Boston out to Holyoke, 325 00:18:44,830 --> 00:18:47,260 pretending that I'm a TCP/IP packet. 326 00:18:47,260 --> 00:18:54,290 And I do worry about my latency driving around in my car. 327 00:18:54,290 --> 00:18:56,690 So that's the green piece. 328 00:18:56,690 --> 00:19:00,070 So let's just take a minute and think about stacks. 329 00:19:00,070 --> 00:19:04,060 So we're trying very carefully to build data centers efficiently, 330 00:19:04,060 --> 00:19:08,770 computing efficiently, make good selection for the computing equipment 331 00:19:08,770 --> 00:19:12,060 and deliver, more importantly, our application, 332 00:19:12,060 --> 00:19:17,860 be it a messaging service or a scientific application. 333 00:19:17,860 --> 00:19:19,110 >> So here are the stacks. 334 00:19:19,110 --> 00:19:22,762 So physical layer, all the way up through application-- 335 00:19:22,762 --> 00:19:25,220 hoping that this is going to be a good part of your course. 336 00:19:25,220 --> 00:19:31,450 OSI seven layer model is basically, you will live, eat, and breathe 337 00:19:31,450 --> 00:19:35,270 this throughout your computing careers. 338 00:19:35,270 --> 00:19:37,800 This whole concept of physical infrastructure-- wires, 339 00:19:37,800 --> 00:19:40,080 cables, data centers, links. 340 00:19:40,080 --> 00:19:42,190 And this is just describing the network. 341 00:19:42,190 --> 00:19:44,780 >> Up here is, well, obviously, this is an old slide, 342 00:19:44,780 --> 00:19:49,342 because this should say HTTP, because nobody cares about simple mail 343 00:19:49,342 --> 00:19:50,550 transport protocols, anymore. 344 00:19:50,550 --> 00:19:53,960 It's all happening in the HTTP space. 345 00:19:53,960 --> 00:19:55,850 So that's one level of stack. 346 00:19:55,850 --> 00:19:59,460 >> Here's another set of stacks, where you have a server, a host, a hypervisor, 347 00:19:59,460 --> 00:20:02,470 a guest, binary library, and then your application. 348 00:20:02,470 --> 00:20:06,070 Or, in this case, the device driver, a Linux kernel, native c, 349 00:20:06,070 --> 00:20:08,080 Java virtual machine, Java API, then Java 350 00:20:08,080 --> 00:20:11,220 applications, and so on and so forth. 351 00:20:11,220 --> 00:20:14,090 This is a description of a virtual machine. 352 00:20:14,090 --> 00:20:15,450 >> Holy stacks, Batman! 353 00:20:15,450 --> 00:20:18,260 Think about this in terms of how much compute 354 00:20:18,260 --> 00:20:20,850 you need to get from what's happening here, 355 00:20:20,850 --> 00:20:23,110 all the way up to the top of this stack, to then 356 00:20:23,110 --> 00:20:26,840 be able to do your actual delivery of the application. 357 00:20:26,840 --> 00:20:29,130 >> And if you kind of rewind and start to think 358 00:20:29,130 --> 00:20:33,450 about what it takes to provide a floating point operation, 359 00:20:33,450 --> 00:20:37,650 your floating point operation is a sum of the sockets, the number of cores 360 00:20:37,650 --> 00:20:44,490 in the socket, a clock, which is how fast can the clock turnover-- 361 00:20:44,490 --> 00:20:47,490 four gigahertz, two gigahertz-- and then the number 362 00:20:47,490 --> 00:20:50,890 of operations you can do in a given hertz. 363 00:20:50,890 --> 00:20:54,350 >> So those microprocessors today do between four and 6 FLOPs 364 00:20:54,350 --> 00:20:55,400 per clock cycle. 365 00:20:55,400 --> 00:20:59,810 And so a single-core 2.5 gig clock has a theoretical performance 366 00:20:59,810 --> 00:21:03,490 of about a mega FLOP, give or take. 367 00:21:03,490 --> 00:21:05,940 >> But, as with everything, we have choices. 368 00:21:05,940 --> 00:21:12,280 So and Intel Core 2, Nehalem Sandy Bridge, Haswell, AMD, 369 00:21:12,280 --> 00:21:13,920 take your choices-- Intel Atom . 370 00:21:13,920 --> 00:21:17,670 All of these processor architectures all have a slightly different way 371 00:21:17,670 --> 00:21:19,650 of being able to add two numbers together, 372 00:21:19,650 --> 00:21:23,520 which is basically their purpose in life. 373 00:21:23,520 --> 00:21:24,535 Must be tough. 374 00:21:24,535 --> 00:21:27,100 There's millions of them sitting in data centers, now though. 375 00:21:27,100 --> 00:21:30,410 >> Sor, flops per watt-- this is the big thing. 376 00:21:30,410 --> 00:21:37,780 So if I want to get more of this to get through this stack, faster, 377 00:21:37,780 --> 00:21:41,800 I've got to work on how many floating point operations a second, 378 00:21:41,800 --> 00:21:43,770 I can do, and then give them watt. 379 00:21:43,770 --> 00:21:46,160 And fortunately, folks have thought about this. 380 00:21:46,160 --> 00:21:49,140 >> So there's a large contest every year to see 381 00:21:49,140 --> 00:21:52,310 who can build the fastest computer that can diagonalize a matrix. 382 00:21:52,310 --> 00:21:53,980 It's called the Top 500. 383 00:21:53,980 --> 00:21:56,420 They pick the top from the best 500 computers 384 00:21:56,420 --> 00:21:58,610 on the planet that can diagonalize matrices. 385 00:21:58,610 --> 00:22:00,760 And you get some amazing results. 386 00:22:00,760 --> 00:22:04,660 >> A lot of those machines are between 10 and 20 megawatts. 387 00:22:04,660 --> 00:22:09,380 They can diagonalize matrices inordinately quickly. 388 00:22:09,380 --> 00:22:13,550 They don't necessarily diagonalized them as efficiently per watt, 389 00:22:13,550 --> 00:22:18,060 so there was this big push to look at what a green 500 list would look like. 390 00:22:18,060 --> 00:22:20,360 And here is the list from June. 391 00:22:20,360 --> 00:22:22,410 There should be a new one very shortly. 392 00:22:22,410 --> 00:22:26,590 >> And it calls out-- I'll take the top of this particular list. 393 00:22:26,590 --> 00:22:32,187 There's two specific machines-- one from the Tokyo Institute of Technology 394 00:22:32,187 --> 00:22:34,520 and one from Cambridge University in the United Kingdom. 395 00:22:34,520 --> 00:22:37,700 And these have pretty staggering mega flops per watt ratios. 396 00:22:37,700 --> 00:22:42,620 This one's 4,389, and the next one down is 3,631. 397 00:22:42,620 --> 00:22:47,660 >> I'll explain the difference between these two, in the next slide. 398 00:22:47,660 --> 00:22:51,320 But these are these are moderately sized test clusters. 399 00:22:51,320 --> 00:22:54,732 These are just 34 kilowatts or 52 kilowatts. 400 00:22:54,732 --> 00:22:56,940 There are some larger ones here-- this particular one 401 00:22:56,940 --> 00:22:58,860 at the Swiss National Supercomputing Centre. 402 00:22:58,860 --> 00:23:00,693 The take home message for this is that we're 403 00:23:00,693 --> 00:23:04,270 trying to find computers that can operate efficiently. 404 00:23:04,270 --> 00:23:09,860 >> And so, let's look at this top one, cutely called, the KFC. 405 00:23:09,860 --> 00:23:12,960 And a little bit of advertising here. 406 00:23:12,960 --> 00:23:15,730 This particular food company has nothing to do with this. 407 00:23:15,730 --> 00:23:18,240 It's the fact that this particular system 408 00:23:18,240 --> 00:23:23,830 is soaked in a very clever oil-based compound. 409 00:23:23,830 --> 00:23:27,590 And so they got their chicken fryer moniker 410 00:23:27,590 --> 00:23:30,040 when they first started to build these types of systems. 411 00:23:30,040 --> 00:23:32,740 >> But basically what they've taken here is a number of blades, 412 00:23:32,740 --> 00:23:37,560 put them in this sophisticated mineral oil, 413 00:23:37,560 --> 00:23:40,979 and then worked out how to get all the networking in and out of it. 414 00:23:40,979 --> 00:23:42,895 Then, not only that, they've put it outside so 415 00:23:42,895 --> 00:23:46,095 that it can exploit outside air cooling. 416 00:23:46,095 --> 00:23:47,520 It was pretty impressive. 417 00:23:47,520 --> 00:23:49,630 So you have to do all of this shenanigans 418 00:23:49,630 --> 00:23:53,280 to be able to get this amount of compute delivered for small wattage. 419 00:23:53,280 --> 00:23:57,360 >> And you can see this is the shape of where things are heading. 420 00:23:57,360 --> 00:24:01,240 The challenge is that regular air cooling is the economy of scale 421 00:24:01,240 --> 00:24:08,459 and is driving a lot of the development of both regular computing, 422 00:24:08,459 --> 00:24:09,750 and high performance computing. 423 00:24:09,750 --> 00:24:11,080 So, this is pretty disruptive. 424 00:24:11,080 --> 00:24:13,280 I think this is fascinating. 425 00:24:13,280 --> 00:24:15,530 It's a bit messy when you try to swap the disk drives, 426 00:24:15,530 --> 00:24:18,090 but it's a really cool idea. 427 00:24:18,090 --> 00:24:22,200 >> So not only that, there's a whole bunch of work 428 00:24:22,200 --> 00:24:25,450 being built around what we're calling the Open Compute Project. 429 00:24:25,450 --> 00:24:29,400 And so, more about that a little bit later. 430 00:24:29,400 --> 00:24:32,740 But the industry's starting to realize that the FLOPs per watt 431 00:24:32,740 --> 00:24:33,670 is becoming important. 432 00:24:33,670 --> 00:24:39,256 And you, as folks here, as you design your algorithms 433 00:24:39,256 --> 00:24:41,130 and you design your code, you should be aware 434 00:24:41,130 --> 00:24:43,620 that your code can have a knock-on effect. 435 00:24:43,620 --> 00:24:48,380 >> When Mark was sitting here in his dorm room writing Facebook 1.0, 436 00:24:48,380 --> 00:24:51,050 I'm pretty sure he had a view that it was going to be huge. 437 00:24:51,050 --> 00:24:54,945 But how huge it would be on the environment is a big dealio. 438 00:24:54,945 --> 00:24:58,340 And so all of ya'll could come up with algorithms 439 00:24:58,340 --> 00:25:01,370 that could be the next challenging thing for folks like me, 440 00:25:01,370 --> 00:25:02,700 trying to run systems. 441 00:25:02,700 --> 00:25:07,360 >> So let's just think about real world power limits. 442 00:25:07,360 --> 00:25:09,930 This paper by Landauer-- is not a new thing. 443 00:25:09,930 --> 00:25:12,480 1961 this was published in the IBM Journal. 444 00:25:12,480 --> 00:25:15,590 This is the canonical "Irreversibility and Heat 445 00:25:15,590 --> 00:25:17,630 Generation in the Computing Process." 446 00:25:17,630 --> 00:25:22,050 And so he argued that machines inevitably 447 00:25:22,050 --> 00:25:25,070 perform logistic functions that don't have single-valued inverse. 448 00:25:25,070 --> 00:25:29,130 >> So that the whole part of this is that back in the '60s, 449 00:25:29,130 --> 00:25:31,890 folks knew that this was going to be a problem. 450 00:25:31,890 --> 00:25:37,080 And so the law of limits said 25 degrees C, a sort of canonical room 451 00:25:37,080 --> 00:25:41,120 temperature, the limit represents 0.1 electron volts. 452 00:25:41,120 --> 00:25:44,920 But theoretically, this is the theory, computer memory, 453 00:25:44,920 --> 00:25:51,410 operating at this limit could be changed at one billion bits a second. 454 00:25:51,410 --> 00:25:54,620 >> I don't know about you, but not come across many one billion bits 455 00:25:54,620 --> 00:25:57,190 a second data rate exchanges. 456 00:25:57,190 --> 00:26:01,360 The argument there was that only 2.8 trillions of a watt of power 457 00:26:01,360 --> 00:26:03,180 ought to ever be expanded. 458 00:26:03,180 --> 00:26:08,160 >> All right, real world example-- this is my electric bill. 459 00:26:08,160 --> 00:26:10,347 I'm 65% percent of that lovely data center 460 00:26:10,347 --> 00:26:11,930 I showed you, in this particular time. 461 00:26:11,930 --> 00:26:15,520 This is back in June last year. 462 00:26:15,520 --> 00:26:21,300 I've taken an older version so that we can and sort of anonymize a little bit. 463 00:26:21,300 --> 00:26:25,470 I was spending $45,000 a month for energy there. 464 00:26:25,470 --> 00:26:34,990 >> So the reason being there is that we have over 50,000 processes in room. 465 00:26:34,990 --> 00:26:38,110 So could you imagine your own residential electricity bill 466 00:26:38,110 --> 00:26:39,540 being that high? 467 00:26:39,540 --> 00:26:46,180 But it was for a 199 million watt hours over a month. 468 00:26:46,180 --> 00:26:51,670 >> So the question I pose is, can you imagine Mr. Zuckerberg's electric bill? 469 00:26:51,670 --> 00:26:54,730 Mine is pretty big, and I struggle. 470 00:26:54,730 --> 00:26:56,600 And I'm not alone in this is. 471 00:26:56,600 --> 00:26:59,450 There's a lot of people with big data centers. 472 00:26:59,450 --> 00:27:04,800 And so, I guess, full disclosure-- my Facebook friends a little bit odd. 473 00:27:04,800 --> 00:27:07,900 >> So my Facebook friend is the Prineville data center, 474 00:27:07,900 --> 00:27:14,030 which is one of Facebook's largest, newest, lowest energy data center. 475 00:27:14,030 --> 00:27:19,360 And they post to me, things like power utilization effectiveness, 476 00:27:19,360 --> 00:27:24,020 as in how effective is the data center versus how much energy you're 477 00:27:24,020 --> 00:27:26,370 putting into it, how much water are they using, what's 478 00:27:26,370 --> 00:27:27,810 the humidity and temperature. 479 00:27:27,810 --> 00:27:29,980 >> And they have these lovely, lovely plots. 480 00:27:29,980 --> 00:27:32,600 I think this is an awesome Facebook page, 481 00:27:32,600 --> 00:27:35,400 but I guess I'm a little bit weird. 482 00:27:35,400 --> 00:27:39,930 >> So one more power thing, research computing that I do 483 00:27:39,930 --> 00:27:44,060 is significantly different to what Facebook and Yahoo and Google 484 00:27:44,060 --> 00:27:50,020 and other on-demand, fully, always available services. 485 00:27:50,020 --> 00:27:53,530 And so I have the advantage that when ISO New England-- and ISO New England 486 00:27:53,530 --> 00:27:58,910 helps set the energy rates for the region. 487 00:27:58,910 --> 00:28:01,110 >> And it says it's extending a request to consumers 488 00:28:01,110 --> 00:28:05,870 to voluntarily conserve high energy, because of the high heat and humidity. 489 00:28:05,870 --> 00:28:08,680 And this was back on the 18th of July. 490 00:28:08,680 --> 00:28:12,600 And so I happily Tweet back, Hey, ISO New England, Green Harvard. 491 00:28:12,600 --> 00:28:14,880 We're doing our part over here in research computing. 492 00:28:14,880 --> 00:28:16,760 And this is because we're doing science. 493 00:28:16,760 --> 00:28:20,380 >> And as much as people say science never sleeps, science can wait. 494 00:28:20,380 --> 00:28:25,030 So we are able to quiesce our systems, take advantage of grade rates 495 00:28:25,030 --> 00:28:30,550 on our energy bill, and help the entire New England 496 00:28:30,550 --> 00:28:35,910 region by shedding many megawatts of load. 497 00:28:35,910 --> 00:28:40,020 So that's the unique thing that differs about scientific computing data 498 00:28:40,020 --> 00:28:48,890 centers and those that are in full production 24/7. 499 00:28:48,890 --> 00:28:51,670 >> So let's just take another gear here. 500 00:28:51,670 --> 00:28:55,170 So, I want to discuss chaos a little bit. 501 00:28:55,170 --> 00:28:59,900 And I want to put it in the auspices of storage. 502 00:28:59,900 --> 00:29:03,150 So for those that kind of were struggling 503 00:29:03,150 --> 00:29:08,680 getting their head around what petabytes of storage look like, this an example. 504 00:29:08,680 --> 00:29:11,660 And this is the sort of stuff I deal with all the time. 505 00:29:11,660 --> 00:29:15,550 >> Each one of these little fellas is a four terabyte hard drive, 506 00:29:15,550 --> 00:29:17,420 so you can kind of count them up. 507 00:29:17,420 --> 00:29:21,370 We're getting now between one to 1 and 1/2 petabytes 508 00:29:21,370 --> 00:29:22,970 in a standard industry rack. 509 00:29:22,970 --> 00:29:26,430 And we have rooms and rooms, as you saw in that earlier picture with John 510 00:29:26,430 --> 00:29:31,230 and I, full of these racks of equipment. 511 00:29:31,230 --> 00:29:40,400 So it's becoming very, very easy to build massive storage arrays 512 00:29:40,400 --> 00:29:44,140 >> It's mostly easy inside of Unix to kind of count up how things are going. 513 00:29:44,140 --> 00:29:48,270 So this is counting how many MAU points have I got there. 514 00:29:48,270 --> 00:29:50,880 So that's 423 intercept points. 515 00:29:50,880 --> 00:29:55,660 And then if I run some sketchy awk, I can add up, in this particular system, 516 00:29:55,660 --> 00:29:59,080 there was 7.3 petabytes of available storage. 517 00:29:59,080 --> 00:30:01,350 >> So that's a lot of stuff. 518 00:30:01,350 --> 00:30:03,030 And storage is really hard. 519 00:30:03,030 --> 00:30:06,850 And yet, for some reason, this is an industry trend. 520 00:30:06,850 --> 00:30:11,500 Whenever I talk to our researchers and our faculty and say, 521 00:30:11,500 --> 00:30:14,180 hey, I can run storage for you. 522 00:30:14,180 --> 00:30:17,690 Unfortunately, I have to recover the cost of the storage. 523 00:30:17,690 --> 00:30:19,430 I get this business. 524 00:30:19,430 --> 00:30:23,300 And people reference Newegg or they reference Staples 525 00:30:23,300 --> 00:30:27,040 or how much they can buy a single terabyte disk drive for. 526 00:30:27,040 --> 00:30:29,390 >> So this, you'll note here, that there's a clue. 527 00:30:29,390 --> 00:30:31,310 There's one disk drive here. 528 00:30:31,310 --> 00:30:33,290 And if we go back, I have many. 529 00:30:33,290 --> 00:30:36,130 Not only do I have many, I have sophisticated interconnects 530 00:30:36,130 --> 00:30:38,750 to be able to stitch these things together. 531 00:30:38,750 --> 00:30:44,080 So the risk associated with these large storage arrays is not insignificant. 532 00:30:44,080 --> 00:30:46,370 >> In fact, we took to the internet and we wrote 533 00:30:46,370 --> 00:30:51,670 a little story about a well-meaning, mild-mannered director of research 534 00:30:51,670 --> 00:30:54,640 computing-- happens to have a strange English accent-- trying 535 00:30:54,640 --> 00:30:59,930 to explain to a researcher what the no underscore backup folder actually 536 00:30:59,930 --> 00:31:01,070 meant. 537 00:31:01,070 --> 00:31:05,690 It was quite a long, little story, a good four minutes of discovery. 538 00:31:05,690 --> 00:31:09,380 >> And note, I have an awful lot less space than the lady 539 00:31:09,380 --> 00:31:11,800 that sings about all the bass. 540 00:31:11,800 --> 00:31:13,910 We're quite a few accounts lower. 541 00:31:13,910 --> 00:31:16,160 But anyway, this is an important thing to think about, 542 00:31:16,160 --> 00:31:18,532 in terms of what could go wrong. 543 00:31:18,532 --> 00:31:20,990 So if I get a disk drive, and I throw it in a Unix machine, 544 00:31:20,990 --> 00:31:24,300 and I start writing things to it, there's a magnet, there's a drive head, 545 00:31:24,300 --> 00:31:30,150 there's ostensibly, a one or a zero being written down on to that device. 546 00:31:30,150 --> 00:31:32,180 >> Motors-- spinny, twirly things always break. 547 00:31:32,180 --> 00:31:33,490 Think about things that break. 548 00:31:33,490 --> 00:31:35,170 It's always been spinny, twirly things. 549 00:31:35,170 --> 00:31:38,560 Printers, disk drives, motor vehicles, etc. 550 00:31:38,560 --> 00:31:40,590 Anything that moves is likely to break. 551 00:31:40,590 --> 00:31:42,575 >> So you need motors, you need drive firmware, 552 00:31:42,575 --> 00:31:47,110 you need SAS/SATA controllers, wires, firmware on the SAS/SATA controllers, 553 00:31:47,110 --> 00:31:48,530 low level blocks. 554 00:31:48,530 --> 00:31:54,580 Pick your storage controller file system code, whichever one it may be, 555 00:31:54,580 --> 00:31:56,780 how you stitch things together. 556 00:31:56,780 --> 00:32:00,956 And your virtual memory manager pages, DRAM fetch and stores. 557 00:32:00,956 --> 00:32:02,705 Then, you get another stack, which is kind 558 00:32:02,705 --> 00:32:05,440 of down the list on this one, algorithms, users. 559 00:32:05,440 --> 00:32:09,050 >> And if you multiply this up, I don't know how many, 560 00:32:09,050 --> 00:32:11,640 there's a lot of places where stuff can go sideways. 561 00:32:11,640 --> 00:32:14,430 I mean, that's an example about math. 562 00:32:14,430 --> 00:32:18,070 But it's kind of fun to think of how many ways things could go wrong, 563 00:32:18,070 --> 00:32:21,650 just for a disk drive. 564 00:32:21,650 --> 00:32:25,440 We're already at 300 petabytes, so imagine the number of disk drives 565 00:32:25,440 --> 00:32:27,741 you need at 300 petabytes that can go wrong. 566 00:32:27,741 --> 00:32:28,240 567 00:32:28,240 --> 00:32:30,390 Not only that-- so that's storage. 568 00:32:30,390 --> 00:32:34,220 And that alludes to the person I'd like to see 569 00:32:34,220 --> 00:32:38,780 enter stage left, which is the Chaos Monkey. 570 00:32:38,780 --> 00:32:43,140 So at a certain point, it gets even bigger than just the disk drive 571 00:32:43,140 --> 00:32:43,920 problem. 572 00:32:43,920 --> 00:32:50,610 >> And so, these fine ladies and gentleman that run a streaming video service 573 00:32:50,610 --> 00:32:55,430 realized that their computers were also huge and also very complicated 574 00:32:55,430 --> 00:33:00,010 and also providing service to an awful a lot of people. 575 00:33:00,010 --> 00:33:05,180 They've got 37 million members-- and this slide's maybe a year or so old-- 576 00:33:05,180 --> 00:33:07,350 thousands of devices. 577 00:33:07,350 --> 00:33:10,810 There are billions of hours of video. 578 00:33:10,810 --> 00:33:13,600 They log billions of events a day. 579 00:33:13,600 --> 00:33:17,330 >> And you can see, most people watch the telly later on in the evening, 580 00:33:17,330 --> 00:33:19,429 and it far outweighs everything. 581 00:33:19,429 --> 00:33:21,220 And so, they wanted to be able to make sure 582 00:33:21,220 --> 00:33:24,854 that the service was up and reliable and working for them. 583 00:33:24,854 --> 00:33:27,020 So they came up with this thing called Chaos Monkey. 584 00:33:27,020 --> 00:33:29,000 It's piece of software which, when you think 585 00:33:29,000 --> 00:33:34,190 about talking about the title of this whole presentation, 586 00:33:34,190 --> 00:33:36,530 scale-out means you should test this stuff. 587 00:33:36,530 --> 00:33:38,585 It's no good just having a million machines. 588 00:33:38,585 --> 00:33:40,460 So the nice thing about this is, Chaos Monkey 589 00:33:40,460 --> 00:33:43,090 is a service which identifies groups of systems 590 00:33:43,090 --> 00:33:47,220 and randomly terminates one of the systems in a group. 591 00:33:47,220 --> 00:33:48,429 Awesome. 592 00:33:48,429 --> 00:33:50,220 So I don't know about you, but if I've ever 593 00:33:50,220 --> 00:33:52,990 built a system that relies on other systems talking to each other, 594 00:33:52,990 --> 00:33:55,865 you take one of them out, the likelihood of the entire thing working, 595 00:33:55,865 --> 00:33:57,130 diminishes rapidly. 596 00:33:57,130 --> 00:34:00,475 >> And so this piece of software runs around Netflix's infrastructure. 597 00:34:00,475 --> 00:34:03,100 Luckily, it says it runs only in business hours with the intent 598 00:34:03,100 --> 00:34:05,810 that engineers will be alert and able to respond. 599 00:34:05,810 --> 00:34:08,020 So these are the types of things we're now 600 00:34:08,020 --> 00:34:13,360 having to do to perturb our computing environments, to introduce chaos 601 00:34:13,360 --> 00:34:15,739 and to introduce complexity. 602 00:34:15,739 --> 00:34:19,139 >> So who, in their right mind, would willingly choose 603 00:34:19,139 --> 00:34:22,540 to work with a Chaos Monkey? 604 00:34:22,540 --> 00:34:24,150 Hang on, he seems to be pointing me. 605 00:34:24,150 --> 00:34:28,719 Well, I guess I should-- cute. 606 00:34:28,719 --> 00:34:32,909 But the problem is you don't get the choice. 607 00:34:32,909 --> 00:34:37,440 The Chaos Monkey, as you can see, chooses you. 608 00:34:37,440 --> 00:34:42,650 >> And this is the problem with computing at scale is that you can't avoid this. 609 00:34:42,650 --> 00:34:49,989 It's an inevitability of complexity and of scale and of our evolution, 610 00:34:49,989 --> 00:34:53,280 in some ways, of computing expertise. 611 00:34:53,280 --> 00:34:55,510 And remember, this is one thing to remember, 612 00:34:55,510 --> 00:35:00,030 Chaos Monkeys love snowflakes-- love snowflakes. 613 00:35:00,030 --> 00:35:03,470 A snowflake-- we've explained the Chaos Monkey-- but a snowflake 614 00:35:03,470 --> 00:35:09,630 is a server that is unique and special and delicate and individual 615 00:35:09,630 --> 00:35:11,770 and will never be reproduced. 616 00:35:11,770 --> 00:35:14,790 >> We often find snowflake service in our environment. 617 00:35:14,790 --> 00:35:16,700 And we always try and melt snowflake service. 618 00:35:16,700 --> 00:35:18,880 But if you find a server in your environment 619 00:35:18,880 --> 00:35:23,240 that is critical to the longevity of your organization and it melts, 620 00:35:23,240 --> 00:35:25,300 you can't put it back together again. 621 00:35:25,300 --> 00:35:28,071 So Chaos Monkey's job was to go and terminate instances. 622 00:35:28,071 --> 00:35:30,820 If the Chaos Monkey melts the snowflake, you're over, you're done. 623 00:35:30,820 --> 00:35:34,390 624 00:35:34,390 --> 00:35:37,950 I want to talk about some hardware that we're 625 00:35:37,950 --> 00:35:40,415 seeing in terms of sort of scale-out activities too. 626 00:35:40,415 --> 00:35:43,810 And some unique things that are in and around the science activity. 627 00:35:43,810 --> 00:35:46,990 We are now starting to see, remember this unit of issue, this rack? 628 00:35:46,990 --> 00:35:51,780 So this is a rack of GPGPUs-- so general purpose graphics processing units. 629 00:35:51,780 --> 00:35:55,790 >> We have these located in our data center, 100 or so miles away. 630 00:35:55,790 --> 00:35:59,780 This particular rack is about 96 tera FLOPS 631 00:35:59,780 --> 00:36:04,090 of single-precision math able to deliver out the back of it. 632 00:36:04,090 --> 00:36:10,530 And we have order 130-odd cards in an instance 633 00:36:10,530 --> 00:36:16,620 that we-- multiple racks of this instance. 634 00:36:16,620 --> 00:36:22,730 >> So this is interesting in the sense that the general purpose graphics processes 635 00:36:22,730 --> 00:36:27,880 are able to do mathematics incredibly quickly for very low amounts of energy. 636 00:36:27,880 --> 00:36:32,060 So there's a large uptick in the scientific computing areas, 637 00:36:32,060 --> 00:36:36,400 looking at graphics processing units in a big way. 638 00:36:36,400 --> 00:36:41,990 >> So I ran some Mcollective through our puppet infrastructure 639 00:36:41,990 --> 00:36:45,330 yesterday, very excited about this. 640 00:36:45,330 --> 00:36:48,260 just short of a petaflop of single-precision. 641 00:36:48,260 --> 00:36:52,440 Just to be clear here , this little multiplier is 3.95. 642 00:36:52,440 --> 00:36:54,820 Double-precision math would be about 1.2, 643 00:36:54,820 --> 00:36:57,010 but my Twitter feed looked way better if I 644 00:36:57,010 --> 00:37:02,670 said we had almost a petaflop of single-precision GPGPUs. 645 00:37:02,670 --> 00:37:04,220 >> But it's getting there. 646 00:37:04,220 --> 00:37:06,280 It's getting to be very, very impressive. 647 00:37:06,280 --> 00:37:08,550 And why are we doing this? 648 00:37:08,550 --> 00:37:11,570 Because quantum chemistry, among other things, 649 00:37:11,570 --> 00:37:15,300 but we're starting to design some new photovoltaics. 650 00:37:15,300 --> 00:37:20,210 >> And so Alan Aspuru-Guzik, who's a professor in chemistry-- my partner 651 00:37:20,210 --> 00:37:22,390 in crime-- for the last few years. 652 00:37:22,390 --> 00:37:25,660 We've been pushing the envelope on computing. 653 00:37:25,660 --> 00:37:30,250 And the GPGPU is ideal technology to be able to do 654 00:37:30,250 --> 00:37:34,760 an awful lot of complicated math, very, very quickly. 655 00:37:34,760 --> 00:37:36,750 >> So with scale, comes new challenges. 656 00:37:36,750 --> 00:37:41,070 So huge scale-- you have to be careful how you wire this stuff. 657 00:37:41,070 --> 00:37:45,300 And we have certain levels of obsessive compulsive disorder. 658 00:37:45,300 --> 00:37:49,530 These pictures probably drive a lot of people nuts. 659 00:37:49,530 --> 00:37:53,390 And cabinets that aren't wired particularly well 660 00:37:53,390 --> 00:37:56,050 drive our network and facilities engineers nuts. 661 00:37:56,050 --> 00:37:58,620 Plus there's also airflow issues that you have to contain. 662 00:37:58,620 --> 00:38:01,430 >> So these are things that I would never have thought of. 663 00:38:01,430 --> 00:38:03,480 With scale, comes more complexity. 664 00:38:03,480 --> 00:38:05,869 This is a new type of file system. 665 00:38:05,869 --> 00:38:06,410 It's awesome. 666 00:38:06,410 --> 00:38:07,660 It's a petabyte. 667 00:38:07,660 --> 00:38:09,905 It can store 1.1 billion files. 668 00:38:09,905 --> 00:38:15,940 It can read and write to 13 gigabytes and 20 gigabytes a second-- gigabytes 669 00:38:15,940 --> 00:38:17,150 a second. 670 00:38:17,150 --> 00:38:20,900 So it can unload terabytes in no time at all. 671 00:38:20,900 --> 00:38:22,070 >> And it's highly available. 672 00:38:22,070 --> 00:38:26,989 And it's got amazing lookup rates-- 220,000 lookups a second. 673 00:38:26,989 --> 00:38:29,780 And there are many different people building these kind of systems. 674 00:38:29,780 --> 00:38:32,830 And you can see it here graphically. 675 00:38:32,830 --> 00:38:35,800 This is one of our file systems that's under load, quite 676 00:38:35,800 --> 00:38:41,250 happily reading at just short of 22 gigabytes a second. 677 00:38:41,250 --> 00:38:42,790 So that's cool-- so complexity. 678 00:38:42,790 --> 00:38:47,230 >> So with complexity and scale, comes more complexity, right? 679 00:38:47,230 --> 00:38:51,830 This is one of our many, many network diagrams, 680 00:38:51,830 --> 00:38:54,970 where you have many different chassis all supporting up 681 00:38:54,970 --> 00:38:57,730 into a main core switch, connected to storage, 682 00:38:57,730 --> 00:39:00,731 connecting to low latency interconnects. 683 00:39:00,731 --> 00:39:03,605 And then all of this side of the house, is just all of the management 684 00:39:03,605 --> 00:39:09,740 that you need to be able to address these systems from a remote location. 685 00:39:09,740 --> 00:39:12,070 So scale has a lot of complexity with it. 686 00:39:12,070 --> 00:39:14,910 687 00:39:14,910 --> 00:39:17,785 >> Change gear again, let's go back and have a little spot of science. 688 00:39:17,785 --> 00:39:21,450 So, remember, research computing and this little shim-- 689 00:39:21,450 --> 00:39:25,310 little pink shim between the faculty and all of their algorithms 690 00:39:25,310 --> 00:39:30,650 and all of the cool science and all of this power and cooling and data center 691 00:39:30,650 --> 00:39:35,330 floor and networking and big computers and service desks and help desks 692 00:39:35,330 --> 00:39:39,330 and so forth-- and so, we're just this little shim between them. 693 00:39:39,330 --> 00:39:42,820 >> What we've started to see is that the world's 694 00:39:42,820 --> 00:39:45,730 been able to build these large data centers 695 00:39:45,730 --> 00:39:48,020 and be able to build these large computers. 696 00:39:48,020 --> 00:39:49,420 We've gotten pretty good at it. 697 00:39:49,420 --> 00:39:53,600 What we're not very good at is this little shim between the research 698 00:39:53,600 --> 00:39:56,670 and the bare metal and the technology. 699 00:39:56,670 --> 00:39:58,600 And it's hard. 700 00:39:58,600 --> 00:40:03,330 >> And so we've been able to hire folks that live in this world. 701 00:40:03,330 --> 00:40:07,590 And more recently, we spoke to the National Science Foundation and said, 702 00:40:07,590 --> 00:40:11,440 this scale-out stuff is great, but we can't get our scientists 703 00:40:11,440 --> 00:40:13,690 on to these big complicated machines. 704 00:40:13,690 --> 00:40:16,040 And so, there have been a number of different programs 705 00:40:16,040 --> 00:40:20,100 where we really were mostly concerned about trying 706 00:40:20,100 --> 00:40:22,800 to see if we could transform the campus infrastructure. 707 00:40:22,800 --> 00:40:25,850 >> There are a lot of programs around national centers. 708 00:40:25,850 --> 00:40:28,300 And so, ourselves, our friends at Clemson, 709 00:40:28,300 --> 00:40:32,620 University of Wisconsin Madison, Southern California, Utah, and Hawaii 710 00:40:32,620 --> 00:40:35,780 kind of got together to look at this problem. 711 00:40:35,780 --> 00:40:39,340 And this little graph here is the long tail of science. 712 00:40:39,340 --> 00:40:41,602 >> So this is-- it doesn't matter what's on this axis, 713 00:40:41,602 --> 00:40:45,485 but this axis is actually number of jobs going through the cluster. 714 00:40:45,485 --> 00:40:48,940 So there's 350,000 over whatever time period. 715 00:40:48,940 --> 00:40:51,730 These are our usual suspects along the bottom here. 716 00:40:51,730 --> 00:40:55,992 In fact, there's Alan Aspuru-Guzik, who we were just talking about-- tons 717 00:40:55,992 --> 00:40:58,700 and tons of compute, really effective, knows what he's doing. 718 00:40:58,700 --> 00:41:02,840 >> Here's another lab that I'll talk about in a moment-- John Kovac's lab . 719 00:41:02,840 --> 00:41:03,610 They've got it. 720 00:41:03,610 --> 00:41:04,210 They're good. 721 00:41:04,210 --> 00:41:04,830 They're happy. 722 00:41:04,830 --> 00:41:05,960 They're computing. 723 00:41:05,960 --> 00:41:07,664 Great science is getting done . 724 00:41:07,664 --> 00:41:09,580 And then, as you kind of come down here, there 725 00:41:09,580 --> 00:41:12,110 are other groups that aren't running many jobs. 726 00:41:12,110 --> 00:41:13,410 >> And why is that? 727 00:41:13,410 --> 00:41:15,080 Is it because the computing is too hard? 728 00:41:15,080 --> 00:41:19,580 Is it because they don't know how to? 729 00:41:19,580 --> 00:41:22,880 We don't know, because we've gone and looked. 730 00:41:22,880 --> 00:41:25,620 And so that's what this project is all about, 731 00:41:25,620 --> 00:41:27,830 is locally, within each of these regions, 732 00:41:27,830 --> 00:41:32,660 to look to avenues where we can engage with the faculty and researchers 733 00:41:32,660 --> 00:41:36,400 actually in the bottom end of the tail, and understand what they're doing. 734 00:41:36,400 --> 00:41:37,920 >> So that's something that we're actually passionate about. 735 00:41:37,920 --> 00:41:39,920 And that's something that science won't continue 736 00:41:39,920 --> 00:41:44,260 to move forward until we solve some of these edge cases. 737 00:41:44,260 --> 00:41:46,590 Other bits of science that's going up-- everyone 738 00:41:46,590 --> 00:41:48,260 seen the Large Hadron Collider. 739 00:41:48,260 --> 00:41:49,540 Awesome, right? 740 00:41:49,540 --> 00:41:52,960 This stuff all ran out at Holyoke. 741 00:41:52,960 --> 00:41:56,510 We built-- the very first science that happened in Holyoke 742 00:41:56,510 --> 00:41:59,130 was the collaboration between ourselves and Boston University. 743 00:41:59,130 --> 00:42:01,510 So it's really, really cool. 744 00:42:01,510 --> 00:42:04,410 >> This is a fun piece of science for scale. 745 00:42:04,410 --> 00:42:07,650 This is a digital access to a sky century at Harvard. 746 00:42:07,650 --> 00:42:09,170 Basically, it's a plate archive. 747 00:42:09,170 --> 00:42:13,350 If you go down Oxford-- Garden Street, sorry, 748 00:42:13,350 --> 00:42:16,560 you'll find one of the observatory buildings is basically full 749 00:42:16,560 --> 00:42:19,480 of about half a million plates. 750 00:42:19,480 --> 00:42:24,410 >> And these are pictures of the sky at night, over 100 years. 751 00:42:24,410 --> 00:42:28,760 So there's a whole rig set up here to digitize those plates, 752 00:42:28,760 --> 00:42:32,100 take pictures of them, register them, put them on a computer. 753 00:42:32,100 --> 00:42:36,410 And that's a petabyte and a half, just right there-- one small project. 754 00:42:36,410 --> 00:42:37,530 >> These are other projects. 755 00:42:37,530 --> 00:42:42,800 This Pan-STARRS project is doing a full wide panoramic survey, 756 00:42:42,800 --> 00:42:47,390 looking for near Earth asteroids and transient celestial events. 757 00:42:47,390 --> 00:42:52,100 As a molecular biophysicist, I love the word transient celestial event. 758 00:42:52,100 --> 00:42:55,050 I'm not quite sure what it is, but anyway, we're looking for them. 759 00:42:55,050 --> 00:43:00,372 >> And we're generating 30 terabytes a night out of those telescopes. 760 00:43:00,372 --> 00:43:03,330 And that's not really a bandwidth problem, that's like a FedEx problem. 761 00:43:03,330 --> 00:43:08,420 So you put the storage on the van and you send it whatever it is. 762 00:43:08,420 --> 00:43:10,570 >> BICEP is really interesting-- so background imaging 763 00:43:10,570 --> 00:43:13,850 of cosmic extra galactic polarization. 764 00:43:13,850 --> 00:43:16,880 When I first started working at Harvard seven or so, 765 00:43:16,880 --> 00:43:21,440 eight years ago, I remember working on this project 766 00:43:21,440 --> 00:43:26,010 and it didn't really sink home as to why polarized light 767 00:43:26,010 --> 00:43:29,770 from the cosmic microwave background would be important, 768 00:43:29,770 --> 00:43:30,800 until this happened. 769 00:43:30,800 --> 00:43:34,580 >> And this was John Kovac, who I talked to before, 770 00:43:34,580 --> 00:43:42,030 using millions upon millions of CPU hours, in our facility and others, 771 00:43:42,030 --> 00:43:46,600 to basically stare into the inside of the universe's first moments 772 00:43:46,600 --> 00:43:49,150 after the Big Bang, and trying to understand 773 00:43:49,150 --> 00:43:51,290 Einstein's general theory of relativity. 774 00:43:51,290 --> 00:43:56,040 It's mind blowing that our computers are helping us unravel and stare 775 00:43:56,040 --> 00:43:59,280 into the very origins of why we're here. 776 00:43:59,280 --> 00:44:03,450 >> So when you talk about scale, this is some serious scale. 777 00:44:03,450 --> 00:44:09,260 The other thing of scale is, that particular project hit these guys. 778 00:44:09,260 --> 00:44:15,320 And this is the response curve for BICEP [INAUDIBLE] This was our little survey. 779 00:44:15,320 --> 00:44:19,220 >> And you can see here, life was good until about here, 780 00:44:19,220 --> 00:44:21,200 which was when the announcement came out. 781 00:44:21,200 --> 00:44:24,120 And you have got literally seconds to respond 782 00:44:24,120 --> 00:44:29,020 to the scaling event which corresponds to this little dot here, 783 00:44:29,020 --> 00:44:32,200 which ended up shifting four or so terabytes of data 784 00:44:32,200 --> 00:44:36,370 through the web server that day-- pretty hairy. 785 00:44:36,370 --> 00:44:38,210 >> And so, these are the types of things that 786 00:44:38,210 --> 00:44:43,040 can happen to you in your infrastructure if you do not design for scale. 787 00:44:43,040 --> 00:44:45,630 We had a bit of a scramble that day, to be 788 00:44:45,630 --> 00:44:50,440 able to span out enough web service to keep the site up and running. 789 00:44:50,440 --> 00:44:53,399 And we were successful. 790 00:44:53,399 --> 00:44:55,190 This is a little email that's kind of cute. 791 00:44:55,190 --> 00:45:00,245 This is a mail to Mark Vogelsberger, and Lars Hernquist, who's 792 00:45:00,245 --> 00:45:02,650 a faculty member here at Harvard. 793 00:45:02,650 --> 00:45:03,570 More about Mark later. 794 00:45:03,570 --> 00:45:05,990 But I think this is one sort of sums up kind 795 00:45:05,990 --> 00:45:09,920 of where the computing is in research computing. 796 00:45:09,920 --> 00:45:12,070 Hey, team, since last Tuesday, you guys racked up 797 00:45:12,070 --> 00:45:15,470 over 28% of the new cluster, which combined 798 00:45:15,470 --> 00:45:20,040 is over 78 years of CPU in just three days. 799 00:45:20,040 --> 00:45:22,502 And I said, it's still only just Friday morning. 800 00:45:22,502 --> 00:45:23,460 This is pretty awesome! 801 00:45:23,460 --> 00:45:24,740 Happy Friday! 802 00:45:24,740 --> 00:45:27,450 >> Then I give them the data points. 803 00:45:27,450 --> 00:45:30,260 And so that was kind of interesting. 804 00:45:30,260 --> 00:45:34,840 So remember about Mark, he'll come back into the picture in a little bit. 805 00:45:34,840 --> 00:45:36,935 So scale-out computing is everywhere. 806 00:45:36,935 --> 00:45:41,080 >> We're even helping folks look at how the NBA functions, 807 00:45:41,080 --> 00:45:43,140 and where people are throwing balls from. 808 00:45:43,140 --> 00:45:47,580 I don't really understand this game too well, but seemingly, it's a big deal. 809 00:45:47,580 --> 00:45:50,610 There's hoops and bowls and money. 810 00:45:50,610 --> 00:45:55,300 >> And so, our database, we built a little 500 [INAUDIBLE] 811 00:45:55,300 --> 00:45:58,170 parallel processor cluster, a couple of terabytes of RAM, 812 00:45:58,170 --> 00:46:03,590 to be able to build this for Kirk and his team. 813 00:46:03,590 --> 00:46:08,524 And they're doing computing in a whole other way. 814 00:46:08,524 --> 00:46:10,440 Now this is project we're involved with that's 815 00:46:10,440 --> 00:46:14,880 absolutely fascinating, around neural plasticity connectomics and genomic 816 00:46:14,880 --> 00:46:20,960 imprinting-- three very heavy hitting areas of research 817 00:46:20,960 --> 00:46:24,650 that we fight with on a day-to-day basis. 818 00:46:24,650 --> 00:46:30,670 The idea that our brains are under plastic stress when we are young. 819 00:46:30,670 --> 00:46:34,980 And much of our adult behavior is sculpted by experience in infancy. 820 00:46:34,980 --> 00:46:37,040 So this is a big dealio. 821 00:46:37,040 --> 00:46:41,360 >> And so this is work that's funded by the National Institutes of Mental Health. 822 00:46:41,360 --> 00:46:46,860 And we are trying to basically, through a lot of large data 823 00:46:46,860 --> 00:46:51,970 and big data analysis, kind of peer into our human brain 824 00:46:51,970 --> 00:46:54,870 through a variety of different techniques. 825 00:46:54,870 --> 00:47:00,360 >> So I wanted to stop and kind of just pause for a little moment. 826 00:47:00,360 --> 00:47:04,160 The challenge with remote data centers is it's far away. 827 00:47:04,160 --> 00:47:05,520 It can't possibly work. 828 00:47:05,520 --> 00:47:07,590 I need my data close by. 829 00:47:07,590 --> 00:47:10,730 I need to do my research in my lab. 830 00:47:10,730 --> 00:47:18,620 >> And so I kind of took an example of a functional magnetic resonance imaging 831 00:47:18,620 --> 00:47:22,260 data set from our data center in Western Mass. 832 00:47:22,260 --> 00:47:24,660 and connected it to my desktop in Cambridge. 833 00:47:24,660 --> 00:47:27,440 And I'll play this little video. 834 00:47:27,440 --> 00:47:29,750 Hopefully it will kind of work. 835 00:47:29,750 --> 00:47:33,480 >> So this is me going through checking my GPUs are working. 836 00:47:33,480 --> 00:47:35,430 And I'm checking that VNC's up. 837 00:47:35,430 --> 00:47:36,810 And this is a clever VNC. 838 00:47:36,810 --> 00:47:38,970 This is a VNC with 3D pieces. 839 00:47:38,970 --> 00:47:41,975 And so, as you can see shortly, this is me spinning this brain around. 840 00:47:41,975 --> 00:47:44,460 I'm trying to kind of get it oriented. 841 00:47:44,460 --> 00:47:49,574 And then I can move through many different slices of MRI data. 842 00:47:49,574 --> 00:47:51,490 And the only thing that's different about this 843 00:47:51,490 --> 00:47:55,160 is, it's coming over the wire from Western Mass. to my desktop. 844 00:47:55,160 --> 00:47:57,300 And its rendering faster than my desktop, 845 00:47:57,300 --> 00:48:02,840 because I don't have a $4,000 graphics card in my desktop, which 846 00:48:02,840 --> 00:48:04,262 we have out Western Mass. 847 00:48:04,262 --> 00:48:05,720 Of course, I'm trying to be clever. 848 00:48:05,720 --> 00:48:08,859 I'm running GLX gears in the background, whilst doing all this, 849 00:48:08,859 --> 00:48:10,900 to make sure that I can stress the graphics card, 850 00:48:10,900 --> 00:48:14,140 and that it all kind of works and all the rest of it. 851 00:48:14,140 --> 00:48:16,700 But the important thing is, is this is 100 miles away. 852 00:48:16,700 --> 00:48:20,460 And you can see from this that there's no obvious latency. 853 00:48:20,460 --> 00:48:24,600 Things holding together fairly well. 854 00:48:24,600 --> 00:48:28,907 >> And so that, in and of itself, is an example and some insight 855 00:48:28,907 --> 00:48:31,490 into how computing and scale-out computing is going to happen. 856 00:48:31,490 --> 00:48:35,330 We're all working on thinner and thinner devices. 857 00:48:35,330 --> 00:48:36,870 Our use of tablets is increasing. 858 00:48:36,870 --> 00:48:39,160 >> So therefore, my carbon footprint is basically 859 00:48:39,160 --> 00:48:42,060 moving from what used to do that would've 860 00:48:42,060 --> 00:48:46,060 been a huge machine under my desk, to what 861 00:48:46,060 --> 00:48:49,550 is now a facility-- could be anywhere. 862 00:48:49,550 --> 00:48:50,800 It could be anywhere at all. 863 00:48:50,800 --> 00:48:54,790 And yet, it's still able to bring back high performance graphics 864 00:48:54,790 --> 00:48:56,630 to my desktop. 865 00:48:56,630 --> 00:49:00,900 >> So, getting near the end-- remember Mark? 866 00:49:00,900 --> 00:49:04,480 Well, smart lad is Mark. 867 00:49:04,480 --> 00:49:09,360 He decided that he was going to build a realistic virtual universe. 868 00:49:09,360 --> 00:49:12,820 That's quite a project, when you think you've got to pitch this. 869 00:49:12,820 --> 00:49:14,740 I'm going to use a computer, and I'm going 870 00:49:14,740 --> 00:49:21,040 to model the 12 million years after the Big Bang to represent a day. 871 00:49:21,040 --> 00:49:27,080 And then I'm going to do 13.8 billion years of cosmic evolution. 872 00:49:27,080 --> 00:49:28,270 All right. 873 00:49:28,270 --> 00:49:30,970 >> This actually uses a computer the was bigger than our computer, 874 00:49:30,970 --> 00:49:35,040 and it spilled over onto the national resources to our friends down in Texas. 875 00:49:35,040 --> 00:49:38,820 And to the national facilities, this was a lot of compute. 876 00:49:38,820 --> 00:49:40,750 But we did a lot of the simulation locally 877 00:49:40,750 --> 00:49:44,820 to make sure that the software worked and the systems worked. 878 00:49:44,820 --> 00:49:47,790 >> And it's days like this when you realize that you're supporting science 879 00:49:47,790 --> 00:49:51,090 at this level of scale, that people can now say things 880 00:49:51,090 --> 00:49:52,840 like, I'm going to a model a universe. 881 00:49:52,840 --> 00:49:54,145 And this is his first model. 882 00:49:54,145 --> 00:49:56,422 And this is his team's first model. 883 00:49:56,422 --> 00:49:58,130 There are many other folks that are going 884 00:49:58,130 --> 00:50:01,520 to come behind Mark, who are going to want to model with high resolution, 885 00:50:01,520 --> 00:50:04,652 with more specificity, with more accuracy. 886 00:50:04,652 --> 00:50:09,105 >> And so, in the last couple of minutes, I just want to show you this video 887 00:50:09,105 --> 00:50:15,270 of Mark and Lars's that to me, again, as a life scientist , is kind of cute. 888 00:50:15,270 --> 00:50:17,890 889 00:50:17,890 --> 00:50:20,970 So this, at the bottom here, to orient you, 890 00:50:20,970 --> 00:50:23,640 this is telling you the time since the Big Bang. 891 00:50:23,640 --> 00:50:26,570 So we're at about 0.7 billion years. 892 00:50:26,570 --> 00:50:28,740 And this is showing the current update. 893 00:50:28,740 --> 00:50:33,450 So you're seeing at the moment, dark matter and the evolution 894 00:50:33,450 --> 00:50:39,910 of the fine structure and early structures in our known universe. 895 00:50:39,910 --> 00:50:45,690 >> And the point with this is that this is all done inside the computer. 896 00:50:45,690 --> 00:50:48,530 This is a set of parameters and a set of physics 897 00:50:48,530 --> 00:50:52,840 and a set of mathematics and a set of models 898 00:50:52,840 --> 00:50:59,284 that are carefully selected, and then carefully connected to each other 899 00:50:59,284 --> 00:51:00,825 to be able to model the interactions. 900 00:51:00,825 --> 00:51:04,850 >> So you can see some starts of some gaseous explosions here. 901 00:51:04,850 --> 00:51:06,880 And gas temperature is changing. 902 00:51:06,880 --> 00:51:13,720 And you can start to see the structure of the visible universe change. 903 00:51:13,720 --> 00:51:18,130 And the important part with this is, each little tiny, tiny, tiny dot 904 00:51:18,130 --> 00:51:21,070 is a piece of physics and has a set of mathematics around, 905 00:51:21,070 --> 00:51:23,030 informing its friend and its neighbor. 906 00:51:23,030 --> 00:51:27,245 >> So from a scaling perspective, these computers have to all work in concert 907 00:51:27,245 --> 00:51:29,470 and talk to each other efficiently. 908 00:51:29,470 --> 00:51:31,060 So they can't be too chatty. 909 00:51:31,060 --> 00:51:33,520 They have to store their results. 910 00:51:33,520 --> 00:51:37,902 And they have to continue to inform all of their friends. 911 00:51:37,902 --> 00:51:40,860 Indeed, you'll see now, this model's getting more and more complicated. 912 00:51:40,860 --> 00:51:42,590 There's more and more stuff going on. 913 00:51:42,590 --> 00:51:45,210 There's more and more material flying around. 914 00:51:45,210 --> 00:51:48,410 >> And this is what the early cosmos would've looked like. 915 00:51:48,410 --> 00:51:49,770 It was a pretty hairy place. 916 00:51:49,770 --> 00:51:55,140 There's explosions all over the place, powerful collisions. 917 00:51:55,140 --> 00:51:58,620 And formation of heavy metals and elements. 918 00:51:58,620 --> 00:52:03,910 And these big clouds smashing into each other with the extreme force. 919 00:52:03,910 --> 00:52:08,530 >> And so now we're 9.6 billion years from this initial explosion. 920 00:52:08,530 --> 00:52:12,310 You're starting to see things are kind of calmed down a little bit, just 921 00:52:12,310 --> 00:52:15,660 a little bit, because the energy is now starting to relax. 922 00:52:15,660 --> 00:52:19,420 And so the mathematical models have got that in place. 923 00:52:19,420 --> 00:52:22,510 And you're starting to see coalescence of different elements. 924 00:52:22,510 --> 00:52:26,220 And starting to see this thing kind of come together and slowly cool. 925 00:52:26,220 --> 00:52:32,260 >> And it's starting to look a little bit more like the night sky, a little bit. 926 00:52:32,260 --> 00:52:37,870 And it's [? QSing. ?] We're now 30.2 billion years and we're kind of done. 927 00:52:37,870 --> 00:52:41,130 And then what they did was that they took this model, 928 00:52:41,130 --> 00:52:44,580 and then looked at the visible universe. 929 00:52:44,580 --> 00:52:48,560 And basically then, were able to take that and overlay 930 00:52:48,560 --> 00:52:50,580 it with what you can see. 931 00:52:50,580 --> 00:52:56,160 And the fidelity is staggering, as to how accurate the computer models are. 932 00:52:56,160 --> 00:52:58,760 >> Of course, the astrophysicists and the research groups 933 00:52:58,760 --> 00:53:02,780 need even better fidelity and even higher resolution. 934 00:53:02,780 --> 00:53:06,230 But if you think about what I've been talking to you today 935 00:53:06,230 --> 00:53:11,850 through this little voyage through both storage and structure and networking 936 00:53:11,850 --> 00:53:18,000 and stacks, the important thing is, is scale-out computing essential? 937 00:53:18,000 --> 00:53:22,050 That was my original hypothesis-- back to our scientific method. 938 00:53:22,050 --> 00:53:24,810 >> I hope that at the early part of this I would 939 00:53:24,810 --> 00:53:29,400 predict that I would be able to explain to you about scale-out computing. 940 00:53:29,400 --> 00:53:32,870 And we kind of tested some of those hypotheses. 941 00:53:32,870 --> 00:53:34,585 We went through this conversation. 942 00:53:34,585 --> 00:53:38,920 And I'm just going to say scale-out computing is essential-- oh, 943 00:53:38,920 --> 00:53:42,480 yes, very much yes. 944 00:53:42,480 --> 00:53:44,790 >> So when you're thinking about your codes, when 945 00:53:44,790 --> 00:53:49,230 you're doing the CS50 final projects, when you're thinking about your legacy 946 00:53:49,230 --> 00:53:52,990 to humanity and the resources that we need to be able to run these computer 947 00:53:52,990 --> 00:53:56,650 systems, think very carefully about the FLOPS per watt, 948 00:53:56,650 --> 00:53:58,560 and think about the Chaos Monkey. 949 00:53:58,560 --> 00:54:02,240 >> Think about your snowflakes, don't do one-offs, reuse libraries, 950 00:54:02,240 --> 00:54:06,453 build reusable codes-- all of the things that the tutors have been teaching you 951 00:54:06,453 --> 00:54:08,630 in this class. 952 00:54:08,630 --> 00:54:11,942 These are fundamental aspects. 953 00:54:11,942 --> 00:54:13,150 They're not just lip service. 954 00:54:13,150 --> 00:54:15,660 These are real things. 955 00:54:15,660 --> 00:54:20,680 >> And if any of you want to follow me, I am obsessive with the Twitter thing. 956 00:54:20,680 --> 00:54:22,770 I've got to somehow give that up. 957 00:54:22,770 --> 00:54:24,960 But a lot of the background information is 958 00:54:24,960 --> 00:54:29,260 on our research computing website at rc.fas.harvard.edu. 959 00:54:29,260 --> 00:54:34,010 >> I try and keep a blog up to date with modern technologies 960 00:54:34,010 --> 00:54:38,390 and how we do distributive computing and so forth. 961 00:54:38,390 --> 00:54:43,600 And then our staff are always available through odybot.org. 962 00:54:43,600 --> 00:54:46,270 And odybot is our little helper. 963 00:54:46,270 --> 00:54:49,280 He often has little contests on his website 964 00:54:49,280 --> 00:54:51,630 too, where you can try and spot him around campus. 965 00:54:51,630 --> 00:54:55,200 He's the friendly little face of research computing. 966 00:54:55,200 --> 00:54:59,730 >> And I'll kind of wrap up there and thank you all for your time. 967 00:54:59,730 --> 00:55:05,660 And I hope you remember that scale-out computing is a real thing. 968 00:55:05,660 --> 00:55:08,162 And there are a lot of people who've got a lot of prior art 969 00:55:08,162 --> 00:55:09,370 who will be able to help you. 970 00:55:09,370 --> 00:55:14,330 And all of the best of luck with your future endeavors in making 971 00:55:14,330 --> 00:55:18,280 sure that our computing both scales, is high performing, 972 00:55:18,280 --> 00:55:20,370 and helps humanity more than anything else. 973 00:55:20,370 --> 00:55:22,850 So, thank you for your time. 974 00:55:22,850 --> 00:55:23,947