1 00:00:00,000 --> 00:00:00,300 2 00:00:00,300 --> 00:00:01,260 DAVID J. MALAN: All right, welcome, everyone. 3 00:00:01,260 --> 00:00:02,468 Thank you so much for coming. 4 00:00:02,468 --> 00:00:07,700 So I'm really happy to have a classmate of mine, Matt, from class of 1999, 5 00:00:07,700 --> 00:00:10,500 where both of us were undergrads sometime ago. 6 00:00:10,500 --> 00:00:13,760 And we've been super excited to bring Transit Screen-- and Matt especially-- 7 00:00:13,760 --> 00:00:16,270 to campus, since you might have seen here in CS50's office, 8 00:00:16,270 --> 00:00:19,220 or in Cabot House, or in Currier House, and soon elsewhere, 9 00:00:19,220 --> 00:00:22,404 these so-called Transit Screens that allows students to get up-to-the-minute 10 00:00:22,404 --> 00:00:25,362 information on when the next Harvard shuttle is, when the next Uber is, 11 00:00:25,362 --> 00:00:27,653 when the next Hubway bikes are available, and the like. 12 00:00:27,653 --> 00:00:30,660 So today Matt is here with us to talk about smart cities more generally, 13 00:00:30,660 --> 00:00:32,906 and the power of harnessing this kind of data. 14 00:00:32,906 --> 00:00:33,780 So over to you, Matt. 15 00:00:33,780 --> 00:00:34,010 MATT: Great. 16 00:00:34,010 --> 00:00:34,560 DAVID J. MALAN: Thank you, Matt. 17 00:00:34,560 --> 00:00:35,518 MATT: Thank you, David. 18 00:00:35,518 --> 00:00:38,100 And if you ever get tired of teaching CS50, 19 00:00:38,100 --> 00:00:40,560 we might have a sales job for you. 20 00:00:40,560 --> 00:00:41,484 All right. 21 00:00:41,484 --> 00:00:42,900 DAVID J. MALAN: Welcome to campus. 22 00:00:42,900 --> 00:00:44,530 MATT: Thank you. 23 00:00:44,530 --> 00:00:45,030 Well great. 24 00:00:45,030 --> 00:00:48,780 So I'm going to talk a little bit about what I do, and the kind of company 25 00:00:48,780 --> 00:00:53,730 that I and my colleagues and co-founders have built at Transit Screen. 26 00:00:53,730 --> 00:00:56,670 But I also want to talk more generally about urban data, 27 00:00:56,670 --> 00:00:59,430 about smart cities data, and about how you too 28 00:00:59,430 --> 00:01:04,170 can get started building interesting interventions that 29 00:01:04,170 --> 00:01:06,760 can help solve problems in cities. 30 00:01:06,760 --> 00:01:11,460 And I'm going to take a little bit of a step back, sort of a 30,000 foot view 31 00:01:11,460 --> 00:01:15,780 at the beginning, and explain why cities and why now. 32 00:01:15,780 --> 00:01:19,350 And so the most salient thing you can take away here 33 00:01:19,350 --> 00:01:24,300 is that there are more people living inside this circle on the globe than 34 00:01:24,300 --> 00:01:24,930 outside it. 35 00:01:24,930 --> 00:01:27,120 And we have a lot of international students here. 36 00:01:27,120 --> 00:01:28,578 So of course, they appreciate that. 37 00:01:28,578 --> 00:01:31,770 Actually let me-- sorry, let me just stop my slide show from going. 38 00:01:31,770 --> 00:01:33,570 I knew it was going to do that. 39 00:01:33,570 --> 00:01:35,506 OK, much better. 40 00:01:35,506 --> 00:01:40,630 41 00:01:40,630 --> 00:01:41,920 OK, thank you. 42 00:01:41,920 --> 00:01:45,130 So there are more people living inside this circle, China and India 43 00:01:45,130 --> 00:01:46,700 primarily, than outside of it. 44 00:01:46,700 --> 00:01:52,990 And the pace of urbanization has only increased over the last 22 to 40 years. 45 00:01:52,990 --> 00:01:56,080 So just within India, for instance, the urban population 46 00:01:56,080 --> 00:01:59,380 is this red bar continuing to increase. 47 00:01:59,380 --> 00:02:03,400 And the rural population is sort of saturated, or declining. 48 00:02:03,400 --> 00:02:07,930 So what that means is tons of people moving to cities, density increasing. 49 00:02:07,930 --> 00:02:11,500 And that means all of the issues that come with that kind of density 50 00:02:11,500 --> 00:02:12,820 are also cropping up. 51 00:02:12,820 --> 00:02:16,540 And one of those major issues is transportation. 52 00:02:16,540 --> 00:02:19,480 And the way we see transportation at Transit Screen 53 00:02:19,480 --> 00:02:23,920 is that, in the words of Bogota mayor, Enrique Penalosa-- who 54 00:02:23,920 --> 00:02:27,160 is an great documentary called Urbanize that you might like-- 55 00:02:27,160 --> 00:02:30,070 the sign of an advanced society is not where the people who are poor 56 00:02:30,070 --> 00:02:32,680 drive cars, but where the rich ride transit. 57 00:02:32,680 --> 00:02:34,120 And that's kind of inevitable. 58 00:02:34,120 --> 00:02:36,550 Because when you have enough people and enough density, 59 00:02:36,550 --> 00:02:40,140 you just can't have everyone driving a car and getting around. 60 00:02:40,140 --> 00:02:41,890 And you'll just have gridlock, like you do 61 00:02:41,890 --> 00:02:43,970 in many cities in the developing world. 62 00:02:43,970 --> 00:02:48,360 And so what we have to do is we have to find a way to shift some of that 63 00:02:48,360 --> 00:02:52,780 to two different modes in order to make things achievable. 64 00:02:52,780 --> 00:02:54,040 So here's an example. 65 00:02:54,040 --> 00:02:58,000 I believe this is from South America-- Brazil I think. 66 00:02:58,000 --> 00:03:01,210 And this is the kind of issue with transportation supply 67 00:03:01,210 --> 00:03:04,390 that you see with cities with all this urbanization. 68 00:03:04,390 --> 00:03:08,920 And then, back here in the US, we have a great example 69 00:03:08,920 --> 00:03:10,730 from the last decade of Houston. 70 00:03:10,730 --> 00:03:14,830 Houston has the largest freeway in the US, and possibly the world. 71 00:03:14,830 --> 00:03:16,142 It's 23 lanes wide. 72 00:03:16,142 --> 00:03:17,350 It's called the Katy Freeway. 73 00:03:17,350 --> 00:03:19,150 And it goes from Houston west. 74 00:03:19,150 --> 00:03:24,310 And they put $3 billion into expanding it in 2011. 75 00:03:24,310 --> 00:03:28,600 And now traffic is 33% slower, even though they increased tolls. 76 00:03:28,600 --> 00:03:29,290 What happened? 77 00:03:29,290 --> 00:03:32,350 Well, when they built all those new lanes, 78 00:03:32,350 --> 00:03:36,280 and they allowed sprawling development so that people who lived out 79 00:03:36,280 --> 00:03:39,790 in these suburbs who could only get to the city, to their jobs, 80 00:03:39,790 --> 00:03:44,170 using this freeway, the result was what's called induced demand. 81 00:03:44,170 --> 00:03:47,290 So you increase the supply, but that creates its own demand. 82 00:03:47,290 --> 00:03:48,410 It's a feedback loop. 83 00:03:48,410 --> 00:03:52,060 And so the end result is actually worse than when they started. 84 00:03:52,060 --> 00:03:53,770 And this is paradoxical. 85 00:03:53,770 --> 00:03:56,770 But this is the mode that transportation solutions 86 00:03:56,770 --> 00:04:01,390 were for basically the last 50 years, was all this thinking that we 87 00:04:01,390 --> 00:04:02,660 could just fix it with supply. 88 00:04:02,660 --> 00:04:04,750 And that's not actually the case. 89 00:04:04,750 --> 00:04:07,750 Another manifestation of this is parking lots. 90 00:04:07,750 --> 00:04:11,960 So in terms of cars, the number of cars continues to increase. 91 00:04:11,960 --> 00:04:19,750 There are actually 1.2 billion cars on the road today, and 2 billion by 2035. 92 00:04:19,750 --> 00:04:26,050 With the urbanization picking up as well, the cars are sitting unused 96% 93 00:04:26,050 --> 00:04:26,710 of the time. 94 00:04:26,710 --> 00:04:30,310 And so this vast sea of parking is made for the day 95 00:04:30,310 --> 00:04:32,710 after Thanksgiving, when everyone goes shopping. 96 00:04:32,710 --> 00:04:34,350 Most of the time it's not used. 97 00:04:34,350 --> 00:04:37,520 Almost none of the time are these cars actually used. 98 00:04:37,520 --> 00:04:41,620 So if you think, the amount of this that is actually necessary, 99 00:04:41,620 --> 00:04:44,170 is just a fraction of this little car. 100 00:04:44,170 --> 00:04:48,400 So that's a way to think about the scope of the problem, 101 00:04:48,400 --> 00:04:51,360 and how people are looking for a solutions. 102 00:04:51,360 --> 00:04:54,770 So what are the solutions in transportation going to look like? 103 00:04:54,770 --> 00:04:59,170 Well, I think one way to think about it is that the transportation itself is 104 00:04:59,170 --> 00:05:01,030 changing, and it's becoming mobility. 105 00:05:01,030 --> 00:05:04,330 So car companies are saying, we're now mobility companies. 106 00:05:04,330 --> 00:05:05,360 What does that mean? 107 00:05:05,360 --> 00:05:11,270 Well, mobility means, essentially, just getting around in a variety of ways. 108 00:05:11,270 --> 00:05:15,170 And one of those ways, that's very current and very relevant 109 00:05:15,170 --> 00:05:19,800 to CS50, I think, and what people from computer science backgrounds 110 00:05:19,800 --> 00:05:23,280 are going to end up doing in the future, is autonomous vehicles. 111 00:05:23,280 --> 00:05:26,300 So right now, there are 26 companies actively developing 112 00:05:26,300 --> 00:05:29,870 autonomous vehicle technologies-- ones you've heard of, of course, 113 00:05:29,870 --> 00:05:31,050 Google, Uber. 114 00:05:31,050 --> 00:05:35,970 But Tesla, General Motors, Ford, and every other car company you could name 115 00:05:35,970 --> 00:05:38,790 has a research project in this area. 116 00:05:38,790 --> 00:05:42,420 One Google scientist, Sebastian Thrun, said 117 00:05:42,420 --> 00:05:47,270 that the going rate for an acquisition of one scientist who's 118 00:05:47,270 --> 00:05:50,450 working in autonomous vehicles right now is $10 million. 119 00:05:50,450 --> 00:05:54,462 So if you have a active team working on this, 120 00:05:54,462 --> 00:05:56,670 you could just get scooped up by one of these people. 121 00:05:56,670 --> 00:06:01,470 And that's how the math works out. 122 00:06:01,470 --> 00:06:06,370 Nevertheless, autonomous vehicles are here in some places. 123 00:06:06,370 --> 00:06:07,620 They're coming in some places. 124 00:06:07,620 --> 00:06:11,610 There are some autonomous taxis driving around Pittsburgh-- autonomous Ubers. 125 00:06:11,610 --> 00:06:13,470 And Google cars driving around too. 126 00:06:13,470 --> 00:06:15,050 But it's not really here yet. 127 00:06:15,050 --> 00:06:17,990 And it's not clear how they're going to work in an urban environment. 128 00:06:17,990 --> 00:06:19,490 That's still getting worked out. 129 00:06:19,490 --> 00:06:22,470 So in the meantime, let's focus on some other technologies that 130 00:06:22,470 --> 00:06:26,270 are really here, and are really growing very fast right now. 131 00:06:26,270 --> 00:06:30,660 One of those is, surprisingly to some people, bikeshare. 132 00:06:30,660 --> 00:06:37,020 And so bikeshare, in the last year, 1.5 billion trips were taken on bikeshare. 133 00:06:37,020 --> 00:06:40,920 There is now a company in China that's a bikesharing company that's 134 00:06:40,920 --> 00:06:45,480 valued at over a billion dollars, so-called unicorn of bikesharing 135 00:06:45,480 --> 00:06:46,200 companies. 136 00:06:46,200 --> 00:06:50,160 So this is both a real commercial marketplace, and a real transportation 137 00:06:50,160 --> 00:06:52,870 solution, with 1.5 billion trips. 138 00:06:52,870 --> 00:06:56,180 Carshare is also been growing tremendously. 139 00:06:56,180 --> 00:07:01,110 And every single car share vehicle, like a Zipcar, or a Car2go vehicle, 140 00:07:01,110 --> 00:07:07,170 has been studied and has been shown to take up to about 8 to 12 vehicles 141 00:07:07,170 --> 00:07:08,060 off the road. 142 00:07:08,060 --> 00:07:11,120 So private cars, people who have second private cars, 143 00:07:11,120 --> 00:07:14,240 will give up their car, because it costs a lot to maintain. 144 00:07:14,240 --> 00:07:16,980 And they'll use a carsharer instead as their second car 145 00:07:16,980 --> 00:07:19,920 for those rare occasions when they actually need it. 146 00:07:19,920 --> 00:07:22,520 So this is still a pretty significant number of trips. 147 00:07:22,520 --> 00:07:25,730 It's grown tremendously in Europe, but it's still popular here 148 00:07:25,730 --> 00:07:27,500 in North America. 149 00:07:27,500 --> 00:07:32,340 Ridesharing, of course, everyone's familiar with Uber, and Lyft, and DD, 150 00:07:32,340 --> 00:07:35,210 and the growth of all these services. 151 00:07:35,210 --> 00:07:40,170 It's still only four billion rides, but it's increasing very rapidly. 152 00:07:40,170 --> 00:07:43,830 So it's about twice as big as bikeshare right now. 153 00:07:43,830 --> 00:07:48,510 And then mass transit, I don't have the number off hand for how many people 154 00:07:48,510 --> 00:07:53,120 it's carrying, but it's a lot more than that, like by probably a factor of 10. 155 00:07:53,120 --> 00:07:58,170 And mass transit is still continuing to grow tremendously. 156 00:07:58,170 --> 00:08:01,470 Sometimes people think subways are old technology or something like that. 157 00:08:01,470 --> 00:08:03,230 But that's not actually true. 158 00:08:03,230 --> 00:08:07,080 And there have been 40 brand new metro systems 159 00:08:07,080 --> 00:08:10,310 built across the world in just the last decade, which basically means 160 00:08:10,310 --> 00:08:14,430 a doubling of the number of cities that have mass transit, so especially 161 00:08:14,430 --> 00:08:18,240 in countries like China, which have been building a ton of them. 162 00:08:18,240 --> 00:08:21,480 So mobility has really changed. 163 00:08:21,480 --> 00:08:25,980 And the result of all this is that now, more than ever before, 164 00:08:25,980 --> 00:08:29,460 it's more complicated to try to get around cities. 165 00:08:29,460 --> 00:08:33,740 You have more choices, but you need solutions for using those choices 166 00:08:33,740 --> 00:08:37,690 and for getting informed about your different options. 167 00:08:37,690 --> 00:08:45,230 So this is just one example of how the diversity of things that people 168 00:08:45,230 --> 00:08:47,850 are doing in this space has exploded. 169 00:08:47,850 --> 00:08:50,520 You can find this on our website, but includes 170 00:08:50,520 --> 00:08:53,670 different on-demand mobility options, and microtransit and stuff, 171 00:08:53,670 --> 00:08:57,050 as well as some of these other self-driving cars, 172 00:08:57,050 --> 00:08:59,750 and other associated technologies. 173 00:08:59,750 --> 00:09:03,380 OK, so mobility has changed. 174 00:09:03,380 --> 00:09:05,300 Tremendous amount of activity there. 175 00:09:05,300 --> 00:09:08,060 What else is happening in cities? 176 00:09:08,060 --> 00:09:11,940 And one of the trends that's enabling all these changes in mobility 177 00:09:11,940 --> 00:09:15,180 is what we call smart cities. 178 00:09:15,180 --> 00:09:19,180 And related to that is this concept called the internet of things. 179 00:09:19,180 --> 00:09:21,930 So I'll talk a little bit about the internet of things 180 00:09:21,930 --> 00:09:25,010 first, because it has a more specific definition. 181 00:09:25,010 --> 00:09:27,660 What the internet of things is really putting 182 00:09:27,660 --> 00:09:31,200 sensors and other devices in the real world 183 00:09:31,200 --> 00:09:33,210 so that they are now connected to the internet, 184 00:09:33,210 --> 00:09:37,450 and are now enabled for technology. 185 00:09:37,450 --> 00:09:39,960 The idea of a smart city is a city in which 186 00:09:39,960 --> 00:09:42,230 you take all these connected devices and you 187 00:09:42,230 --> 00:09:44,730 use that to get some sort of intelligence 188 00:09:44,730 --> 00:09:50,000 or some sort of operation that makes everything more efficient, often more 189 00:09:50,000 --> 00:09:52,470 sustainable, greener, less CO2. 190 00:09:52,470 --> 00:09:56,970 And so you see all of these things, solar, and carsharing, 191 00:09:56,970 --> 00:10:01,380 and energy generation, and houses, and everything are all connected together. 192 00:10:01,380 --> 00:10:05,240 And they all run efficiently like a giant machine. 193 00:10:05,240 --> 00:10:08,360 The reality of that is that a lot of this stuff is still emerging. 194 00:10:08,360 --> 00:10:10,110 Here are a few examples of some smart city 195 00:10:10,110 --> 00:10:12,380 technologies that you might see today. 196 00:10:12,380 --> 00:10:15,180 And they're all sensor-based, these ones. 197 00:10:15,180 --> 00:10:19,250 And they're all generating data that often is being collected by cities, 198 00:10:19,250 --> 00:10:22,140 but sometimes is available for you and I to use 199 00:10:22,140 --> 00:10:25,030 in different ways in our own projects. 200 00:10:25,030 --> 00:10:29,420 So here's an example of smart parking systems. 201 00:10:29,420 --> 00:10:35,150 So in the old days, cities had no idea who was parking in what space when. 202 00:10:35,150 --> 00:10:40,330 And so the reason that's a problem is because, in many cities, 33% of traffic 203 00:10:40,330 --> 00:10:43,540 is people circling looking for a parking spot. 204 00:10:43,540 --> 00:10:46,600 And so when they can't find one, they keep circling. 205 00:10:46,600 --> 00:10:48,640 And they keep causing mass amount of congestion. 206 00:10:48,640 --> 00:10:51,574 So the theory is, well, if we knew exactly what parking spots were 207 00:10:51,574 --> 00:10:53,740 available, we could direct people in their smartcars 208 00:10:53,740 --> 00:10:56,530 directly to those spaces, and cut down on all the congestion, 209 00:10:56,530 --> 00:10:59,650 and all the wasted gasoline. 210 00:10:59,650 --> 00:11:04,540 So smart parking is being done by both-- people put sensors 211 00:11:04,540 --> 00:11:07,510 in the parking spaces themselves. 212 00:11:07,510 --> 00:11:11,590 And that technology unfortunately needs a lot of maintenance, 213 00:11:11,590 --> 00:11:14,060 and batteries need to be replaced, and so on. 214 00:11:14,060 --> 00:11:18,520 So people are now looking at sensors that are on light posts. 215 00:11:18,520 --> 00:11:21,460 You see all these light poles around the urban environment. 216 00:11:21,460 --> 00:11:23,920 And those can be retrofitted to put sensors on them 217 00:11:23,920 --> 00:11:25,310 for a variety of things. 218 00:11:25,310 --> 00:11:30,310 One of them is monitoring using either video or some sort of laser sensor 219 00:11:30,310 --> 00:11:33,970 or something where the cars are parked. 220 00:11:33,970 --> 00:11:37,139 But you could also imagine other uses for that space. 221 00:11:37,139 --> 00:11:39,430 And so people are coming up with lots of neat proposals 222 00:11:39,430 --> 00:11:43,240 for how to use those sensor packs. 223 00:11:43,240 --> 00:11:46,900 Another technology, and this is one that actually is, I believe, 224 00:11:46,900 --> 00:11:49,420 deployed in Cambridge, Massachusetts here, 225 00:11:49,420 --> 00:11:54,430 is a network of microphones called Shotspotter. 226 00:11:54,430 --> 00:12:01,690 And what Shotspotter is is, if someone is shooting a gun in the city, 227 00:12:01,690 --> 00:12:04,630 these sensors, which are basically just microphones, 228 00:12:04,630 --> 00:12:06,550 will triangulate the location. 229 00:12:06,550 --> 00:12:10,340 And in real time, that data then gets fed to the police department, 230 00:12:10,340 --> 00:12:16,900 so that they know where to start their search for this presumed gunman. 231 00:12:16,900 --> 00:12:19,600 You could imagine other more peaceful applications 232 00:12:19,600 --> 00:12:21,040 for this kinds of technology. 233 00:12:21,040 --> 00:12:23,950 For instance, you've got someone with a motorcycle that's 234 00:12:23,950 --> 00:12:27,100 generating sound at 120 decibels. 235 00:12:27,100 --> 00:12:28,460 It's a real nuisance. 236 00:12:28,460 --> 00:12:30,911 And in the past, the police couldn't find them, 237 00:12:30,911 --> 00:12:33,160 because they just right away as soon as you got there. 238 00:12:33,160 --> 00:12:35,320 But with a technology like this, maybe you 239 00:12:35,320 --> 00:12:39,310 can actually find them and stop them, confiscate their motorcycle, 240 00:12:39,310 --> 00:12:43,070 and restore peace and quiet to the neighborhood. 241 00:12:43,070 --> 00:12:46,720 Another set of technologies that's creating 242 00:12:46,720 --> 00:12:49,300 interesting sensor data-- and most of this data 243 00:12:49,300 --> 00:12:52,930 is private for privacy reasons, but is widely available 244 00:12:52,930 --> 00:12:56,320 in the commercial sector-- is technology that 245 00:12:56,320 --> 00:13:01,150 monitors your location through the use of your mobile phone, 246 00:13:01,150 --> 00:13:04,300 and your connection to the cellular towers, and the cellular networks. 247 00:13:04,300 --> 00:13:06,460 So they know you're in this car. 248 00:13:06,460 --> 00:13:07,660 You're driving. 249 00:13:07,660 --> 00:13:09,120 The cellular network here. 250 00:13:09,120 --> 00:13:11,590 It knows where you are, because it has to transfer you 251 00:13:11,590 --> 00:13:13,090 from one tower to another. 252 00:13:13,090 --> 00:13:15,760 And then that data says, you're moving. 253 00:13:15,760 --> 00:13:16,450 You're in a car. 254 00:13:16,450 --> 00:13:18,190 And you're over there. 255 00:13:18,190 --> 00:13:21,520 It says, there's a train going by, and there's 200 people on the train, 256 00:13:21,520 --> 00:13:22,610 et cetera, et cetera. 257 00:13:22,610 --> 00:13:27,170 And so all of that location data can be used for a variety of purposes, 258 00:13:27,170 --> 00:13:29,080 including transportation planning, trying 259 00:13:29,080 --> 00:13:33,560 to figure out how congested things are, and what we should do about it. 260 00:13:33,560 --> 00:13:37,390 So these are all examples of smart city sensors. 261 00:13:37,390 --> 00:13:39,280 And so all of these things are collecting 262 00:13:39,280 --> 00:13:45,910 data that could be used to make cities run more efficiently and smarter. 263 00:13:45,910 --> 00:13:50,860 Now I'm a PhD in neuroscience. 264 00:13:50,860 --> 00:13:56,710 And so although I work mostly on vision, which is one of the five senses, 265 00:13:56,710 --> 00:14:00,190 I actually learned, during the course of my studies, 266 00:14:00,190 --> 00:14:02,690 that sensing is only sort of half of the problem. 267 00:14:02,690 --> 00:14:05,230 And there's a benefit from biology you can use, 268 00:14:05,230 --> 00:14:09,070 which says that a sensor is something that transforms energy into data. 269 00:14:09,070 --> 00:14:11,680 So you're walking around the world and your senses 270 00:14:11,680 --> 00:14:13,600 are acquiring data about the world. 271 00:14:13,600 --> 00:14:19,030 At the same time, you as a person, as an animal or an organism, 272 00:14:19,030 --> 00:14:20,776 that's not the point. 273 00:14:20,776 --> 00:14:22,150 The point isn't just to get data. 274 00:14:22,150 --> 00:14:24,130 The point is actually a do things in the world. 275 00:14:24,130 --> 00:14:27,209 And so what you need are actuators or activators, things 276 00:14:27,209 --> 00:14:28,750 that transform that data into energy. 277 00:14:28,750 --> 00:14:30,220 For us, it's our muscles. 278 00:14:30,220 --> 00:14:31,600 We walk over there. 279 00:14:31,600 --> 00:14:35,590 We take a drink, et cetera. 280 00:14:35,590 --> 00:14:38,860 For cities, a lot of the smart city stuff 281 00:14:38,860 --> 00:14:45,190 that you'll see in the media or just around the world is focused on sensing. 282 00:14:45,190 --> 00:14:47,950 And it's not focused on actually getting stuff done, 283 00:14:47,950 --> 00:14:51,920 making a change in the world, activating things. 284 00:14:51,920 --> 00:14:56,290 And so this is actually the area that we're working on with Transit Screen. 285 00:14:56,290 --> 00:15:00,730 And so our goal-- and you can see the Transit Screens in CS50 286 00:15:00,730 --> 00:15:05,680 is one very small part of that-- is to put real time 287 00:15:05,680 --> 00:15:09,370 information all around the world in places where you live, 288 00:15:09,370 --> 00:15:13,960 places where you work, like right here is CS50, places where you play, 289 00:15:13,960 --> 00:15:18,070 like these bar jukeboxes that we just launched with touch tunes that 290 00:15:18,070 --> 00:15:20,450 now have transit information. 291 00:15:20,450 --> 00:15:24,410 You can have city halls, on the streets, or when you're traveling, 292 00:15:24,410 --> 00:15:26,570 and vacationing or a hotel. 293 00:15:26,570 --> 00:15:32,750 So what's all this information is provided to do 294 00:15:32,750 --> 00:15:35,900 is to make people make different decisions, 295 00:15:35,900 --> 00:15:37,710 and make people make better decisions. 296 00:15:37,710 --> 00:15:39,380 They might be more efficient decisions. 297 00:15:39,380 --> 00:15:41,750 For them, they might be a more sustainable decisions. 298 00:15:41,750 --> 00:15:44,242 Usually when you give people more information, 299 00:15:44,242 --> 00:15:46,700 they'll shift away from whatever their default behavior is. 300 00:15:46,700 --> 00:15:49,520 And right now, in transportation, people's default behavior 301 00:15:49,520 --> 00:15:50,270 is pretty bad. 302 00:15:50,270 --> 00:15:52,160 It's usually, I get in my car and I drive, 303 00:15:52,160 --> 00:15:54,330 or I take an Uber, or something like that. 304 00:15:54,330 --> 00:15:57,280 So what you want is you want people to see the choices. 305 00:15:57,280 --> 00:15:59,600 Most of the choices are actually more sustainable. 306 00:15:59,600 --> 00:16:03,930 And so you end up with a better result when you provide the information. 307 00:16:03,930 --> 00:16:09,094 The key is that, in a smart city, people are the activators. 308 00:16:09,094 --> 00:16:10,010 They're the actuators. 309 00:16:10,010 --> 00:16:13,520 So you need to actually provide people with the information. 310 00:16:13,520 --> 00:16:15,560 It's not enough just to collect the data. 311 00:16:15,560 --> 00:16:17,300 You actually have to turn it information, 312 00:16:17,300 --> 00:16:22,220 and then you have to get people to use it to change their behavior. 313 00:16:22,220 --> 00:16:26,000 So here's one of our transit screens close up 314 00:16:26,000 --> 00:16:28,850 here from Cabbot House, Harvard. 315 00:16:28,850 --> 00:16:31,760 Just got some very nice feedback about this one. 316 00:16:31,760 --> 00:16:37,880 We had one of the CS50 TFs, who chooses his breakfast location so that he 317 00:16:37,880 --> 00:16:42,990 can see this transit screen, so then whether the shuttle is coming, 318 00:16:42,990 --> 00:16:44,840 or maybe he'll take a Hubway bikeshare. 319 00:16:44,840 --> 00:16:47,000 He has a choice based on that. 320 00:16:47,000 --> 00:16:49,840 And so that's a great example of behavioral change. 321 00:16:49,840 --> 00:16:52,460 He's changing where he sits in the dining hall. 322 00:16:52,460 --> 00:17:00,680 And then based on that, he's also changing what kind of travel he uses. 323 00:17:00,680 --> 00:17:05,599 So what other examples are there of behavioral change? 324 00:17:05,599 --> 00:17:10,700 Well, one very interesting example is another company 325 00:17:10,700 --> 00:17:14,869 called Opower that's also based in Washington, DC area. 326 00:17:14,869 --> 00:17:17,690 And what they did is they took all this smart city 327 00:17:17,690 --> 00:17:20,750 data about people's energy usages. 328 00:17:20,750 --> 00:17:24,619 And you might have an apartment or a home, 329 00:17:24,619 --> 00:17:27,020 and you use energy at a certain rate. 330 00:17:27,020 --> 00:17:29,510 This is how many kilowatt hours you used. 331 00:17:29,510 --> 00:17:34,130 And then this is how much you paid last month for your energy bills. 332 00:17:34,130 --> 00:17:39,810 Now this is the average, because they have data from everyone in Cambridge, 333 00:17:39,810 --> 00:17:40,310 say. 334 00:17:40,310 --> 00:17:43,970 That they know what the average home in Cambridge uses, in terms of energy. 335 00:17:43,970 --> 00:17:49,310 And then, in this case, this person here, Garrett, he's using more. 336 00:17:49,310 --> 00:17:53,540 And so the objective of Opower's software 337 00:17:53,540 --> 00:17:56,690 here is that they have this dashboard here, 338 00:17:56,690 --> 00:17:59,000 so that he can see that he is using more. 339 00:17:59,000 --> 00:18:03,110 And maybe there are some things he you can do, specifically, turn off lights, 340 00:18:03,110 --> 00:18:05,955 or compete against other people to reduce his energy usage. 341 00:18:05,955 --> 00:18:07,080 There are things he can do. 342 00:18:07,080 --> 00:18:11,060 They're kind of long term things, but they can, ultimately, save him money, 343 00:18:11,060 --> 00:18:13,530 and improve the environment. 344 00:18:13,530 --> 00:18:16,340 So this is something where they're taking 345 00:18:16,340 --> 00:18:20,762 data that was previously unavailable, because meters didn't used to be smart. 346 00:18:20,762 --> 00:18:21,470 But now they are. 347 00:18:21,470 --> 00:18:23,390 They're feeding that data into the internet. 348 00:18:23,390 --> 00:18:27,080 And Opower built a business using combination of web technologies, 349 00:18:27,080 --> 00:18:30,440 and also they even mail this information to you in a letter, 350 00:18:30,440 --> 00:18:32,780 because they know you're more likely to look at it. 351 00:18:32,780 --> 00:18:38,270 They were able to find ways to reduce energy usage in different cities 352 00:18:38,270 --> 00:18:40,250 around the country. 353 00:18:40,250 --> 00:18:44,030 And coincidentally enough, Opower was actually 354 00:18:44,030 --> 00:18:46,340 founded by a couple of our other classmates, mine 355 00:18:46,340 --> 00:18:50,840 and David's, these guys, Alex and Dan. 356 00:18:50,840 --> 00:18:54,050 And they actually went public a couple of years ago. 357 00:18:54,050 --> 00:18:56,420 And this is them on the New York Stock Exchange floor. 358 00:18:56,420 --> 00:19:00,620 So there's a lot of real impact and also real potential 359 00:19:00,620 --> 00:19:02,360 in some of these kinds of businesses that 360 00:19:02,360 --> 00:19:06,290 are based on the idea of smart cities. 361 00:19:06,290 --> 00:19:13,030 All right, so that's kind of the background for smart cities and data, 362 00:19:13,030 --> 00:19:17,450 and both the sensor and the activator side. 363 00:19:17,450 --> 00:19:21,200 So what I want to talk about next is a little bit 364 00:19:21,200 --> 00:19:25,440 of how you can get started by yourself in working in this area. 365 00:19:25,440 --> 00:19:30,680 And really the easiest way to get started is with open data. 366 00:19:30,680 --> 00:19:36,340 And in case you haven't heard of open data, the idea of open data as this. 367 00:19:36,340 --> 00:19:40,070 It's data that can be freely used, reused, or redistributed by any one, 368 00:19:40,070 --> 00:19:44,090 subject only to, at most, you say it's from a certain source. 369 00:19:44,090 --> 00:19:49,670 And so it's often required to be available in machine-readable formats. 370 00:19:49,670 --> 00:19:53,690 So if I give you a stack of paper, that's not open data. 371 00:19:53,690 --> 00:19:56,690 It's got to be in data format. 372 00:19:56,690 --> 00:20:00,100 And then it's also ought to be available in its entirety. 373 00:20:00,100 --> 00:20:04,910 So if I have to log into some website, and take a screenshot, 374 00:20:04,910 --> 00:20:09,350 or just scrape the web or something like that, that's not open data either. 375 00:20:09,350 --> 00:20:13,250 I have to be able to get it like a table, like an Excel file. 376 00:20:13,250 --> 00:20:18,280 Or I have to be able to get some sort of interchange format like XML, 377 00:20:18,280 --> 00:20:20,350 or JSON, or these other kinds of things. 378 00:20:20,350 --> 00:20:25,000 So open data is often generated by the government, but not always. 379 00:20:25,000 --> 00:20:28,760 Because governments have to be public and transparent about their data. 380 00:20:28,760 --> 00:20:30,730 So they have to share it with citizens. 381 00:20:30,730 --> 00:20:34,570 And the result of this-- and there's been a lot of open data momentum 382 00:20:34,570 --> 00:20:39,130 recently-- the result of this is that it's a really good way to get started. 383 00:20:39,130 --> 00:20:43,900 Because with some of this other stuff like this data, 384 00:20:43,900 --> 00:20:47,240 few years ago, you weren't able to get this data from the energy companies. 385 00:20:47,240 --> 00:20:50,290 Your utility, you'd say, well, I have a really great idea 386 00:20:50,290 --> 00:20:53,760 and I'd like to be able to do something to make my home more efficient. 387 00:20:53,760 --> 00:20:54,760 And they'd say, too bad. 388 00:20:54,760 --> 00:20:56,830 You can't have the data. 389 00:20:56,830 --> 00:20:57,580 You're a customer. 390 00:20:57,580 --> 00:20:59,350 You're not our partner. 391 00:20:59,350 --> 00:21:02,260 But with the government involved and generating this data, 392 00:21:02,260 --> 00:21:06,910 for instance, the Boston T, the MBTA, is a government agency. 393 00:21:06,910 --> 00:21:10,450 And so they provide open data that's available to a lot of people, 394 00:21:10,450 --> 00:21:14,170 and without very many restrictions on it. 395 00:21:14,170 --> 00:21:17,500 So one of the impacts of this is that you can develop your app, 396 00:21:17,500 --> 00:21:20,450 or you can even develop a business based on open data. 397 00:21:20,450 --> 00:21:24,820 And so you're no longer required, say, for the T-- 398 00:21:24,820 --> 00:21:27,900 I'm not even sure if they have their own MBTA app or not. 399 00:21:27,900 --> 00:21:29,740 But there are a lot of other apps you can 400 00:21:29,740 --> 00:21:34,000 use on your mobile phone that allow you to get information about them. 401 00:21:34,000 --> 00:21:36,430 Transit Screen is another example of something 402 00:21:36,430 --> 00:21:42,640 that uses open data from the T, as well as dozens and dozens, if not hundreds, 403 00:21:42,640 --> 00:21:46,060 of other agencies around the world. 404 00:21:46,060 --> 00:21:48,506 And then another nice thing about this is 405 00:21:48,506 --> 00:21:51,130 that there are many different people contributing to open data. 406 00:21:51,130 --> 00:21:53,590 So it doesn't just have to be sourced from the government. 407 00:21:53,590 --> 00:21:55,660 Someone can come along and say, well, I think 408 00:21:55,660 --> 00:21:57,649 this data set would be better in this format, 409 00:21:57,649 --> 00:21:59,440 or it would be easier to use, or something. 410 00:21:59,440 --> 00:22:01,990 And you can clean it up, and then redistribute it. 411 00:22:01,990 --> 00:22:05,499 And then the results of all this kind of activity 412 00:22:05,499 --> 00:22:07,540 is that people can build little businesses on it. 413 00:22:07,540 --> 00:22:10,810 And so I'm just going to plug the startup incubator that we 414 00:22:10,810 --> 00:22:13,840 come from, which is in Washington, DC. 415 00:22:13,840 --> 00:22:15,400 And it's called 1776. 416 00:22:15,400 --> 00:22:17,800 And their mission is to solve hard problems 417 00:22:17,800 --> 00:22:21,340 in the areas of cities, and government, and health care, and education, 418 00:22:21,340 --> 00:22:24,220 and transportation, a lot of which can be 419 00:22:24,220 --> 00:22:26,664 addressed using some form of open data. 420 00:22:26,664 --> 00:22:29,980 421 00:22:29,980 --> 00:22:32,620 Like I said, the US federal government has 422 00:22:32,620 --> 00:22:36,430 gotten behind this crusade of openness and transparency. 423 00:22:36,430 --> 00:22:39,530 And so the Obama Administration had a pronouncement. 424 00:22:39,530 --> 00:22:41,650 Then they created an independent authority 425 00:22:41,650 --> 00:22:45,430 under the US CTO's office to promote open data. 426 00:22:45,430 --> 00:22:49,210 And this was me meeting the President at 1776. 427 00:22:49,210 --> 00:22:52,330 I just wanted to show this slide. 428 00:22:52,330 --> 00:22:56,740 All right, so I'm going to walk you through a couple 429 00:22:56,740 --> 00:23:01,180 of different examples of open data, and things you can actually use. 430 00:23:01,180 --> 00:23:03,790 These ones are ones that I know well, because they're 431 00:23:03,790 --> 00:23:06,340 relevant to transportation for the most part. 432 00:23:06,340 --> 00:23:10,360 But there's a whole variety of different sources out there that you can go to. 433 00:23:10,360 --> 00:23:13,510 So I'm going to talk about four different areas very quickly. 434 00:23:13,510 --> 00:23:19,540 One of them is open geographic data, so maps, OpenStreetMap specifically, 435 00:23:19,540 --> 00:23:23,680 which is like the Wikipedia of maps, GTFS, 436 00:23:23,680 --> 00:23:26,500 which is how transit schedules are represented, 437 00:23:26,500 --> 00:23:31,270 and it's been a very successful open data standard, Realtime APIs-- 438 00:23:31,270 --> 00:23:33,170 so this is just transit schedules, but if you 439 00:23:33,170 --> 00:23:36,010 want to know where a train is right now, you 440 00:23:36,010 --> 00:23:40,870 need to use a Realtime API-- and then energy data. 441 00:23:40,870 --> 00:23:45,250 So there's some new standards there that we can talk about. 442 00:23:45,250 --> 00:23:48,160 So OpenStreetMap, a lot of people have heard about Google Maps. 443 00:23:48,160 --> 00:23:49,480 They're kind of the standard. 444 00:23:49,480 --> 00:23:52,300 Some people have even heard of Apple Maps. 445 00:23:52,300 --> 00:23:58,450 And OpenStreetMap is less well-known, but it underlies a tremendous amount 446 00:23:58,450 --> 00:23:59,980 of activity on the internet. 447 00:23:59,980 --> 00:24:03,230 And it's really at sort of world-class standards right now. 448 00:24:03,230 --> 00:24:07,990 So this was Sochi Olympics, the Winter Olympics before the last summer 449 00:24:07,990 --> 00:24:08,830 Olympics. 450 00:24:08,830 --> 00:24:12,430 And during that time-- Sochi is a Russian city that not a lot of people 451 00:24:12,430 --> 00:24:13,750 are familiar with. 452 00:24:13,750 --> 00:24:15,880 They kind of built it up for the Olympics. 453 00:24:15,880 --> 00:24:18,700 This is what OpenStreetMap looked like during that Olympics. 454 00:24:18,700 --> 00:24:22,551 So if you were a visitor, and you were trying to find your way around, 455 00:24:22,551 --> 00:24:23,800 you can see all the buildings. 456 00:24:23,800 --> 00:24:26,900 You can see all the new paths, the Olympic Park, et cetera. 457 00:24:26,900 --> 00:24:30,580 And this is what it looked like in Google Maps. 458 00:24:30,580 --> 00:24:36,430 So Google, which, reminder, is a company that has hundreds of billions 459 00:24:36,430 --> 00:24:39,730 of dollars sitting around, didn't really prepare for the Olympics, 460 00:24:39,730 --> 00:24:41,920 or even get the data together. 461 00:24:41,920 --> 00:24:45,220 Whereas a community of people, just like Wikipedia editors, 462 00:24:45,220 --> 00:24:48,010 people generating their own open data, managed 463 00:24:48,010 --> 00:24:49,780 to achieve this in OpenStreetMap. 464 00:24:49,780 --> 00:24:54,010 So last Olympics, it wasn't quite as clear 465 00:24:54,010 --> 00:24:56,860 a distinction, because Google actually put a lot of money 466 00:24:56,860 --> 00:25:01,250 into making sure that the last summer Olympics wasn't like this for them. 467 00:25:01,250 --> 00:25:03,850 468 00:25:03,850 --> 00:25:07,341 So OpenStreetMap is one of these open data 469 00:25:07,341 --> 00:25:10,090 sets that's not generated by a government, although in many cases, 470 00:25:10,090 --> 00:25:11,530 it uses government data. 471 00:25:11,530 --> 00:25:13,900 It's generated by crowdsourcing, like Wikipedia. 472 00:25:13,900 --> 00:25:16,960 So here's an interesting map showing the number of edits 473 00:25:16,960 --> 00:25:19,990 in each of these geographic areas. 474 00:25:19,990 --> 00:25:23,190 One thing to notice is that, in many of these areas in Europe, 475 00:25:23,190 --> 00:25:27,400 there's-- I think this is per square kilometer or something-- 476 00:25:27,400 --> 00:25:33,040 it's like over 500,000 edits per square kilometer, which is just amazing. 477 00:25:33,040 --> 00:25:37,420 You're really converging very quickly on sort of ground truth 478 00:25:37,420 --> 00:25:39,340 when you have that kind of activity. 479 00:25:39,340 --> 00:25:41,700 And the US is not too far behind. 480 00:25:41,700 --> 00:25:44,560 481 00:25:44,560 --> 00:25:47,340 So you, yourself, can actually edit OpenStreetMap. 482 00:25:47,340 --> 00:25:48,130 It's very easy. 483 00:25:48,130 --> 00:25:52,159 You just go OpenStreetMap.org and start editing stuff that you know. 484 00:25:52,159 --> 00:25:55,200 One thing you can do with it-- and I think this is kind of a neat thing-- 485 00:25:55,200 --> 00:25:59,872 is in some cities, the sidewalks and pedestrian paths 486 00:25:59,872 --> 00:26:01,080 aren't very well represented. 487 00:26:01,080 --> 00:26:07,710 So let's say that you have a elderly relative who uses like a wheelchair, 488 00:26:07,710 --> 00:26:09,930 or a scooter, or something like that, they're not 489 00:26:09,930 --> 00:26:12,030 going to be able to get from point A to point B, 490 00:26:12,030 --> 00:26:18,082 or to have an app tell them how they can safely get from one point to another. 491 00:26:18,082 --> 00:26:20,040 They'll just have to discover it by themselves. 492 00:26:20,040 --> 00:26:23,320 But you can actually go into OpenStreetMap and edit these paths. 493 00:26:23,320 --> 00:26:25,530 Say, here's an accessible path. 494 00:26:25,530 --> 00:26:27,930 There's a curb here, so if you're using a wheelchair, 495 00:26:27,930 --> 00:26:30,090 you can't jump over that, and so on. 496 00:26:30,090 --> 00:26:34,410 And then this all gets put into trip planning apps, 497 00:26:34,410 --> 00:26:40,390 so that, when you're using one of these apps like a city navigation app, 498 00:26:40,390 --> 00:26:42,360 it will actually now give you these paths. 499 00:26:42,360 --> 00:26:44,850 And you could build an app for someone who's 500 00:26:44,850 --> 00:26:48,270 using a wheelchair that says, how can I actually get from point A 501 00:26:48,270 --> 00:26:50,430 to point B in a safe way? 502 00:26:50,430 --> 00:26:53,100 And so you can build that for your local community, 503 00:26:53,100 --> 00:26:56,190 and then extend that out as you grow. 504 00:26:56,190 --> 00:27:00,330 So I think this is really an interesting use of open data 505 00:27:00,330 --> 00:27:02,530 where you can actually solve a real problem. 506 00:27:02,530 --> 00:27:05,940 And then you can also maybe build an app, or even a company 507 00:27:05,940 --> 00:27:08,010 on something like that. 508 00:27:08,010 --> 00:27:09,750 Here's another interesting use of data. 509 00:27:09,750 --> 00:27:18,540 You can use this open satellite data to measure the solar potential 510 00:27:18,540 --> 00:27:20,310 of different buildings in the city. 511 00:27:20,310 --> 00:27:22,560 So this is a solar map of Cambridge, Massachusetts. 512 00:27:22,560 --> 00:27:26,280 And this is showing that some parts of the roof 513 00:27:26,280 --> 00:27:29,700 here could actually have solar installations on here that 514 00:27:29,700 --> 00:27:34,230 could theoretically generate a lot of power for this building. 515 00:27:34,230 --> 00:27:38,730 And then you could even measure, using open data from the electric utility 516 00:27:38,730 --> 00:27:40,710 about how much they charge and so on, you 517 00:27:40,710 --> 00:27:43,626 could say, well, how long will it take me to pay off the solar panels, 518 00:27:43,626 --> 00:27:45,090 and so on? 519 00:27:45,090 --> 00:27:47,070 Does CS50 have solar panels? 520 00:27:47,070 --> 00:27:48,250 Not yet. 521 00:27:48,250 --> 00:27:51,600 All right, working on it. 522 00:27:51,600 --> 00:27:54,950 So another interesting application of all this Geodata 523 00:27:54,950 --> 00:27:56,580 is you can do custom styles. 524 00:27:56,580 --> 00:27:59,100 And there's a great company we work with called 525 00:27:59,100 --> 00:28:01,800 Mapbox that's based in Washington, DC. 526 00:28:01,800 --> 00:28:04,530 And they've built a whole business on open data 527 00:28:04,530 --> 00:28:07,800 and on using different ways to interact with it. 528 00:28:07,800 --> 00:28:10,980 So they have a styling language that's kind 529 00:28:10,980 --> 00:28:15,510 of like how you use HTML to make the web pages look different. 530 00:28:15,510 --> 00:28:18,700 They've created language for making maps look different. 531 00:28:18,700 --> 00:28:22,920 And so these all they look completely different. 532 00:28:22,920 --> 00:28:25,650 And yet, they're all generated with the same open data, 533 00:28:25,650 --> 00:28:29,640 with the same OpenStreetMap data, just with different styling rules applied. 534 00:28:29,640 --> 00:28:32,400 So you guys could actually see these examples online. 535 00:28:32,400 --> 00:28:36,120 You could edit them, and change them yourself, and come up 536 00:28:36,120 --> 00:28:39,570 with maps that look ranging from, this is very similar to Google Maps, 537 00:28:39,570 --> 00:28:45,570 to something that's more like a hiking map, or sort of a '50s style map, 538 00:28:45,570 --> 00:28:48,690 or even a sort of comic book kind of map. 539 00:28:48,690 --> 00:28:54,690 So all these different styles are now available because the data exists. 540 00:28:54,690 --> 00:28:56,820 So that's geographical data. 541 00:28:56,820 --> 00:29:00,870 And that's a really important source of knowledge about the world. 542 00:29:00,870 --> 00:29:05,830 Lots of great solutions to urban problems can be found with Geodata. 543 00:29:05,830 --> 00:29:08,370 Another kind of open data that I know very well, 544 00:29:08,370 --> 00:29:11,820 and that I think has been a real success story, is called GTFS. 545 00:29:11,820 --> 00:29:14,744 And this is the general transit feed specification. 546 00:29:14,744 --> 00:29:15,660 You need to know that. 547 00:29:15,660 --> 00:29:18,790 Basically, all you need to know is that it's transit schedules. 548 00:29:18,790 --> 00:29:22,770 So when is the bus running from A to B, and where? 549 00:29:22,770 --> 00:29:26,190 So there are a couple of places where you can explore this kind of data 550 00:29:26,190 --> 00:29:27,870 and get started. 551 00:29:27,870 --> 00:29:31,330 One great one is called transit feeds. 552 00:29:31,330 --> 00:29:37,074 And just search for transit feeds and you can find it. 553 00:29:37,074 --> 00:29:38,490 You can actually click through it. 554 00:29:38,490 --> 00:29:39,960 This is San Francisco. 555 00:29:39,960 --> 00:29:41,310 You can find the latest file. 556 00:29:41,310 --> 00:29:43,050 You can look for a certain route. 557 00:29:43,050 --> 00:29:46,740 And then the data is all available to download that powers 558 00:29:46,740 --> 00:29:48,540 this particular visualization. 559 00:29:48,540 --> 00:29:52,110 Another really nice tool that's been developing very quickly from a company 560 00:29:52,110 --> 00:29:56,220 called Mapzen that we work with is called Transitland. 561 00:29:56,220 --> 00:30:02,160 And so Transitland is what's called a data commons, which 562 00:30:02,160 --> 00:30:05,100 is that they're taking all open data from all over the world, all 563 00:30:05,100 --> 00:30:07,650 over the planet, and bring it together in one place, 564 00:30:07,650 --> 00:30:10,530 so that you can see everything sort of unequal footing. 565 00:30:10,530 --> 00:30:15,090 You have data from Sydney, Australia, and then you 566 00:30:15,090 --> 00:30:16,410 have data from San Francisco. 567 00:30:16,410 --> 00:30:18,900 And it's all in the same standard and the same format. 568 00:30:18,900 --> 00:30:22,290 So you can take a solution that you develop that works for San Francisco, 569 00:30:22,290 --> 00:30:25,654 and you can immediately make work for Sydney, Australia. 570 00:30:25,654 --> 00:30:28,880 571 00:30:28,880 --> 00:30:32,500 So GTFS, one of the reasons I like it, and one the reasons 572 00:30:32,500 --> 00:30:35,350 I'm talking about it, is that it's actually really simple. 573 00:30:35,350 --> 00:30:39,500 And so it's plain text files, and it's written in tables. 574 00:30:39,500 --> 00:30:43,310 So if you download a GTFS file, it's like a zip file. 575 00:30:43,310 --> 00:30:45,760 And it's got this stuff in it. 576 00:30:45,760 --> 00:30:47,170 These things all have formats. 577 00:30:47,170 --> 00:30:52,309 But if you just double click on one of them, stops.txt, open that up, 578 00:30:52,309 --> 00:30:53,600 and this is what it looks like. 579 00:30:53,600 --> 00:30:58,390 It's got a list of stops, the names of those stops, latitudes and longitude, 580 00:30:58,390 --> 00:31:00,730 so you can plot it on your OpenStreetMap, 581 00:31:00,730 --> 00:31:02,450 and then some other information here. 582 00:31:02,450 --> 00:31:05,890 So it's the kind of data that you can really start exploring 583 00:31:05,890 --> 00:31:10,060 and start building on very readily. 584 00:31:10,060 --> 00:31:12,190 Of course, it's a little bit more complicated. 585 00:31:12,190 --> 00:31:14,770 And if you really want to do some complicated things, 586 00:31:14,770 --> 00:31:18,460 you'll have to start working a little bit with databases, or at least 587 00:31:18,460 --> 00:31:20,410 matching data from one thing to another. 588 00:31:20,410 --> 00:31:23,170 But it's not it's not terribly complex. 589 00:31:23,170 --> 00:31:25,180 And there are tools that are available that 590 00:31:25,180 --> 00:31:27,280 are mostly open source, easy to download, 591 00:31:27,280 --> 00:31:30,140 that can be used for this kind of stuff. 592 00:31:30,140 --> 00:31:32,650 So for instance, those stops I showed you 593 00:31:32,650 --> 00:31:38,080 are linked to particular stop times, and to certain bus trips, and so on. 594 00:31:38,080 --> 00:31:40,780 So you can actually go from one thing to another, 595 00:31:40,780 --> 00:31:45,100 and match all these things up in the database. 596 00:31:45,100 --> 00:31:48,940 Another interesting source of data from around here is bikeshare data. 597 00:31:48,940 --> 00:31:51,640 And a lot of bikeshare data is open. 598 00:31:51,640 --> 00:31:54,070 In DC, I helped open that up. 599 00:31:54,070 --> 00:31:59,020 But then other cities, like Boston, have just done that from the beginning. 600 00:31:59,020 --> 00:32:01,810 And so here's an interesting example of someone 601 00:32:01,810 --> 00:32:06,590 who-- they had a contest for visualizing Hubway data. 602 00:32:06,590 --> 00:32:12,390 And so that resulted in a really nice visualization of trips versus, I think, 603 00:32:12,390 --> 00:32:13,960 distance. 604 00:32:13,960 --> 00:32:18,555 And they have actual trip history here, so you can see not just your trips, 605 00:32:18,555 --> 00:32:20,680 or just what the stations are, but you can actually 606 00:32:20,680 --> 00:32:25,070 see how many people have traveled from what part to what other part. 607 00:32:25,070 --> 00:32:30,100 And so that's the kind of more the sort of monitoring data, 608 00:32:30,100 --> 00:32:32,940 or the real historical data that we're talking about. 609 00:32:32,940 --> 00:32:34,690 A lot of that is available with bikeshare. 610 00:32:34,690 --> 00:32:36,700 So if you're interested in what you could 611 00:32:36,700 --> 00:32:39,100 do if you had data from everyone's cellular records 612 00:32:39,100 --> 00:32:41,502 of how they got from point A to point B, then bikeshare 613 00:32:41,502 --> 00:32:44,710 is an interesting way you might want to get started with something like that, 614 00:32:44,710 --> 00:32:45,701 because it is open. 615 00:32:45,701 --> 00:32:48,790 616 00:32:48,790 --> 00:32:51,340 So all of that data is static data. 617 00:32:51,340 --> 00:32:53,050 The maps are just static. 618 00:32:53,050 --> 00:32:54,820 The schedule never changes. 619 00:32:54,820 --> 00:32:58,742 And the bikeshare data is the history of the bikeshare system. 620 00:32:58,742 --> 00:33:00,700 When you're talking about real time data, stuff 621 00:33:00,700 --> 00:33:04,690 that's actually happening right now in real time, 622 00:33:04,690 --> 00:33:06,670 you're not talking about data sets anymore. 623 00:33:06,670 --> 00:33:08,860 You're really talking about APIs. 624 00:33:08,860 --> 00:33:13,490 And I think APIs are covered later in the course, sort of towards the end. 625 00:33:13,490 --> 00:33:20,050 But basically, all an API is is the way you talk to another program, 626 00:33:20,050 --> 00:33:22,180 or the way you talk to a database about something. 627 00:33:22,180 --> 00:33:25,900 And so you can get information from that program, or from that database, 628 00:33:25,900 --> 00:33:27,280 that you can then use. 629 00:33:27,280 --> 00:33:29,890 And if that database is changing, is updating 630 00:33:29,890 --> 00:33:33,730 in real time, then you have that information and it's up to the minute, 631 00:33:33,730 --> 00:33:34,600 or up to the second. 632 00:33:34,600 --> 00:33:35,620 And you can use it. 633 00:33:35,620 --> 00:33:41,000 So a lot of APIs can actually be accessed just through your web browser. 634 00:33:41,000 --> 00:33:43,690 So in many cases, if you have an API, you 635 00:33:43,690 --> 00:33:50,440 can just type the name of the so-called endpoint into your web browser. 636 00:33:50,440 --> 00:33:54,750 And then you'll get out, in a readable format-- this one is called JSON, 637 00:33:54,750 --> 00:33:57,220 J-S-O-N-- you can get information. 638 00:33:57,220 --> 00:33:59,830 So here's one from a transit service. 639 00:33:59,830 --> 00:34:02,282 And it says, well, here are the stop times. 640 00:34:02,282 --> 00:34:03,490 There's an arrival coming up. 641 00:34:03,490 --> 00:34:05,050 Don't worry about the numbers. 642 00:34:05,050 --> 00:34:06,880 This is in a particular format. 643 00:34:06,880 --> 00:34:10,719 But the point is, this one is just 0 seconds from the stop. 644 00:34:10,719 --> 00:34:13,210 So the train has just gotten to the stop. 645 00:34:13,210 --> 00:34:17,409 And now you know that because you called that API, and you got information back. 646 00:34:17,409 --> 00:34:19,179 And now your program can use that too. 647 00:34:19,179 --> 00:34:21,159 You can put it up on a screen, like we do. 648 00:34:21,159 --> 00:34:24,699 Or you can do other things with it. 649 00:34:24,699 --> 00:34:27,060 APIs are readable by code. 650 00:34:27,060 --> 00:34:28,870 This is Twitter. 651 00:34:28,870 --> 00:34:30,520 And this is someone's tweet. 652 00:34:30,520 --> 00:34:32,560 This is not what I would consider to be an API. 653 00:34:32,560 --> 00:34:38,199 You can get tweets from an API, but it's not the same, 654 00:34:38,199 --> 00:34:41,170 because this information isn't structured. 655 00:34:41,170 --> 00:34:42,820 It's just someone's typing. 656 00:34:42,820 --> 00:34:46,139 And it's very hard to know what to do with that to write a program that 657 00:34:46,139 --> 00:34:47,889 can use all of this text. 658 00:34:47,889 --> 00:34:50,949 Whereas with this, I can write a program that says, I want this, 659 00:34:50,949 --> 00:34:53,020 then I want that, then I want this part. 660 00:34:53,020 --> 00:34:54,699 So there's a big difference. 661 00:34:54,699 --> 00:34:58,680 And it's a lot easier to work with APIs that generate machine-readable data 662 00:34:58,680 --> 00:35:00,910 than other data. 663 00:35:00,910 --> 00:35:06,520 Another key property of APIs is that some of them are giving you bulk data, 664 00:35:06,520 --> 00:35:11,560 and some of them are like family style, and some are giving you single serving. 665 00:35:11,560 --> 00:35:14,020 So we just went to a Chinese restaurant. 666 00:35:14,020 --> 00:35:15,160 There was a buffet. 667 00:35:15,160 --> 00:35:16,770 You could take as much as you want. 668 00:35:16,770 --> 00:35:20,230 If you're a particularly rude, you could just grab everything and walk out 669 00:35:20,230 --> 00:35:21,770 the door with it. 670 00:35:21,770 --> 00:35:25,660 APIs are like that, except of course, you can get to the food, 671 00:35:25,660 --> 00:35:27,670 and someone else can also get the food. 672 00:35:27,670 --> 00:35:31,420 And so some of them will tell you about, for instance, where every train is 673 00:35:31,420 --> 00:35:34,810 in the entire metro system in DC here. 674 00:35:34,810 --> 00:35:37,560 And so one API call will tell you all of that. 675 00:35:37,560 --> 00:35:42,102 There are also some APIs for metro that will tell you about a particular stop. 676 00:35:42,102 --> 00:35:45,060 And so you just ask about one stop, but that's your single-serving API. 677 00:35:45,060 --> 00:35:53,200 And that can say, there's a train right now going to 16th and Colorado. 678 00:35:53,200 --> 00:36:01,720 So want to show you another example just related to the energy data stuff 679 00:36:01,720 --> 00:36:03,640 that I was talking about. 680 00:36:03,640 --> 00:36:09,191 So there's also a website for energy data, 681 00:36:09,191 --> 00:36:10,690 like I showed in the Opower example. 682 00:36:10,690 --> 00:36:13,090 And it's called it's called Green Button. 683 00:36:13,090 --> 00:36:18,220 And what it is USdata.gov office and the EPA 684 00:36:18,220 --> 00:36:20,960 created a standard for interchanging energy data. 685 00:36:20,960 --> 00:36:27,370 So if you have an apartment, or a condo, or whatever house, 686 00:36:27,370 --> 00:36:32,290 and you're paying energy bills to a certain set of utilities, 687 00:36:32,290 --> 00:36:37,810 they are required to give you access to your own historical energy usage data. 688 00:36:37,810 --> 00:36:42,640 So there should be, on the Customer Portal website, 689 00:36:42,640 --> 00:36:44,740 a log in that you can get. 690 00:36:44,740 --> 00:36:47,910 And then there should be one of these green buttons that 691 00:36:47,910 --> 00:36:50,684 allows you to access your own data, and pull that data down. 692 00:36:50,684 --> 00:36:52,600 And then you can do whatever you want with it. 693 00:36:52,600 --> 00:36:54,430 You can plot your own historical use. 694 00:36:54,430 --> 00:36:55,930 You can try to optimize it. 695 00:36:55,930 --> 00:36:57,760 You can compare it to other things. 696 00:36:57,760 --> 00:37:00,810 And so this is also now an open data ecosystem. 697 00:37:00,810 --> 00:37:03,190 And last time I looked, there were about 250 apps 698 00:37:03,190 --> 00:37:05,440 people had built using this kind of green button data. 699 00:37:05,440 --> 00:37:08,150 So this is another interesting place to get started 700 00:37:08,150 --> 00:37:11,020 if you have paying a utility bill. 701 00:37:11,020 --> 00:37:14,160 Not sure if Harvard is doing this right now for students. 702 00:37:14,160 --> 00:37:19,270 But if you're living off campus, you can probably get this. 703 00:37:19,270 --> 00:37:23,470 So some places you can get inspired to come up 704 00:37:23,470 --> 00:37:28,470 with different project ideas for smart cities using open data, one of them 705 00:37:28,470 --> 00:37:30,640 is, like I said, data.gov. 706 00:37:30,640 --> 00:37:35,380 And data.gov is a place where you can find almost an infinite number 707 00:37:35,380 --> 00:37:36,810 of open data sets. 708 00:37:36,810 --> 00:37:39,330 It can be a little bit overwhelming, just trying 709 00:37:39,330 --> 00:37:41,180 to figure out what's available there. 710 00:37:41,180 --> 00:37:44,347 But they let you sort them by different types of things 711 00:37:44,347 --> 00:37:45,430 that you're interested in. 712 00:37:45,430 --> 00:37:47,180 You might be interested in transportation. 713 00:37:47,180 --> 00:37:48,720 You might be interested in energy. 714 00:37:48,720 --> 00:37:50,560 You might be interested in agriculture. 715 00:37:50,560 --> 00:37:52,740 All of those data sets are available there. 716 00:37:52,740 --> 00:37:57,000 And you can find a catalog of them, and then start generating ideas 717 00:37:57,000 --> 00:38:00,510 for how you might use some of that data to get some insight, 718 00:38:00,510 --> 00:38:05,100 or maybe build an app that would be useful to people. 719 00:38:05,100 --> 00:38:09,370 There's another class of kind of places like that called developer portals. 720 00:38:09,370 --> 00:38:13,060 And so, for instance, for MBTA, for the Boston T, 721 00:38:13,060 --> 00:38:15,810 they have a developer portal that you can use. 722 00:38:15,810 --> 00:38:18,590 And you get log in for that. 723 00:38:18,590 --> 00:38:20,830 And then you get access to their open data. 724 00:38:20,830 --> 00:38:25,520 You get access to their APIs as well, and a variety of other things. 725 00:38:25,520 --> 00:38:30,970 So often, if you're looking to find some open data, 726 00:38:30,970 --> 00:38:35,540 you might be able to look for it by looking for a developer site. 727 00:38:35,540 --> 00:38:37,570 There are some other catalogs of other APIs. 728 00:38:37,570 --> 00:38:40,890 There are a lot of private APIs from other services, 729 00:38:40,890 --> 00:38:44,080 so things you might use-- Yelp, if you use 730 00:38:44,080 --> 00:38:47,100 that to search for restaurants or local businesses, 731 00:38:47,100 --> 00:38:48,340 they have an API you can use. 732 00:38:48,340 --> 00:38:50,310 Foursquare has an API. 733 00:38:50,310 --> 00:38:53,520 A variety of other different commercial APIs exist. 734 00:38:53,520 --> 00:38:56,760 Google Maps, you can use their data to some extent. 735 00:38:56,760 --> 00:39:00,540 And all these things have some way to access them. 736 00:39:00,540 --> 00:39:01,770 And it's often free. 737 00:39:01,770 --> 00:39:06,120 So you can find some of them just by searching for developer portals, 738 00:39:06,120 --> 00:39:07,680 like Yelp developer portal. 739 00:39:07,680 --> 00:39:10,390 But there are also other catalogs you can use too, 740 00:39:10,390 --> 00:39:15,850 like publicapis.com is one that seems fairly recent that I came across. 741 00:39:15,850 --> 00:39:19,000 There's an old one called programmableweb.com 742 00:39:19,000 --> 00:39:24,640 that's been around for a long time, and has a lot of APIs, some of which 743 00:39:24,640 --> 00:39:28,660 go back almost 10 or even 15 years. 744 00:39:28,660 --> 00:39:31,960 So all of these places are good ways to sort of discover the richness of what's 745 00:39:31,960 --> 00:39:35,310 available, in terms of APIs. 746 00:39:35,310 --> 00:39:35,950 and open data. 747 00:39:35,950 --> 00:39:39,180 And then green button data, that's the one that I mentioned for their energy 748 00:39:39,180 --> 00:39:40,200 data as well. 749 00:39:40,200 --> 00:39:43,352 And even if you don't have a apartment, or house, 750 00:39:43,352 --> 00:39:45,310 and you don't pay a utility bill, you can still 751 00:39:45,310 --> 00:39:47,700 get test data from these kinds of sites that 752 00:39:47,700 --> 00:39:52,300 would allow you to build an app that would be based on that test data. 753 00:39:52,300 --> 00:39:58,060 And many of these will also allow you to test certain things against them. 754 00:39:58,060 --> 00:40:02,680 So I hope that's been sort of a rapid-fire introduction 755 00:40:02,680 --> 00:40:06,750 to some of the different aspects of data and APIs, 756 00:40:06,750 --> 00:40:10,770 and especially where it relates to cities, and the kinds of data sources 757 00:40:10,770 --> 00:40:15,390 that you have in cities, and for solving urban problems. 758 00:40:15,390 --> 00:40:19,570 So I just want to wrap up a little bit by showing 759 00:40:19,570 --> 00:40:21,450 some more slides of our team. 760 00:40:21,450 --> 00:40:26,580 This is our DC team for Transit Screen, and of course, 761 00:40:26,580 --> 00:40:31,180 mention that we're very interested in having any of the CS50 students, 762 00:40:31,180 --> 00:40:34,870 or anyone else who's interested, please drop us a line. 763 00:40:34,870 --> 00:40:38,430 If you're interested in a summer internship, or a job, or whatever, 764 00:40:38,430 --> 00:40:41,760 please let us know. 765 00:40:41,760 --> 00:40:43,270 We tend to be very mission driven. 766 00:40:43,270 --> 00:40:46,420 And we're focused on making cities more sustainable, 767 00:40:46,420 --> 00:40:48,750 promoting walkability, urbanization, and public health. 768 00:40:48,750 --> 00:40:51,929 And our own operations are actually carbon neutral. 769 00:40:51,929 --> 00:40:54,970 And I wish I had a better way to do that and to show that with open data. 770 00:40:54,970 --> 00:41:00,330 But we've managed to audit our own energy usage and prove that. 771 00:41:00,330 --> 00:41:03,240 And then also, zero of our employees commute by car. 772 00:41:03,240 --> 00:41:05,580 So we get around a lot of fun ways in the city. 773 00:41:05,580 --> 00:41:09,430 And so I just wanted to say, sort of in conclusion, 774 00:41:09,430 --> 00:41:14,010 the way that my personal opinion about this goes, 775 00:41:14,010 --> 00:41:18,600 is that in a truly smart city is not one just where all the data gets 776 00:41:18,600 --> 00:41:22,140 brought together and kind of funneled into some sort of silo 777 00:41:22,140 --> 00:41:24,060 to be used by people in the government. 778 00:41:24,060 --> 00:41:27,180 It's one where data is open and data shared, 779 00:41:27,180 --> 00:41:29,700 so that that data and the technology can be 780 00:41:29,700 --> 00:41:36,780 used to improve the lives of everyone, from the janitor to the CEO, people 781 00:41:36,780 --> 00:41:42,570 who are old, people who are young, and just the whole variety of residents 782 00:41:42,570 --> 00:41:43,870 and citizens of the city. 783 00:41:43,870 --> 00:41:46,830 So I really tend to think about smart citizens 784 00:41:46,830 --> 00:41:49,210 as being the goal, rather than smart cities. 785 00:41:49,210 --> 00:41:55,930 And I think that's a good way to keep in mind the real problems that 786 00:41:55,930 --> 00:42:00,120 need to be addressed, and the real problems that can be solved. 787 00:42:00,120 --> 00:42:01,680 So this in my email. 788 00:42:01,680 --> 00:42:02,850 It's matt@transitscreen.com. 789 00:42:02,850 --> 00:42:05,200 Feel free to drop me a line. 790 00:42:05,200 --> 00:42:10,250 This is a picture from 1999 of me in Leverette House Dining Hall, 791 00:42:10,250 --> 00:42:12,030 which is, I believe, closed right now. 792 00:42:12,030 --> 00:42:15,780 This painting was hung upside down for half the year that year, 793 00:42:15,780 --> 00:42:16,612 and no one noticed. 794 00:42:16,612 --> 00:42:19,660 795 00:42:19,660 --> 00:42:24,000 And these are a couple shots of me on the computer programming competition. 796 00:42:24,000 --> 00:42:27,490 Especially if you're just starting CS50 and you're interested, 797 00:42:27,490 --> 00:42:29,820 I think there are tryout processes for these. 798 00:42:29,820 --> 00:42:35,640 So you can go and try to get yourself on the international programming team. 799 00:42:35,640 --> 00:42:39,492 I got to go to exciting San Jose this year. 800 00:42:39,492 --> 00:42:41,200 And this year was a little more exciting, 801 00:42:41,200 --> 00:42:45,990 we got to go to Amsterdam for the international finals. 802 00:42:45,990 --> 00:42:51,030 Look at these computers, just going to say. 803 00:42:51,030 --> 00:42:54,380 And then a bit later, I got to-- see, the computer 804 00:42:54,380 --> 00:42:56,120 is a lot newer in this one. 805 00:42:56,120 --> 00:42:59,340 That's all I'm going to say about that one. 806 00:42:59,340 --> 00:43:01,230 All right, so thanks very much. 807 00:43:01,230 --> 00:43:03,510 And I appreciate your attention. 808 00:43:03,510 --> 00:43:05,621 And again, please drop me a line if you'd like. 809 00:43:05,621 --> 00:43:08,412 And I'd love to take any questions from the audience at this point. 810 00:43:08,412 --> 00:43:11,010 811 00:43:11,010 --> 00:43:11,902 Yeah. 812 00:43:11,902 --> 00:43:13,240 AUDIENCE: Thanks so much, Matt. 813 00:43:13,240 --> 00:43:14,198 This is really helpful. 814 00:43:14,198 --> 00:43:15,527 My name's Amira. 815 00:43:15,527 --> 00:43:18,020 I'm a CS50 student. 816 00:43:18,020 --> 00:43:21,232 I feel like, when I hear about smart cities, the topics 817 00:43:21,232 --> 00:43:22,940 that you brought up come into play a lot. 818 00:43:22,940 --> 00:43:25,700 A lot of it is about transportation, and a bit about energy. 819 00:43:25,700 --> 00:43:27,080 Where aren't people looking? 820 00:43:27,080 --> 00:43:29,260 And where do you think should the next wave of progress in smart cities 821 00:43:29,260 --> 00:43:30,120 could be? 822 00:43:30,120 --> 00:43:36,539 MATT: Yeah, well, back when we started Transit Screen, 823 00:43:36,539 --> 00:43:38,830 people weren't really working in transportation enough. 824 00:43:38,830 --> 00:43:43,560 So that was one area that we thought was really open for exploration. 825 00:43:43,560 --> 00:43:47,910 I think that's less true now than it was at the time. 826 00:43:47,910 --> 00:43:49,740 I think I'd sort of turn that back to you 827 00:43:49,740 --> 00:43:52,770 and say, what are you really interested in? 828 00:43:52,770 --> 00:43:57,190 And where do you see data that's kind of being underused? 829 00:43:57,190 --> 00:44:01,260 Because there's definitely-- if you think of, 830 00:44:01,260 --> 00:44:06,220 what are the different sectors that make up a city, there's housing. 831 00:44:06,220 --> 00:44:08,680 There's transportation. 832 00:44:08,680 --> 00:44:09,450 There's retail. 833 00:44:09,450 --> 00:44:10,320 There's energy. 834 00:44:10,320 --> 00:44:14,790 There's all these different inputs and outputs to the cities. 835 00:44:14,790 --> 00:44:18,560 And some of those are going to be on the verge of opening up more data, 836 00:44:18,560 --> 00:44:22,570 or where you can put together the data you 837 00:44:22,570 --> 00:44:26,010 need from interesting sources, kind of combine things in ways 838 00:44:26,010 --> 00:44:28,740 that people haven't looked at before. 839 00:44:28,740 --> 00:44:33,490 So for instance, I'll just give an example of the top of a head. 840 00:44:33,490 --> 00:44:35,950 I talked to a guy who is in DC. 841 00:44:35,950 --> 00:44:40,590 And he was in a sort of software development training program. 842 00:44:40,590 --> 00:44:43,872 And he was a labor inspector. 843 00:44:43,872 --> 00:44:46,830 So he was looking for workers who were being cheated out of their wages 844 00:44:46,830 --> 00:44:50,850 by businesses that were unethical. 845 00:44:50,850 --> 00:44:55,320 And he thought, well, there's no data set 846 00:44:55,320 --> 00:44:58,554 I can use that will tell me whether a business is shady or not. 847 00:44:58,554 --> 00:45:01,470 I mean, maybe you can find bad reviews on Yelp or something like that, 848 00:45:01,470 --> 00:45:03,600 but maybe not. 849 00:45:03,600 --> 00:45:09,490 So he thought, well, what else would be a sign that the business is either 850 00:45:09,490 --> 00:45:14,220 shady, or may be falling apart in unable to manage itself properly, 851 00:45:14,220 --> 00:45:17,520 which makes that likely to underpay its workers? 852 00:45:17,520 --> 00:45:21,490 And health inspection data for restaurants 853 00:45:21,490 --> 00:45:25,200 turns out to be an open data sets that's available in a lot of cities, including 854 00:45:25,200 --> 00:45:28,450 New York City, and I think maybe even Cambridge. 855 00:45:28,450 --> 00:45:33,190 So he took that data set and combined it with some internal data 856 00:45:33,190 --> 00:45:37,620 sets that he had access to, and showed that actually, it was true 857 00:45:37,620 --> 00:45:41,920 If you just went to inspect restaurants that were already failing their health 858 00:45:41,920 --> 00:45:47,369 exams, you could probably also predict, with some accuracy, 859 00:45:47,369 --> 00:45:49,660 which ones were likely to be underpaying their workers, 860 00:45:49,660 --> 00:45:52,826 because those are probably restaurants that are falling apart, or something. 861 00:45:52,826 --> 00:45:58,190 So I thought that was a really interesting example of using a data 862 00:45:58,190 --> 00:46:00,700 set that was maybe not intuitive. 863 00:46:00,700 --> 00:46:02,820 And so I think there's a little bit of creativity 864 00:46:02,820 --> 00:46:05,250 that you have to pull in when it comes to this. 865 00:46:05,250 --> 00:46:07,521 And it really should be kind of problem-focused. 866 00:46:07,521 --> 00:46:10,818 867 00:46:10,818 --> 00:46:15,150 AUDIENCE: I was wondering, in terms of Transit Screen, 868 00:46:15,150 --> 00:46:19,132 how is the data for people who drive, and if they see your transit screen, 869 00:46:19,132 --> 00:46:22,265 they would-- because what I think is people 870 00:46:22,265 --> 00:46:24,916 who use a transit screen are people who use transit anyway. 871 00:46:24,916 --> 00:46:28,030 So you're not really taking away from people actually drive. 872 00:46:28,030 --> 00:46:29,010 MATT: Yeah. 873 00:46:29,010 --> 00:46:30,140 AUDIENCE: How does-- yeah. 874 00:46:30,140 --> 00:46:34,640 MATT: Yeah, so this gets into sort of interesting areas of something called 875 00:46:34,640 --> 00:46:37,640 mobility management or transportation demand management, which is, 876 00:46:37,640 --> 00:46:41,870 instead of paying a ton of money to build bigger roads, 877 00:46:41,870 --> 00:46:47,810 how can we be more efficient and change people's demand behavior 878 00:46:47,810 --> 00:46:50,990 for those roads, or for transit, or something? 879 00:46:50,990 --> 00:46:54,210 And there are a lot of incentives you can use. 880 00:46:54,210 --> 00:46:56,330 Information is sort of fundamental. 881 00:46:56,330 --> 00:46:59,470 Because if people don't have information about a bus, that they don't know 882 00:46:59,470 --> 00:47:02,510 it runs by them, they're never going to use it. 883 00:47:02,510 --> 00:47:07,340 So someone who might actually have a very convenient trip by bus 884 00:47:07,340 --> 00:47:09,980 might just drive because they've never heard of the bus before. 885 00:47:09,980 --> 00:47:10,896 They've never seen it. 886 00:47:10,896 --> 00:47:13,780 So there's a real education component we try to address. 887 00:47:13,780 --> 00:47:15,530 And the other thing we try to do is we try 888 00:47:15,530 --> 00:47:19,990 to be in places where people have just moved. 889 00:47:19,990 --> 00:47:23,420 So you just move into a new condo building, 890 00:47:23,420 --> 00:47:26,720 and so you could do a variety of things. 891 00:47:26,720 --> 00:47:27,950 You might have a car. 892 00:47:27,950 --> 00:47:29,930 You might bring the car and pay for parking. 893 00:47:29,930 --> 00:47:33,370 Or you might decide to leave the car and just use transit, and bikeshare, 894 00:47:33,370 --> 00:47:34,370 and Uber. 895 00:47:34,370 --> 00:47:36,470 And so if we can get you at that point when 896 00:47:36,470 --> 00:47:39,950 you're making that kind of decision to keep the car, to get rid of the car, 897 00:47:39,950 --> 00:47:44,120 to start driving to work, or to start learning how to use transit 898 00:47:44,120 --> 00:47:46,550 to get to work-- if we can get you then, then we 899 00:47:46,550 --> 00:47:48,620 have the ability to change your behavior. 900 00:47:48,620 --> 00:47:50,360 And we have a special amount of leverage. 901 00:47:50,360 --> 00:47:53,871 It's when people are new to a place, or people's habits are already 902 00:47:53,871 --> 00:47:55,370 changing, that you have the ability. 903 00:47:55,370 --> 00:48:00,010 And there's some good research in this area in sort of behavioral change. 904 00:48:00,010 --> 00:48:05,010 There are a few other books I'd like to mention in that area. 905 00:48:05,010 --> 00:48:07,220 There's one called Nudge, which is really 906 00:48:07,220 --> 00:48:10,550 sort of interesting popular read about how you can influence 907 00:48:10,550 --> 00:48:14,180 people's behavior using economic principles, 908 00:48:14,180 --> 00:48:16,730 written by guy who taught here. 909 00:48:16,730 --> 00:48:24,050 And there's a couple other books in the area called action design. 910 00:48:24,050 --> 00:48:27,560 And so this is about design as it applies 911 00:48:27,560 --> 00:48:31,460 to getting people to change their behavior for the better. 912 00:48:31,460 --> 00:48:36,760 And so there's a whole interesting area of exploration there. 913 00:48:36,760 --> 00:48:37,260 Yeah. 914 00:48:37,260 --> 00:48:38,760 AUDIENCE: Oh, hi. 915 00:48:38,760 --> 00:48:41,260 My name is [INAUDIBLE] and I work at the Planning Office 916 00:48:41,260 --> 00:48:43,233 at Harvard, the campus planning. 917 00:48:43,233 --> 00:48:47,070 So you're talking about the smart cities, so in your eyes, 918 00:48:47,070 --> 00:48:50,070 [INAUDIBLE], what is a smart campus? 919 00:48:50,070 --> 00:48:58,233 And what kind of data is smart data that students want to see? 920 00:48:58,233 --> 00:49:01,937 But considering that a lot of data is actually private, 921 00:49:01,937 --> 00:49:06,070 so you cannot expose it, so [INAUDIBLE]. 922 00:49:06,070 --> 00:49:09,080 MATT: Yeah, so the question is bringing it 923 00:49:09,080 --> 00:49:13,640 close to home, what makes a smart campus, as opposed 924 00:49:13,640 --> 00:49:15,800 to sort of a broader smart city? 925 00:49:15,800 --> 00:49:20,360 And I think the fact that you are from the Office of Planning, 926 00:49:20,360 --> 00:49:22,730 and the University has an office planning, 927 00:49:22,730 --> 00:49:26,450 sort of underscores the idea that a campus is kind 928 00:49:26,450 --> 00:49:28,130 of like a miniature city in many ways. 929 00:49:28,130 --> 00:49:31,940 I'm sure you look at it under a similar lens. 930 00:49:31,940 --> 00:49:38,210 And so I would say that the areas for a smart campus, 931 00:49:38,210 --> 00:49:42,050 obviously, there are some standards that everyone on the campus 932 00:49:42,050 --> 00:49:43,850 can agree on, similar to the city. 933 00:49:43,850 --> 00:49:46,260 You want it to be a sustainable campus. 934 00:49:46,260 --> 00:49:49,510 You want it to be an efficient campus. 935 00:49:49,510 --> 00:49:54,680 And you want people to feel safe on campus, and so on. 936 00:49:54,680 --> 00:49:58,820 So I think really, you look at all the different areas, 937 00:49:58,820 --> 00:50:02,480 whether it's law enforcement, or it's sustainability, 938 00:50:02,480 --> 00:50:07,160 or transportation, everything where there's a similarity between a campus 939 00:50:07,160 --> 00:50:13,730 and the city, all of those areas are areas where you 940 00:50:13,730 --> 00:50:17,260 can help make the student body smarter. 941 00:50:17,260 --> 00:50:21,350 At the same time, there's also a workforce who has to get to the campus. 942 00:50:21,350 --> 00:50:23,690 So you're dealing with those transportation problems 943 00:50:23,690 --> 00:50:24,960 at the same time. 944 00:50:24,960 --> 00:50:29,530 And so how do you get people from the suburbs, or from Boston? 945 00:50:29,530 --> 00:50:31,760 And how do you get them in here in a way that 946 00:50:31,760 --> 00:50:34,740 doesn't create problems for everyone? 947 00:50:34,740 --> 00:50:39,740 And so I think, for a campus, you need to think a little bit more 948 00:50:39,740 --> 00:50:42,350 expansively, not just about the footprint of the campus, 949 00:50:42,350 --> 00:50:45,050 but about the whole network of how everyone 950 00:50:45,050 --> 00:50:49,990 is coming to and from the campus. 951 00:50:49,990 --> 00:50:54,530 Transit Screen, just to tell us another short story, 952 00:50:54,530 --> 00:50:57,200 comes from a city called Arlington, Virginia. 953 00:50:57,200 --> 00:51:01,250 We actually did it within the city government as a pilot project 954 00:51:01,250 --> 00:51:02,930 before we launched the startup. 955 00:51:02,930 --> 00:51:05,810 And Arlington is a sort of interesting, almost 956 00:51:05,810 --> 00:51:10,100 like a campus, in the way that it has only 150,000 people, but almost 957 00:51:10,100 --> 00:51:11,960 250,000 people working in it. 958 00:51:11,960 --> 00:51:14,840 So there's a flood of people coming in every day, 959 00:51:14,840 --> 00:51:19,010 because the Pentagon is there, and a variety of other major employers. 960 00:51:19,010 --> 00:51:23,260 So there's a real need for solutions that take into account 961 00:51:23,260 --> 00:51:25,259 the whole footprint of the campus. 962 00:51:25,259 --> 00:51:27,050 That's kind of the direction I would go in. 963 00:51:27,050 --> 00:51:29,547 But love to talk with you about that more. 964 00:51:29,547 --> 00:51:33,800 965 00:51:33,800 --> 00:51:34,300 [INAUDIBLE] 966 00:51:34,300 --> 00:51:37,868 AUDIENCE: [INAUDIBLE] sensors, who has accessed information to the sensors? 967 00:51:37,868 --> 00:51:40,010 Can anyone request access? 968 00:51:40,010 --> 00:51:44,185 MATT: Yeah, so that's the challenge, is that a lot-- like, 969 00:51:44,185 --> 00:51:46,750 for instance, the gunshot detectors. 970 00:51:46,750 --> 00:51:47,920 That's the police's thing. 971 00:51:47,920 --> 00:51:50,110 They're not going to share that. 972 00:51:50,110 --> 00:51:53,620 But sometimes with sensors, you can get a certain data set, 973 00:51:53,620 --> 00:51:58,750 or you can get historical data set that preserves some kind of privacy. 974 00:51:58,750 --> 00:52:02,950 So for instance, with New York City, you can 975 00:52:02,950 --> 00:52:07,900 get a data set that has all the taxi trips that were taken in the city. 976 00:52:07,900 --> 00:52:11,020 Or with bikeshare, you've got all the history of the bikeshare trips. 977 00:52:11,020 --> 00:52:18,280 And so some of those were collected with sensors. 978 00:52:18,280 --> 00:52:21,460 But you can't get them in real time, but you can get them historically. 979 00:52:21,460 --> 00:52:24,910 And so I would focus on that kind of data. 980 00:52:24,910 --> 00:52:30,590 Or of course, might even try to put your own sensor out somewhere. 981 00:52:30,590 --> 00:52:34,270 You interested in video, put a camera up somewhere, 982 00:52:34,270 --> 00:52:36,723 and start collecting that data. 983 00:52:36,723 --> 00:52:39,889 AUDIENCE: So what would incentify those companies to release their real time 984 00:52:39,889 --> 00:52:42,234 data for everyone? 985 00:52:42,234 --> 00:52:44,110 Why would companies do that? 986 00:52:44,110 --> 00:52:45,410 MATT: Generally they don't. 987 00:52:45,410 --> 00:52:47,662 They don't share that kind of sensor data because-- 988 00:52:47,662 --> 00:52:52,105 AUDIENCE: But we are able to access API, like the transit information. 989 00:52:52,105 --> 00:52:54,400 Is that just like the government wants to? 990 00:52:54,400 --> 00:52:57,910 MATT: So there are two things that make companies often share their data. 991 00:52:57,910 --> 00:53:01,240 One is if there's a public policy reason, like the government says, 992 00:53:01,240 --> 00:53:05,230 you should have open data because of transparency. 993 00:53:05,230 --> 00:53:10,960 Let's say the subway isn't running very well, or it's not running on time, 994 00:53:10,960 --> 00:53:13,450 maybe people should have the right to look at the data 995 00:53:13,450 --> 00:53:17,229 and say whether it's running on time or not. 996 00:53:17,229 --> 00:53:18,770 Subways don't always see it that way. 997 00:53:18,770 --> 00:53:21,070 But that might be one reason to release the data. 998 00:53:21,070 --> 00:53:24,550 The other reason people release data is often just related 999 00:53:24,550 --> 00:53:26,650 to marketing and promotion. 1000 00:53:26,650 --> 00:53:30,430 So why do we get data from so many public and private transportation 1001 00:53:30,430 --> 00:53:31,840 providers at Transit Screen? 1002 00:53:31,840 --> 00:53:35,740 Because we promote their services on public screens. 1003 00:53:35,740 --> 00:53:39,460 And so we're free advertising for these kinds of agencies. 1004 00:53:39,460 --> 00:53:43,750 And so if you can identify a common interest 1005 00:53:43,750 --> 00:53:46,960 where-- another thing is, maybe a company doesn't 1006 00:53:46,960 --> 00:53:49,840 have enough people to do all the analysis they'd like. 1007 00:53:49,840 --> 00:53:52,720 Maybe they don't have data scientists, but they have data. 1008 00:53:52,720 --> 00:53:56,350 So you might be able to say, well, if you share this data with me, 1009 00:53:56,350 --> 00:54:01,240 I can do some things with it that might provide insight or information to you 1010 00:54:01,240 --> 00:54:04,910 that make you run your business better. 1011 00:54:04,910 --> 00:54:08,172 So there are a variety of different pitches from the regulation. 1012 00:54:08,172 --> 00:54:09,880 There's even something called the Freedom 1013 00:54:09,880 --> 00:54:12,130 of Information Act, where you can request data 1014 00:54:12,130 --> 00:54:14,770 from a lot of public agencies. 1015 00:54:14,770 --> 00:54:16,530 It's pretty easy to use. 1016 00:54:16,530 --> 00:54:20,570 There's a site called muckrock.com that's really good for it. 1017 00:54:20,570 --> 00:54:23,740 But then, there's often some kind of common interest. 1018 00:54:23,740 --> 00:54:26,779 And I think that's maybe, in many cases with private companies, 1019 00:54:26,779 --> 00:54:27,695 is a better way to go. 1020 00:54:27,695 --> 00:54:30,220 1021 00:54:30,220 --> 00:54:32,970 All right, thank you very much. 1022 00:54:32,970 --> 00:54:34,593