00:00:00,300 --> 00:00:01,260 DAVID J. MALAN: All right, welcome, everyone. Thank you so much for coming. So I'm really happy to have a classmate of mine, Matt, from class of 1999, where both of us were undergrads sometime ago. And we've been super excited to bring Transit Screen-- and Matt especially-- to campus, since you might have seen here in CS50's office, or in Cabot House, or in Currier House, and soon elsewhere, these so-called Transit Screens that allows students to get up-to-the-minute information on when the next Harvard shuttle is, when the next Uber is, when the next Hubway bikes are available, and the like. So today Matt is here with us to talk about smart cities more generally, and the power of harnessing this kind of data. So over to you, Matt. MATT: Great. DAVID J. MALAN: Thank you, Matt. MATT: Thank you, David. And if you ever get tired of teaching CS50, we might have a sales job for you. All right. DAVID J. MALAN: Welcome to campus. MATT: Thank you. Well great. So I'm going to talk a little bit about what I do, and the kind of company that I and my colleagues and co-founders have built at Transit Screen. But I also want to talk more generally about urban data, about smart cities data, and about how you too can get started building interesting interventions that can help solve problems in cities. And I'm going to take a little bit of a step back, sort of a 30,000 foot view at the beginning, and explain why cities and why now. And so the most salient thing you can take away here is that there are more people living inside this circle on the globe than outside it. And we have a lot of international students here. So of course, they appreciate that. Actually let me-- sorry, let me just stop my slide show from going. I knew it was going to do that. OK, much better. 00:01:40,630 --> 00:01:41,920 OK, thank you. So there are more people living inside this circle, China and India primarily, than outside of it. And the pace of urbanization has only increased over the last 22 to 40 years. So just within India, for instance, the urban population is this red bar continuing to increase. And the rural population is sort of saturated, or declining. So what that means is tons of people moving to cities, density increasing. And that means all of the issues that come with that kind of density are also cropping up. And one of those major issues is transportation. And the way we see transportation at Transit Screen is that, in the words of Bogota mayor, Enrique Penalosa-- who is an great documentary called Urbanize that you might like-- the sign of an advanced society is not where the people who are poor drive cars, but where the rich ride transit. And that's kind of inevitable. Because when you have enough people and enough density, you just can't have everyone driving a car and getting around. And you'll just have gridlock, like you do in many cities in the developing world. And so what we have to do is we have to find a way to shift some of that to two different modes in order to make things achievable. So here's an example. I believe this is from South America-- Brazil I think. And this is the kind of issue with transportation supply that you see with cities with all this urbanization. And then, back here in the US, we have a great example from the last decade of Houston. Houston has the largest freeway in the US, and possibly the world. It's 23 lanes wide. It's called the Katy Freeway. And it goes from Houston west. And they put $3 billion into expanding it in 2011. And now traffic is 33% slower, even though they increased tolls. What happened? Well, when they built all those new lanes, and they allowed sprawling development so that people who lived out in these suburbs who could only get to the city, to their jobs, using this freeway, the result was what's called induced demand. So you increase the supply, but that creates its own demand. It's a feedback loop. And so the end result is actually worse than when they started. And this is paradoxical. But this is the mode that transportation solutions were for basically the last 50 years, was all this thinking that we could just fix it with supply. And that's not actually the case. Another manifestation of this is parking lots. So in terms of cars, the number of cars continues to increase. There are actually 1.2 billion cars on the road today, and 2 billion by 2035. With the urbanization picking up as well, the cars are sitting unused 96% of the time. And so this vast sea of parking is made for the day after Thanksgiving, when everyone goes shopping. Most of the time it's not used. Almost none of the time are these cars actually used. So if you think, the amount of this that is actually necessary, is just a fraction of this little car. So that's a way to think about the scope of the problem, and how people are looking for a solutions. So what are the solutions in transportation going to look like? Well, I think one way to think about it is that the transportation itself is changing, and it's becoming mobility. So car companies are saying, we're now mobility companies. What does that mean? Well, mobility means, essentially, just getting around in a variety of ways. And one of those ways, that's very current and very relevant to CS50, I think, and what people from computer science backgrounds are going to end up doing in the future, is autonomous vehicles. So right now, there are 26 companies actively developing autonomous vehicle technologies-- ones you've heard of, of course, Google, Uber. But Tesla, General Motors, Ford, and every other car company you could name has a research project in this area. One Google scientist, Sebastian Thrun, said that the going rate for an acquisition of one scientist who's working in autonomous vehicles right now is $10 million. So if you have a active team working on this, you could just get scooped up by one of these people. And that's how the math works out. Nevertheless, autonomous vehicles are here in some places. They're coming in some places. There are some autonomous taxis driving around Pittsburgh-- autonomous Ubers. And Google cars driving around too. But it's not really here yet. And it's not clear how they're going to work in an urban environment. That's still getting worked out. So in the meantime, let's focus on some other technologies that are really here, and are really growing very fast right now. One of those is, surprisingly to some people, bikeshare. And so bikeshare, in the last year, 1.5 billion trips were taken on bikeshare. There is now a company in China that's a bikesharing company that's valued at over a billion dollars, so-called unicorn of bikesharing companies. So this is both a real commercial marketplace, and a real transportation solution, with 1.5 billion trips. Carshare is also been growing tremendously. And every single car share vehicle, like a Zipcar, or a Car2go vehicle, has been studied and has been shown to take up to about 8 to 12 vehicles off the road. So private cars, people who have second private cars, will give up their car, because it costs a lot to maintain. And they'll use a carsharer instead as their second car for those rare occasions when they actually need it. So this is still a pretty significant number of trips. It's grown tremendously in Europe, but it's still popular here in North America. Ridesharing, of course, everyone's familiar with Uber, and Lyft, and DD, and the growth of all these services. It's still only four billion rides, but it's increasing very rapidly. So it's about twice as big as bikeshare right now. And then mass transit, I don't have the number off hand for how many people it's carrying, but it's a lot more than that, like by probably a factor of 10. And mass transit is still continuing to grow tremendously. Sometimes people think subways are old technology or something like that. But that's not actually true. And there have been 40 brand new metro systems built across the world in just the last decade, which basically means a doubling of the number of cities that have mass transit, so especially in countries like China, which have been building a ton of them. So mobility has really changed. And the result of all this is that now, more than ever before, it's more complicated to try to get around cities. You have more choices, but you need solutions for using those choices and for getting informed about your different options. So this is just one example of how the diversity of things that people are doing in this space has exploded. You can find this on our website, but includes different on-demand mobility options, and microtransit and stuff, as well as some of these other self-driving cars, and other associated technologies. OK, so mobility has changed. Tremendous amount of activity there. What else is happening in cities? And one of the trends that's enabling all these changes in mobility is what we call smart cities. And related to that is this concept called the internet of things. So I'll talk a little bit about the internet of things first, because it has a more specific definition. What the internet of things is really putting sensors and other devices in the real world so that they are now connected to the internet, and are now enabled for technology. The idea of a smart city is a city in which you take all these connected devices and you use that to get some sort of intelligence or some sort of operation that makes everything more efficient, often more sustainable, greener, less CO2. And so you see all of these things, solar, and carsharing, and energy generation, and houses, and everything are all connected together. And they all run efficiently like a giant machine. The reality of that is that a lot of this stuff is still emerging. Here are a few examples of some smart city technologies that you might see today. And they're all sensor-based, these ones. And they're all generating data that often is being collected by cities, but sometimes is available for you and I to use in different ways in our own projects. So here's an example of smart parking systems. So in the old days, cities had no idea who was parking in what space when. And so the reason that's a problem is because, in many cities, 33% of traffic is people circling looking for a parking spot. And so when they can't find one, they keep circling. And they keep causing mass amount of congestion. So the theory is, well, if we knew exactly what parking spots were available, we could direct people in their smartcars directly to those spaces, and cut down on all the congestion, and all the wasted gasoline. So smart parking is being done by both-- people put sensors in the parking spaces themselves. And that technology unfortunately needs a lot of maintenance, and batteries need to be replaced, and so on. So people are now looking at sensors that are on light posts. You see all these light poles around the urban environment. And those can be retrofitted to put sensors on them for a variety of things. One of them is monitoring using either video or some sort of laser sensor or something where the cars are parked. But you could also imagine other uses for that space. And so people are coming up with lots of neat proposals for how to use those sensor packs. Another technology, and this is one that actually is, I believe, deployed in Cambridge, Massachusetts here, is a network of microphones called Shotspotter. And what Shotspotter is is, if someone is shooting a gun in the city, these sensors, which are basically just microphones, will triangulate the location. And in real time, that data then gets fed to the police department, so that they know where to start their search for this presumed gunman. You could imagine other more peaceful applications for this kinds of technology. For instance, you've got someone with a motorcycle that's generating sound at 120 decibels. It's a real nuisance. And in the past, the police couldn't find them, because they just right away as soon as you got there. But with a technology like this, maybe you can actually find them and stop them, confiscate their motorcycle, and restore peace and quiet to the neighborhood. Another set of technologies that's creating interesting sensor data-- and most of this data is private for privacy reasons, but is widely available in the commercial sector-- is technology that monitors your location through the use of your mobile phone, and your connection to the cellular towers, and the cellular networks. So they know you're in this car. You're driving. The cellular network here. It knows where you are, because it has to transfer you from one tower to another. And then that data says, you're moving. You're in a car. And you're over there. It says, there's a train going by, and there's 200 people on the train, et cetera, et cetera. And so all of that location data can be used for a variety of purposes, including transportation planning, trying to figure out how congested things are, and what we should do about it. So these are all examples of smart city sensors. And so all of these things are collecting data that could be used to make cities run more efficiently and smarter. Now I'm a PhD in neuroscience. And so although I work mostly on vision, which is one of the five senses, I actually learned, during the course of my studies, that sensing is only sort of half of the problem. And there's a benefit from biology you can use, which says that a sensor is something that transforms energy into data. So you're walking around the world and your senses are acquiring data about the world. At the same time, you as a person, as an animal or an organism, that's not the point. The point isn't just to get data. The point is actually a do things in the world. And so what you need are actuators or activators, things that transform that data into energy. For us, it's our muscles. We walk over there. We take a drink, et cetera. For cities, a lot of the smart city stuff that you'll see in the media or just around the world is focused on sensing. And it's not focused on actually getting stuff done, making a change in the world, activating things. And so this is actually the area that we're working on with Transit Screen. And so our goal-- and you can see the Transit Screens in CS50 is one very small part of that-- is to put real time information all around the world in places where you live, places where you work, like right here is CS50, places where you play, like these bar jukeboxes that we just launched with touch tunes that now have transit information. You can have city halls, on the streets, or when you're traveling, and vacationing or a hotel. So what's all this information is provided to do is to make people make different decisions, and make people make better decisions. They might be more efficient decisions. For them, they might be a more sustainable decisions. Usually when you give people more information, they'll shift away from whatever their default behavior is. And right now, in transportation, people's default behavior is pretty bad. It's usually, I get in my car and I drive, or I take an Uber, or something like that. So what you want is you want people to see the choices. Most of the choices are actually more sustainable. And so you end up with a better result when you provide the information. The key is that, in a smart city, people are the activators. They're the actuators. So you need to actually provide people with the information. It's not enough just to collect the data. You actually have to turn it information, and then you have to get people to use it to change their behavior. So here's one of our transit screens close up here from Cabbot House, Harvard. Just got some very nice feedback about this one. We had one of the CS50 TFs, who chooses his breakfast location so that he can see this transit screen, so then whether the shuttle is coming, or maybe he'll take a Hubway bikeshare. He has a choice based on that. And so that's a great example of behavioral change. He's changing where he sits in the dining hall. And then based on that, he's also changing what kind of travel he uses. So what other examples are there of behavioral change? Well, one very interesting example is another company called Opower that's also based in Washington, DC area. And what they did is they took all this smart city data about people's energy usages. And you might have an apartment or a home, and you use energy at a certain rate. This is how many kilowatt hours you used. And then this is how much you paid last month for your energy bills. Now this is the average, because they have data from everyone in Cambridge, say. That they know what the average home in Cambridge uses, in terms of energy. And then, in this case, this person here, Garrett, he's using more. And so the objective of Opower's software here is that they have this dashboard here, so that he can see that he is using more. And maybe there are some things he you can do, specifically, turn off lights, or compete against other people to reduce his energy usage. There are things he can do. They're kind of long term things, but they can, ultimately, save him money, and improve the environment. So this is something where they're taking data that was previously unavailable, because meters didn't used to be smart. But now they are. They're feeding that data into the internet. And Opower built a business using combination of web technologies, and also they even mail this information to you in a letter, because they know you're more likely to look at it. They were able to find ways to reduce energy usage in different cities around the country. And coincidentally enough, Opower was actually founded by a couple of our other classmates, mine and David's, these guys, Alex and Dan. And they actually went public a couple of years ago. And this is them on the New York Stock Exchange floor. So there's a lot of real impact and also real potential in some of these kinds of businesses that are based on the idea of smart cities. All right, so that's kind of the background for smart cities and data, and both the sensor and the activator side. So what I want to talk about next is a little bit of how you can get started by yourself in working in this area. And really the easiest way to get started is with open data. And in case you haven't heard of open data, the idea of open data as this. It's data that can be freely used, reused, or redistributed by any one, subject only to, at most, you say it's from a certain source. And so it's often required to be available in machine-readable formats. So if I give you a stack of paper, that's not open data. It's got to be in data format. And then it's also ought to be available in its entirety. So if I have to log into some website, and take a screenshot, or just scrape the web or something like that, that's not open data either. I have to be able to get it like a table, like an Excel file. Or I have to be able to get some sort of interchange format like XML, or JSON, or these other kinds of things. So open data is often generated by the government, but not always. Because governments have to be public and transparent about their data. So they have to share it with citizens. And the result of this-- and there's been a lot of open data momentum recently-- the result of this is that it's a really good way to get started. Because with some of this other stuff like this data, few years ago, you weren't able to get this data from the energy companies. Your utility, you'd say, well, I have a really great idea and I'd like to be able to do something to make my home more efficient. And they'd say, too bad. You can't have the data. You're a customer. You're not our partner. But with the government involved and generating this data, for instance, the Boston T, the MBTA, is a government agency. And so they provide open data that's available to a lot of people, and without very many restrictions on it. So one of the impacts of this is that you can develop your app, or you can even develop a business based on open data. And so you're no longer required, say, for the T-- I'm not even sure if they have their own MBTA app or not. But there are a lot of other apps you can use on your mobile phone that allow you to get information about them. Transit Screen is another example of something that uses open data from the T, as well as dozens and dozens, if not hundreds, of other agencies around the world. And then another nice thing about this is that there are many different people contributing to open data. So it doesn't just have to be sourced from the government. Someone can come along and say, well, I think this data set would be better in this format, or it would be easier to use, or something. And you can clean it up, and then redistribute it. And then the results of all this kind of activity is that people can build little businesses on it. And so I'm just going to plug the startup incubator that we come from, which is in Washington, DC. And it's called 1776. And their mission is to solve hard problems in the areas of cities, and government, and health care, and education, and transportation, a lot of which can be addressed using some form of open data. 00:22:29,980 --> 00:22:32,620 Like I said, the US federal government has gotten behind this crusade of openness and transparency. And so the Obama Administration had a pronouncement. Then they created an independent authority under the US CTO's office to promote open data. And this was me meeting the President at 1776. I just wanted to show this slide. All right, so I'm going to walk you through a couple of different examples of open data, and things you can actually use. These ones are ones that I know well, because they're relevant to transportation for the most part. But there's a whole variety of different sources out there that you can go to. So I'm going to talk about four different areas very quickly. One of them is open geographic data, so maps, OpenStreetMap specifically, which is like the Wikipedia of maps, GTFS, which is how transit schedules are represented, and it's been a very successful open data standard, Realtime APIs-- so this is just transit schedules, but if you want to know where a train is right now, you need to use a Realtime API-- and then energy data. So there's some new standards there that we can talk about. So OpenStreetMap, a lot of people have heard about Google Maps. They're kind of the standard. Some people have even heard of Apple Maps. And OpenStreetMap is less well-known, but it underlies a tremendous amount of activity on the internet. And it's really at sort of world-class standards right now. So this was Sochi Olympics, the Winter Olympics before the last summer Olympics. And during that time-- Sochi is a Russian city that not a lot of people are familiar with. They kind of built it up for the Olympics. This is what OpenStreetMap looked like during that Olympics. So if you were a visitor, and you were trying to find your way around, you can see all the buildings. You can see all the new paths, the Olympic Park, et cetera. And this is what it looked like in Google Maps. So Google, which, reminder, is a company that has hundreds of billions of dollars sitting around, didn't really prepare for the Olympics, or even get the data together. Whereas a community of people, just like Wikipedia editors, people generating their own open data, managed to achieve this in OpenStreetMap. So last Olympics, it wasn't quite as clear a distinction, because Google actually put a lot of money into making sure that the last summer Olympics wasn't like this for them. 00:25:03,850 --> 00:25:07,341 So OpenStreetMap is one of these open data sets that's not generated by a government, although in many cases, it uses government data. It's generated by crowdsourcing, like Wikipedia. So here's an interesting map showing the number of edits in each of these geographic areas. One thing to notice is that, in many of these areas in Europe, there's-- I think this is per square kilometer or something-- it's like over 500,000 edits per square kilometer, which is just amazing. You're really converging very quickly on sort of ground truth when you have that kind of activity. And the US is not too far behind. 00:25:44,560 --> 00:25:47,340 So you, yourself, can actually edit OpenStreetMap. It's very easy. You just go OpenStreetMap.org and start editing stuff that you know. One thing you can do with it-- and I think this is kind of a neat thing-- is in some cities, the sidewalks and pedestrian paths aren't very well represented. So let's say that you have a elderly relative who uses like a wheelchair, or a scooter, or something like that, they're not going to be able to get from point A to point B, or to have an app tell them how they can safely get from one point to another. They'll just have to discover it by themselves. But you can actually go into OpenStreetMap and edit these paths. Say, here's an accessible path. There's a curb here, so if you're using a wheelchair, you can't jump over that, and so on. And then this all gets put into trip planning apps, so that, when you're using one of these apps like a city navigation app, it will actually now give you these paths. And you could build an app for someone who's using a wheelchair that says, how can I actually get from point A to point B in a safe way? And so you can build that for your local community, and then extend that out as you grow. So I think this is really an interesting use of open data where you can actually solve a real problem. And then you can also maybe build an app, or even a company on something like that. Here's another interesting use of data. You can use this open satellite data to measure the solar potential of different buildings in the city. So this is a solar map of Cambridge, Massachusetts. And this is showing that some parts of the roof here could actually have solar installations on here that could theoretically generate a lot of power for this building. And then you could even measure, using open data from the electric utility about how much they charge and so on, you could say, well, how long will it take me to pay off the solar panels, and so on? Does CS50 have solar panels? Not yet. All right, working on it. So another interesting application of all this Geodata is you can do custom styles. And there's a great company we work with called Mapbox that's based in Washington, DC. And they've built a whole business on open data and on using different ways to interact with it. So they have a styling language that's kind of like how you use HTML to make the web pages look different. They've created language for making maps look different. And so these all they look completely different. And yet, they're all generated with the same open data, with the same OpenStreetMap data, just with different styling rules applied. So you guys could actually see these examples online. You could edit them, and change them yourself, and come up with maps that look ranging from, this is very similar to Google Maps, to something that's more like a hiking map, or sort of a '50s style map, or even a sort of comic book kind of map. So all these different styles are now available because the data exists. So that's geographical data. And that's a really important source of knowledge about the world. Lots of great solutions to urban problems can be found with Geodata. Another kind of open data that I know very well, and that I think has been a real success story, is called GTFS. And this is the general transit feed specification. You need to know that. Basically, all you need to know is that it's transit schedules. So when is the bus running from A to B, and where? So there are a couple of places where you can explore this kind of data and get started. One great one is called transit feeds. And just search for transit feeds and you can find it. You can actually click through it. This is San Francisco. You can find the latest file. You can look for a certain route. And then the data is all available to download that powers this particular visualization. Another really nice tool that's been developing very quickly from a company called Mapzen that we work with is called Transitland. And so Transitland is what's called a data commons, which is that they're taking all open data from all over the world, all over the planet, and bring it together in one place, so that you can see everything sort of unequal footing. You have data from Sydney, Australia, and then you have data from San Francisco. And it's all in the same standard and the same format. So you can take a solution that you develop that works for San Francisco, and you can immediately make work for Sydney, Australia. 00:30:28,880 --> 00:30:32,500 So GTFS, one of the reasons I like it, and one the reasons I'm talking about it, is that it's actually really simple. And so it's plain text files, and it's written in tables. So if you download a GTFS file, it's like a zip file. And it's got this stuff in it. These things all have formats. But if you just double click on one of them, stops.txt, open that up, and this is what it looks like. It's got a list of stops, the names of those stops, latitudes and longitude, so you can plot it on your OpenStreetMap, and then some other information here. So it's the kind of data that you can really start exploring and start building on very readily. Of course, it's a little bit more complicated. And if you really want to do some complicated things, you'll have to start working a little bit with databases, or at least matching data from one thing to another. But it's not it's not terribly complex. And there are tools that are available that are mostly open source, easy to download, that can be used for this kind of stuff. So for instance, those stops I showed you are linked to particular stop times, and to certain bus trips, and so on. So you can actually go from one thing to another, and match all these things up in the database. Another interesting source of data from around here is bikeshare data. And a lot of bikeshare data is open. In DC, I helped open that up. But then other cities, like Boston, have just done that from the beginning. And so here's an interesting example of someone who-- they had a contest for visualizing Hubway data. And so that resulted in a really nice visualization of trips versus, I think, distance. And they have actual trip history here, so you can see not just your trips, or just what the stations are, but you can actually see how many people have traveled from what part to what other part. And so that's the kind of more the sort of monitoring data, or the real historical data that we're talking about. A lot of that is available with bikeshare. So if you're interested in what you could do if you had data from everyone's cellular records of how they got from point A to point B, then bikeshare is an interesting way you might want to get started with something like that, because it is open. 00:32:48,790 --> 00:32:51,340 So all of that data is static data. The maps are just static. The schedule never changes. And the bikeshare data is the history of the bikeshare system. When you're talking about real time data, stuff that's actually happening right now in real time, you're not talking about data sets anymore. You're really talking about APIs. And I think APIs are covered later in the course, sort of towards the end. But basically, all an API is is the way you talk to another program, or the way you talk to a database about something. And so you can get information from that program, or from that database, that you can then use. And if that database is changing, is updating in real time, then you have that information and it's up to the minute, or up to the second. And you can use it. So a lot of APIs can actually be accessed just through your web browser. So in many cases, if you have an API, you can just type the name of the so-called endpoint into your web browser. And then you'll get out, in a readable format-- this one is called JSON, J-S-O-N-- you can get information. So here's one from a transit service. And it says, well, here are the stop times. There's an arrival coming up. Don't worry about the numbers. This is in a particular format. But the point is, this one is just 0 seconds from the stop. So the train has just gotten to the stop. And now you know that because you called that API, and you got information back. And now your program can use that too. You can put it up on a screen, like we do. Or you can do other things with it. APIs are readable by code. This is Twitter. And this is someone's tweet. This is not what I would consider to be an API. You can get tweets from an API, but it's not the same, because this information isn't structured. It's just someone's typing. And it's very hard to know what to do with that to write a program that can use all of this text. Whereas with this, I can write a program that says, I want this, then I want that, then I want this part. So there's a big difference. And it's a lot easier to work with APIs that generate machine-readable data than other data. Another key property of APIs is that some of them are giving you bulk data, and some of them are like family style, and some are giving you single serving. So we just went to a Chinese restaurant. There was a buffet. You could take as much as you want. If you're a particularly rude, you could just grab everything and walk out the door with it. APIs are like that, except of course, you can get to the food, and someone else can also get the food. And so some of them will tell you about, for instance, where every train is in the entire metro system in DC here. And so one API call will tell you all of that. There are also some APIs for metro that will tell you about a particular stop. And so you just ask about one stop, but that's your single-serving API. And that can say, there's a train right now going to 16th and Colorado. So want to show you another example just related to the energy data stuff that I was talking about. So there's also a website for energy data, like I showed in the Opower example. And it's called it's called Green Button. And what it is USdata.gov office and the EPA created a standard for interchanging energy data. So if you have an apartment, or a condo, or whatever house, and you're paying energy bills to a certain set of utilities, they are required to give you access to your own historical energy usage data. So there should be, on the Customer Portal website, a log in that you can get. And then there should be one of these green buttons that allows you to access your own data, and pull that data down. And then you can do whatever you want with it. You can plot your own historical use. You can try to optimize it. You can compare it to other things. And so this is also now an open data ecosystem. And last time I looked, there were about 250 apps people had built using this kind of green button data. So this is another interesting place to get started if you have paying a utility bill. Not sure if Harvard is doing this right now for students. But if you're living off campus, you can probably get this. So some places you can get inspired to come up with different project ideas for smart cities using open data, one of them is, like I said, data.gov. And data.gov is a place where you can find almost an infinite number of open data sets. It can be a little bit overwhelming, just trying to figure out what's available there. But they let you sort them by different types of things that you're interested in. You might be interested in transportation. You might be interested in energy. You might be interested in agriculture. All of those data sets are available there. And you can find a catalog of them, and then start generating ideas for how you might use some of that data to get some insight, or maybe build an app that would be useful to people. There's another class of kind of places like that called developer portals. And so, for instance, for MBTA, for the Boston T, they have a developer portal that you can use. And you get log in for that. And then you get access to their open data. You get access to their APIs as well, and a variety of other things. So often, if you're looking to find some open data, you might be able to look for it by looking for a developer site. There are some other catalogs of other APIs. There are a lot of private APIs from other services, so things you might use-- Yelp, if you use that to search for restaurants or local businesses, they have an API you can use. Foursquare has an API. A variety of other different commercial APIs exist. Google Maps, you can use their data to some extent. And all these things have some way to access them. And it's often free. So you can find some of them just by searching for developer portals, like Yelp developer portal. But there are also other catalogs you can use too, like publicapis.com is one that seems fairly recent that I came across. There's an old one called programmableweb.com that's been around for a long time, and has a lot of APIs, some of which go back almost 10 or even 15 years. So all of these places are good ways to sort of discover the richness of what's available, in terms of APIs. and open data. And then green button data, that's the one that I mentioned for their energy data as well. And even if you don't have a apartment, or house, and you don't pay a utility bill, you can still get test data from these kinds of sites that would allow you to build an app that would be based on that test data. And many of these will also allow you to test certain things against them. So I hope that's been sort of a rapid-fire introduction to some of the different aspects of data and APIs, and especially where it relates to cities, and the kinds of data sources that you have in cities, and for solving urban problems. So I just want to wrap up a little bit by showing some more slides of our team. This is our DC team for Transit Screen, and of course, mention that we're very interested in having any of the CS50 students, or anyone else who's interested, please drop us a line. If you're interested in a summer internship, or a job, or whatever, please let us know. We tend to be very mission driven. And we're focused on making cities more sustainable, promoting walkability, urbanization, and public health. And our own operations are actually carbon neutral. And I wish I had a better way to do that and to show that with open data. But we've managed to audit our own energy usage and prove that. And then also, zero of our employees commute by car. So we get around a lot of fun ways in the city. And so I just wanted to say, sort of in conclusion, the way that my personal opinion about this goes, is that in a truly smart city is not one just where all the data gets brought together and kind of funneled into some sort of silo to be used by people in the government. It's one where data is open and data shared, so that that data and the technology can be used to improve the lives of everyone, from the janitor to the CEO, people who are old, people who are young, and just the whole variety of residents and citizens of the city. So I really tend to think about smart citizens as being the goal, rather than smart cities. And I think that's a good way to keep in mind the real problems that need to be addressed, and the real problems that can be solved. So this in my email. It's matt@transitscreen.com. Feel free to drop me a line. This is a picture from 1999 of me in Leverette House Dining Hall, which is, I believe, closed right now. This painting was hung upside down for half the year that year, and no one noticed. 00:42:19,660 --> 00:42:24,000 And these are a couple shots of me on the computer programming competition. Especially if you're just starting CS50 and you're interested, I think there are tryout processes for these. So you can go and try to get yourself on the international programming team. I got to go to exciting San Jose this year. And this year was a little more exciting, we got to go to Amsterdam for the international finals. Look at these computers, just going to say. And then a bit later, I got to-- see, the computer is a lot newer in this one. That's all I'm going to say about that one. All right, so thanks very much. And I appreciate your attention. And again, please drop me a line if you'd like. And I'd love to take any questions from the audience at this point. 00:43:11,010 --> 00:43:11,902 Yeah. AUDIENCE: Thanks so much, Matt. This is really helpful. My name's Amira. I'm a CS50 student. I feel like, when I hear about smart cities, the topics that you brought up come into play a lot. A lot of it is about transportation, and a bit about energy. Where aren't people looking? And where do you think should the next wave of progress in smart cities could be? MATT: Yeah, well, back when we started Transit Screen, people weren't really working in transportation enough. So that was one area that we thought was really open for exploration. I think that's less true now than it was at the time. I think I'd sort of turn that back to you and say, what are you really interested in? And where do you see data that's kind of being underused? Because there's definitely-- if you think of, what are the different sectors that make up a city, there's housing. There's transportation. There's retail. There's energy. There's all these different inputs and outputs to the cities. And some of those are going to be on the verge of opening up more data, or where you can put together the data you need from interesting sources, kind of combine things in ways that people haven't looked at before. So for instance, I'll just give an example of the top of a head. I talked to a guy who is in DC. And he was in a sort of software development training program. And he was a labor inspector. So he was looking for workers who were being cheated out of their wages by businesses that were unethical. And he thought, well, there's no data set I can use that will tell me whether a business is shady or not. I mean, maybe you can find bad reviews on Yelp or something like that, but maybe not. So he thought, well, what else would be a sign that the business is either shady, or may be falling apart in unable to manage itself properly, which makes that likely to underpay its workers? And health inspection data for restaurants turns out to be an open data sets that's available in a lot of cities, including New York City, and I think maybe even Cambridge. So he took that data set and combined it with some internal data sets that he had access to, and showed that actually, it was true If you just went to inspect restaurants that were already failing their health exams, you could probably also predict, with some accuracy, which ones were likely to be underpaying their workers, because those are probably restaurants that are falling apart, or something. So I thought that was a really interesting example of using a data set that was maybe not intuitive. And so I think there's a little bit of creativity that you have to pull in when it comes to this. And it really should be kind of problem-focused. 00:46:10,818 --> 00:46:15,150 AUDIENCE: I was wondering, in terms of Transit Screen, how is the data for people who drive, and if they see your transit screen, they would-- because what I think is people who use a transit screen are people who use transit anyway. So you're not really taking away from people actually drive. MATT: Yeah. AUDIENCE: How does-- yeah. MATT: Yeah, so this gets into sort of interesting areas of something called mobility management or transportation demand management, which is, instead of paying a ton of money to build bigger roads, how can we be more efficient and change people's demand behavior for those roads, or for transit, or something? And there are a lot of incentives you can use. Information is sort of fundamental. Because if people don't have information about a bus, that they don't know it runs by them, they're never going to use it. So someone who might actually have a very convenient trip by bus might just drive because they've never heard of the bus before. They've never seen it. So there's a real education component we try to address. And the other thing we try to do is we try to be in places where people have just moved. So you just move into a new condo building, and so you could do a variety of things. You might have a car. You might bring the car and pay for parking. Or you might decide to leave the car and just use transit, and bikeshare, and Uber. And so if we can get you at that point when you're making that kind of decision to keep the car, to get rid of the car, to start driving to work, or to start learning how to use transit to get to work-- if we can get you then, then we have the ability to change your behavior. And we have a special amount of leverage. It's when people are new to a place, or people's habits are already changing, that you have the ability. And there's some good research in this area in sort of behavioral change. There are a few other books I'd like to mention in that area. There's one called Nudge, which is really sort of interesting popular read about how you can influence people's behavior using economic principles, written by guy who taught here. And there's a couple other books in the area called action design. And so this is about design as it applies to getting people to change their behavior for the better. And so there's a whole interesting area of exploration there. Yeah. AUDIENCE: Oh, hi. My name is [INAUDIBLE] and I work at the Planning Office at Harvard, the campus planning. So you're talking about the smart cities, so in your eyes, [INAUDIBLE], what is a smart campus? And what kind of data is smart data that students want to see? But considering that a lot of data is actually private, so you cannot expose it, so [INAUDIBLE]. MATT: Yeah, so the question is bringing it close to home, what makes a smart campus, as opposed to sort of a broader smart city? And I think the fact that you are from the Office of Planning, and the University has an office planning, sort of underscores the idea that a campus is kind of like a miniature city in many ways. I'm sure you look at it under a similar lens. And so I would say that the areas for a smart campus, obviously, there are some standards that everyone on the campus can agree on, similar to the city. You want it to be a sustainable campus. You want it to be an efficient campus. And you want people to feel safe on campus, and so on. So I think really, you look at all the different areas, whether it's law enforcement, or it's sustainability, or transportation, everything where there's a similarity between a campus and the city, all of those areas are areas where you can help make the student body smarter. At the same time, there's also a workforce who has to get to the campus. So you're dealing with those transportation problems at the same time. And so how do you get people from the suburbs, or from Boston? And how do you get them in here in a way that doesn't create problems for everyone? And so I think, for a campus, you need to think a little bit more expansively, not just about the footprint of the campus, but about the whole network of how everyone is coming to and from the campus. Transit Screen, just to tell us another short story, comes from a city called Arlington, Virginia. We actually did it within the city government as a pilot project before we launched the startup. And Arlington is a sort of interesting, almost like a campus, in the way that it has only 150,000 people, but almost 250,000 people working in it. So there's a flood of people coming in every day, because the Pentagon is there, and a variety of other major employers. So there's a real need for solutions that take into account the whole footprint of the campus. That's kind of the direction I would go in. But love to talk with you about that more. 00:51:33,800 --> 00:51:34,300 [INAUDIBLE] AUDIENCE: [INAUDIBLE] sensors, who has accessed information to the sensors? Can anyone request access? MATT: Yeah, so that's the challenge, is that a lot-- like, for instance, the gunshot detectors. That's the police's thing. They're not going to share that. But sometimes with sensors, you can get a certain data set, or you can get historical data set that preserves some kind of privacy. So for instance, with New York City, you can get a data set that has all the taxi trips that were taken in the city. Or with bikeshare, you've got all the history of the bikeshare trips. And so some of those were collected with sensors. But you can't get them in real time, but you can get them historically. And so I would focus on that kind of data. Or of course, might even try to put your own sensor out somewhere. You interested in video, put a camera up somewhere, and start collecting that data. AUDIENCE: So what would incentify those companies to release their real time data for everyone? Why would companies do that? MATT: Generally they don't. They don't share that kind of sensor data because-- AUDIENCE: But we are able to access API, like the transit information. Is that just like the government wants to? MATT: So there are two things that make companies often share their data. One is if there's a public policy reason, like the government says, you should have open data because of transparency. Let's say the subway isn't running very well, or it's not running on time, maybe people should have the right to look at the data and say whether it's running on time or not. Subways don't always see it that way. But that might be one reason to release the data. The other reason people release data is often just related to marketing and promotion. So why do we get data from so many public and private transportation providers at Transit Screen? Because we promote their services on public screens. And so we're free advertising for these kinds of agencies. And so if you can identify a common interest where-- another thing is, maybe a company doesn't have enough people to do all the analysis they'd like. Maybe they don't have data scientists, but they have data. So you might be able to say, well, if you share this data with me, I can do some things with it that might provide insight or information to you that make you run your business better. So there are a variety of different pitches from the regulation. There's even something called the Freedom of Information Act, where you can request data from a lot of public agencies. It's pretty easy to use. There's a site called muckrock.com that's really good for it. But then, there's often some kind of common interest. And I think that's maybe, in many cases with private companies, is a better way to go. 00:54:30,220 --> 00:54:32,970 All right, thank you very much.