1 00:00:00,000 --> 00:00:02,435 [MUSIC PLAYING] 2 00:00:02,435 --> 00:00:16,152 3 00:00:16,152 --> 00:00:17,860 DAVID MALAN: So today we're going to talk 4 00:00:17,860 --> 00:00:22,120 about challenges at this crucial intersection of law and technology. 5 00:00:22,120 --> 00:00:26,260 And the goal at the end of today is not to have provided you with more answers, 6 00:00:26,260 --> 00:00:30,670 but hopefully generated more questions about what this intersection is 7 00:00:30,670 --> 00:00:32,170 and where we're going to go forward. 8 00:00:32,170 --> 00:00:36,220 Because at this intersection lie a lot of really interesting and challenging 9 00:00:36,220 --> 00:00:38,590 problems that are at the forefront of what we're doing. 10 00:00:38,590 --> 00:00:40,720 And you, as a practitioner, may be someone 11 00:00:40,720 --> 00:00:44,710 who is asked to confront and contend with and provide resolutions 12 00:00:44,710 --> 00:00:46,600 for some of these problems. 13 00:00:46,600 --> 00:00:49,017 This lecture's going to be divided into two parts roughly. 14 00:00:49,017 --> 00:00:50,725 In the first part, we're going to discuss 15 00:00:50,725 --> 00:00:53,260 trust, whether we can trust the software that we receive 16 00:00:53,260 --> 00:00:56,128 and what implications that might have for software 17 00:00:56,128 --> 00:00:57,670 that's transmitted over the internet. 18 00:00:57,670 --> 00:01:00,730 And in the second part, we're going to talk about regulatory challenges that 19 00:01:00,730 --> 00:01:01,355 might be faced. 20 00:01:01,355 --> 00:01:03,910 As new emergent technologies come into play, 21 00:01:03,910 --> 00:01:07,120 how is the law prepared, or is the law prepared 22 00:01:07,120 --> 00:01:09,280 to contend with those challenges? 23 00:01:09,280 --> 00:01:13,600 But let's start by talking about this idea of a trust model, 24 00:01:13,600 --> 00:01:16,420 trust model being a computational term for basically 25 00:01:16,420 --> 00:01:19,240 do we trust something that we're receiving over the internet? 26 00:01:19,240 --> 00:01:22,150 Do we trust that software is what it says it is? 27 00:01:22,150 --> 00:01:26,770 Do we trust that a provider is providing a service in the way they describe, 28 00:01:26,770 --> 00:01:29,880 or are there doing other things behind the scenes? 29 00:01:29,880 --> 00:01:33,130 Now, as part of this lecture, there's a lot of supplementary reading materials 30 00:01:33,130 --> 00:01:36,190 that we've incorporated in that we're going to draw on quite a bit 31 00:01:36,190 --> 00:01:37,510 throughout the course of today. 32 00:01:37,510 --> 00:01:41,950 And the first of those is a paper called "Reflections on Trusting Trust." 33 00:01:41,950 --> 00:01:45,640 This is arguably one of the most famous papers in computer science. 34 00:01:45,640 --> 00:01:48,400 It was written in 1984 by Ken Thompson. 35 00:01:48,400 --> 00:01:51,640 Ken Thompson was one of the inventors of the Unix operating 36 00:01:51,640 --> 00:01:55,030 system, on which Linux was based, on which subsequently, 37 00:01:55,030 --> 00:01:57,940 based on a version of Linux, Mac OS is based. 38 00:01:57,940 --> 00:02:00,940 And so he's quite a well-known figure in the computer science community. 39 00:02:00,940 --> 00:02:04,357 And he wrote this paper to accept an award called the Turing Award, again, 40 00:02:04,357 --> 00:02:06,440 one of the most famous awards in computer science. 41 00:02:06,440 --> 00:02:10,479 And in it, he's trying to highlight the problem of trust in software. 42 00:02:10,479 --> 00:02:12,700 And he begins by discussing about a computer 43 00:02:12,700 --> 00:02:14,710 program that can reproduce itself. 44 00:02:14,710 --> 00:02:17,990 We typically refer to this as what's called a quine in computer science. 45 00:02:17,990 --> 00:02:21,655 But the idea is can you write a simple program that reproduces itself? 46 00:02:21,655 --> 00:02:23,710 And we won't go through that exercise here. 47 00:02:23,710 --> 00:02:27,580 But Thompson shows us that, yes, it is relatively trivial actually 48 00:02:27,580 --> 00:02:30,160 to write programs that do this. 49 00:02:30,160 --> 00:02:32,530 But what does this then lead to? 50 00:02:32,530 --> 00:02:35,080 So the next step of the process that Thompson discusses is 51 00:02:35,080 --> 00:02:39,640 stage two in this paper, is how do you teach a computer 52 00:02:39,640 --> 00:02:41,290 to teach itself something? 53 00:02:41,290 --> 00:02:43,900 And he uses the idea of a compiler. 54 00:02:43,900 --> 00:02:46,480 Recall that we use compilers in some programming languages 55 00:02:46,480 --> 00:02:50,560 to turn source code, the human-like syntax 56 00:02:50,560 --> 00:02:54,338 that we understand-- languages like C, for example, 57 00:02:54,338 --> 00:02:55,630 will be written in source code. 58 00:02:55,630 --> 00:02:58,180 And they need to be compiled, or transformed, 59 00:02:58,180 --> 00:03:01,720 into zeros and ones, machine code, because computers only 60 00:03:01,720 --> 00:03:03,730 understand these zeros and ones. 61 00:03:03,730 --> 00:03:05,860 They don't understand the human-like syntax 62 00:03:05,860 --> 00:03:09,550 that we're familiar with as programmers when we are writing our code. 63 00:03:09,550 --> 00:03:11,980 And what Thompson is suggesting that we can do 64 00:03:11,980 --> 00:03:14,890 is we can teach the compiler, the program that 65 00:03:14,890 --> 00:03:18,820 actually takes the source code and transforms it into zeros and ones, 66 00:03:18,820 --> 00:03:20,530 to compile itself. 67 00:03:20,530 --> 00:03:24,070 And he starts out by doing this by introducing a new character 68 00:03:24,070 --> 00:03:26,320 for the compiler to understand. 69 00:03:26,320 --> 00:03:28,870 The analogy is drawn to the newline character, which we 70 00:03:28,870 --> 00:03:30,803 type when we reach the end of a line. 71 00:03:30,803 --> 00:03:33,220 We want to go down and back to the beginning of a new one. 72 00:03:33,220 --> 00:03:34,540 We enter the newline character. 73 00:03:34,540 --> 00:03:37,330 There are other characters that were not initially envisioned 74 00:03:37,330 --> 00:03:38,920 as part of the C compiler. 75 00:03:38,920 --> 00:03:41,110 And one of those is vertical tab, which basically 76 00:03:41,110 --> 00:03:44,800 allows you to jump down several lines without necessarily resetting back 77 00:03:44,800 --> 00:03:48,070 to the beginning of the line as newline would. 78 00:03:48,070 --> 00:03:50,190 And so Thompson goes through the process, 79 00:03:50,190 --> 00:03:52,510 that I won't expound on here because it's 80 00:03:52,510 --> 00:03:55,930 covered in the paper, of how to teach the compiler what 81 00:03:55,930 --> 00:03:59,050 this new character, this vertical tab means. 82 00:03:59,050 --> 00:04:03,190 He shows us that we can write code in the C programming language 83 00:04:03,190 --> 00:04:06,280 and then have the compiler compile that code into zeros and ones that 84 00:04:06,280 --> 00:04:08,860 create something called a binary, a program 85 00:04:08,860 --> 00:04:11,290 that a computer can execute and understand. 86 00:04:11,290 --> 00:04:14,950 And then we can use that newly created compiler 87 00:04:14,950 --> 00:04:18,470 that we've just created to compile other C programs. 88 00:04:18,470 --> 00:04:20,680 Which means that once we've taught the computer how 89 00:04:20,680 --> 00:04:24,150 to understand what this vertical tab character is, 90 00:04:24,150 --> 00:04:27,040 it then can propagate into any other C program that we write. 91 00:04:27,040 --> 00:04:31,600 The computer is learning, effectively, a new thing to interpret, 92 00:04:31,600 --> 00:04:34,270 and it can then interpret that in every other program. 93 00:04:34,270 --> 00:04:36,980 But then Thompson leads us into stage three, 94 00:04:36,980 --> 00:04:41,470 which is, what if that's not all the computer or the compiler does? 95 00:04:41,470 --> 00:04:44,530 What if instead of just adding that vertical tab character 96 00:04:44,530 --> 00:04:49,150 whenever we did it, we also secretly, as part of the source code, 97 00:04:49,150 --> 00:04:53,455 insert a bug into the code, such that now whenever we compile the code 98 00:04:53,455 --> 00:04:57,670 and we encounter that backslash V, that vertical tab character, 99 00:04:57,670 --> 00:04:59,920 we're not only putting that into the code 100 00:04:59,920 --> 00:05:02,650 so that the computer can understand and pass this slash 101 00:05:02,650 --> 00:05:05,050 V, the character that it never knew about before, 102 00:05:05,050 --> 00:05:09,070 but we've also sort of surreptitiously hidden a bug in the code. 103 00:05:09,070 --> 00:05:11,680 And again, Thompson goes into great detail 104 00:05:11,680 --> 00:05:14,680 about exactly how that can be done and exactly what steps we can then 105 00:05:14,680 --> 00:05:17,060 take to make it look like that was never there. 106 00:05:17,060 --> 00:05:19,340 We can change the source code, modify it, 107 00:05:19,340 --> 00:05:22,220 and make it look like we never had a bug in there, 108 00:05:22,220 --> 00:05:24,920 even though it is now propagating into all of the source code 109 00:05:24,920 --> 00:05:28,160 we ever write or we ever compile going forward. 110 00:05:28,160 --> 00:05:32,750 We've created a way to surreptitiously hide bugs in our code. 111 00:05:32,750 --> 00:05:36,830 And the conclusion that Thompson draws is, is it 112 00:05:36,830 --> 00:05:40,933 possible to ever trust software that was written by anyone else? 113 00:05:40,933 --> 00:05:43,850 In this course we've talked about some of the tools that are available 114 00:05:43,850 --> 00:05:46,767 to programmers that would allow them to go back in time-- for example, 115 00:05:46,767 --> 00:05:50,330 we've discussed GitHub on several occasions to go back in time-- 116 00:05:50,330 --> 00:05:52,280 and see prior versions of code. 117 00:05:52,280 --> 00:05:54,560 In the 1980s, when this paper was written, 118 00:05:54,560 --> 00:05:56,330 that wasn't necessarily possible. 119 00:05:56,330 --> 00:06:01,340 It was relatively easy to hide source code changes so that the untrained eye 120 00:06:01,340 --> 00:06:02,510 wouldn't know about them. 121 00:06:02,510 --> 00:06:04,880 Code was not shared via the internet. 122 00:06:04,880 --> 00:06:08,120 Code was shared via floppy disks or hard disk that were being 123 00:06:08,120 --> 00:06:10,402 passed between people who needed them. 124 00:06:10,402 --> 00:06:13,610 And so there was no easy way to verify that code that was written by somebody 125 00:06:13,610 --> 00:06:16,580 else is actually trustworthy. 126 00:06:16,580 --> 00:06:20,142 Now, again, this paper came out 35-plus years ago now. 127 00:06:20,142 --> 00:06:22,850 And it came out around the time that the Computer Fraud and Abuse 128 00:06:22,850 --> 00:06:25,160 Act, which we've also previously discussed, 129 00:06:25,160 --> 00:06:29,990 was being drafted and run through Congress. 130 00:06:29,990 --> 00:06:34,070 Did lawmakers heed the advice of Ken Thompson? 131 00:06:34,070 --> 00:06:39,590 Do we still today trust that our programs that we receive 132 00:06:39,590 --> 00:06:42,410 or that we write are free of bugs? 133 00:06:42,410 --> 00:06:44,840 Is there a way for us to verify that? 134 00:06:44,840 --> 00:06:48,830 What should happen if code is found to be buggy? 135 00:06:48,830 --> 00:06:50,300 What if it's unintentionally buggy? 136 00:06:50,300 --> 00:06:52,100 What if it's maliciously buggy? 137 00:06:52,100 --> 00:06:56,180 Do we have a way to challenge things like that? 138 00:06:56,180 --> 00:06:58,900 Do we have a way to prosecute those kinds of cases 139 00:06:58,900 --> 00:07:03,130 if the bug creates some sort of catastrophic failure in some business? 140 00:07:03,130 --> 00:07:04,000 Not exactly. 141 00:07:04,000 --> 00:07:08,593 The challenge of figuring out whether or not we should trust software 142 00:07:08,593 --> 00:07:10,760 is something that we have to contend with every day. 143 00:07:10,760 --> 00:07:15,980 And there's no bright line answer for exactly how to do so. 144 00:07:15,980 --> 00:07:19,610 Now let's turn to perhaps a more modern interpretation of this idea 145 00:07:19,610 --> 00:07:22,700 and take a look at the Samsung Smart TV policy. 146 00:07:22,700 --> 00:07:25,940 So this was a bit of news a few years ago, 147 00:07:25,940 --> 00:07:30,650 that Samsung was recording or was capturing voice commands 148 00:07:30,650 --> 00:07:33,970 so people could make use of their television without needing a remote. 149 00:07:33,970 --> 00:07:35,810 You could say something like, television, 150 00:07:35,810 --> 00:07:39,390 please turn the volume up, or television, change the channel. 151 00:07:39,390 --> 00:07:42,740 But it turned out that when Samsung was collecting this information, 152 00:07:42,740 --> 00:07:46,100 they were transmitting it to a third party, a third-party language 153 00:07:46,100 --> 00:07:51,410 processor, who would ostensibly be taking the commands they hear 154 00:07:51,410 --> 00:07:55,820 and feeding them into their own database to improve the quality of understanding 155 00:07:55,820 --> 00:07:57,020 what these commands were. 156 00:07:57,020 --> 00:07:58,670 So it would hear-- 157 00:07:58,670 --> 00:08:03,800 let's say thousands of people use this brand of television. 158 00:08:03,800 --> 00:08:08,480 It would take the thousands of people's voices all making the same command, 159 00:08:08,480 --> 00:08:12,055 feed it into its algorithm to process this command, and hopefully try 160 00:08:12,055 --> 00:08:14,930 and come up with a better or more comprehensive understanding of what 161 00:08:14,930 --> 00:08:18,530 that command meant to avoid the mistake of I say one thing, 162 00:08:18,530 --> 00:08:23,720 and the TV does something else because it misinterprets what I do. 163 00:08:23,720 --> 00:08:28,550 If you take a look at Samsung's policy, it says things like the device 164 00:08:28,550 --> 00:08:32,960 will collect IP addresses, cookies, your hardware and software configuration, so 165 00:08:32,960 --> 00:08:38,120 the settings that you have put onto your television, your browser information. 166 00:08:38,120 --> 00:08:41,429 Some of these TVs, these smart TVs, have web browsers built into them. 167 00:08:41,429 --> 00:08:44,810 And so you may be also sharing information about your history 168 00:08:44,810 --> 00:08:46,160 and so on. 169 00:08:46,160 --> 00:08:50,060 Is this necessarily a bad thing? 170 00:08:50,060 --> 00:08:53,810 When it became a news story it was mildly scandalous in the tech world 171 00:08:53,810 --> 00:08:55,640 because it was unexpected. 172 00:08:55,640 --> 00:08:59,330 No one thought that that was something a television should be doing. 173 00:08:59,330 --> 00:09:02,690 But is it really all that different from when you use your browser anyway? 174 00:09:02,690 --> 00:09:06,980 We've seen in this course that whenever we connect to a website, 175 00:09:06,980 --> 00:09:11,690 we need to provide our IP address so that the site that we're requesting, 176 00:09:11,690 --> 00:09:14,960 the server, knows where to send our data back to. 177 00:09:14,960 --> 00:09:15,680 And in addition. 178 00:09:15,680 --> 00:09:18,500 As part of those HTTP headers, we not only send our IP address, 179 00:09:18,500 --> 00:09:22,460 but we're usually sending information about what operating system or running, 180 00:09:22,460 --> 00:09:25,970 what browser we're currently using, where geographically we 181 00:09:25,970 --> 00:09:29,060 might be located, so ways to help the routers route 182 00:09:29,060 --> 00:09:30,680 traffic in the right direction. 183 00:09:30,680 --> 00:09:33,800 Are we leaking as much information when we 184 00:09:33,800 --> 00:09:37,880 use the internet to make a request as we are when our television is interpreting 185 00:09:37,880 --> 00:09:39,380 or understanding a command? 186 00:09:39,380 --> 00:09:46,220 Why is it that this particular action, this interpretation of sound, 187 00:09:46,220 --> 00:09:49,490 feels so much more of a privacy violation 188 00:09:49,490 --> 00:09:53,480 than just accessing something on the internet when we're voluntarily, sort 189 00:09:53,480 --> 00:09:55,730 of, revealing the same information? 190 00:09:55,730 --> 00:09:58,640 Are we not voluntarily relinquishing the same information 191 00:09:58,640 --> 00:10:04,420 to a company like Samsung, whose smart TVs sort of precipitated this? 192 00:10:04,420 --> 00:10:07,810 Moreover, is it technologically feasible for Samsung 193 00:10:07,810 --> 00:10:11,347 to not collect all of the sounds that it hears? 194 00:10:11,347 --> 00:10:13,180 One of the big concerns as well that came up 195 00:10:13,180 --> 00:10:19,480 with these smart TVs is that when does the recording and transmitting start? 196 00:10:19,480 --> 00:10:22,190 For those of you who maybe have seen old versions of Star Trek, 197 00:10:22,190 --> 00:10:26,697 you may recall that in order to activate the computers on that television show, 198 00:10:26,697 --> 00:10:28,030 someone would just say computer. 199 00:10:28,030 --> 00:10:30,370 And then the computer would sort of spring to life, 200 00:10:30,370 --> 00:10:34,237 and then they could have a normal English language interaction with it. 201 00:10:34,237 --> 00:10:36,070 There's no need to program specific commands 202 00:10:36,070 --> 00:10:39,970 or click anything or have any other interaction other than voice. 203 00:10:39,970 --> 00:10:43,510 How would we technologically accomplish that now? 204 00:10:43,510 --> 00:10:46,780 How would a device know whether or not it 205 00:10:46,780 --> 00:10:49,810 should be listening unless it's listening for a specific word? 206 00:10:49,810 --> 00:10:52,660 Is there a way for the device to perhaps listen 207 00:10:52,660 --> 00:10:56,020 to everything that comes in but only start sending information 208 00:10:56,020 --> 00:10:57,340 when it hears a command? 209 00:10:57,340 --> 00:11:01,930 Is it impossible for it not to capture all of the information 210 00:11:01,930 --> 00:11:06,040 that it's hearing and send it somewhere, encrypt it or not encrypt it, and just 211 00:11:06,040 --> 00:11:07,960 transmit it somewhere else? 212 00:11:07,960 --> 00:11:10,000 It's kind of an interesting question. 213 00:11:10,000 --> 00:11:14,590 Samsung also allows not only voice controls, but gesture controls. 214 00:11:14,590 --> 00:11:17,110 This may help people who are visually impaired 215 00:11:17,110 --> 00:11:22,630 or help people who are unable to use a remote control device. 216 00:11:22,630 --> 00:11:24,940 They can wave or make certain gestures. 217 00:11:24,940 --> 00:11:28,150 And in so doing, they're going to capture your face perhaps 218 00:11:28,150 --> 00:11:29,500 as part of this gesture. 219 00:11:29,500 --> 00:11:32,620 Or they may capture certain movements that you're making 220 00:11:32,620 --> 00:11:34,750 or maybe even capture, depending on the quality 221 00:11:34,750 --> 00:11:38,950 of the camera built into the television, aspects of the room around you. 222 00:11:38,950 --> 00:11:41,230 Is this necessarily problematic? 223 00:11:41,230 --> 00:11:44,500 Is this something that we as users of this software 224 00:11:44,500 --> 00:11:48,190 need to accept as something that just is part of the deal? 225 00:11:48,190 --> 00:11:50,530 In order to use this feature, we have to do it? 226 00:11:50,530 --> 00:11:52,570 Is there a necessary compromise? 227 00:11:52,570 --> 00:11:58,870 Is there a way to ensure that Samsung is properly interacting with our data? 228 00:11:58,870 --> 00:12:01,630 Should there be a way for us to verify this? 229 00:12:01,630 --> 00:12:04,790 Or is that proprietary to Samsung, the way that it handles that data? 230 00:12:04,790 --> 00:12:05,290 231 00:12:05,290 --> 00:12:07,720 Again, these are all sorts of questions that we really 232 00:12:07,720 --> 00:12:09,130 want to know the answers to. 233 00:12:09,130 --> 00:12:14,530 We want to know whether or not what we are saying we're doing is secure, 234 00:12:14,530 --> 00:12:15,490 is private. 235 00:12:15,490 --> 00:12:18,490 And we can read the policies of these organizations that are providing 236 00:12:18,490 --> 00:12:21,030 these tools for us to interact with. 237 00:12:21,030 --> 00:12:22,290 But is that enough? 238 00:12:22,290 --> 00:12:23,970 Do we have a way to verify? 239 00:12:23,970 --> 00:12:26,762 Is there anything we can do other than just trust 240 00:12:26,762 --> 00:12:29,220 that these companies are doing what they say they're doing, 241 00:12:29,220 --> 00:12:31,500 or services or programmers are providing tools that 242 00:12:31,500 --> 00:12:33,800 do exactly what they say that they do? 243 00:12:33,800 --> 00:12:38,320 Without some really advanced knowledge and skill in tech, the answer is no. 244 00:12:38,320 --> 00:12:40,680 And even if you have that advanced skill or knowledge, 245 00:12:40,680 --> 00:12:42,900 it's really hard to take a look at a binary, zeros 246 00:12:42,900 --> 00:12:47,460 and ones, the actual executable program that is being run on these devices, 247 00:12:47,460 --> 00:12:51,300 and look at it and say, yeah, I think that that does match the source code 248 00:12:51,300 --> 00:12:53,850 that they provided to me so I can really feel 249 00:12:53,850 --> 00:12:58,590 reasonably confident that yeah I trust this particular piece of software. 250 00:12:58,590 --> 00:13:00,780 As we've discussed in the context of security, 251 00:13:00,780 --> 00:13:03,570 trust is sort of something we have to deal with. 252 00:13:03,570 --> 00:13:07,390 We're constantly torn between this tension of not trusting other people, 253 00:13:07,390 --> 00:13:10,140 and so we encrypt everything, but needing to trust people in order 254 00:13:10,140 --> 00:13:11,670 for some things to work. 255 00:13:11,670 --> 00:13:15,910 It's a very delicate balancing act that we have to contend with every day. 256 00:13:15,910 --> 00:13:17,910 And again, I don't mean to pick on Samsung here. 257 00:13:17,910 --> 00:13:19,970 This is just one of many different examples 258 00:13:19,970 --> 00:13:22,093 that have sort of existed in popular culture. 259 00:13:22,093 --> 00:13:23,760 Let's consider another one, for example. 260 00:13:23,760 --> 00:13:28,070 Let's consider a piece of hardware called the Intel Management 261 00:13:28,070 --> 00:13:30,980 Engine, or hardware, firmware, software, depending 262 00:13:30,980 --> 00:13:33,050 on what it is, because one of the open questions 263 00:13:33,050 --> 00:13:36,650 is, what exactly is the Intel Management Engine? 264 00:13:36,650 --> 00:13:42,980 What we do know about it is that it is usually part of the CPU itself. 265 00:13:42,980 --> 00:13:43,610 It's unclear. 266 00:13:43,610 --> 00:13:47,720 It's not exactly been publicly disclosed whether it's built into the CPU 267 00:13:47,720 --> 00:13:52,340 or perhaps built into the CMOS or the BIOS, different parts, low-level parts 268 00:13:52,340 --> 00:13:54,980 of the motherboard itself. 269 00:13:54,980 --> 00:14:00,890 But it is a chip or some software that runs on a computer, whose intended 270 00:14:00,890 --> 00:14:04,610 purpose is to help network administrators in the event 271 00:14:04,610 --> 00:14:06,750 that something has gone wrong with a computer. 272 00:14:06,750 --> 00:14:08,750 So recall that we previously discussed this idea 273 00:14:08,750 --> 00:14:11,840 that it's possible to encrypt your hard drive, 274 00:14:11,840 --> 00:14:13,730 and that there are also ramifications that 275 00:14:13,730 --> 00:14:15,740 can happen if you encrypt your hard drive 276 00:14:15,740 --> 00:14:20,310 and forget exactly how to un-encrypt your hard drive. 277 00:14:20,310 --> 00:14:25,458 What the Intel Management Engine would allow, one of its several features, 278 00:14:25,458 --> 00:14:27,500 is for a network administrator, perhaps if you're 279 00:14:27,500 --> 00:14:31,190 in an enterprise suite, your IT professional, your head of IT 280 00:14:31,190 --> 00:14:35,180 might be able to access your computer remotely by issuing commands, 281 00:14:35,180 --> 00:14:38,630 because the computer is able to listen on a specific port. 282 00:14:38,630 --> 00:14:39,802 It's like 16,000 something. 283 00:14:39,802 --> 00:14:41,510 I don't remember exactly the port number. 284 00:14:41,510 --> 00:14:44,750 And it's discussed again, as well, in the article provided. 285 00:14:44,750 --> 00:14:47,120 But it allows the computer to be listening 286 00:14:47,120 --> 00:14:49,250 for a specific kind of request that should only 287 00:14:49,250 --> 00:14:53,270 be coming from an administrator's computer to be able to remotely access 288 00:14:53,270 --> 00:14:54,780 another computer. 289 00:14:54,780 --> 00:14:58,610 But the concern is because it's listening on a specific port, 290 00:14:58,610 --> 00:15:03,350 how is it possible to ensure that the request that it's 291 00:15:03,350 --> 00:15:08,500 receiving on that port or via that IP address are accurate? 292 00:15:08,500 --> 00:15:12,220 Because Intel has not disclosed the actual code 293 00:15:12,220 --> 00:15:15,730 that comprises this module of the IME. 294 00:15:15,730 --> 00:15:18,910 And then the question becomes, is that a problem? 295 00:15:18,910 --> 00:15:22,300 Should they be required to reveal that code? 296 00:15:22,300 --> 00:15:25,270 Some will certainly argue yes it's really important for us 297 00:15:25,270 --> 00:15:30,910 as end users to understand what software is running on our devices. 298 00:15:30,910 --> 00:15:36,430 We have a right to know what programs are running on our computers. 299 00:15:36,430 --> 00:15:39,340 Others will say, no, we don't have a right to do that. 300 00:15:39,340 --> 00:15:41,900 This is Intel's intellectual property. 301 00:15:41,900 --> 00:15:46,540 It may contain trade secret information that allows its chips to work better. 302 00:15:46,540 --> 00:15:49,690 We don't, for example, argue Coca-Cola should 303 00:15:49,690 --> 00:15:53,920 be required to reveal its secret formula to us because it may implicate 304 00:15:53,920 --> 00:15:57,400 certain allergies or Kentucky Fried Chicken needs 305 00:15:57,400 --> 00:15:59,980 to disclose its secret recipe to us. 306 00:15:59,980 --> 00:16:02,500 So why should Intel be required to tell us 307 00:16:02,500 --> 00:16:08,090 about the lines of code that comprise this part of its hardware or software 308 00:16:08,090 --> 00:16:12,220 or firmware, again depending on exactly what it is, because it's slightly 309 00:16:12,220 --> 00:16:15,850 unclear as to what this tool is. 310 00:16:15,850 --> 00:16:18,065 So the question again is, are they required 311 00:16:18,065 --> 00:16:19,690 to provide some degree of transparency? 312 00:16:19,690 --> 00:16:21,160 Do we have a right to know? 313 00:16:21,160 --> 00:16:23,350 Should we just trust that this software is indeed 314 00:16:23,350 --> 00:16:29,620 only being used to allow remote access only to authorized individuals? 315 00:16:29,620 --> 00:16:32,560 If Intel were to provide a tool to tell us whether our computer was 316 00:16:32,560 --> 00:16:37,000 vulnerable to attack from outside computers accessing 317 00:16:37,000 --> 00:16:40,210 our own personal computers outside of the enterprise context, 318 00:16:40,210 --> 00:16:43,460 should we trust the result of the software 319 00:16:43,460 --> 00:16:46,210 that Intel provided that tells us whether or not it is vulnerable? 320 00:16:46,210 --> 00:16:49,000 As it turns out, Intel does provide this software 321 00:16:49,000 --> 00:16:54,520 to tell you whether or not your IME chip is activated in such a way 322 00:16:54,520 --> 00:16:59,740 that yes, you are subject to potential remote access or no, you are not. 323 00:16:59,740 --> 00:17:03,370 Does saying that you are or your aunt reveal potential trade 324 00:17:03,370 --> 00:17:05,589 secret-related information about Intel? 325 00:17:05,589 --> 00:17:08,200 Should we be concerned that Intel is the one providing us 326 00:17:08,200 --> 00:17:11,440 this information versus a third party providing us this information? 327 00:17:11,440 --> 00:17:13,540 Of course, Intel being the only organization 328 00:17:13,540 --> 00:17:16,690 that really can tell us that we're vulnerable 329 00:17:16,690 --> 00:17:20,839 or not because they're the only ones who know what is on this software. 330 00:17:20,839 --> 00:17:23,200 So again, not picking on any individual company 331 00:17:23,200 --> 00:17:26,170 here, just drawing from case studies that exist in popular culture 332 00:17:26,170 --> 00:17:28,960 from in tech circles about the kinds of questions 333 00:17:28,960 --> 00:17:32,230 that we need to start considering and wrestling with. 334 00:17:32,230 --> 00:17:34,750 Are they going to be required to disclose this information? 335 00:17:34,750 --> 00:17:38,620 Should Samsung be revealing information about what sorts of data 336 00:17:38,620 --> 00:17:40,960 it's collecting and how it's collecting it? 337 00:17:40,960 --> 00:17:44,410 Do we trust that our compilers, as Ken Thompson alluded to, 338 00:17:44,410 --> 00:17:47,020 actually compile our code the way that they say that they do? 339 00:17:47,020 --> 00:17:50,500 This healthy skepticism is always at the forefront of our mind 340 00:17:50,500 --> 00:17:53,800 when we're considering programming- and technology-related questions. 341 00:17:53,800 --> 00:17:58,630 But how do we press on these issues further in a legal context? 342 00:17:58,630 --> 00:17:59,950 That's still to be determined. 343 00:17:59,950 --> 00:18:01,742 And that's going to be something that we're 344 00:18:01,742 --> 00:18:05,170 going to be grappling with for quite some time, I think. 345 00:18:05,170 --> 00:18:08,260 Another key issue that's likely to be faced by technologists 346 00:18:08,260 --> 00:18:10,540 and the lawyers who represent them, particularly 347 00:18:10,540 --> 00:18:15,520 startups working in a small environment with limited numbers of programmers 348 00:18:15,520 --> 00:18:19,780 that may be relying on material that's been open sourced online, 349 00:18:19,780 --> 00:18:23,890 is this idea of open source software and licensing. 350 00:18:23,890 --> 00:18:27,160 Because the scheme that exists out there is quite complicated. 351 00:18:27,160 --> 00:18:29,123 There are many, many different licenses that 352 00:18:29,123 --> 00:18:31,540 have many, many different provisions associated with them. 353 00:18:31,540 --> 00:18:34,090 And each one will have different combinations 354 00:18:34,090 --> 00:18:36,610 of some of these things being permitted, some of them not, 355 00:18:36,610 --> 00:18:40,090 and potential ramifications of using some of these licenses. 356 00:18:40,090 --> 00:18:43,750 We're going to discuss three of the most popularly used licenses, particularly 357 00:18:43,750 --> 00:18:47,680 in the context of open source software, generally that is released on GitHub. 358 00:18:47,680 --> 00:18:54,250 And the first of these is GPL version 3, GPL being the new Public License. 359 00:18:54,250 --> 00:18:58,780 And one of the things that GPL often gets criticism for 360 00:18:58,780 --> 00:19:01,970 is it is known as a copyleft license. 361 00:19:01,970 --> 00:19:05,727 And copyleft is sort of designed to be the inverse of what copyright 362 00:19:05,727 --> 00:19:07,060 protection's usually thought of. 363 00:19:07,060 --> 00:19:11,297 Copyright protections give the owner or the person who owns the copyright, not 364 00:19:11,297 --> 00:19:14,380 necessarily the creator but the person who owns the copyright, the ability 365 00:19:14,380 --> 00:19:19,630 to restrict certain behaviors associated with that work or that material. 366 00:19:19,630 --> 00:19:23,290 The GPL sort of does the opposite. 367 00:19:23,290 --> 00:19:27,100 Instead of restricting the rights of others, 368 00:19:27,100 --> 00:19:31,960 it compels others, who use code that has been licensed under the GPL, 369 00:19:31,960 --> 00:19:35,500 to avoid allowing any restrictions at all, 370 00:19:35,500 --> 00:19:38,830 such that others can also benefit from using and modifying 371 00:19:38,830 --> 00:19:41,470 that same source code. 372 00:19:41,470 --> 00:19:47,710 The catch with GPL is that any code that incorporates the GPL-- 373 00:19:47,710 --> 00:19:49,360 GPL license, excuse me. 374 00:19:49,360 --> 00:19:53,980 Any code that includes GPL-licensed code-- 375 00:19:53,980 --> 00:19:56,590 so say you incorporate some module written by somebody else, 376 00:19:56,590 --> 00:19:59,740 or your client incorporate something that they found on GitHub 377 00:19:59,740 --> 00:20:02,740 or found on the internet and wants to include it into their own project. 378 00:20:02,740 --> 00:20:06,970 If that code is licensed under the GPL, unfortunately one of the side effects 379 00:20:06,970 --> 00:20:11,260 perhaps of what your client or what you have just done 380 00:20:11,260 --> 00:20:15,250 is you have transformed your entire work into something that 381 00:20:15,250 --> 00:20:19,210 is GPL, which means you are also then required to make the source 382 00:20:19,210 --> 00:20:22,540 code available to anybody, make the binary available to anybody, 383 00:20:22,540 --> 00:20:28,030 and also to allow anybody to have the same rights of modification 384 00:20:28,030 --> 00:20:30,797 and redistribution that you had as well. 385 00:20:30,797 --> 00:20:33,880 So think about some of the dangers that might introduce for a company that 386 00:20:33,880 --> 00:20:37,120 relies extensively on GPL license code. 387 00:20:37,120 --> 00:20:39,475 They may not be able to profit as much from that code 388 00:20:39,475 --> 00:20:40,600 as they thought they would. 389 00:20:40,600 --> 00:20:45,460 Perhaps they thought they had this amazing disruptive idea that 390 00:20:45,460 --> 00:20:47,590 was going to transform the market. 391 00:20:47,590 --> 00:20:51,280 And this particular piece of GPL code that they found online 392 00:20:51,280 --> 00:20:54,160 allowed them-- it was the final piece of the puzzle that they needed. 393 00:20:54,160 --> 00:20:56,860 When they included it in their own source code, 394 00:20:56,860 --> 00:20:59,380 they transformed their entire project, according 395 00:20:59,380 --> 00:21:03,640 to the terms of the GPL license, into something that was also GPL licensed. 396 00:21:03,640 --> 00:21:06,700 So their profitability-- they could still sell it. 397 00:21:06,700 --> 00:21:09,880 But their profitability may be diminished because the source code is 398 00:21:09,880 --> 00:21:13,950 available freely to anybody to access. 399 00:21:13,950 --> 00:21:16,510 Now, some people find this particularly restrictive. 400 00:21:16,510 --> 00:21:18,510 In fact, pejoratively sometimes this is referred 401 00:21:18,510 --> 00:21:22,560 to as the GNU virus, the General Public License virus, 402 00:21:22,560 --> 00:21:25,320 because it propagates so extensively. 403 00:21:25,320 --> 00:21:28,230 As soon as you touch code or use code really 404 00:21:28,230 --> 00:21:30,690 that is GPL licensed, suddenly everything 405 00:21:30,690 --> 00:21:33,180 that it touches is also GPL licensed. 406 00:21:33,180 --> 00:21:36,300 So it's, depending on your perspective of open source licensing, 407 00:21:36,300 --> 00:21:39,210 it's either a great thing because it's making more stuff available, 408 00:21:39,210 --> 00:21:42,000 or it's a bad thing because it is preventing people 409 00:21:42,000 --> 00:21:48,810 from using open source material to create further developments when they 410 00:21:48,810 --> 00:21:52,560 don't necessarily want to license those changes or modifications that they 411 00:21:52,560 --> 00:21:53,430 made. 412 00:21:53,430 --> 00:21:58,410 The lesser General Public License, or the lesser GNU Public License, 413 00:21:58,410 --> 00:22:03,330 is basically the same idea, but it only applies to a library code. 414 00:22:03,330 --> 00:22:07,200 So if code is LGPL-ed, what this basically means 415 00:22:07,200 --> 00:22:15,180 is any modifications that you make to that code also need to be LGPL-ed, 416 00:22:15,180 --> 00:22:17,400 or released under the LGPL license. 417 00:22:17,400 --> 00:22:21,750 But other ancillary things that you do in your program that 418 00:22:21,750 --> 00:22:26,610 overall incorporates this library code does not need to be LGPL-ed. 419 00:22:26,610 --> 00:22:29,610 So it would be possible to license it under other terms, 420 00:22:29,610 --> 00:22:32,490 including terms that are not open source at all. 421 00:22:32,490 --> 00:22:34,458 So changes that you make to the library need 422 00:22:34,458 --> 00:22:36,750 to be propagated down the line so that other people can 423 00:22:36,750 --> 00:22:40,500 benefit from the changes that are specific to the library that you made. 424 00:22:40,500 --> 00:22:45,330 But it does not necessarily reflect back into your own code. 425 00:22:45,330 --> 00:22:48,840 You don't have to necessarily make that publicly available. 426 00:22:48,840 --> 00:22:52,560 So this is considered slightly lesser in terms of its ability to propagate. 427 00:22:52,560 --> 00:22:56,160 And also, though, it's considered lesser in terms of its ability 428 00:22:56,160 --> 00:22:59,430 to grant rights to others. 429 00:22:59,430 --> 00:23:03,030 Then you have, at the other end of the extreme, the MIT license. 430 00:23:03,030 --> 00:23:07,170 The MIT license is considered one of the most permissive licenses available. 431 00:23:07,170 --> 00:23:09,150 It says, here's the software. 432 00:23:09,150 --> 00:23:10,600 Do whatever you want with it. 433 00:23:10,600 --> 00:23:11,760 You can make changes to it. 434 00:23:11,760 --> 00:23:14,130 You don't have to re-license those changes to others. 435 00:23:14,130 --> 00:23:17,250 You can take this code and profit from it. 436 00:23:17,250 --> 00:23:21,930 You can take this code and make whatever-- re-license it 437 00:23:21,930 --> 00:23:24,180 under some other scheme if you want. 438 00:23:24,180 --> 00:23:27,360 So this is the other end of the extreme. 439 00:23:27,360 --> 00:23:30,240 Is this license copyleft? 440 00:23:30,240 --> 00:23:33,000 Well, no, it's not copyleft because it doesn't require others 441 00:23:33,000 --> 00:23:36,070 to adhere to the same licensing terms. 442 00:23:36,070 --> 00:23:39,930 Again, you can do with it whatever you would like. 443 00:23:39,930 --> 00:23:44,430 Most of the code that is actually found on GitHub is MIT licensed. 444 00:23:44,430 --> 00:23:48,810 So in that sense, using code that you find online 445 00:23:48,810 --> 00:23:53,580 is not necessarily problematic to an entrepreneur or a budding developer who 446 00:23:53,580 --> 00:23:58,290 wants to profit from some larger program that they write if it incorporates 447 00:23:58,290 --> 00:24:02,310 MIT-licensed code, which might be an issue for those who are incorporating 448 00:24:02,310 --> 00:24:05,042 GPL-licensed code. 449 00:24:05,042 --> 00:24:06,750 What sorts of considerations, then, would 450 00:24:06,750 --> 00:24:10,480 go into deciding which license to use? 451 00:24:10,480 --> 00:24:14,580 And again, these are just three of many, many licenses that exist 452 00:24:14,580 --> 00:24:16,440 that pertain to software development. 453 00:24:16,440 --> 00:24:19,530 Then, of course, there are open source licenses 454 00:24:19,530 --> 00:24:21,060 that are not tied to this at all. 455 00:24:21,060 --> 00:24:24,010 So for example, a lot of the material that we produce for CS50, 456 00:24:24,010 --> 00:24:26,310 the course on which this is based at Harvard College, 457 00:24:26,310 --> 00:24:28,870 is licensed under a Creative Commons license, 458 00:24:28,870 --> 00:24:33,300 which is similar in spirit to a GPL license, 459 00:24:33,300 --> 00:24:38,100 in as much as it oftentimes will require people to re-license the changes that 460 00:24:38,100 --> 00:24:40,630 they make to that material under GPL-- 461 00:24:40,630 --> 00:24:42,750 or under Creative Commons, excuse me. 462 00:24:42,750 --> 00:24:47,055 It will generally require a non-commercial aspect of it. 463 00:24:47,055 --> 00:24:50,573 It is not possible to profit from any changes that you make and so on. 464 00:24:50,573 --> 00:24:51,990 And that's not a software license. 465 00:24:51,990 --> 00:24:54,900 That's more of a general media-related license. 466 00:24:54,900 --> 00:24:58,720 So these software open source licenses exist in both contexts. 467 00:24:58,720 --> 00:25:02,130 But what sorts of considerations might go into choosing a license? 468 00:25:02,130 --> 00:25:05,020 Well, again, it really does depend on the organization itself. 469 00:25:05,020 --> 00:25:08,520 And so that's why understanding a bit about these licenses 470 00:25:08,520 --> 00:25:09,960 certainly comes into play. 471 00:25:09,960 --> 00:25:13,380 Do you want your changes to propagate and get out into the market 472 00:25:13,380 --> 00:25:14,430 more easily? 473 00:25:14,430 --> 00:25:18,480 That might be a reason to use the MIT license, which is a very permissive. 474 00:25:18,480 --> 00:25:22,350 Do you just feel compelled to share code with others, 475 00:25:22,350 --> 00:25:26,370 and you want to insist that others share that code as well? 476 00:25:26,370 --> 00:25:27,900 Then you might want to use GPL. 477 00:25:27,900 --> 00:25:32,130 Do you potentially want to use open source code 478 00:25:32,130 --> 00:25:36,450 but not release your own code freely to others, the changes 479 00:25:36,450 --> 00:25:38,160 that you make to interact with that code? 480 00:25:38,160 --> 00:25:42,840 That might be cause for relying on LGPL for the library code 481 00:25:42,840 --> 00:25:46,920 that you import and use but licensing your own changes and modifications 482 00:25:46,920 --> 00:25:49,090 under some other scheme. 483 00:25:49,090 --> 00:25:51,365 Again, a very complex and open field that's 484 00:25:51,365 --> 00:25:53,490 going to require a lot of research for anyone who's 485 00:25:53,490 --> 00:25:56,760 going to be pursuing and helping clients who 486 00:25:56,760 --> 00:26:00,510 are working with software development and what they want 487 00:26:00,510 --> 00:26:04,120 to do with that code going forward. 488 00:26:04,120 --> 00:26:07,260 So let's turn our attention now from issues that have existed for a while 489 00:26:07,260 --> 00:26:09,330 and sort of been bubbling underneath the surface, 490 00:26:09,330 --> 00:26:11,640 issues of trust and issues of software licensing-- 491 00:26:11,640 --> 00:26:13,350 those have been around a lot longer-- 492 00:26:13,350 --> 00:26:16,200 and start to contend with new technologies 493 00:26:16,200 --> 00:26:18,117 and how the law keeps up with them. 494 00:26:18,117 --> 00:26:19,950 And so you'll also hear these terms that are 495 00:26:19,950 --> 00:26:22,650 being considered emergent technologies or new technologies. 496 00:26:22,650 --> 00:26:26,230 You'll sometimes see them referred to as disruptive technologies 497 00:26:26,230 --> 00:26:30,090 because they are poised to materially affect the way that we interact 498 00:26:30,090 --> 00:26:34,380 with technology, particularly in terms of purchasing things 499 00:26:34,380 --> 00:26:39,880 through commerce, for example, as in the case of our first topic, 3D printing. 500 00:26:39,880 --> 00:26:44,680 So how does 3D printing work, is a good question to ask at the outset. 501 00:26:44,680 --> 00:26:47,410 Similar in spirit to a 2D printer, with a 2D printer 502 00:26:47,410 --> 00:26:53,740 you have a write head that spits out ink, typically in some sort of toner. 503 00:26:53,740 --> 00:26:56,350 It moves left to right across a piece of paper. 504 00:26:56,350 --> 00:26:59,540 And the paper's also fed through some sort of feeder. 505 00:26:59,540 --> 00:27:03,550 So the left-to-right movement of the toner or ink head 506 00:27:03,550 --> 00:27:05,170 is the x-axis movement. 507 00:27:05,170 --> 00:27:08,428 And the paper rolling underneath that provides y-axis movements. 508 00:27:08,428 --> 00:27:10,720 Such that when we're done, we may be able to get access 509 00:27:10,720 --> 00:27:14,720 to a piece of paper that has ink scattered across it, left to right, 510 00:27:14,720 --> 00:27:15,800 top to bottom. 511 00:27:15,800 --> 00:27:19,450 3D printers work in very much the same way, except instead of their medium, 512 00:27:19,450 --> 00:27:23,350 instead of being ink or toner, is typically some sort of filament that 513 00:27:23,350 --> 00:27:27,100 is conventionally, at least at the time of this recording, been 514 00:27:27,100 --> 00:27:29,020 generally plastic based. 515 00:27:29,020 --> 00:27:30,910 And what basically happens is the plastic 516 00:27:30,910 --> 00:27:35,080 is melted just to above the melting point of the plastic. 517 00:27:35,080 --> 00:27:37,450 And then it is deposited onto some surface. 518 00:27:37,450 --> 00:27:43,630 And that surface that is being moved over by a similar read-write head, 519 00:27:43,630 --> 00:27:48,130 basically it's a nozzle or eyedropper basically of plastic. 520 00:27:48,130 --> 00:27:51,970 And it can move up and down across a flat surface, 521 00:27:51,970 --> 00:27:53,780 similar to what the printer would do. 522 00:27:53,780 --> 00:27:58,270 But instead of just being flat, the arm can also move up and down. 523 00:27:58,270 --> 00:28:01,510 On some models of 3D printers, the table can move up and down 524 00:28:01,510 --> 00:28:06,110 to allow it to not only print on the xy-plane, but also on the z-axis. 525 00:28:06,110 --> 00:28:10,490 So it can print in space and create three-dimensional objects, 3D printing. 526 00:28:10,490 --> 00:28:13,700 Typically the material used, again, is melted plastic just 527 00:28:13,700 --> 00:28:15,200 above the melting point. 528 00:28:15,200 --> 00:28:18,500 So that by the time it's deposited onto the surface 529 00:28:18,500 --> 00:28:22,370 or onto other existing plastic, it's already basically cooled 530 00:28:22,370 --> 00:28:24,400 enough that it's hardened again. 531 00:28:24,400 --> 00:28:27,290 So the idea is we want to just melt it enough so 532 00:28:27,290 --> 00:28:29,990 that by the time it's put onto some other surface, 533 00:28:29,990 --> 00:28:34,130 it re-hardens and becomes a rigid material once again. 534 00:28:34,130 --> 00:28:38,420 Now, 3D printing is usually considered to be a disruptive technology 535 00:28:38,420 --> 00:28:43,640 because it allows people to create items they may not otherwise have access to. 536 00:28:43,640 --> 00:28:45,560 And of course, the controversial one that 537 00:28:45,560 --> 00:28:48,440 is often spoken about in terms of we need to ban things 538 00:28:48,440 --> 00:28:52,940 or we need to ban certain 3D printers or ban certain 3D printing technologies 539 00:28:52,940 --> 00:28:56,270 is guns, because it's actually possible, using technology 540 00:28:56,270 --> 00:29:01,580 that exists right now, to 3D print a plastic gun that 541 00:29:01,580 --> 00:29:06,970 would evade any sort of metal detection that is usually used for detecting guns 542 00:29:06,970 --> 00:29:07,970 and is fully functional. 543 00:29:07,970 --> 00:29:12,560 It can fire bullets, plastic bullets or real metal bullets. 544 00:29:12,560 --> 00:29:17,540 The article that is recommended that goes with this part of the discussion 545 00:29:17,540 --> 00:29:20,900 proposes several different ways that we might be able to-- 546 00:29:20,900 --> 00:29:25,430 or the law may be able to keep up with 3D printing technologies. 547 00:29:25,430 --> 00:29:28,757 Because, again, the law typically lags behind technology, and so 548 00:29:28,757 --> 00:29:30,840 is there a way that the law can contend with this? 549 00:29:30,840 --> 00:29:32,990 And there are a couple of options that it proposes 550 00:29:32,990 --> 00:29:34,610 that I think are worthy of discussion. 551 00:29:34,610 --> 00:29:37,610 The first is allow permission-less innovation. 552 00:29:37,610 --> 00:29:40,490 Should we just allow people to do whatever 553 00:29:40,490 --> 00:29:44,390 they want with it, the 3D printing technology, 554 00:29:44,390 --> 00:29:49,100 and decide ex post facto this, what you just did, is not OK, 555 00:29:49,100 --> 00:29:53,720 the rest of it's fine and disallow that type of thing going forward? 556 00:29:53,720 --> 00:29:57,410 This approach is interesting because it allows people to be creative, 557 00:29:57,410 --> 00:29:59,480 and it allows potentially for things to be 558 00:29:59,480 --> 00:30:02,090 revealed about 3D printing technology that were not 559 00:30:02,090 --> 00:30:04,430 possible to forecast in advance. 560 00:30:04,430 --> 00:30:08,510 But is that reactive-based approach better? 561 00:30:08,510 --> 00:30:10,640 Or should we be proactive in trying to prevent 562 00:30:10,640 --> 00:30:14,870 the production of certain things that we don't want to be produced? 563 00:30:14,870 --> 00:30:17,330 And moreover, all the plastic filament tends 564 00:30:17,330 --> 00:30:21,283 to be the most popular and common way that things are 3D printed right now. 565 00:30:21,283 --> 00:30:24,200 3D printers are being developed that are much more advanced than this. 566 00:30:24,200 --> 00:30:28,040 We are not necessarily restricted to plastic-based printing. 567 00:30:28,040 --> 00:30:29,690 We may have metal-based printing. 568 00:30:29,690 --> 00:30:32,720 And you may have even seen that there are 3D printers that exist 569 00:30:32,720 --> 00:30:35,120 that can produce organic materials. 570 00:30:35,120 --> 00:30:39,500 They use human cells, basically, to create things like organs. 571 00:30:39,500 --> 00:30:41,890 Do we want people to be able to create these things? 572 00:30:41,890 --> 00:30:45,710 Is this the kind of thing that should be regulated beforehand rather 573 00:30:45,710 --> 00:30:48,830 than regulated after we've already printed 574 00:30:48,830 --> 00:30:54,350 and exchanged copyrighted designs for what to build and construct? 575 00:30:54,350 --> 00:30:59,450 Is it too late by the time we have regulated it to prevent it 576 00:30:59,450 --> 00:31:01,220 from being reproduced in the future? 577 00:31:01,220 --> 00:31:02,840 578 00:31:02,840 --> 00:31:07,520 Another thought that this article proposes is immunizing intermediaries. 579 00:31:07,520 --> 00:31:10,865 Should we allow people to do whatever they want with 3D printing? 580 00:31:10,865 --> 00:31:13,490 Or maybe not allow people to do whatever they want 3D printing, 581 00:31:13,490 --> 00:31:20,630 but regardless don't punish the manufacturers of 3D printers 582 00:31:20,630 --> 00:31:26,210 and don't punish the designers of the CAD files, 583 00:31:26,210 --> 00:31:30,260 the Computer-Aided Design files, that generally go into 3D printing? 584 00:31:30,260 --> 00:31:33,860 Is this a reasonable policy approach? 585 00:31:33,860 --> 00:31:35,720 It's not an unheard of policy approach. 586 00:31:35,720 --> 00:31:38,870 This is the approach that we typically have used with respect 587 00:31:38,870 --> 00:31:40,550 to gun manufacturers, for example. 588 00:31:40,550 --> 00:31:45,440 Gun manufacturers generally are not subject to prosecution for crimes 589 00:31:45,440 --> 00:31:48,250 that are committed using those guns. 590 00:31:48,250 --> 00:31:51,260 Should we apply something similar to 3D printers, for example, 591 00:31:51,260 --> 00:31:54,955 when the printer is used to manufacturer a gun? 592 00:31:54,955 --> 00:31:57,080 Who should be punished in that case, the person who 593 00:31:57,080 --> 00:31:59,420 designed the gun model, the person who actually 594 00:31:59,420 --> 00:32:04,430 printed the gun, the 3D printer manufacturer itself, 595 00:32:04,430 --> 00:32:05,750 any of those people? 596 00:32:05,750 --> 00:32:08,780 Again, an unanswered question that the law is going 597 00:32:08,780 --> 00:32:12,460 to have to contend with going forward. 598 00:32:12,460 --> 00:32:16,650 Another solution potentially is to rely on existing common law. 599 00:32:16,650 --> 00:32:18,510 But the problem that typically arises there 600 00:32:18,510 --> 00:32:21,630 is that there is not a federal common law. 601 00:32:21,630 --> 00:32:26,310 And so this would potentially result in 50 different jurisdictions handling 602 00:32:26,310 --> 00:32:28,355 the same problem in different ways. 603 00:32:28,355 --> 00:32:30,480 Whether this is a good thing or a bad thing, again, 604 00:32:30,480 --> 00:32:33,900 sort of dependent on how quickly these things move. 605 00:32:33,900 --> 00:32:38,400 Common law, as we've seen, certainly is capable of adapting 606 00:32:38,400 --> 00:32:39,750 to new technologies. 607 00:32:39,750 --> 00:32:43,700 Does it do it quickly enough for us? 608 00:32:43,700 --> 00:32:45,730 Finally, another example that is proposed 609 00:32:45,730 --> 00:32:50,050 is that we could just allow the 3D printing industry to self-regulate. 610 00:32:50,050 --> 00:32:52,330 After all, we, as attorneys, self-regulate, 611 00:32:52,330 --> 00:32:54,440 and that seems to work just fine. 612 00:32:54,440 --> 00:32:57,520 Now, granted this may be because we are in an adversarial system, 613 00:32:57,520 --> 00:33:01,630 and so there's advantages and extra incentives for adversaries 614 00:33:01,630 --> 00:33:04,480 to insist that we are adhering to our ethical principles 615 00:33:04,480 --> 00:33:06,140 and doing the right thing. 616 00:33:06,140 --> 00:33:10,360 There's also the overhanging threat of outside regulation 617 00:33:10,360 --> 00:33:12,330 if we do not self-regulate. 618 00:33:12,330 --> 00:33:17,980 So in a lawyer context, adapting this model to 3D printing 619 00:33:17,980 --> 00:33:21,080 may work because it seems to be working well for attorneys. 620 00:33:21,080 --> 00:33:23,890 Then you consider that social media companies are also 621 00:33:23,890 --> 00:33:27,250 self-regulating, with respect to data protection and data privacy. 622 00:33:27,250 --> 00:33:31,347 And as we've seen, that's maybe not going so well. 623 00:33:31,347 --> 00:33:33,430 So how do we handle the regulation of 3D printing? 624 00:33:33,430 --> 00:33:35,388 Does it fall into the self-regulation category? 625 00:33:35,388 --> 00:33:36,415 Does that succeed? 626 00:33:36,415 --> 00:33:39,250 Does it fall into the self-regulation category that doesn't succeed? 627 00:33:39,250 --> 00:33:42,760 Does it require preemptive regulation to deal with? 628 00:33:42,760 --> 00:33:46,210 Now, 3D printing also has some other potential concerns. 629 00:33:46,210 --> 00:33:50,740 Very easily, by the nature of the technology itself, 630 00:33:50,740 --> 00:33:55,232 it's quite capable of violating copyrights, patents, trademarks, 631 00:33:55,232 --> 00:33:57,190 potentially more just by the virtue of the fact 632 00:33:57,190 --> 00:34:01,970 that you can create things that may be copywritten or patented or trademarked. 633 00:34:01,970 --> 00:34:06,370 And there's also prior case law that sort of informs potential consequences 634 00:34:06,370 --> 00:34:10,389 for using 3D printers, the Napster case from several years ago, the technology. 635 00:34:10,389 --> 00:34:14,920 Napster would allow peer-to-peer sharing of digital music files. 636 00:34:14,920 --> 00:34:17,949 Basically that service was deemed to entirely exist 637 00:34:17,949 --> 00:34:20,050 for the purpose of violating copyright. 638 00:34:20,050 --> 00:34:22,960 And so that shut down Napster basically. 639 00:34:22,960 --> 00:34:24,820 Will 3D printers suffer the same fate? 640 00:34:24,820 --> 00:34:29,440 Because you could argue that 3D printers are generally used to recreate things 641 00:34:29,440 --> 00:34:33,250 that may be patented or may be subject to copyright. 642 00:34:33,250 --> 00:34:36,370 Or is it going to fall more into a category like Sony, which 643 00:34:36,370 --> 00:34:40,659 many years ago faced a lawsuit, or was part of a lawsuit involving VCRs 644 00:34:40,659 --> 00:34:43,300 and tape-delaying copywritten material? 645 00:34:43,300 --> 00:34:45,982 Is that going to be more of a precedent for 3D printing, 646 00:34:45,982 --> 00:34:48,940 or is the Napster case going to be more of a precedent for 3D printing? 647 00:34:48,940 --> 00:34:50,590 Again, we don't really know. 648 00:34:50,590 --> 00:34:55,000 It's up to the future practitioners of technology law, who 649 00:34:55,000 --> 00:34:58,360 are forced to grapple with the challenges presented by 3D printing, 650 00:34:58,360 --> 00:35:03,680 to nudge us in that direction, one way or the other. 651 00:35:03,680 --> 00:35:06,113 To dive a bit more deeply into this topic of 3D printing, 652 00:35:06,113 --> 00:35:09,030 I do recommend you take a look at this article, "Guns Limbs and Toys-- 653 00:35:09,030 --> 00:35:10,448 What Future for 3D Printing?" 654 00:35:10,448 --> 00:35:12,990 And if you're particularly interested in 3D printing and some 655 00:35:12,990 --> 00:35:17,107 of the ramifications of it and the technological underpinnings of it, 656 00:35:17,107 --> 00:35:20,190 I do encourage you to also take a look at "The Law and 3D Printing," which 657 00:35:20,190 --> 00:35:24,945 is a Law Review article from 2015, which also is periodically updated online. 658 00:35:24,945 --> 00:35:28,110 And it's a wonderful bibliography of all the different things 659 00:35:28,110 --> 00:35:29,760 that 3D printing does. 660 00:35:29,760 --> 00:35:34,770 And it will presumably continue to be updated as cases and laws come 661 00:35:34,770 --> 00:35:40,020 into play that interact with 3D printing and start to define this relatively 662 00:35:40,020 --> 00:35:42,550 ambiguous space. 663 00:35:42,550 --> 00:35:44,300 Another particularly innovative space that 664 00:35:44,300 --> 00:35:47,540 really pushes the boundaries of what the law is capable of handling 665 00:35:47,540 --> 00:35:51,190 is the idea of augmented reality and virtual reality. 666 00:35:51,190 --> 00:35:53,510 And we'll consider them in that order. 667 00:35:53,510 --> 00:35:55,532 Let's define what augmented reality is. 668 00:35:55,532 --> 00:35:58,240 And the most common example of this that you may be familiar with 669 00:35:58,240 --> 00:36:00,820 is a phenomenon from several years ago called Pokemon Go. 670 00:36:00,820 --> 00:36:03,920 It was a game that you played on your mobile phone. 671 00:36:03,920 --> 00:36:06,770 And you would hold up your phone, and you 672 00:36:06,770 --> 00:36:08,690 would see through the camera's lens, as if you 673 00:36:08,690 --> 00:36:14,240 were taking a picture, the real world through the lens of the camera. 674 00:36:14,240 --> 00:36:18,080 But superimposed onto that would be digital avatars 675 00:36:18,080 --> 00:36:21,080 of Pokemon, which is part of this game of collectible creatures 676 00:36:21,080 --> 00:36:25,040 that you're trying to walk around and find and capture, basically. 677 00:36:25,040 --> 00:36:30,200 So you would try and throw some fake ball at them to capture them. 678 00:36:30,200 --> 00:36:35,570 So augmented reality is some sort of technical graphical overlay 679 00:36:35,570 --> 00:36:36,740 over the real world. 680 00:36:36,740 --> 00:36:40,220 Contrast this with virtual reality, in which one typically 681 00:36:40,220 --> 00:36:42,410 wears a headset of some sort. 682 00:36:42,410 --> 00:36:45,433 It's usually proprietary. 683 00:36:45,433 --> 00:36:47,600 It's not generally available as an app, for example, 684 00:36:47,600 --> 00:36:50,630 like the augmented-reality game Pokemon Go was. 685 00:36:50,630 --> 00:36:53,720 It's usually tied to a specific brand of headset, 686 00:36:53,720 --> 00:36:56,990 like Oculus being one type of headset, for example. 687 00:36:56,990 --> 00:37:00,110 And it is an immersive alternate reality basically. 688 00:37:00,110 --> 00:37:04,820 When you put the headset on, you don't see the lens of the world around you. 689 00:37:04,820 --> 00:37:06,650 You are transformed into another space. 690 00:37:06,650 --> 00:37:09,740 And to make the experience even more immersive 691 00:37:09,740 --> 00:37:13,760 is the potential to wear headphones, for example, 692 00:37:13,760 --> 00:37:16,940 so that you are not only immersed in a visual space, 693 00:37:16,940 --> 00:37:20,740 but also immersed in a soundscape. 694 00:37:20,740 --> 00:37:24,070 Now, something that's particularly strange about these environments 695 00:37:24,070 --> 00:37:26,330 is that they are still interactive. 696 00:37:26,330 --> 00:37:28,630 It is still possible for multiple people, scattered 697 00:37:28,630 --> 00:37:34,090 in different parts of the world, to be involved in the same virtual reality 698 00:37:34,090 --> 00:37:36,320 experience, or the same augmented-reality experience. 699 00:37:36,320 --> 00:37:38,620 Let's now consider virtual reality experiences, where 700 00:37:38,620 --> 00:37:42,400 you are taken away from the real world. 701 00:37:42,400 --> 00:37:47,910 What should happen if someone were to commit a crime in a virtual reality 702 00:37:47,910 --> 00:37:48,960 space? 703 00:37:48,960 --> 00:37:53,130 Studies have shown that people who are immersed in a virtual reality 704 00:37:53,130 --> 00:37:57,180 experience can have serious ramifications. 705 00:37:57,180 --> 00:38:00,240 They can have real feelings that last for a long time 706 00:38:00,240 --> 00:38:02,380 based on their experiences in them. 707 00:38:02,380 --> 00:38:06,450 For example, there's been a study out where people put on a virtual reality 708 00:38:06,450 --> 00:38:09,600 headset, and they were then immersed in this space where 709 00:38:09,600 --> 00:38:12,310 they were standing on a plank. 710 00:38:12,310 --> 00:38:15,040 And they were asked to step off the plank. 711 00:38:15,040 --> 00:38:18,430 Now, in the real world, this would be just like this room. 712 00:38:18,430 --> 00:38:20,830 I can see that everything around me is a carpet. 713 00:38:20,830 --> 00:38:23,590 There's no giant pit for me to fall into. 714 00:38:23,590 --> 00:38:28,780 But when I have this headset on, I'm completely taken away from reality 715 00:38:28,780 --> 00:38:30,370 as we see it here. 716 00:38:30,370 --> 00:38:33,790 The experience is so pervasive for some people 717 00:38:33,790 --> 00:38:37,330 that they walk to the edge of the plank, and they freeze in fear. 718 00:38:37,330 --> 00:38:38,170 They can't move. 719 00:38:38,170 --> 00:38:41,980 There's a real physical manifestation in the real world 720 00:38:41,980 --> 00:38:43,720 of what they feel in this reality. 721 00:38:43,720 --> 00:38:46,900 And for those brave people who are able to take the step off the edge, 722 00:38:46,900 --> 00:38:50,527 many of them lean forward and try and fall into the space. 723 00:38:50,527 --> 00:38:52,360 And some of them may even get the experience 724 00:38:52,360 --> 00:38:54,760 like when you're on a roller coaster, and you feel that tingle in your spine 725 00:38:54,760 --> 00:38:56,080 as you're falling. 726 00:38:56,080 --> 00:38:58,970 The sense that that actually is happening to you 727 00:38:58,970 --> 00:39:05,100 is so real in the virtual reality space that you can feel it. 728 00:39:05,100 --> 00:39:09,610 So what would be the case, then, if you are in a virtual reality space, 729 00:39:09,610 --> 00:39:13,290 and someone were to pull a virtual gun on you? 730 00:39:13,290 --> 00:39:15,830 Is that assault? 731 00:39:15,830 --> 00:39:20,150 Assault is a crime where your perception of harm is a material element. 732 00:39:20,150 --> 00:39:21,380 It's not actual harm. 733 00:39:21,380 --> 00:39:23,060 It's your perception of it. 734 00:39:23,060 --> 00:39:26,060 You can perceive in the real world when somebody points a gun at you, 735 00:39:26,060 --> 00:39:28,580 this fear of imminent bodily harm. 736 00:39:28,580 --> 00:39:33,770 Can you feel that same imminent bodily harm in a virtual world? 737 00:39:33,770 --> 00:39:36,260 That's not a question that's really been answered Moreover, 738 00:39:36,260 --> 00:39:40,550 who has jurisdiction over a crime that is committed in virtual reality? 739 00:39:40,550 --> 00:39:42,950 It's possible that I, here in the United States, 740 00:39:42,950 --> 00:39:45,680 might be interacting with someone in France, 741 00:39:45,680 --> 00:39:50,180 who is maybe the perpetrator of this virtual assault that I'm describing. 742 00:39:50,180 --> 00:39:53,030 Is the crime committed in the United States? 743 00:39:53,030 --> 00:39:54,680 Is the crime committed in France? 744 00:39:54,680 --> 00:39:57,090 Do we have jurisdiction over the potential perpetrator, 745 00:39:57,090 --> 00:39:59,060 even though all I'm experiencing or seeing 746 00:39:59,060 --> 00:40:03,117 is that person's avatar as opposed to their real persona? 747 00:40:03,117 --> 00:40:04,700 Does anyone have jurisdiction over it? 748 00:40:04,700 --> 00:40:08,360 Does the jurisdiction only exist in the virtual world? 749 00:40:08,360 --> 00:40:12,260 Virtual reality introduces a lot of really interesting questions 750 00:40:12,260 --> 00:40:16,550 that are poised to redefine the way we think about jurisdiction 751 00:40:16,550 --> 00:40:21,800 in defining crimes and the prosecutability of crimes 752 00:40:21,800 --> 00:40:24,860 in a virtual space. 753 00:40:24,860 --> 00:40:27,890 Some other terms just to bring up as well that sort of tangentially 754 00:40:27,890 --> 00:40:30,170 relate to virtual and augmented reality so that you're 755 00:40:30,170 --> 00:40:34,040 familiar with them are the real-world crimes that are very technologically 756 00:40:34,040 --> 00:40:37,580 driven of doxing and swatting. 757 00:40:37,580 --> 00:40:40,430 Doxing, if unfamiliar, is a crime involving 758 00:40:40,430 --> 00:40:43,760 revealing or exposing the personal information of someone 759 00:40:43,760 --> 00:40:48,620 on the internet with the intent to harass or embarrass or do 760 00:40:48,620 --> 00:40:51,307 some harm to them by having that exposed, so, for example, 761 00:40:51,307 --> 00:40:53,390 revealing somebody's phone number such that it can 762 00:40:53,390 --> 00:40:56,600 be called incessantly by other people. 763 00:40:56,600 --> 00:41:03,230 As well as swatting, which is a, well, pretty horrible crime, whereby 764 00:41:03,230 --> 00:41:07,513 an individual calls the police and says, John Smith 765 00:41:07,513 --> 00:41:09,680 is committing a crime at this address, is holding me 766 00:41:09,680 --> 00:41:12,170 hostage, or something like that, with the intention 767 00:41:12,170 --> 00:41:15,320 that the police would then go to that location 768 00:41:15,320 --> 00:41:19,130 and a SWAT team would go, hence the term swatting, 769 00:41:19,130 --> 00:41:24,780 and potentially cause serious injury or harm to the ostensibly innocent John 770 00:41:24,780 --> 00:41:26,780 Smith, who's just sitting at home doing nothing. 771 00:41:26,780 --> 00:41:28,700 These two crimes are generally interrelated. 772 00:41:28,700 --> 00:41:32,593 But they oftentimes come up in the technological context, 773 00:41:32,593 --> 00:41:34,760 usually as part of the same conversation, when we're 774 00:41:34,760 --> 00:41:38,120 thinking about virtual reality crimes. 775 00:41:38,120 --> 00:41:40,940 One of the potential upsides, though, if you 776 00:41:40,940 --> 00:41:44,270 want to think about it like that, of crimes that are committed 777 00:41:44,270 --> 00:41:46,992 in virtual or augmented reality are-- 778 00:41:46,992 --> 00:41:48,200 well, there's actually a few. 779 00:41:48,200 --> 00:41:51,620 First, because it is happening in a virtual space, 780 00:41:51,620 --> 00:41:55,023 and because generally in the virtual space all of our movements are tracked, 781 00:41:55,023 --> 00:41:57,440 and the identities of everybody who's entering and leaving 782 00:41:57,440 --> 00:42:00,830 that space are tracked by way of IP addresses, 783 00:42:00,830 --> 00:42:05,660 it may be easier for investigators to figure out who 784 00:42:05,660 --> 00:42:07,640 the perpetrators of those crimes are. 785 00:42:07,640 --> 00:42:11,300 You know exactly the IP address of the person who apparently initiated 786 00:42:11,300 --> 00:42:15,350 this threat against you in the virtual space, which may perhaps make it easier 787 00:42:15,350 --> 00:42:18,920 to go and find that person in reality and question them 788 00:42:18,920 --> 00:42:22,228 about their involvement in this alleged crime. 789 00:42:22,228 --> 00:42:25,020 The other thing that's fortunately a good thing about these crimes, 790 00:42:25,020 --> 00:42:27,770 and this is not to mitigate the effect that these crimes can have, 791 00:42:27,770 --> 00:42:31,550 is that usually you can kind of mute them from happening. 792 00:42:31,550 --> 00:42:34,670 If somebody is in a virtual space, and they're just screaming constantly, 793 00:42:34,670 --> 00:42:37,670 such that you might consider that to be disturbing the peace when you're 794 00:42:37,670 --> 00:42:41,315 in a virtual space trying to have some sort of pleasant experience ordinarily, 795 00:42:41,315 --> 00:42:43,400 you usually have the capability of muting them. 796 00:42:43,400 --> 00:42:45,400 This is not a benefit that we have in real life. 797 00:42:45,400 --> 00:42:48,640 We generally can't stop crimes by just pretending they're not happening. 798 00:42:48,640 --> 00:42:50,630 But in a virtual space, we do have that luxury. 799 00:42:50,630 --> 00:42:54,380 That's, again, not to mitigate some of the very unpleasant and unfortunate 800 00:42:54,380 --> 00:42:58,820 things that can happen in virtual reality that are just inappropriate. 801 00:42:58,820 --> 00:43:02,180 But being in that space does allow people 802 00:43:02,180 --> 00:43:08,150 the option to get away from the crime in a way that the confines of reality 803 00:43:08,150 --> 00:43:08,810 may not allow. 804 00:43:08,810 --> 00:43:11,900 But again, this is a very challenging area 805 00:43:11,900 --> 00:43:15,950 because the law is not really equipped right now 806 00:43:15,950 --> 00:43:19,460 to handle what happens in an alternate reality, which effectively 807 00:43:19,460 --> 00:43:20,700 virtual reality is. 808 00:43:20,700 --> 00:43:24,080 And so, again, if you're considering trying to figure out the best 809 00:43:24,080 --> 00:43:27,110 way to prosecute these issues or deal with these issues, 810 00:43:27,110 --> 00:43:30,973 you may be at the forefront of trying to define how crimes 811 00:43:30,973 --> 00:43:32,390 are dealt with in a virtual space. 812 00:43:32,390 --> 00:43:36,080 Or how potentially, if working with augmented reality, 813 00:43:36,080 --> 00:43:40,130 if malicious code is put up in front of you 814 00:43:40,130 --> 00:43:43,190 to simulate something that might be happening in the real world, 815 00:43:43,190 --> 00:43:46,910 how do you prosecute those kinds of crimes, where you may be, for example, 816 00:43:46,910 --> 00:43:49,280 using a GPS program that is designed to navigate you 817 00:43:49,280 --> 00:43:51,590 in one direction versus the other based on the set of glasses 818 00:43:51,590 --> 00:43:54,230 that you're wearing so you don't have to keep looking at your phone to make sure 819 00:43:54,230 --> 00:43:55,940 that you're going the right way. 820 00:43:55,940 --> 00:44:00,830 What if somebody maliciously programs that augmented-reality program to route 821 00:44:00,830 --> 00:44:03,448 you off a cliff somewhere, right? 822 00:44:03,448 --> 00:44:04,490 How do we deal with that? 823 00:44:04,490 --> 00:44:07,880 Right now, again, augmented-reality virtual reality, 824 00:44:07,880 --> 00:44:12,207 it's a relatively untested space for lawyers in the law. 825 00:44:12,207 --> 00:44:14,040 In the second part of today's lecture, we're 826 00:44:14,040 --> 00:44:17,220 going to take a look at some potential regulatory challenges going forward, 827 00:44:17,220 --> 00:44:21,730 some issues at the forefront of law and technology generally related to privacy 828 00:44:21,730 --> 00:44:24,360 and how the law is ill equipped or hopefully 829 00:44:24,360 --> 00:44:27,810 soon to be equipped to handle the challenge that these issues present. 830 00:44:27,810 --> 00:44:30,150 And the first of these is your digital privacy, 831 00:44:30,150 --> 00:44:34,180 in particular, the abilities of organizations, companies, 832 00:44:34,180 --> 00:44:38,940 and mobile device manufacturers to track your whereabouts, whether that's 833 00:44:38,940 --> 00:44:42,540 your digital whereabouts, where you go on the internet, 834 00:44:42,540 --> 00:44:44,115 or your physical whereabouts. 835 00:44:44,115 --> 00:44:48,538 We'll start with the former, your digital whereabouts. 836 00:44:48,538 --> 00:44:51,330 So there's an article we provided on digital tracking technologies. 837 00:44:51,330 --> 00:44:54,038 This is designed to be a primer for the different types of things 838 00:44:54,038 --> 00:44:56,850 that companies, in particular their marketing teams, 839 00:44:56,850 --> 00:45:00,390 may do to track individuals online with, again, 840 00:45:00,390 --> 00:45:02,940 relatively little recourse for the individuals 841 00:45:02,940 --> 00:45:06,180 to know what sorts of information is being gathered 842 00:45:06,180 --> 00:45:08,925 about them, at least in the US. 843 00:45:08,925 --> 00:45:10,800 Now, of course, we're familiar with this idea 844 00:45:10,800 --> 00:45:15,030 of a cookie from our discussion of interacting with websites. 845 00:45:15,030 --> 00:45:18,450 It's our shorthand way to bypass the logging credentials 846 00:45:18,450 --> 00:45:23,728 and show sort of a virtual hand stamp saying, yes, I am who I say I am. 847 00:45:23,728 --> 00:45:25,770 I've already previously logged into your service. 848 00:45:25,770 --> 00:45:28,140 Cookies are certainly one way that a site 849 00:45:28,140 --> 00:45:33,500 can track a recurrent user from coming to the site over and over and over. 850 00:45:33,500 --> 00:45:36,240 Now, this article posits that most consumers have just 851 00:45:36,240 --> 00:45:38,420 come to accept that they're being tracked, 852 00:45:38,420 --> 00:45:42,210 like that's just part of the deal with the internet. 853 00:45:42,210 --> 00:45:47,250 Do you think that using cookies and being tracked 854 00:45:47,250 --> 00:45:51,468 is an essential requirement of what it means to use the internet today? 855 00:45:51,468 --> 00:45:53,760 And if you do think that, is that the way it should be? 856 00:45:53,760 --> 00:45:57,780 And if you don't think that, is that also the way it should be? 857 00:45:57,780 --> 00:46:02,680 Or should we be considering the fact that tracking is happening? 858 00:46:02,680 --> 00:46:08,190 Is that an essential part of what it means to use the internet? 859 00:46:08,190 --> 00:46:10,380 We also need to be concerned about the types of data 860 00:46:10,380 --> 00:46:13,440 that companies are using or collecting about us. 861 00:46:13,440 --> 00:46:17,460 Certainly cookies are one way to identify who we are. 862 00:46:17,460 --> 00:46:23,340 But also it's possible for a cookie to be identified with what types of data 863 00:46:23,340 --> 00:46:27,000 an individual accesses while visiting a particular site. 864 00:46:27,000 --> 00:46:30,120 So for example, if I am on Facebook, and I'm using my cookie, 865 00:46:30,120 --> 00:46:32,252 and I'm looking up lots of pictures on Facebook-- 866 00:46:32,252 --> 00:46:33,960 I'm just I'm searching for all my friends 867 00:46:33,960 --> 00:46:37,020 profiles and clicking on all the ones that have cats in them-- 868 00:46:37,020 --> 00:46:41,580 that might then give Facebook, or the administrator of that site, 869 00:46:41,580 --> 00:46:47,445 the ability to pair that cookie with a particular trend of things 870 00:46:47,445 --> 00:46:49,250 that that cookie likes. 871 00:46:49,250 --> 00:46:52,650 So in this case, it might want to then-- it knows, OK, maybe the person who 872 00:46:52,650 --> 00:46:54,630 owns this cookie likes cats. 873 00:46:54,630 --> 00:46:56,990 And as such, it may then start to serve up 874 00:46:56,990 --> 00:47:01,320 advertisements related to cats to me. 875 00:47:01,320 --> 00:47:06,055 And then when I log into a site, it's going 876 00:47:06,055 --> 00:47:07,680 to get information about my IP address. 877 00:47:07,680 --> 00:47:12,960 And if I use that cookie, it has now mapped my IP address to the fact 878 00:47:12,960 --> 00:47:15,640 that I like cats. 879 00:47:15,640 --> 00:47:21,660 And then it could sell the information about me, this particular IP address-- 880 00:47:21,660 --> 00:47:25,590 I guess it's not necessarily me because one IP address usually covers a house 881 00:47:25,590 --> 00:47:27,690 but gets you pretty close-- 882 00:47:27,690 --> 00:47:32,720 maps this particular IP address to somebody who likes cats. 883 00:47:32,720 --> 00:47:34,620 So they may sell that to some other service. 884 00:47:34,620 --> 00:47:36,662 Now, it turns out that IP addresses are generally 885 00:47:36,662 --> 00:47:40,290 allocated in geographic blocks, which means that, again, just by virtue 886 00:47:40,290 --> 00:47:42,390 of the fact that I log into a particular site, 887 00:47:42,390 --> 00:47:47,250 I'm able to access and access similar data when visiting that site. 888 00:47:47,250 --> 00:47:49,763 They may not be able to geographically isolate down to-- 889 00:47:49,763 --> 00:47:52,680 again, depending on how populated the area you are currently living in 890 00:47:52,680 --> 00:47:58,920 is, possibly narrow it down to a city block, that someone in this city block 891 00:47:58,920 --> 00:47:59,970 really likes cats. 892 00:47:59,970 --> 00:48:04,020 And then this company may be involved in targeted actual physical mail 893 00:48:04,020 --> 00:48:07,440 advertising, snail mail advertising, where 894 00:48:07,440 --> 00:48:11,070 some company that sells cat products, like a pet store or something, 895 00:48:11,070 --> 00:48:15,390 might target that particular block with advertising, in the hopes that because 896 00:48:15,390 --> 00:48:19,140 of this data that has been collected about this particular cookie, who then 897 00:48:19,140 --> 00:48:21,450 logged in with a particular IP address, which 898 00:48:21,450 --> 00:48:25,310 we've zeroed in to a particular geographic location-- 899 00:48:25,310 --> 00:48:27,960 it's kind of feeling a little unsettling, right? 900 00:48:27,960 --> 00:48:31,240 Suddenly something that we do online is having a manifestation, again, 901 00:48:31,240 --> 00:48:35,850 in the real world, where we're getting targeted advertising not just 902 00:48:35,850 --> 00:48:40,170 on sites that we visit, but also in our mailbox at home. 903 00:48:40,170 --> 00:48:43,050 It's a little bit discomfiting. 904 00:48:43,050 --> 00:48:45,990 Should IP addresses be allocated in this way? 905 00:48:45,990 --> 00:48:49,560 Is this the kind of thing that technologically can be changed? 906 00:48:49,560 --> 00:48:51,810 The latter answer is yes, it is possible to allocate 907 00:48:51,810 --> 00:48:54,210 IP addresses in a different way than we typically do. 908 00:48:54,210 --> 00:48:58,050 Should we allocate IP addresses in a different way than we typically do? 909 00:48:58,050 --> 00:49:04,410 Is the potential threat of receiving real-life advertisements 910 00:49:04,410 --> 00:49:07,800 related to your online activities enough to justify that? 911 00:49:07,800 --> 00:49:12,950 What would be enough to justify that kind of change? 912 00:49:12,950 --> 00:49:16,310 Then, of course, there's the question of tracking not in the digital world, 913 00:49:16,310 --> 00:49:17,570 but in the real world. 914 00:49:17,570 --> 00:49:20,275 This is usually done through mobile phone tracking. 915 00:49:20,275 --> 00:49:23,150 And so we provide an article from the Electronic Frontier Foundation. 916 00:49:23,150 --> 00:49:26,120 And full disclosure, some of the articles we've presented here 917 00:49:26,120 --> 00:49:27,620 do have a certain bias in them. 918 00:49:27,620 --> 00:49:31,670 The Electronic Frontier Foundation is well-known as a rights advocacy 919 00:49:31,670 --> 00:49:33,010 group for privacy. 920 00:49:33,010 --> 00:49:36,590 And so they're going to naturally be disinclined to things that 921 00:49:36,590 --> 00:49:38,180 involve tracking of data and so on. 922 00:49:38,180 --> 00:49:40,490 So just bear that in mind, some additional context 923 00:49:40,490 --> 00:49:42,050 when you're considering this article. 924 00:49:42,050 --> 00:49:44,420 But it does contain a lot of factual information and not 925 00:49:44,420 --> 00:49:47,345 necessarily just purely opinion about things that should be changed. 926 00:49:47,345 --> 00:49:50,240 Although it does advocate for certain policy changes. 927 00:49:50,240 --> 00:49:52,670 Now, why is it that tracking on a mobile device 928 00:49:52,670 --> 00:49:57,352 is oftentimes perceived as much worse than tracking on a laptop or desktop? 929 00:49:57,352 --> 00:49:59,435 Well, again, first of all, it's your mobile device 930 00:49:59,435 --> 00:50:02,510 is generally with you at all times. 931 00:50:02,510 --> 00:50:05,720 We've reached the point where our phones are generally carried in our pockets 932 00:50:05,720 --> 00:50:11,240 and with us wherever we go, which means that it's very easy to use data 933 00:50:11,240 --> 00:50:14,648 that's collected from mobile phone-- 934 00:50:14,648 --> 00:50:16,690 information that's given out by the mobile phone, 935 00:50:16,690 --> 00:50:19,690 whether that's the cell phone towers or GPS data and so on, 936 00:50:19,690 --> 00:50:22,870 to pinpoint that to us. 937 00:50:22,870 --> 00:50:26,230 The other concern is that mobile phones are very, very quick 938 00:50:26,230 --> 00:50:28,030 to become obsolete. 939 00:50:28,030 --> 00:50:31,660 Oftentimes one or two versions of a new version 940 00:50:31,660 --> 00:50:34,810 of a phone, whether it's a new Android phone release or software 941 00:50:34,810 --> 00:50:39,400 release or a new iPhone or so on, the version that came out two years ago 942 00:50:39,400 --> 00:50:43,930 is generally obsolete, which means it is no longer subject to firmware patches 943 00:50:43,930 --> 00:50:47,920 provided by the manufacturer or the software 944 00:50:47,920 --> 00:50:50,170 developers of the operating systems that are 945 00:50:50,170 --> 00:50:54,160 run on those phones, which could also mean that they are much more 946 00:50:54,160 --> 00:50:58,300 susceptible to people figuring out how to break into those phones 947 00:50:58,300 --> 00:51:00,950 and use that tracking information against you. 948 00:51:00,950 --> 00:51:03,760 So laptops and desktops generally don't move that much. 949 00:51:03,760 --> 00:51:05,960 You may carry your laptop to and from but generally 950 00:51:05,960 --> 00:51:07,420 to just a couple locations. 951 00:51:07,420 --> 00:51:09,565 It's usually set at a desk somewhere in between. 952 00:51:09,565 --> 00:51:11,440 Your desktop, of course, doesn't move at all. 953 00:51:11,440 --> 00:51:14,710 So the tracking potential there is pretty minimal. 954 00:51:14,710 --> 00:51:17,710 And also those devices tend to last quite a long time, 955 00:51:17,710 --> 00:51:23,710 and the lifecycle support for service and keeping those operating systems 956 00:51:23,710 --> 00:51:26,410 up to date is quite a bit longer versus the mobile phone, 957 00:51:26,410 --> 00:51:29,800 where that window is much, much shorter. 958 00:51:29,800 --> 00:51:33,340 Now, phones, contrary to most people's opinions of this, 959 00:51:33,340 --> 00:51:38,140 phones do not actually track your information based on GPS data. 960 00:51:38,140 --> 00:51:41,350 The way GPS works is your phone just fires off a signal, 961 00:51:41,350 --> 00:51:44,740 and it gets a response back that is trying to triangulate 962 00:51:44,740 --> 00:51:47,170 where exactly you are in space. 963 00:51:47,170 --> 00:51:50,320 But there's no information about what device requested that data or so on. 964 00:51:50,320 --> 00:51:54,580 And generally that data's not stored on the phone or in the GPS satellite 965 00:51:54,580 --> 00:51:55,160 in any way. 966 00:51:55,160 --> 00:51:58,360 It's just sort of ask-and-answer type inquiry. 967 00:51:58,360 --> 00:52:04,030 The real threat vector for phone tracking, if this is the kind of thing 968 00:52:04,030 --> 00:52:06,820 that you're concerned about, is actually through cell phone towers 969 00:52:06,820 --> 00:52:09,420 because cell phone towers do track this information. 970 00:52:09,420 --> 00:52:11,620 Different companies own different towers. 971 00:52:11,620 --> 00:52:14,470 They would like to know who is using each tower, 972 00:52:14,470 --> 00:52:18,220 whether or not this may involve also charging the-- 973 00:52:18,220 --> 00:52:20,230 say I'm using a Verizon phone, and I happen 974 00:52:20,230 --> 00:52:22,480 to be connected to an AT&T tower. 975 00:52:22,480 --> 00:52:28,542 AT&T may wish to know that this is mostly being used by Verizon customers. 976 00:52:28,542 --> 00:52:30,250 And the only way they really know that is 977 00:52:30,250 --> 00:52:33,120 by mapping the individual device to the phone number, 978 00:52:33,120 --> 00:52:35,890 then checking that against Verizon's records. 979 00:52:35,890 --> 00:52:38,350 And so they are collecting all this information 980 00:52:38,350 --> 00:52:41,225 about every phone that connects their tower so they could potentially 981 00:52:41,225 --> 00:52:43,540 bill Verizon for the portion of their customers 982 00:52:43,540 --> 00:52:47,080 who were using their infrastructure. 983 00:52:47,080 --> 00:52:48,970 So these towers do track information. 984 00:52:48,970 --> 00:52:52,120 And towers also can be used to triangulate your location. 985 00:52:52,120 --> 00:52:56,400 If I'm standing in the middle of an open field, for example, 986 00:52:56,400 --> 00:53:00,940 and there's a tower over there and a tower maybe just beside me, 987 00:53:00,940 --> 00:53:03,250 generally the signal that I'm sending-- my phone 988 00:53:03,250 --> 00:53:04,730 is emitting a signal constantly. 989 00:53:04,730 --> 00:53:06,605 If I'm emitting one signal in that direction, 990 00:53:06,605 --> 00:53:10,840 and it's received by a tower fairly weakly, and if I'm emitting another-- 991 00:53:10,840 --> 00:53:13,370 my phone is, again, radially sort of emitting the signal. 992 00:53:13,370 --> 00:53:15,162 If right next to me is another tower that's 993 00:53:15,162 --> 00:53:17,320 picking it up very strongly, in space I can 994 00:53:17,320 --> 00:53:20,490 use the information, sort of extrapolating from these two points, 995 00:53:20,490 --> 00:53:22,180 I'm most likely here. 996 00:53:22,180 --> 00:53:26,590 So even without having GPS turned on, just by trying to make a phone call 997 00:53:26,590 --> 00:53:31,420 or use a 2G, 3G, 4G network, it's pretty easy 998 00:53:31,420 --> 00:53:34,060 to figure out where you are in space. 999 00:53:34,060 --> 00:53:36,043 And this is potentially a concern. 1000 00:53:36,043 --> 00:53:37,960 This concern comes up sometimes in the context 1001 00:53:37,960 --> 00:53:43,270 of are these companies who provide operating systems for phones 1002 00:53:43,270 --> 00:53:48,790 or firmware for phones, are they at the behest of government agencies, who 1003 00:53:48,790 --> 00:53:52,030 may request back doors into the devices so that they can then 1004 00:53:52,030 --> 00:53:54,250 spy on individuals? 1005 00:53:54,250 --> 00:53:56,020 And certainly this might be something that 1006 00:53:56,020 --> 00:53:58,390 comes up in a FISA court or the like, where 1007 00:53:58,390 --> 00:53:59,890 they're trying to get phone records. 1008 00:53:59,890 --> 00:54:03,190 And there's always this sort of unknown. 1009 00:54:03,190 --> 00:54:06,280 Is it happening to all of our devices all the time? 1010 00:54:06,280 --> 00:54:10,810 Is it is it happening right now the phone in my pocket? 1011 00:54:10,810 --> 00:54:13,060 Or is the sound being captured in such a way 1012 00:54:13,060 --> 00:54:15,378 that it can be transmitted just because? 1013 00:54:15,378 --> 00:54:17,670 Because there happens to be a backdoor in the operating 1014 00:54:17,670 --> 00:54:19,590 system or a backdoor in the firmware that 1015 00:54:19,590 --> 00:54:22,200 allows anybody to listen to it, even if they're not 1016 00:54:22,200 --> 00:54:25,470 supposed to be listening to it. 1017 00:54:25,470 --> 00:54:30,750 It's really hard to pretend to be somebody that you're not with a phone. 1018 00:54:30,750 --> 00:54:33,000 As you saw, it's pretty easy to pretend to be somebody 1019 00:54:33,000 --> 00:54:37,260 that you're not with a computer you can use a service like a VPN, which 1020 00:54:37,260 --> 00:54:40,920 pretends to be a different IP address. 1021 00:54:40,920 --> 00:54:41,890 You connect to the VPN. 1022 00:54:41,890 --> 00:54:46,890 And as long as you trust VPN, the VPN ostensibly protects your identity. 1023 00:54:46,890 --> 00:54:50,670 With mobile phones, every device has a unique ID. 1024 00:54:50,670 --> 00:54:53,430 And it's really hard to change that ID. 1025 00:54:53,430 --> 00:54:55,650 So one way around this is to use what are 1026 00:54:55,650 --> 00:55:00,240 called burner phones, devices that are used once, twice, 1027 00:55:00,240 --> 00:55:01,830 and then they're thrown away. 1028 00:55:01,830 --> 00:55:06,840 Now, this again comes down to how concerned are you about your privacy? 1029 00:55:06,840 --> 00:55:08,845 How concerned should you be about your privacy? 1030 00:55:08,845 --> 00:55:11,970 Are you concerned enough that you're willing to purchase these devices that 1031 00:55:11,970 --> 00:55:15,270 are one-time, two-time use devices, which you then 1032 00:55:15,270 --> 00:55:18,360 throw away and constantly do that? 1033 00:55:18,360 --> 00:55:21,090 And moreover, it's actually kind of interesting to know 1034 00:55:21,090 --> 00:55:22,920 that burner phones don't actually do-- 1035 00:55:22,920 --> 00:55:27,390 they're not shown to do anything to protect one's identity or privacy 1036 00:55:27,390 --> 00:55:30,810 because it tends to be the case that we call the same people, 1037 00:55:30,810 --> 00:55:32,490 even if we're using different phones. 1038 00:55:32,490 --> 00:55:36,090 And so by virtue of the fact that this number seems 1039 00:55:36,090 --> 00:55:38,850 to be calling this number and this number all the time, 1040 00:55:38,850 --> 00:55:43,732 like maybe it's my work line and my family, my home number. 1041 00:55:43,732 --> 00:55:46,440 If I'm always calling those two numbers, even if the phone number 1042 00:55:46,440 --> 00:55:50,947 changes, a pattern can still be established with the device IDs of all 1043 00:55:50,947 --> 00:55:54,030 of the other phones, maybe my regular phone plus all the burners that I've 1044 00:55:54,030 --> 00:56:00,028 had, where you can still craft a picture of who I am, 1045 00:56:00,028 --> 00:56:02,820 even though I'm using different devices, based on the call patterns 1046 00:56:02,820 --> 00:56:03,487 that I'm making. 1047 00:56:03,487 --> 00:56:05,580 As usual, humans are the vulnerability here. 1048 00:56:05,580 --> 00:56:08,580 Humans are going to use the same-- they're going to call the same people 1049 00:56:08,580 --> 00:56:11,370 and talk to the same people on their phones all the time. 1050 00:56:11,370 --> 00:56:18,930 And so it's relatively easy for mobile devices to track our locations. 1051 00:56:18,930 --> 00:56:21,180 Again, every device has a unique ID. 1052 00:56:21,180 --> 00:56:22,830 You can't hide that ID. 1053 00:56:22,830 --> 00:56:26,670 That ID is part of something that gets transmitted to cell towers. 1054 00:56:26,670 --> 00:56:29,770 And potentially the threat exists that if somebody 1055 00:56:29,770 --> 00:56:31,770 is able to break into that phone, whether that's 1056 00:56:31,770 --> 00:56:34,967 because of old, outdated firmware that's not been updated 1057 00:56:34,967 --> 00:56:37,800 or because of the potential that there is some sort of backdoor that 1058 00:56:37,800 --> 00:56:43,020 would allow an agent, authorized or not, to access it, again, 1059 00:56:43,020 --> 00:56:44,340 this vulnerability exists. 1060 00:56:44,340 --> 00:56:49,770 How does the law deal with do you own the information that is being tracked? 1061 00:56:49,770 --> 00:56:53,200 Do you want that information to be available to other people? 1062 00:56:53,200 --> 00:56:56,200 It's an open question. 1063 00:56:56,200 --> 00:56:58,390 Another issue at the forefront of where we're going, 1064 00:56:58,390 --> 00:57:01,510 especially when it comes to legal technology and law firms itself 1065 00:57:01,510 --> 00:57:06,070 availing itself of technology, is artificial intelligence and machine 1066 00:57:06,070 --> 00:57:06,910 learning. 1067 00:57:06,910 --> 00:57:10,060 Both of these techniques are incredibly useful potentially 1068 00:57:10,060 --> 00:57:13,210 to law firms that are trying to process large amounts of data 1069 00:57:13,210 --> 00:57:15,220 relatively quickly, the type of work that's 1070 00:57:15,220 --> 00:57:19,600 generally been outsourced to contract attorneys or first-year associates 1071 00:57:19,600 --> 00:57:20,830 or the like. 1072 00:57:20,830 --> 00:57:23,440 First of all, we need to define what it means when 1073 00:57:23,440 --> 00:57:25,927 we talk about artificial intelligence. 1074 00:57:25,927 --> 00:57:27,760 Generally when we think about that, it means 1075 00:57:27,760 --> 00:57:29,230 something like pattern recognition. 1076 00:57:29,230 --> 00:57:31,870 Can we teach a computer to recognize specific patterns? 1077 00:57:31,870 --> 00:57:34,410 In the case of a law firm, for example, that might be can 1078 00:57:34,410 --> 00:57:39,440 it realize that something looks like a clause in a contract, a valid clause 1079 00:57:39,440 --> 00:57:41,440 that we might want to see or a clause that we're 1080 00:57:41,440 --> 00:57:42,898 hoping not to see in our contracts. 1081 00:57:42,898 --> 00:57:45,820 We might want to flag that for further human review. 1082 00:57:45,820 --> 00:57:48,940 Can the machine make a decision about something? 1083 00:57:48,940 --> 00:57:50,870 Should it, in fact, flag that for review? 1084 00:57:50,870 --> 00:57:54,520 Or is it just highlighting things that might be alarming or not? 1085 00:57:54,520 --> 00:57:57,820 Can it mimic the operations of the human mind? 1086 00:57:57,820 --> 00:58:00,100 If we can teach a computer to do those things-- 1087 00:58:00,100 --> 00:58:02,058 we've already seen that we can teach a computer 1088 00:58:02,058 --> 00:58:04,360 to teach itself how to reproduce bugs. 1089 00:58:04,360 --> 00:58:06,880 We saw that in Ken Thompson's compiler example. 1090 00:58:06,880 --> 00:58:09,970 If we can teach a computer to mimic the types of things 1091 00:58:09,970 --> 00:58:11,980 that we would do as humans, that's when we've 1092 00:58:11,980 --> 00:58:14,380 created an artificial intelligence. 1093 00:58:14,380 --> 00:58:18,220 There's a lot of potential uses for artificial intelligences 1094 00:58:18,220 --> 00:58:22,330 in the legal profession, like I said, document review being 1095 00:58:22,330 --> 00:58:24,790 one potential avenue for that. 1096 00:58:24,790 --> 00:58:28,900 And there are a few different types of ways that artificial intelligences can 1097 00:58:28,900 --> 00:58:29,870 learn. 1098 00:58:29,870 --> 00:58:33,110 There are actually two kind of prevailing major ways. 1099 00:58:33,110 --> 00:58:36,490 The first is for humans to supply some sort of data 1100 00:58:36,490 --> 00:58:41,710 and also supply the rules that map the data to some outcome. 1101 00:58:41,710 --> 00:58:42,890 That's one way. 1102 00:58:42,890 --> 00:58:46,220 The other way is something called neuroevolution, 1103 00:58:46,220 --> 00:58:49,330 which is generally best exemplified by way of a genetic algorithm. 1104 00:58:49,330 --> 00:58:52,300 In a moment, we'll take a look at a genetic algorithm literally written 1105 00:58:52,300 --> 00:58:55,612 in Python, where a machine learns over time to try and generate 1106 00:58:55,612 --> 00:58:56,320 the right result. 1107 00:58:56,320 --> 00:59:00,190 In this model, we give the computer a target, something 1108 00:59:00,190 --> 00:59:02,380 that it should try and achieve, and request 1109 00:59:02,380 --> 00:59:05,230 that it generates data until it can match 1110 00:59:05,230 --> 00:59:09,280 that target that we are looking for. 1111 00:59:09,280 --> 00:59:11,110 So by way of example, let's see if we can 1112 00:59:11,110 --> 00:59:13,960 teach a computer to write Shakespeare. 1113 00:59:13,960 --> 00:59:17,140 After all, it's a theory that given an infinite amount of time, 1114 00:59:17,140 --> 00:59:18,940 enough monkeys could write Shakespeare. 1115 00:59:18,940 --> 00:59:21,940 Can we teach a computer to do the same? 1116 00:59:21,940 --> 00:59:23,840 Let's have a look. 1117 00:59:23,840 --> 00:59:26,990 So it might be a big ask to get a computer to write all of Shakespeare. 1118 00:59:26,990 --> 00:59:29,450 Let's see if we can get this computer to eventually realize 1119 00:59:29,450 --> 00:59:33,448 the following line, the target, so to speak, "a rose by any other name." 1120 00:59:33,448 --> 00:59:35,240 So we're going to try and teach a computer. 1121 00:59:35,240 --> 00:59:37,370 We want a computer to eventually on its own 1122 00:59:37,370 --> 00:59:39,890 arrive at this phrase using some sort of algorithm. 1123 00:59:39,890 --> 00:59:43,430 The algorithm we're going to use to do it is called the genetic algorithm. 1124 00:59:43,430 --> 00:59:47,150 Now, the genetic algorithm is called this based on the theory of genetics, 1125 00:59:47,150 --> 00:59:51,890 that best traits or good traits will propagate down and become 1126 00:59:51,890 --> 00:59:55,070 part of the defined set of traits we usually encounter. 1127 00:59:55,070 --> 00:59:58,380 And bad traits, things that we don't necessarily want, 1128 00:59:58,380 --> 01:00:00,980 will be weeded out of the population. 1129 01:00:00,980 --> 01:00:05,360 And over successive generations, hopefully only the good traits 1130 01:00:05,360 --> 01:00:06,770 will prevail. 1131 01:00:06,770 --> 01:00:08,853 Now, just like any other genetic variation, 1132 01:00:08,853 --> 01:00:10,270 we need to account for a mutation. 1133 01:00:10,270 --> 01:00:12,210 We need to allow things to change. 1134 01:00:12,210 --> 01:00:14,930 Otherwise we may end up in a situation where all we 1135 01:00:14,930 --> 01:00:17,270 have is the potential for bad traits. 1136 01:00:17,270 --> 01:00:21,470 We randomly might need something to happen to eliminate that bad trait. 1137 01:00:21,470 --> 01:00:22,910 We have no other way to do it. 1138 01:00:22,910 --> 01:00:26,660 So we do have to mutate some of our strings from time to time. 1139 01:00:26,660 --> 01:00:28,910 How are we going to teach the computer to do this? 1140 01:00:28,910 --> 01:00:31,320 We're not providing it with any data set to start with. 1141 01:00:31,320 --> 01:00:37,340 The computer's going to generate its own data set, trying to get at this target. 1142 01:00:37,340 --> 01:00:41,237 The way we're going to do this is to create a bunch of DNA objects. 1143 01:00:41,237 --> 01:00:44,570 DNA objects, in this example, we're just going to refer to as different strings. 1144 01:00:44,570 --> 01:00:46,330 And the strings are just a random-- 1145 01:00:46,330 --> 01:00:51,470 as exemplified here in this code, a random set of characters. 1146 01:00:51,470 --> 01:00:53,720 We're going to have it randomly pick. 1147 01:00:53,720 --> 01:00:56,450 I believe that the string's about 23 characters long 1148 01:00:56,450 --> 01:00:58,600 that we're trying to have it match. 1149 01:00:58,600 --> 01:01:01,490 So it's going to randomly pick 23 characters, 1150 01:01:01,490 --> 01:01:05,060 uppercase letters, lowercase letters, numbers, punctuation marks, 1151 01:01:05,060 --> 01:01:08,930 doesn't matter, any legitimate Ascii character, 1152 01:01:08,930 --> 01:01:13,280 and just add itself to the list of potential candidates 1153 01:01:13,280 --> 01:01:14,360 for the correct phrase. 1154 01:01:14,360 --> 01:01:18,200 So randomly slam on your keyboard and hit 23 keys. 1155 01:01:18,200 --> 01:01:21,890 The computer has about 1,000 of those to get started. 1156 01:01:21,890 --> 01:01:25,910 Every one of those strings, every one of those DNA items, 1157 01:01:25,910 --> 01:01:29,000 also has the ability to determine how fit it is. 1158 01:01:29,000 --> 01:01:32,610 Fitness being is it more likely to go on to the next generation? 1159 01:01:32,610 --> 01:01:37,730 Does it have characteristics that we might want to propagate down the line? 1160 01:01:37,730 --> 01:01:41,330 So for example, the way we're going to, in a rudimentary way, 1161 01:01:41,330 --> 01:01:45,860 assess the fitness of a string, how close it is basically to the target, 1162 01:01:45,860 --> 01:01:49,310 is to go over every single character of it and compare, 1163 01:01:49,310 --> 01:01:51,630 does this match what we expect in this spot? 1164 01:01:51,630 --> 01:01:53,090 So if it starts with a T-- 1165 01:01:53,090 --> 01:01:56,000 or excuse me, starts with an A, "a rose by any other name," 1166 01:01:56,000 --> 01:02:00,500 if it starts with an A, then that's one point of fitness. 1167 01:02:00,500 --> 01:02:04,340 If the next character is a space, then that's one point of fitness. 1168 01:02:04,340 --> 01:02:08,150 So a perfect string will have all of the characters in the correct space. 1169 01:02:08,150 --> 01:02:11,690 But as long as it has even just one character in the correct space, 1170 01:02:11,690 --> 01:02:12,950 then it is considered fit. 1171 01:02:12,950 --> 01:02:15,410 And so we iterate over all of the characters in the string 1172 01:02:15,410 --> 01:02:17,480 to see if it is fit. 1173 01:02:17,480 --> 01:02:21,890 Now, much like multiple generations, we need the ability to create new strings 1174 01:02:21,890 --> 01:02:23,970 from the population that we had before. 1175 01:02:23,970 --> 01:02:26,210 And so this is the idea of crossover. 1176 01:02:26,210 --> 01:02:27,350 We take two strings. 1177 01:02:27,350 --> 01:02:30,140 And again, we're just going to arbitrarily decide 1178 01:02:30,140 --> 01:02:32,480 how to take two strings and mash them together. 1179 01:02:32,480 --> 01:02:36,590 We're going to say the first half comes from the mother string, 1180 01:02:36,590 --> 01:02:39,080 and the second half comes from the father string. 1181 01:02:39,080 --> 01:02:43,818 And that will produce a child, which may have some positive characteristics 1182 01:02:43,818 --> 01:02:45,860 from the mother and some positive characteristics 1183 01:02:45,860 --> 01:02:50,180 from the father, which may then make us a little bit closer towards this idea 1184 01:02:50,180 --> 01:02:51,650 of having the perfect string. 1185 01:02:51,650 --> 01:02:56,030 Again, the idea here is for the computer to evolve itself 1186 01:02:56,030 --> 01:02:59,720 into the correct string rather than us just giving it a set of data 1187 01:02:59,720 --> 01:03:00,590 and saying, do this. 1188 01:03:00,590 --> 01:03:03,450 We want to let it figure it out on its own. 1189 01:03:03,450 --> 01:03:05,400 That's the idea of the genetic algorithm. 1190 01:03:05,400 --> 01:03:08,930 So we're going to arbitrarily split the string in half. 1191 01:03:08,930 --> 01:03:13,460 Half the characters, or genes of the string, come from the mother. 1192 01:03:13,460 --> 01:03:14,960 The other half come from the father. 1193 01:03:14,960 --> 01:03:16,220 They get slammed together. 1194 01:03:16,220 --> 01:03:19,520 That is a new DNA sequence of the child. 1195 01:03:19,520 --> 01:03:22,010 And then again, to account for mutation, we 1196 01:03:22,010 --> 01:03:26,630 need some random percent of the time, in this case, we're saying less than 1% 1197 01:03:26,630 --> 01:03:30,410 the time, we would like one of those characters to randomly change. 1198 01:03:30,410 --> 01:03:33,380 So it doesn't come from the mother or the father string. 1199 01:03:33,380 --> 01:03:36,020 It just randomly changes into something else, in the hopes 1200 01:03:36,020 --> 01:03:40,850 that maybe that mutation will be beneficial somewhere down the line. 1201 01:03:40,850 --> 01:03:43,610 Now, in this other Python file, script.py, 1202 01:03:43,610 --> 01:03:47,150 we're actually taking those strings that we are just randomly creating-- 1203 01:03:47,150 --> 01:03:50,120 those are the DNA objects from the previous file-- 1204 01:03:50,120 --> 01:03:53,100 and starting to actually evolve them over time. 1205 01:03:53,100 --> 01:03:56,270 So we're going to start out with 1,000 of these random strings. 1206 01:03:56,270 --> 01:03:58,790 And the best score so far, the closest score we have, 1207 01:03:58,790 --> 01:04:02,660 the best match to "a rose by any other name" is currently zero. 1208 01:04:02,660 --> 01:04:04,700 No string is currently there. 1209 01:04:04,700 --> 01:04:06,890 We may randomly get it on the first generation. 1210 01:04:06,890 --> 01:04:08,480 That would be a wonderful success. 1211 01:04:08,480 --> 01:04:09,925 It's pretty unlikely. 1212 01:04:09,925 --> 01:04:11,300 Population here is just an array. 1213 01:04:11,300 --> 01:04:15,440 It's going to allow us to store all of these 1,000 strings. 1214 01:04:15,440 --> 01:04:19,160 And then as long as we have not yet found the perfect string. 1215 01:04:19,160 --> 01:04:23,710 The one that has 100% fitness or a score of exactly 1, 1216 01:04:23,710 --> 01:04:26,740 we would like to do the following, calculate the fitness score 1217 01:04:26,740 --> 01:04:30,780 for every one of those random 1,000 strings that we generated. 1218 01:04:30,780 --> 01:04:35,562 Then, if what we just found is better than anything we've seen before-- 1219 01:04:35,562 --> 01:04:37,270 and at the beginning, we start with zero, 1220 01:04:37,270 --> 01:04:40,020 so everything is better than what we've seen before, as long as it 1221 01:04:40,020 --> 01:04:42,490 matches at least one character-- 1222 01:04:42,490 --> 01:04:44,320 then print out that string. 1223 01:04:44,320 --> 01:04:46,210 So this is a sense of progression. 1224 01:04:46,210 --> 01:04:50,380 Over time we're going to see the strings get better and better and better. 1225 01:04:50,380 --> 01:04:52,680 Then we're going to create what's called a mating pool. 1226 01:04:52,680 --> 01:04:56,650 Again, this is this idea of two strings sort of crossing over. 1227 01:04:56,650 --> 01:05:01,480 They're sort of breeding to try and create a better subsequent string. 1228 01:05:01,480 --> 01:05:04,390 Depending on how good that string is, we may 1229 01:05:04,390 --> 01:05:07,780 want that child to be in the next population more times. 1230 01:05:07,780 --> 01:05:13,720 If a string is a 20% match, that's pretty good, especially 1231 01:05:13,720 --> 01:05:15,140 if it's an early generation. 1232 01:05:15,140 --> 01:05:19,320 So we may want that string to appear in the mating pool, the next generation, 1233 01:05:19,320 --> 01:05:21,190 20% of the time. 1234 01:05:21,190 --> 01:05:25,690 It has a better likelihood than a string that matches 5% of the characters 1235 01:05:25,690 --> 01:05:28,002 to be closer to the right answer. 1236 01:05:28,002 --> 01:05:29,710 So a string that barely matches anything, 1237 01:05:29,710 --> 01:05:31,002 sure, it should be in the pool. 1238 01:05:31,002 --> 01:05:33,550 Maybe it has the one character that we're looking for. 1239 01:05:33,550 --> 01:05:35,470 But we only want it in the pool 5% of the time 1240 01:05:35,470 --> 01:05:38,380 versus the string that matches 50% of the characters. 1241 01:05:38,380 --> 01:05:41,320 We probably want that in the pool 50% of the time. 1242 01:05:41,320 --> 01:05:45,640 The idea is, again, taking the best representatives of the next generation 1243 01:05:45,640 --> 01:05:50,410 and trying to have the computer learn and understand that those are good 1244 01:05:50,410 --> 01:05:54,400 and see if they can build better and better strings from those better 1245 01:05:54,400 --> 01:05:57,220 and better representatives of the population that 1246 01:05:57,220 --> 01:06:00,550 are more close to the target string that we're looking 1247 01:06:00,550 --> 01:06:03,650 for, "a rose by any other name." 1248 01:06:03,650 --> 01:06:07,400 Then in here all we're doing is picking two random items 1249 01:06:07,400 --> 01:06:10,940 from that pool we've just created of the best possible candidates 1250 01:06:10,940 --> 01:06:12,950 and mating those two together and continuing 1251 01:06:12,950 --> 01:06:17,210 this process of hopefully getting better and better approximations 1252 01:06:17,210 --> 01:06:19,020 of this string that we're looking for. 1253 01:06:19,020 --> 01:06:22,280 And what's going to happen there is they're going to create a crossover. 1254 01:06:22,280 --> 01:06:26,718 That crossover child DNA string will mutate into some other new string. 1255 01:06:26,718 --> 01:06:29,760 And we'll add that to the population to be considered for the next round. 1256 01:06:29,760 --> 01:06:32,090 So we're just going keep going over and over and over, 1257 01:06:32,090 --> 01:06:34,845 generating hopefully better and better strings. 1258 01:06:34,845 --> 01:06:36,470 So that's how these two files interact. 1259 01:06:36,470 --> 01:06:40,250 The first file that we took a look at defines the properties of a string 1260 01:06:40,250 --> 01:06:42,950 and how it can score itself basically. 1261 01:06:42,950 --> 01:06:45,080 And this process here in script.py-- 1262 01:06:45,080 --> 01:06:48,830 and this these two files are based on a Medium post, which 1263 01:06:48,830 --> 01:06:52,340 we've described in the course materials, as well as an exam question that we've 1264 01:06:52,340 --> 01:06:56,220 previously asked in the college version of CS50, 1265 01:06:56,220 --> 01:06:59,580 for students to implement and solve on their own. 1266 01:06:59,580 --> 01:07:02,460 Hopefully these two files taken together, the script file, 1267 01:07:02,460 --> 01:07:07,680 will actually go through the process of creating this generation over and over. 1268 01:07:07,680 --> 01:07:08,890 So let's see this in action. 1269 01:07:08,890 --> 01:07:11,970 Let's see how in each successive generation 1270 01:07:11,970 --> 01:07:16,370 we see strings get closer and closer and closer to the target string. 1271 01:07:16,370 --> 01:07:18,570 Again, we never told the computer-- we never 1272 01:07:18,570 --> 01:07:23,130 gave the computer a set of starting data to work with, only an end goal. 1273 01:07:23,130 --> 01:07:25,620 The computer needs to learn how to get closer 1274 01:07:25,620 --> 01:07:27,600 and closer to finding the right string. 1275 01:07:27,600 --> 01:07:30,950 And that's what we do here. 1276 01:07:30,950 --> 01:07:34,160 So let's run our program and see if we've actually taught the computer how 1277 01:07:34,160 --> 01:07:36,992 to genetically evolve itself to figure out this target string 1278 01:07:36,992 --> 01:07:37,950 that we're looking for. 1279 01:07:37,950 --> 01:07:41,180 So we're going to run script.py, which is the Python file where 1280 01:07:41,180 --> 01:07:43,300 we described the process happening. 1281 01:07:43,300 --> 01:07:46,010 And let's just see how the generations evolve over time. 1282 01:07:46,010 --> 01:07:49,010 So we get started, and we have some pretty quick results. 1283 01:07:49,010 --> 01:07:54,830 This first string here has a matching score of 0.042, so 4%, which I believe 1284 01:07:54,830 --> 01:07:55,760 is one character. 1285 01:07:55,760 --> 01:07:58,970 So if we scroll through, we try and find "a rose by any other name," 1286 01:07:58,970 --> 01:08:00,300 I don't know exactly which character it is here. 1287 01:08:00,300 --> 01:08:01,675 But this is basically saying one. 1288 01:08:01,675 --> 01:08:04,400 One of these characters matches. 1289 01:08:04,400 --> 01:08:07,640 It's 4.2% what we're hoping for. 1290 01:08:07,640 --> 01:08:11,270 That means that in the next pool, the next iteration, 1291 01:08:11,270 --> 01:08:14,350 this string will be included 4.2% of the time. 1292 01:08:14,350 --> 01:08:16,850 And there may also be other strings that also match. 1293 01:08:16,850 --> 01:08:20,420 Remember, we're only printing out when we have a better string. 1294 01:08:20,420 --> 01:08:23,120 So this only going to get included 4.2% of the time. 1295 01:08:23,120 --> 01:08:25,120 But there are going to be plenty of other things 1296 01:08:25,120 --> 01:08:28,580 that are also 4.2% matches that are probably matching-- each one of them 1297 01:08:28,580 --> 01:08:30,170 matches one different character. 1298 01:08:30,170 --> 01:08:32,239 So those will comprise part of the pool. 1299 01:08:32,239 --> 01:08:33,739 Then we're going to cross pollinate. 1300 01:08:33,739 --> 01:08:35,447 We're going to take each of those strings 1301 01:08:35,447 --> 01:08:40,010 that each had a one character match and mash them together. 1302 01:08:40,010 --> 01:08:42,140 Now, if the first string that we're considering 1303 01:08:42,140 --> 01:08:46,100 has the character match in the first half, 1304 01:08:46,100 --> 01:08:49,160 and the second string has a character match in the second half, 1305 01:08:49,160 --> 01:08:52,660 now we've created a new string that has two matches, right? 1306 01:08:52,660 --> 01:08:54,410 We know one of them was in the first half. 1307 01:08:54,410 --> 01:08:55,946 That came from the mother string. 1308 01:08:55,946 --> 01:08:59,029 We have one of them in the second half that came from the father's string. 1309 01:08:59,029 --> 01:09:02,053 And so the combined string together, unless that character 1310 01:09:02,053 --> 01:09:04,220 happens to get mutated out, which is a possibility-- 1311 01:09:04,220 --> 01:09:07,200 we might actually take a good thing and turn it into a bad character. 1312 01:09:07,200 --> 01:09:08,950 Then the next one should be twice as good. 1313 01:09:08,950 --> 01:09:11,140 It should be 8.3% or 8.4% likely. 1314 01:09:11,140 --> 01:09:12,390 And that's exactly what it is. 1315 01:09:12,390 --> 01:09:14,750 So this next string has two matches. 1316 01:09:14,750 --> 01:09:17,380 And the next one has three and four. 1317 01:09:17,380 --> 01:09:20,630 And as we kind of scroll down, we see some patterns like this, 1318 01:09:20,630 --> 01:09:27,229 A question mark Q Y. That's obviously not part of the correct answer. 1319 01:09:27,229 --> 01:09:30,800 But it suggests that there's a parent in here that has this string that 1320 01:09:30,800 --> 01:09:32,899 tends to have really good fitness. 1321 01:09:32,899 --> 01:09:37,729 Like this string probably has many other characters outside of this box here 1322 01:09:37,729 --> 01:09:38,750 that match. 1323 01:09:38,750 --> 01:09:41,300 And so that parent propagates down the line for a while 1324 01:09:41,300 --> 01:09:45,500 until eventually those characteristics, in about the ninth generation or so, 1325 01:09:45,500 --> 01:09:47,087 get kind of wiped out. 1326 01:09:47,087 --> 01:09:48,920 And as we can see over time, what starts out 1327 01:09:48,920 --> 01:09:51,740 as a jumbled mess gets closer and closer to something 1328 01:09:51,740 --> 01:09:56,240 that is starting to look even at 58% like we're getting pretty close to 1329 01:09:56,240 --> 01:09:57,440 "a rose by any other name." 1330 01:09:57,440 --> 01:10:00,620 And as we go on and on, again, the likelihood gets better and better. 1331 01:10:00,620 --> 01:10:03,800 So that by the time we're here, at this line here, 1332 01:10:03,800 --> 01:10:08,150 this string is going to appear in 87 and 1/2% 1333 01:10:08,150 --> 01:10:10,190 of the next generation's population. 1334 01:10:10,190 --> 01:10:13,310 So a lot of these characteristics of this string that's close but not 1335 01:10:13,310 --> 01:10:16,790 exactly right will keep, appearing which makes it more and more likely 1336 01:10:16,790 --> 01:10:21,050 that it will eventually pair up with another string that 1337 01:10:21,050 --> 01:10:22,590 is a little bit better. 1338 01:10:22,590 --> 01:10:26,210 And as you probably saw, towards the end, this process got slower, right? 1339 01:10:26,210 --> 01:10:30,640 If all the strings are so good, it might just 1340 01:10:30,640 --> 01:10:35,055 take a while to find one where the match is better than the parents. 1341 01:10:35,055 --> 01:10:37,180 It might be the case that we are creating 1342 01:10:37,180 --> 01:10:38,620 combinations that are worse again. 1343 01:10:38,620 --> 01:10:40,060 We want to filter those back out. 1344 01:10:40,060 --> 01:10:42,820 And so it takes a while to find exactly what we're looking for. 1345 01:10:42,820 --> 01:10:46,690 But again, from this random string at the very beginning, over time, 1346 01:10:46,690 --> 01:10:48,730 the computer learns what parts are good. 1347 01:10:48,730 --> 01:10:51,213 So here's "rose," right, as part of the string. 1348 01:10:51,213 --> 01:10:52,380 This was eventually correct. 1349 01:10:52,380 --> 01:10:54,172 This got rooted out in the next generation. 1350 01:10:54,172 --> 01:10:56,050 It got mutated out by accident. 1351 01:10:56,050 --> 01:10:58,990 But mathematically, what it found was a little bit better. 1352 01:10:58,990 --> 01:11:01,990 There are more characters in this string that are correct than this one, 1353 01:11:01,990 --> 01:11:04,900 even if there are some recognizable patterns in the former. 1354 01:11:04,900 --> 01:11:07,870 But the computer has learned, evolved over time what it 1355 01:11:07,870 --> 01:11:10,420 means to match that particular string. 1356 01:11:10,420 --> 01:11:13,600 This is the idea of neuroevolution, teaching a computer 1357 01:11:13,600 --> 01:11:17,260 to recognize patterns without necessarily telling it 1358 01:11:17,260 --> 01:11:22,890 what those patterns are, just what the target should be. 1359 01:11:22,890 --> 01:11:26,040 So that genetic algorithm is kind of a fun programming activity. 1360 01:11:26,040 --> 01:11:30,140 But the principles that underpin it still apply to a legal context. 1361 01:11:30,140 --> 01:11:36,420 If you teach a computer to recognize certain patterns in a contract, 1362 01:11:36,420 --> 01:11:38,490 you can teach a computer to write contracts 1363 01:11:38,490 --> 01:11:40,170 potentially that match those patterns. 1364 01:11:40,170 --> 01:11:42,810 You can teach a computer to recognize those patterns 1365 01:11:42,810 --> 01:11:44,430 and make decisions based on them. 1366 01:11:44,430 --> 01:11:48,340 So we were using neuroevolution to build or construct something. 1367 01:11:48,340 --> 01:11:52,620 But you can also use neuroevolution to isolate correct sets of words 1368 01:11:52,620 --> 01:11:55,910 or correct sets of phrases that you're hoping to see in a contract 1369 01:11:55,910 --> 01:11:58,950 or that you might want to require for additional use. 1370 01:11:58,950 --> 01:12:02,700 So again, the types of legal work that this can be used to help automate 1371 01:12:02,700 --> 01:12:06,750 are things like collation, analysis, doing large document review, 1372 01:12:06,750 --> 01:12:09,210 predicting the potential outcome of litigation 1373 01:12:09,210 --> 01:12:13,380 based on having it review case precedents and outcomes 1374 01:12:13,380 --> 01:12:19,350 and seeing if there are any trends that appear in cases X, Y, and Z all 1375 01:12:19,350 --> 01:12:20,130 had this outcome. 1376 01:12:20,130 --> 01:12:22,500 Is there some other common thread in cases 1377 01:12:22,500 --> 01:12:26,520 X, Y, and Z that might also apply to the case that we're about to try? 1378 01:12:26,520 --> 01:12:29,910 Or potentially we need to settle because we see that the outcome is 1379 01:12:29,910 --> 01:12:33,820 going to be unfavorable to us. 1380 01:12:33,820 --> 01:12:38,650 But does this digital lawyering potentially make you uncomfortable? 1381 01:12:38,650 --> 01:12:42,850 Is it OK for legal decisions to be made by a computer? 1382 01:12:42,850 --> 01:12:46,570 Is it more OK if those decisions are made because we've trained them 1383 01:12:46,570 --> 01:12:48,542 with our own human instincts? 1384 01:12:48,542 --> 01:12:49,750 There are services out there. 1385 01:12:49,750 --> 01:12:56,680 There's a famous example of a parking ticket clearing service called Do Not 1386 01:12:56,680 --> 01:13:00,040 Pay from several years ago, where a 19- or 20-year-old computer 1387 01:13:00,040 --> 01:13:04,030 programmer basically taught a computer how 1388 01:13:04,030 --> 01:13:05,980 to argue parking tickets on people's behalf 1389 01:13:05,980 --> 01:13:08,230 so that they wouldn't have to hire attorneys to do so. 1390 01:13:08,230 --> 01:13:09,772 He wasn't a trained attorney himself. 1391 01:13:09,772 --> 01:13:12,383 He just recognized some of the things that are-- 1392 01:13:12,383 --> 01:13:14,800 he talked to people and recognized some of the things that 1393 01:13:14,800 --> 01:13:17,890 are common threads for people who successfully challenged 1394 01:13:17,890 --> 01:13:20,740 parking tickets versus don't successfully challenge parking tickets, 1395 01:13:20,740 --> 01:13:23,920 taught a computer to mimic those patterns, 1396 01:13:23,920 --> 01:13:29,215 and have the computer send out notices and the like to defend parking 1397 01:13:29,215 --> 01:13:29,840 ticket holders. 1398 01:13:29,840 --> 01:13:30,673 And he was able to-- 1399 01:13:30,673 --> 01:13:34,840 I think it was several hundred thousand dollars in potential legal fees saved 1400 01:13:34,840 --> 01:13:37,210 and several hundred thousand parking tickets that 1401 01:13:37,210 --> 01:13:38,950 were challenged successfully. 1402 01:13:38,950 --> 01:13:41,530 And the case was dropped, and there was no payment required. 1403 01:13:41,530 --> 01:13:46,240 So is it OK for computers to be making these decisions if humans teach them? 1404 01:13:46,240 --> 01:13:48,610 Is it only OK for computers to make those decisions 1405 01:13:48,610 --> 01:13:53,070 if the humans teaching them have legal training at the outset in order 1406 01:13:53,070 --> 01:13:54,070 to make these decisions? 1407 01:13:54,070 --> 01:13:59,290 Or can we trust programmers to write these kinds of programs for us as well? 1408 01:13:59,290 --> 01:14:02,743 Does lawyering rely on a gut instinct? 1409 01:14:02,743 --> 01:14:04,660 I'm sure sometimes in cases you've experienced 1410 01:14:04,660 --> 01:14:07,810 in your own practice the decision that you 1411 01:14:07,810 --> 01:14:12,430 make might be contrary to what you think might be the right thing 1412 01:14:12,430 --> 01:14:16,055 to do because you just feel like if I do this other thing 1413 01:14:16,055 --> 01:14:17,680 it's going to work better in this case. 1414 01:14:17,680 --> 01:14:21,970 And I'm sure that for many of you, this has paid off successfully. 1415 01:14:21,970 --> 01:14:26,425 Doing something that is in contravention of the accepted norm 1416 01:14:26,425 --> 01:14:28,300 is something that a computer may not be-- you 1417 01:14:28,300 --> 01:14:30,350 may not be able to train a computer to do that. 1418 01:14:30,350 --> 01:14:34,060 You may not be able to train gut instinct to challenge the rules, 1419 01:14:34,060 --> 01:14:37,420 when all this whole idea of neuroevolution and machine 1420 01:14:37,420 --> 01:14:43,360 learning and AI is designed to have computers learn and enforce rules. 1421 01:14:43,360 --> 01:14:47,320 Will the use of AI affect the attorneys' bottom line? 1422 01:14:47,320 --> 01:14:50,410 Hypothetically it should make legal work cheaper. 1423 01:14:50,410 --> 01:14:54,790 But this would then potentially reduce firm profits 1424 01:14:54,790 --> 01:14:58,330 by not having attorneys, humans, reviewing this material. 1425 01:14:58,330 --> 01:15:01,000 This is, in some ways, a good thing. 1426 01:15:01,000 --> 01:15:03,465 It makes things more affordable for our clients. 1427 01:15:03,465 --> 01:15:04,840 This is in some ways a bad thing. 1428 01:15:04,840 --> 01:15:11,050 We have entrenched expenses that we need to pay that are based on certain monies 1429 01:15:11,050 --> 01:15:14,290 coming in because of the hourly rates of our associates and our partners. 1430 01:15:14,290 --> 01:15:16,060 Does this change that up? 1431 01:15:16,060 --> 01:15:19,180 Does the fact of this changes it up, is it problematic? 1432 01:15:19,180 --> 01:15:22,870 Is it better for us to provide the most competent representation that we can, 1433 01:15:22,870 --> 01:15:26,980 even if that competent representation is actually from a computer? 1434 01:15:26,980 --> 01:15:30,160 Remember that as attorneys, we have an ethical obligation to stay on top of 1435 01:15:30,160 --> 01:15:32,530 and understand technology. 1436 01:15:32,530 --> 01:15:36,880 Sometimes that may become a situation where using that technology 1437 01:15:36,880 --> 01:15:39,460 and working with that technology really forces 1438 01:15:39,460 --> 01:15:41,947 us to do something we might not want to do 1439 01:15:41,947 --> 01:15:43,780 because it doesn't feel like the right thing 1440 01:15:43,780 --> 01:15:45,760 to do from a business perspective. 1441 01:15:45,760 --> 01:15:52,380 Nevertheless our ethical obligations compel us to potentially do that thing. 1442 01:15:52,380 --> 01:15:55,260 So we've seen some of the good things that machine learning can do. 1443 01:15:55,260 --> 01:15:59,100 But certainly there are also some bad things that machine learning can do. 1444 01:15:59,100 --> 01:16:02,310 There's an article that we provided about machine bias and a computer 1445 01:16:02,310 --> 01:16:07,440 program that is ostensibly supposed to be used by prosecutors and judges 1446 01:16:07,440 --> 01:16:11,310 when they are considering releasing somebody on bail 1447 01:16:11,310 --> 01:16:14,490 or setting the conditions for parole, whether or not 1448 01:16:14,490 --> 01:16:16,988 they're more likely to commit future crimes. 1449 01:16:16,988 --> 01:16:18,780 Like, what is their likely recidivism rate? 1450 01:16:18,780 --> 01:16:23,232 What kind of additional support might they need upon their release? 1451 01:16:23,232 --> 01:16:26,190 But it turns out that the data that we're feeding into these algorithms 1452 01:16:26,190 --> 01:16:27,630 is provided by humans. 1453 01:16:27,630 --> 01:16:30,060 And unfortunately these programs that are 1454 01:16:30,060 --> 01:16:34,920 supposed to help judges make better decisions have a racial bias in them. 1455 01:16:34,920 --> 01:16:38,010 The questions that get asked as part of figuring out 1456 01:16:38,010 --> 01:16:41,640 whether this person is more likely or not to commit a future crime, 1457 01:16:41,640 --> 01:16:45,390 they're never outright asking the question, what is your race 1458 01:16:45,390 --> 01:16:47,070 and basing a score on that. 1459 01:16:47,070 --> 01:16:52,110 But they're asking other questions that sort of are hints or indicators of what 1460 01:16:52,110 --> 01:16:53,580 someone's race might be. 1461 01:16:53,580 --> 01:16:56,670 For example, they're asking questions about socioeconomic status 1462 01:16:56,670 --> 01:17:01,080 and languages spoken and whether or not parents have ever 1463 01:17:01,080 --> 01:17:02,370 been imprisoned and so on. 1464 01:17:02,370 --> 01:17:09,270 And these programs sort of stereotype people in ways that are not OK, 1465 01:17:09,270 --> 01:17:13,530 or we might not deem to be OK in any way, to make decisions. 1466 01:17:13,530 --> 01:17:17,010 And these stereotypes are created by humans. 1467 01:17:17,010 --> 01:17:22,080 And so we're actually teaching the computer bias in this way. 1468 01:17:22,080 --> 01:17:24,060 We're supplying data. 1469 01:17:24,060 --> 01:17:25,590 We, as humans, are providing it. 1470 01:17:25,590 --> 01:17:28,200 We're imparting our bias into the program. 1471 01:17:28,200 --> 01:17:30,870 And the program is really just implementing 1472 01:17:30,870 --> 01:17:32,370 exactly what we're telling it to do. 1473 01:17:32,370 --> 01:17:34,320 Computers, yes, they are intelligent. 1474 01:17:34,320 --> 01:17:37,630 We can teach them to learn things about themselves. 1475 01:17:37,630 --> 01:17:40,680 But at the end of the day, that knowledge comes from us. 1476 01:17:40,680 --> 01:17:45,000 We are either telling them to hit some target or providing data to them 1477 01:17:45,000 --> 01:17:47,790 and telling them these are the rules to match. 1478 01:17:47,790 --> 01:17:52,050 So computers can are only as intelligent as the humans who create and program 1479 01:17:52,050 --> 01:17:52,720 them. 1480 01:17:52,720 --> 01:17:56,280 And unfortunately that means they're also as affected by bias 1481 01:17:56,280 --> 01:17:58,530 as the humans who create and program them. 1482 01:17:58,530 --> 01:18:01,680 These programs have been found that they are only 20% 1483 01:18:01,680 --> 01:18:06,600 of the time accurate in producing and predicting future violent crimes. 1484 01:18:06,600 --> 01:18:09,630 They are only 60% of the time accurate in predicting 1485 01:18:09,630 --> 01:18:12,920 any sort of future crime, so misdemeanors and so on, 1486 01:18:12,920 --> 01:18:16,500 so a little bit better than a 50/50 shot at getting it right 1487 01:18:16,500 --> 01:18:19,590 based on these predictive questions that they're asking people when 1488 01:18:19,590 --> 01:18:22,600 during intake process. 1489 01:18:22,600 --> 01:18:26,740 Proponents of these scoring metrics say that they provide useful data. 1490 01:18:26,740 --> 01:18:29,170 Opponents say that the data is being misused. 1491 01:18:29,170 --> 01:18:31,360 It's being used as part of sentencing determinations 1492 01:18:31,360 --> 01:18:33,610 rather than what its ostensible purposes, which 1493 01:18:33,610 --> 01:18:36,700 is to set conditions for bail and set conditions 1494 01:18:36,700 --> 01:18:41,110 for release, any sort of parole conditions that might come into play. 1495 01:18:41,110 --> 01:18:43,510 These calculations are also done by companies 1496 01:18:43,510 --> 01:18:45,790 that generally are for-profit entities. 1497 01:18:45,790 --> 01:18:51,040 They sell these programs to states and localities for a fixed rate per year 1498 01:18:51,040 --> 01:18:52,480 typically. 1499 01:18:52,480 --> 01:18:56,462 Does that mean that there's a financial incentive to make certain decisions? 1500 01:18:56,462 --> 01:18:58,420 Would you feel differently about these programs 1501 01:18:58,420 --> 01:19:01,690 if they were not free versus paid programs? 1502 01:19:01,690 --> 01:19:05,530 Should computers be involved in making these decisions that humans 1503 01:19:05,530 --> 01:19:07,030 would otherwise make anyway? 1504 01:19:07,030 --> 01:19:12,220 Like, given a questionnaire, would a human being 1505 01:19:12,220 --> 01:19:13,930 potentially reach the same conclusion? 1506 01:19:13,930 --> 01:19:15,670 Ideally that is what it should do. 1507 01:19:15,670 --> 01:19:19,690 It should be mimicking the human decision-making process. 1508 01:19:19,690 --> 01:19:24,680 Is it somehow less slimy feeling, for lack of a better phrase, 1509 01:19:24,680 --> 01:19:28,610 if a human being, a judge or a court clerk, 1510 01:19:28,610 --> 01:19:31,580 is making these determinations rather than a computer? 1511 01:19:31,580 --> 01:19:33,830 Now, granted the judge is still making the final call. 1512 01:19:33,830 --> 01:19:37,820 But the computer is printing out likely recidivism scores 1513 01:19:37,820 --> 01:19:40,160 and printing out all this data about somebody 1514 01:19:40,160 --> 01:19:42,860 that surely is going to influence the judge's decision 1515 01:19:42,860 --> 01:19:46,910 and in some localities, perhaps over influencing the judge's decision, 1516 01:19:46,910 --> 01:19:49,280 taking the human element out of it entirely. 1517 01:19:49,280 --> 01:19:53,300 Does it feel better if the computer is out of that equation entirely? 1518 01:19:53,300 --> 01:19:55,820 Or is it better to have a computer make these decisions 1519 01:19:55,820 --> 01:20:02,630 and potentially prevent mistakes from happening prevent or draw attention 1520 01:20:02,630 --> 01:20:06,740 to things that might otherwise be missed or minimize things that might otherwise 1521 01:20:06,740 --> 01:20:08,780 have too much attention drawn to them? 1522 01:20:08,780 --> 01:20:11,330 Again, a difficult question to answer, how much do we 1523 01:20:11,330 --> 01:20:15,560 want technology to be involved in the legal decision-making process? 1524 01:20:15,560 --> 01:20:18,080 But as we go forward, it's certainly undoubtedly true 1525 01:20:18,080 --> 01:20:21,950 that more and more decisions in a legal context 1526 01:20:21,950 --> 01:20:24,950 are going to be made by computers at the outset, 1527 01:20:24,950 --> 01:20:28,820 with humans sort of falling into the verification category rather 1528 01:20:28,820 --> 01:20:31,458 than active decision maker category. 1529 01:20:31,458 --> 01:20:32,000 Is this good? 1530 01:20:32,000 --> 01:20:33,550 Is this bad? 1531 01:20:33,550 --> 01:20:36,650 It's the future. 1532 01:20:36,650 --> 01:20:39,800 For entities based in the United States or who 1533 01:20:39,800 --> 01:20:42,050 solely have customers in the United States, 1534 01:20:42,050 --> 01:20:45,680 this next area may not be a concern now but it's very likely 1535 01:20:45,680 --> 01:20:47,450 to potentially become one in the future. 1536 01:20:47,450 --> 01:20:51,200 And that is what to do with GDPR, the General Data Protection 1537 01:20:51,200 --> 01:20:54,200 Regulation, or General Data Privacy regulation 1538 01:20:54,200 --> 01:20:56,390 that was promulgated by the European Union 1539 01:20:56,390 --> 01:21:00,010 and came into effect in May of 2018. 1540 01:21:00,010 --> 01:21:04,680 This basically defines the right for people to know what kind of data 1541 01:21:04,680 --> 01:21:05,930 is being collected about them. 1542 01:21:05,930 --> 01:21:08,555 This is not a right that currently exists in the United States. 1543 01:21:08,555 --> 01:21:11,750 And it'll be really interesting to see whether the EU 1544 01:21:11,750 --> 01:21:15,140 experiment about revealing this kind of data, which has never 1545 01:21:15,140 --> 01:21:18,838 been available to individuals before, will become something 1546 01:21:18,838 --> 01:21:21,380 that exists in the United States and is going to be something 1547 01:21:21,380 --> 01:21:22,790 that we have to deal with. 1548 01:21:22,790 --> 01:21:26,630 If you're based in the United States, and you do have customers in Europe, 1549 01:21:26,630 --> 01:21:29,960 you may be subject to the GDPR. 1550 01:21:29,960 --> 01:21:32,450 For example, us at CS50, we have students 1551 01:21:32,450 --> 01:21:38,670 who take the class through at edX, or HarvardX, the online MOOC platform. 1552 01:21:38,670 --> 01:21:43,850 And when GDPR took effect in May of 2018, we spoke to Harvard 1553 01:21:43,850 --> 01:21:47,030 and figured out ways that we needed to potentially interact 1554 01:21:47,030 --> 01:21:49,910 with European users of our platform, despite the fact that we're 1555 01:21:49,910 --> 01:21:53,060 based in the United States, and what sort of data implications 1556 01:21:53,060 --> 01:21:54,047 that might have. 1557 01:21:54,047 --> 01:21:57,380 And that it could be because of it's out of an abundance of caution to make sure 1558 01:21:57,380 --> 01:21:59,380 we're on the right side of it, even if we're not 1559 01:21:59,380 --> 01:22:01,850 necessarily subject to the GDPR, but it is certainly 1560 01:22:01,850 --> 01:22:05,870 an area of evolving concern for international companies. 1561 01:22:05,870 --> 01:22:09,170 The GDPR allows individuals to get their personal data. 1562 01:22:09,170 --> 01:22:12,080 1563 01:22:12,080 --> 01:22:15,350 That means data that either could identify an individual, something 1564 01:22:15,350 --> 01:22:18,440 like what we discussed earlier in terms of cookies and tracking 1565 01:22:18,440 --> 01:22:22,160 and the kinds of things that you search being tied to your IP address, which 1566 01:22:22,160 --> 01:22:24,860 then might be tied to your actual address and so on, 1567 01:22:24,860 --> 01:22:27,320 or data that even could identify an individual 1568 01:22:27,320 --> 01:22:32,750 but doesn't necessarily identify somebody just yet. 1569 01:22:32,750 --> 01:22:36,370 The requirement itself imposes requirements. 1570 01:22:36,370 --> 01:22:38,120 The regulation itself imposes requirements 1571 01:22:38,120 --> 01:22:41,780 on the controller, so the person who is providing a service 1572 01:22:41,780 --> 01:22:44,570 or is holding all of that data, and basically 1573 01:22:44,570 --> 01:22:47,420 says that what the controllers responsibilities are 1574 01:22:47,420 --> 01:22:52,850 for processing that data and what they have to reveal to users who request it. 1575 01:22:52,850 --> 01:22:56,150 So for example, on request, by a user of a service, 1576 01:22:56,150 --> 01:23:00,110 when that user and the controller are subjects the GDPR, 1577 01:23:00,110 --> 01:23:03,890 the controller must identify themselves, who they are, 1578 01:23:03,890 --> 01:23:08,390 what the best way is to contact them, tell the user what data they have 1579 01:23:08,390 --> 01:23:11,960 about them, how that data is being processed, 1580 01:23:11,960 --> 01:23:14,600 why they are processing that data, so what sorts of things 1581 01:23:14,600 --> 01:23:15,850 are they trying to do with it. 1582 01:23:15,850 --> 01:23:21,590 Are they trying to make longitudinal connections between different people? 1583 01:23:21,590 --> 01:23:26,282 Are they trying to collect it to sell it to marketers and so on? 1584 01:23:26,282 --> 01:23:29,490 They need to tell them if that data is going to be referred to a third party, 1585 01:23:29,490 --> 01:23:33,890 again, whether that's selling the data or using a third-party service to help 1586 01:23:33,890 --> 01:23:34,850 interpret that data. 1587 01:23:34,850 --> 01:23:37,790 So again for example, in the case of Samsung, 1588 01:23:37,790 --> 01:23:40,040 that might be Samsung is collecting your voice data. 1589 01:23:40,040 --> 01:23:41,750 But they may be sharing all the data they 1590 01:23:41,750 --> 01:23:46,550 get with a third party, whose focus, whose programming focus 1591 01:23:46,550 --> 01:23:51,320 is about processing that data and trying to find out better voice 1592 01:23:51,320 --> 01:23:53,897 commands by collecting the voices of hundreds of thousands 1593 01:23:53,897 --> 01:23:55,730 of different people so they can get a better 1594 01:23:55,730 --> 01:24:01,640 synthesis of a particular thing they hear, translating that into a command. 1595 01:24:01,640 --> 01:24:05,690 These same restrictions will apply whether the data 1596 01:24:05,690 --> 01:24:10,510 is collected or provided by the user, or is just inferred about the user 1597 01:24:10,510 --> 01:24:11,010 as well. 1598 01:24:11,010 --> 01:24:14,600 So that the controller would also need to reveal information 1599 01:24:14,600 --> 01:24:17,600 that was gleaned about somebody without necessarily having just 1600 01:24:17,600 --> 01:24:23,670 been given to them directly by the person providing that personal data. 1601 01:24:23,670 --> 01:24:29,630 The owner can also compel the controller to change data about them once they 1602 01:24:29,630 --> 01:24:33,110 get this report about what data they have about them that is inaccurate, 1603 01:24:33,110 --> 01:24:35,990 which brings up a really interesting question of, what if something 1604 01:24:35,990 --> 01:24:38,780 is accurate, but you don't like it, and you are 1605 01:24:38,780 --> 01:24:40,610 a person who's providing personal data? 1606 01:24:40,610 --> 01:24:42,867 Can you challenge it as inaccurate? 1607 01:24:42,867 --> 01:24:45,200 This is, again, something that has not been answered yet 1608 01:24:45,200 --> 01:24:48,500 but is very likely to be answered at some point by somebody. 1609 01:24:48,500 --> 01:24:50,810 What does it mean for data to be inaccurate? 1610 01:24:50,810 --> 01:24:55,580 Moreover, is it a good thing to delete data about somebody? 1611 01:24:55,580 --> 01:24:59,810 There are exceptions that exist in the GDPR for preserving data or not 1612 01:24:59,810 --> 01:25:04,260 allowing it to be deleted if it serves the public interest. 1613 01:25:04,260 --> 01:25:07,460 And so the argument that is sometimes made in favor of GDPR 1614 01:25:07,460 --> 01:25:11,450 is someone who commits a minor crime, for example, 1615 01:25:11,450 --> 01:25:15,840 might be haunted by this one mark on their record for years and years 1616 01:25:15,840 --> 01:25:16,340 and years. 1617 01:25:16,340 --> 01:25:17,630 They can never shake it. 1618 01:25:17,630 --> 01:25:20,330 And it's a minor crime. 1619 01:25:20,330 --> 01:25:22,700 There was no recidivism. 1620 01:25:22,700 --> 01:25:23,960 It wasn't violence in any way. 1621 01:25:23,960 --> 01:25:26,932 It just has now hampered-- it's impacted their life. 1622 01:25:26,932 --> 01:25:29,390 They can't get the kind of job that they want, for example. 1623 01:25:29,390 --> 01:25:31,910 They can't get the kind of apartment that they want. 1624 01:25:31,910 --> 01:25:34,880 Shouldn't they be able to eliminate that data? 1625 01:25:34,880 --> 01:25:41,330 Some people would argue yes, that the individual's already paid the price. 1626 01:25:41,330 --> 01:25:46,260 Society is not harmed by this crime or this past event any longer. 1627 01:25:46,260 --> 01:25:48,290 And so sure, delete that data. 1628 01:25:48,290 --> 01:25:50,600 Others would argue no, it's a part of history. 1629 01:25:50,600 --> 01:25:53,840 We don't have a policy of erasing history. 1630 01:25:53,840 --> 01:25:54,890 That's not what we do. 1631 01:25:54,890 --> 01:25:58,070 And so even though it's annoying perhaps to that individual, 1632 01:25:58,070 --> 01:26:01,040 or it's had a non-trivial impact on their life, 1633 01:26:01,040 --> 01:26:03,890 we can't just get rid of data that we don't like. 1634 01:26:03,890 --> 01:26:07,410 So data that might be deemed inaccurate personally, 1635 01:26:07,410 --> 01:26:09,950 like if a company gets a lot of information about me 1636 01:26:09,950 --> 01:26:12,500 because I'm doing a lot of online shopping, and they say, 1637 01:26:12,500 --> 01:26:16,140 I'm a compulsive spender, and that's part of their processed data, 1638 01:26:16,140 --> 01:26:18,068 can I challenge that is inaccurate because I 1639 01:26:18,068 --> 01:26:19,610 don't think I'm a compulsive spender? 1640 01:26:19,610 --> 01:26:23,630 I feel like I earn enough money and can spend this money how I want, 1641 01:26:23,630 --> 01:26:25,760 and it has an impact on my life negatively. 1642 01:26:25,760 --> 01:26:31,130 But they think, well, you've spent $20,000 on pictures of cats. 1643 01:26:31,130 --> 01:26:33,458 Maybe you are kind of a compulsive spender. 1644 01:26:33,458 --> 01:26:35,750 And that's something that we've gleaned from this data, 1645 01:26:35,750 --> 01:26:37,042 and that's part of your record. 1646 01:26:37,042 --> 01:26:38,300 Can I challenge that? 1647 01:26:38,300 --> 01:26:40,802 Open question. 1648 01:26:40,802 --> 01:26:44,010 For those of you who may be contending with the GDPR in your future practice, 1649 01:26:44,010 --> 01:26:46,590 we've excerpted some parts of it that are particularly relevant, 1650 01:26:46,590 --> 01:26:48,465 that deal with the technological implications 1651 01:26:48,465 --> 01:26:51,300 of what we've just discussed as part of the recommended 1652 01:26:51,300 --> 01:26:54,660 reading for this module. 1653 01:26:54,660 --> 01:26:57,300 The last subject that we'd like to consider in this course 1654 01:26:57,300 --> 01:27:01,470 is what is kind of a political hot potato right now in the United States. 1655 01:27:01,470 --> 01:27:03,600 And that is this idea of net neutrality. 1656 01:27:03,600 --> 01:27:06,700 And before we get into the back and forth of it, 1657 01:27:06,700 --> 01:27:08,700 I think it's properly important for us to define 1658 01:27:08,700 --> 01:27:12,350 what exactly net neutrality is. 1659 01:27:12,350 --> 01:27:16,680 At its fundamental core, the idea is that all traffic on the internet 1660 01:27:16,680 --> 01:27:17,940 should be treated equally. 1661 01:27:17,940 --> 01:27:21,240 We shouldn't prioritize some packets over others. 1662 01:27:21,240 --> 01:27:24,630 So whether your service is Google, Facebook, Netflix, 1663 01:27:24,630 --> 01:27:28,620 some huge data provider, or you are some mom-and-pop shop 1664 01:27:28,620 --> 01:27:32,670 in Kansas somewhere that has a few customers, 1665 01:27:32,670 --> 01:27:36,360 but you still have a website and a web presence, 1666 01:27:36,360 --> 01:27:40,110 that web traffic from either that location, the small shop, 1667 01:27:40,110 --> 01:27:43,230 or the big data provider should be treated equally. 1668 01:27:43,230 --> 01:27:45,120 One should not be prioritized over the other. 1669 01:27:45,120 --> 01:27:48,870 That is the basic idea that underpins-- when you hear net neutrality, 1670 01:27:48,870 --> 01:27:52,650 it is all traffic on the web should be treated equally. 1671 01:27:52,650 --> 01:27:57,040 The hot potato, of course, is, is that the right thing to do? 1672 01:27:57,040 --> 01:28:01,110 Let's try and visualize one way of thinking 1673 01:28:01,110 --> 01:28:05,980 about net neutrality that kind of shows you how both sides might perceive this. 1674 01:28:05,980 --> 01:28:09,100 It may help to think about net neutrality in terms of a road. 1675 01:28:09,100 --> 01:28:11,920 Much like a road has cars flowing over it, 1676 01:28:11,920 --> 01:28:14,140 the internet has information flowing over it. 1677 01:28:14,140 --> 01:28:18,370 So we can think about this like we have a road. 1678 01:28:18,370 --> 01:28:20,860 And proponents of net neutrality will say, well, 1679 01:28:20,860 --> 01:28:26,790 wait a minute, if we built a second road that was parallel to the first road, 1680 01:28:26,790 --> 01:28:31,230 went to the same place, but this road was maybe better maintained, 1681 01:28:31,230 --> 01:28:35,250 and you had to pay a toll to use it, proponents would say, hey, wait, 1682 01:28:35,250 --> 01:28:36,450 this is unfair. 1683 01:28:36,450 --> 01:28:38,580 All this traffic needs to use this main road 1684 01:28:38,580 --> 01:28:40,650 that we've been using for a long time. 1685 01:28:40,650 --> 01:28:45,450 But people who can afford to go into this new road, where 1686 01:28:45,450 --> 01:28:48,450 traffic moves faster, but you have to pay the toll, well, then 1687 01:28:48,450 --> 01:28:50,810 their traffic's going to be prioritized. 1688 01:28:50,810 --> 01:28:53,130 Their packets are to get there faster. 1689 01:28:53,130 --> 01:28:54,840 This is not fundamentally fair. 1690 01:28:54,840 --> 01:28:57,090 This is not the way the internet was designed, 1691 01:28:57,090 --> 01:29:00,060 where free flow of information is sort of priority, 1692 01:29:00,060 --> 01:29:01,690 and every packet is treated equally. 1693 01:29:01,690 --> 01:29:06,690 So proponents of net neutrality will say this arrangement is unfair. 1694 01:29:06,690 --> 01:29:08,610 Opponents of net neutrality, people who feel 1695 01:29:08,610 --> 01:29:11,970 like you should be able to have traffic that goes faster 1696 01:29:11,970 --> 01:29:15,000 on some roads than others, will say, no, no, no, this 1697 01:29:15,000 --> 01:29:16,560 is the free market talking. 1698 01:29:16,560 --> 01:29:18,390 The free market is saying, hey, if I really 1699 01:29:18,390 --> 01:29:22,932 want to make sure that my service gets to people faster, 1700 01:29:22,932 --> 01:29:24,390 I should have the right to do that. 1701 01:29:24,390 --> 01:29:27,960 After all, that's how the market works for just about everything else. 1702 01:29:27,960 --> 01:29:31,625 Why should the internet be any different? 1703 01:29:31,625 --> 01:29:33,000 And that's really the basic idea. 1704 01:29:33,000 --> 01:29:36,480 Is it should everybody use the same road, 1705 01:29:36,480 --> 01:29:42,450 or should people who can afford to use a different road be permitted to do so? 1706 01:29:42,450 --> 01:29:44,310 Proponents will say no. 1707 01:29:44,310 --> 01:29:45,960 Opponents will say yes. 1708 01:29:45,960 --> 01:29:49,935 That's the way the free market works. 1709 01:29:49,935 --> 01:29:52,560 From a theoretical perspective or from a technical perspective, 1710 01:29:52,560 --> 01:29:53,940 how would we implement this? 1711 01:29:53,940 --> 01:29:57,420 It's relatively easy if the service that we're trying to target 1712 01:29:57,420 --> 01:30:00,540 has paid for premium service. 1713 01:30:00,540 --> 01:30:02,710 Their IP addresses associated with their business. 1714 01:30:02,710 --> 01:30:05,010 And so the internet service provider, the people 1715 01:30:05,010 --> 01:30:08,940 who own the infrastructure on which the internet operates, so they literally 1716 01:30:08,940 --> 01:30:11,730 own the fiber optic cables along which the data operate, 1717 01:30:11,730 --> 01:30:16,700 can just say, well, any data that's going to this IP address, 1718 01:30:16,700 --> 01:30:19,010 we'll just prioritize it over other traffic. 1719 01:30:19,010 --> 01:30:22,440 There might be real reasons to actually want to prioritize other traffic. 1720 01:30:22,440 --> 01:30:26,910 So for example, if you are sending an email to somebody 1721 01:30:26,910 --> 01:30:30,410 or trying to access a website, there's a lot of redundancy built in here. 1722 01:30:30,410 --> 01:30:35,280 We've talked about TCP, for example, the Transmission Control Protocol, 1723 01:30:35,280 --> 01:30:37,140 and how it has redundancy built in. 1724 01:30:37,140 --> 01:30:39,480 If a packet is dropped, if there's so much network 1725 01:30:39,480 --> 01:30:42,850 congestion because everybody's flowing along that same road, 1726 01:30:42,850 --> 01:30:45,860 if there's so much congestion that the packet gets dropped, 1727 01:30:45,860 --> 01:30:48,040 TCP will re-send that packet. 1728 01:30:48,040 --> 01:30:52,740 So services that are low impact, like accessing a website for some company 1729 01:30:52,740 --> 01:30:58,647 or sending an email to somebody, there's no real worry here. 1730 01:30:58,647 --> 01:31:00,480 But now imagine a service like you're trying 1731 01:31:00,480 --> 01:31:03,930 to make an international business video call 1732 01:31:03,930 --> 01:31:06,660 using Skype or using Google Hangouts, or you're 1733 01:31:06,660 --> 01:31:12,000 trying to stream a movie on Netflix or some other internet video streaming 1734 01:31:12,000 --> 01:31:13,230 provider. 1735 01:31:13,230 --> 01:31:16,168 Generally, those packets are not sent using TCP. 1736 01:31:16,168 --> 01:31:18,210 They're usually using a different protocol called 1737 01:31:18,210 --> 01:31:22,135 UDP, whose purpose in life is really just to get information to as quickly 1738 01:31:22,135 --> 01:31:23,760 as possible, but there's no redundancy. 1739 01:31:23,760 --> 01:31:27,990 If a package gets dropped, that packet gets dropped, so be it. 1740 01:31:27,990 --> 01:31:30,840 Now, imagine if you're having an international business call. 1741 01:31:30,840 --> 01:31:34,588 There's a lot of packets moving, especially if you're 1742 01:31:34,588 --> 01:31:36,130 having a call with Asia, for example. 1743 01:31:36,130 --> 01:31:39,463 Between the United States and Asia, that has to travel along that Pacific cable. 1744 01:31:39,463 --> 01:31:42,420 There's a lot of traffic that has to use that Pacific cable. 1745 01:31:42,420 --> 01:31:47,370 Wouldn't it be nice, advocates against net neutrality would say, 1746 01:31:47,370 --> 01:31:49,500 if the company that's providing that service 1747 01:31:49,500 --> 01:31:53,280 was able to pay to ensure that its packets had priority thus 1748 01:31:53,280 --> 01:31:56,910 reducing the likelihood of those packets being dropped, 1749 01:31:56,910 --> 01:32:01,200 thus improving the quality of the video call, thus generally providing, 1750 01:32:01,200 --> 01:32:06,060 theoretically again, a better service for the people who use it. 1751 01:32:06,060 --> 01:32:10,220 So it might be the case that some services just need prioritization. 1752 01:32:10,220 --> 01:32:11,970 And the internet is designed in such a way 1753 01:32:11,970 --> 01:32:15,660 that we can't guarantee or give them that prioritization. 1754 01:32:15,660 --> 01:32:19,930 Isn't that a reason in favor of repealing net neutrality, 1755 01:32:19,930 --> 01:32:24,330 making it so that people could pay for certain services that 1756 01:32:24,330 --> 01:32:27,990 don't work with redundancy and require just to get there quickly 1757 01:32:27,990 --> 01:32:31,730 and get there guaranteed over other traffic? 1758 01:32:31,730 --> 01:32:37,550 In 2015, the Obama administration, when the Federal Communications Commission 1759 01:32:37,550 --> 01:32:42,680 was Democratically controlled, voted in favor of net neutrality, 1760 01:32:42,680 --> 01:32:47,270 reclassifying the internet as a Title II communications service. 1761 01:32:47,270 --> 01:32:50,120 Meaning it could be much more tightly regulated by the FCC 1762 01:32:50,120 --> 01:32:52,790 and imposing this net neutrality requirement. 1763 01:32:52,790 --> 01:32:56,520 Two years later, when the Trump administration came into office, 1764 01:32:56,520 --> 01:33:00,650 President Trump appointed Ajit Pai, the current chairman of the FCC, 1765 01:33:00,650 --> 01:33:05,180 who basically said he was going to repeal the net neutrality rules that 1766 01:33:05,180 --> 01:33:07,410 had been set in place by the Obama administration. 1767 01:33:07,410 --> 01:33:08,030 And he did. 1768 01:33:08,030 --> 01:33:11,240 Those took effect in the summer of 2018. 1769 01:33:11,240 --> 01:33:15,800 So we're now back in this wild lands of net neutrality 1770 01:33:15,800 --> 01:33:17,530 is on the books in some places. 1771 01:33:17,530 --> 01:33:20,360 There are even states now who have state laws 1772 01:33:20,360 --> 01:33:25,580 that are designed to enforce this idea, this theory of net neutrality, 1773 01:33:25,580 --> 01:33:28,540 that you're now running into conflict with federal law. 1774 01:33:28,540 --> 01:33:32,210 So there's now this question of who wins out here? 1775 01:33:32,210 --> 01:33:33,650 Has Congress claimed this domain? 1776 01:33:33,650 --> 01:33:39,440 Can states set different rules from what Congress and what regulators 1777 01:33:39,440 --> 01:33:44,450 appointed by or delegated responsibility by Congress to make these decisions? 1778 01:33:44,450 --> 01:33:46,820 Can states do something different than that? 1779 01:33:46,820 --> 01:33:49,040 1780 01:33:49,040 --> 01:33:53,990 It is probably one of the most hot-button hot-potato issues 1781 01:33:53,990 --> 01:33:56,000 in technology and the law right now. 1782 01:33:56,000 --> 01:33:59,120 What is going to happen with respect to net neutrality? 1783 01:33:59,120 --> 01:34:00,140 Is it a good thing? 1784 01:34:00,140 --> 01:34:01,490 Is it a bad thing? 1785 01:34:01,490 --> 01:34:06,460 Is it the right thing to do for the internet? 1786 01:34:06,460 --> 01:34:08,240 To learn a bit more about net neutrality, 1787 01:34:08,240 --> 01:34:11,980 we've supplied as an additional reading a con take on net neutrality. 1788 01:34:11,980 --> 01:34:14,740 Generally you'd see pro takes about this in tech blogs. 1789 01:34:14,740 --> 01:34:18,730 But we've explicitly included a con take on why net neutrality should not 1790 01:34:18,730 --> 01:34:21,400 be the norm, which we really do encourage you to take a look at 1791 01:34:21,400 --> 01:34:25,500 and consider as you dive into this topic. 1792 01:34:25,500 --> 01:34:27,750 But those are just some of the challenges 1793 01:34:27,750 --> 01:34:29,978 that lie at the intersection of law and technology. 1794 01:34:29,978 --> 01:34:31,770 We've certainly barely skimmed the surface. 1795 01:34:31,770 --> 01:34:35,850 And my hope is that I've created far more questions than answers 1796 01:34:35,850 --> 01:34:38,610 because those are the kinds of questions that you 1797 01:34:38,610 --> 01:34:41,400 are going to have to answer for us. 1798 01:34:41,400 --> 01:34:43,740 Ultimately it is you, as practitioners, who 1799 01:34:43,740 --> 01:34:46,710 will go out and face these challenges and figure out 1800 01:34:46,710 --> 01:34:49,290 how we're going to deal with data breaches, how we're 1801 01:34:49,290 --> 01:34:51,353 going to deal with AI in the law, how we're 1802 01:34:51,353 --> 01:34:54,270 going to deal with net neutrality, how we're going to deal with issues 1803 01:34:54,270 --> 01:34:56,130 of software and trust. 1804 01:34:56,130 --> 01:34:59,430 Those are the questions for the future that lie at this intersection. 1805 01:34:59,430 --> 01:35:00,900 And the future is in your hands. 1806 01:35:00,900 --> 01:35:03,530 So help lead us in the right direction. 1807 01:35:03,530 --> 01:35:04,445