1 00:00:00,000 --> 00:00:00,500 2 00:00:00,500 --> 00:00:02,450 BRIAN YU: Let's take a look at similarities. 3 00:00:02,450 --> 00:00:06,130 In the similarities problem, your task is going to be to take two files 4 00:00:06,130 --> 00:00:08,870 and to figure out how similar they are to each other. 5 00:00:08,870 --> 00:00:11,240 But what does similar actually mean? 6 00:00:11,240 --> 00:00:14,510 Two files could be similar because they have a lot of lines in common, 7 00:00:14,510 --> 00:00:15,620 for example. 8 00:00:15,620 --> 00:00:19,210 Or two files could be similar because they have a lot of sentences in common. 9 00:00:19,210 --> 00:00:20,960 Or two files could be similar because they 10 00:00:20,960 --> 00:00:23,180 have a lot of substrings in common. 11 00:00:23,180 --> 00:00:26,420 And your task in this problem is going to be to explore all three 12 00:00:26,420 --> 00:00:28,110 of those different methods. 13 00:00:28,110 --> 00:00:30,710 So here's what you'll have to do. 14 00:00:30,710 --> 00:00:32,630 First, you'll implement a function called 15 00:00:32,630 --> 00:00:36,230 lines, which will compare two files based on the number of lines 16 00:00:36,230 --> 00:00:37,580 they have in common. 17 00:00:37,580 --> 00:00:40,425 Then, you'll implement a function called sentences, 18 00:00:40,425 --> 00:00:42,800 which will compare files based on the number of sentences 19 00:00:42,800 --> 00:00:44,220 that they have in common. 20 00:00:44,220 --> 00:00:46,610 And then, you'll implement a function called substrings, 21 00:00:46,610 --> 00:00:50,720 which will compare files based on the number of substrings of length n 22 00:00:50,720 --> 00:00:53,510 they have in common for any length n. 23 00:00:53,510 --> 00:00:56,120 And then finally, we'll take these functions to the web, 24 00:00:56,120 --> 00:01:00,950 writing an HTML file called index.html, which will display a web form where 25 00:01:00,950 --> 00:01:04,340 the user can select two files and how they want to compare them-- 26 00:01:04,340 --> 00:01:08,099 either by lines, or sentences, or substrings-- and then submit that form, 27 00:01:08,099 --> 00:01:09,890 at which point your web page will display-- 28 00:01:09,890 --> 00:01:14,450 side-by-side-- those two files with any similarities between them highlighted. 29 00:01:14,450 --> 00:01:16,330 Let's get started. 30 00:01:16,330 --> 00:01:17,390