1 00:00:00,000 --> 00:00:00,325 2 00:00:00,325 --> 00:00:02,200 SPEAKER 1: Let's take a look at similarities. 3 00:00:02,200 --> 00:00:05,590 In this problem, your job is going to be to take two strings 4 00:00:05,590 --> 00:00:09,070 and to figure out how similar those two strings are to each other. 5 00:00:09,070 --> 00:00:12,370 And we'll define similarity in terms of edit distance. 6 00:00:12,370 --> 00:00:15,910 In particular, edit distance means how many steps or operations 7 00:00:15,910 --> 00:00:19,570 it would take to convert one string into the other string. 8 00:00:19,570 --> 00:00:22,810 If you can convert one string into the other one in very few steps 9 00:00:22,810 --> 00:00:26,330 or operations, then the two strings are probably pretty similar. 10 00:00:26,330 --> 00:00:28,390 But if it takes a lot of steps or operations 11 00:00:28,390 --> 00:00:31,090 to convert one string into the other, than those two strings 12 00:00:31,090 --> 00:00:32,840 are probably pretty different. 13 00:00:32,840 --> 00:00:34,360 Here's what you'll have to do. 14 00:00:34,360 --> 00:00:36,250 First, you'll implement a function called 15 00:00:36,250 --> 00:00:40,240 distances which will return a 2D list or a matrix, 16 00:00:40,240 --> 00:00:44,620 where matrix in row i and column j is going to be a tuple. 17 00:00:44,620 --> 00:00:48,670 Just an encapsulation of multiple values where the first value in the tuple 18 00:00:48,670 --> 00:00:52,330 is the edit distance between the first i characters 19 00:00:52,330 --> 00:00:55,930 of your first string which we'll call a, the first j 20 00:00:55,930 --> 00:00:59,320 characters of your second string which we'll call b. 21 00:00:59,320 --> 00:01:02,350 And so that edit distance is just going to be a number representing 22 00:01:02,350 --> 00:01:05,230 the number of steps it would take to convert from the first i 23 00:01:05,230 --> 00:01:09,490 characters of a into the first j characters of b. 24 00:01:09,490 --> 00:01:11,620 Then the second element in the tuple is going 25 00:01:11,620 --> 00:01:14,560 to be the last operation that you had to take 26 00:01:14,560 --> 00:01:19,120 in the optimal sequence of operations to get from one string to the other. 27 00:01:19,120 --> 00:01:22,780 That operation might be an insert or a deletion or a substitution, 28 00:01:22,780 --> 00:01:25,060 but more on that in a moment. 29 00:01:25,060 --> 00:01:27,820 Then you'll implement an HTML, a file called 30 00:01:27,820 --> 00:01:30,760 index.html which will display a form where 31 00:01:30,760 --> 00:01:35,290 the user can input two strings in order to submit them to be compared, 32 00:01:35,290 --> 00:01:37,570 based on the edit distance algorithm. 33 00:01:37,570 --> 00:01:41,260 And then finally, you'll implement a file called matrix.html 34 00:01:41,260 --> 00:01:43,870 which will display an HTML table representing 35 00:01:43,870 --> 00:01:47,530 all of the possible costs that are returned by distances. 36 00:01:47,530 --> 00:01:50,920 Giving you all of the different costs for converting the first i 37 00:01:50,920 --> 00:01:54,490 characters of string a into the first j characters of string 38 00:01:54,490 --> 00:01:58,150 b for all possible i and j values. 39 00:01:58,150 --> 00:01:59,970 Let's get started. 40 00:01:59,970 --> 00:02:01,160