1 00:00:00,000 --> 00:00:08,364 2 00:00:08,364 --> 00:00:08,870 >> LUCAS Freitas: Hey. 3 00:00:08,870 --> 00:00:09,980 Karibu kila mtu. 4 00:00:09,980 --> 00:00:11,216 Jina langu ni Lucas Freitas. 5 00:00:11,216 --> 00:00:15,220 Mimi nina junior katika [inaudible] kusoma sayansi ya kompyuta kwa lengo katika 6 00:00:15,220 --> 00:00:16,410 computational isimu. 7 00:00:16,410 --> 00:00:19,310 Hivyo sekondari wangu ni katika lugha na nadharia ya lugha. 8 00:00:19,310 --> 00:00:21,870 Mimi kwa kweli msisimko kufundisha guys kidogo kuhusu shamba. 9 00:00:21,870 --> 00:00:24,300 Ni eneo kusisimua sana kujifunza. 10 00:00:24,300 --> 00:00:27,260 Pia pamoja na mengi ya uwezo kwa siku zijazo. 11 00:00:27,260 --> 00:00:30,160 Kwa hiyo, mimi nina kweli msisimko kwamba guys ni kuzingatia miradi katika 12 00:00:30,160 --> 00:00:31,160 computational isimu. 13 00:00:31,160 --> 00:00:35,460 Na mimi itabidi kuwa zaidi ya furaha kwa ushauri yoyote ya wewe kama wewe kuamua 14 00:00:35,460 --> 00:00:37,090 kujiingiza mmoja wa wale. 15 00:00:37,090 --> 00:00:40,010 >> Hivyo kwanza ya yote nini ni computational isimu? 16 00:00:40,010 --> 00:00:44,630 Isimu hivyo computational ni makutano kati ya lugha na 17 00:00:44,630 --> 00:00:46,390 sayansi ya kompyuta. 18 00:00:46,390 --> 00:00:47,415 Hivyo, ni nini isimu? 19 00:00:47,415 --> 00:00:48,490 Sayansi ya kompyuta ni nini? 20 00:00:48,490 --> 00:00:51,580 Vizuri kutoka lugha, nini sisi kuchukua ni lugha. 21 00:00:51,580 --> 00:00:54,960 Hivyo isimu ni kweli utafiti ya lugha ya asili kwa ujumla. 22 00:00:54,960 --> 00:00:58,330 Lugha hiyo asili - sisi majadiliano juu ya lugha kwamba sisi kweli kutumia 23 00:00:58,330 --> 00:00:59,770 kuwasiliana na kila mmoja. 24 00:00:59,770 --> 00:01:02,200 Hivyo sisi ni si hasa kuzungumza kuhusu C au Java. 25 00:01:02,200 --> 00:01:05,900 Tunazungumzia zaidi kuhusu lugha ya Kiingereza na Lugha ya Kichina na mengine ambayo sisi 26 00:01:05,900 --> 00:01:07,780 kutumia kuwasiliana na kila mmoja. 27 00:01:07,780 --> 00:01:12,470 >> kitu changamoto juu ya ni kwamba sasa hivi tuna karibu 7,000 28 00:01:12,470 --> 00:01:14,260 lugha katika dunia. 29 00:01:14,260 --> 00:01:19,520 Hivyo kuna aina ya juu kabisa ya lugha ambazo tunaweza kujifunza. 30 00:01:19,520 --> 00:01:22,600 Na kisha unafikiri kwamba pengine vigumu sana kufanya, kwa mfano, 31 00:01:22,600 --> 00:01:26,960 tafsiri kutoka lugha moja hadi nyingine, kwa kuzingatia kwamba una 32 00:01:26,960 --> 00:01:28,240 karibu 7,000 wao. 33 00:01:28,240 --> 00:01:31,450 Hivyo, kama wewe kufikiria kufanya tafsiri ya kutoka lugha moja hadi nyingine unaweza 34 00:01:31,450 --> 00:01:35,840 kuwa karibu zaidi ya milioni mchanganyiko tofauti kwamba unaweza 35 00:01:35,840 --> 00:01:37,330 na kutoka lugha kwa lugha. 36 00:01:37,330 --> 00:01:40,820 Hivyo ni kweli changamoto ya kufanya baadhi ya aina ya mfumo wa mfano tafsiri kwa 37 00:01:40,820 --> 00:01:43,540 kila lugha moja. 38 00:01:43,540 --> 00:01:47,120 >> Hivyo, lugha chipsi na syntax, semantics, pragmatics. 39 00:01:47,120 --> 00:01:49,550 You guys si hasa haja kujua nini ni wao. 40 00:01:49,550 --> 00:01:55,090 Lakini jambo la kuvutia sana ni kwamba kama msemaji asili, wakati kujifunza 41 00:01:55,090 --> 00:01:59,010 lugha kama mtoto, kweli kujifunza mambo hayo yote - semantics syntax 42 00:01:59,010 --> 00:02:00,500 na pragmatics - 43 00:02:00,500 --> 00:02:01,430 na wewe mwenyewe. 44 00:02:01,430 --> 00:02:04,820 Na hakuna mtu ana kufundisha syntax kwa wewe kuelewa jinsi hukumu ni 45 00:02:04,820 --> 00:02:05,290 muundo. 46 00:02:05,290 --> 00:02:07,980 Hivyo, ni kweli kuvutia kwa sababu ni kitu ambacho huja sana 47 00:02:07,980 --> 00:02:10,389 intuitively. 48 00:02:10,389 --> 00:02:13,190 >> Na nini wewe ni kuchukua kutoka sayansi ya kompyuta? 49 00:02:13,190 --> 00:02:16,700 Naam, jambo muhimu zaidi kwamba sisi na katika sayansi ya kompyuta ni ya kwanza ya 50 00:02:16,700 --> 00:02:19,340 wote, akili bandia na kujifunza mashine. 51 00:02:19,340 --> 00:02:22,610 Hivyo, nini sisi ni kujaribu kufanya computational isimu ni kufundisha 52 00:02:22,610 --> 00:02:26,990 kompyuta yako jinsi ya kufanya kitu kwa lugha. 53 00:02:26,990 --> 00:02:28,630 >> Hivyo, kwa mfano, katika mashine tafsiri. 54 00:02:28,630 --> 00:02:32,490 Mimi kujaribu kufundisha kompyuta yangu jinsi kujua jinsi ya mpito kutoka moja 55 00:02:32,490 --> 00:02:33,310 lugha nyingine. 56 00:02:33,310 --> 00:02:35,790 Kwa hiyo, kimsingi kama mafundisho kompyuta lugha mbili. 57 00:02:35,790 --> 00:02:38,870 Kama mimi kufanya usindikaji lugha ya asili, ambayo ni kesi kwa mfano wa 58 00:02:38,870 --> 00:02:41,810 Picha ya Graph Search, kufundisha kompyuta yako jinsi ya kuelewa 59 00:02:41,810 --> 00:02:42,730 maswali vizuri. 60 00:02:42,730 --> 00:02:48,130 >> Hivyo, kama wewe kusema "photos yangu marafiki. "Facebook haina kutibu kwamba 61 00:02:48,130 --> 00:02:51,130 kama kamba zima ambayo ina tu rundo la maneno. 62 00:02:51,130 --> 00:02:56,020 Ni kweli anaelewa uhusiano kati ya "picha" na "rafiki yangu" na 63 00:02:56,020 --> 00:02:59,620 anaelewa kuwa "picha" ni mali ya "marafiki zangu." 64 00:02:59,620 --> 00:03:02,350 >> Kwa hiyo, hiyo ni sehemu ya, kwa mfano, usindikaji lugha ya asili. 65 00:03:02,350 --> 00:03:04,790 Ni kujaribu kuelewa nini ni uhusiano kati ya 66 00:03:04,790 --> 00:03:07,520 maneno katika sentensi. 67 00:03:07,520 --> 00:03:11,170 Na swali kubwa ni, unaweza kufundisha kompyuta jinsi ya kuzungumza 68 00:03:11,170 --> 00:03:12,650 lugha kwa ujumla? 69 00:03:12,650 --> 00:03:17,810 Ambayo ni swali kuvutia sana kufikiri, kama labda katika siku zijazo, 70 00:03:17,810 --> 00:03:19,930 utaenda kuwa na uwezo wa kuzungumza na simu yako ya mkononi. 71 00:03:19,930 --> 00:03:23,290 Aina ya kama nini cha kufanya na Siri lakini kitu zaidi kama, unaweza kweli 72 00:03:23,290 --> 00:03:25,690 kusema chochote unataka na simu ni kwenda kuelewa kila kitu. 73 00:03:25,690 --> 00:03:28,350 Na inaweza kuwa na kufuatilia maswali na kuendelea kuongea. 74 00:03:28,350 --> 00:03:30,880 Hiyo ni kitu kweli kusisimua, kwa maoni yangu. 75 00:03:30,880 --> 00:03:33,070 >> Kwa hiyo, kitu kuhusu lugha ya asili. 76 00:03:33,070 --> 00:03:36,220 Kitu kweli kuvutia kuhusu lugha ya asili ni kwamba, na hii ni 77 00:03:36,220 --> 00:03:38,470 mikopo kwa yangu Profesa wa isimu, Maria Polinsky. 78 00:03:38,470 --> 00:03:40,830 Anatoa mfano na nadhani ni kweli kuvutia. 79 00:03:40,830 --> 00:03:47,060 Kwa sababu sisi kujifunza lugha kutoka wakati sisi ni kuzaliwa na kisha asili yetu 80 00:03:47,060 --> 00:03:49,170 lugha aina ya kukua kwa sisi. 81 00:03:49,170 --> 00:03:52,570 >> Na kimsingi kujifunza lugha kutoka pembejeo ndogo, sawa? 82 00:03:52,570 --> 00:03:56,700 Wewe ni kupata tu pembejeo kutoka yako wazazi wa lugha gani yako sauti 83 00:03:56,700 --> 00:03:58,770 kama na wewe tu kujifunza. 84 00:03:58,770 --> 00:04:02,240 Hivyo, ni ya kuvutia kwa sababu kama ukiangalia katika hukumu hizo, kwa mfano. 85 00:04:02,240 --> 00:04:06,980 Unaweza kuangalia, "Mary unaweka juu ya kanzu kila wakati yeye majani ya nyumba. " 86 00:04:06,980 --> 00:04:10,650 >> Katika kesi hiyo, inawezekana kuwa na neno "yeye" rejea Mary, sawa? 87 00:04:10,650 --> 00:04:13,500 Unaweza kusema "Mary unaweka juu ya kanzu kila wakati Mary majani 88 00:04:13,500 --> 00:04:14,960 nyumba. "ili nzuri. 89 00:04:14,960 --> 00:04:19,370 Lakini basi ukiangalia hukumu "She unaweka juu ya koti kila wakati Mary 90 00:04:19,370 --> 00:04:22,850 majani ya nyumba. "unajua ni haiwezekani kusema kwamba "yeye" ni 91 00:04:22,850 --> 00:04:24,260 akimaanisha Mary. 92 00:04:24,260 --> 00:04:27,070 >> Hakuna njia ya kusema kwamba "Mary unaweka juu ya koti kila wakati Mary majani 93 00:04:27,070 --> 00:04:30,790 nyumba. "Kwa hiyo ni ya kuvutia kwa sababu hii ni aina ya Intuition 94 00:04:30,790 --> 00:04:32,890 kwamba kila msemaji uliotokea ana. 95 00:04:32,890 --> 00:04:36,370 Na hakuna mtu alikuwa akifundisha kwamba hii ni njia ambayo syntax kazi. 96 00:04:36,370 --> 00:04:41,930 Na kwamba unaweza tu na hii "yeye" akimaanisha Mary katika kesi hii kwanza, 97 00:04:41,930 --> 00:04:44,260 na kwa kweli katika hii nyingine sana, lakini si katika hili. 98 00:04:44,260 --> 00:04:46,500 Lakini kila mtu aina ya anapata kwa jibu moja. 99 00:04:46,500 --> 00:04:48,580 Kila mmoja anakubali juu ya hilo. 100 00:04:48,580 --> 00:04:53,280 Hivyo ni kweli kuvutia jinsi ingawa huna kujua sheria zote za 101 00:04:53,280 --> 00:04:55,575 katika lugha yako aina ya kuelewa jinsi lugha kazi. 102 00:04:55,575 --> 00:04:59,020 103 00:04:59,020 --> 00:05:01,530 >> Kwa hivyo jambo la kuvutia kuhusu asili lugha ni kwamba huna kwa 104 00:05:01,530 --> 00:05:06,970 kujua syntax yoyote kujua kama hukumu ni sarufi au ungrammatical kwa 105 00:05:06,970 --> 00:05:08,810 kesi nyingi. 106 00:05:08,810 --> 00:05:13,220 Ambayo inafanya unafikiri kwamba labda nini kinatokea ni kwamba kwa njia ya maisha yako, 107 00:05:13,220 --> 00:05:17,410 tu kuweka kupata zaidi na zaidi hukumu aliiambia na wewe. 108 00:05:17,410 --> 00:05:19,800 Na kisha kuweka kukariri yote ya hukumu. 109 00:05:19,800 --> 00:05:24,230 Na wakati mtu anakwambia kitu, kusikia kwamba hukumu na 110 00:05:24,230 --> 00:05:27,040 ukiangalia msamiati wako ya hukumu na kuona kama 111 00:05:27,040 --> 00:05:28,270 hukumu hiyo ni huko. 112 00:05:28,270 --> 00:05:29,830 Na kama ni huko kusema ni sarufi. 113 00:05:29,830 --> 00:05:31,740 Kama siyo, unaweza kusema ni ungrammatical. 114 00:05:31,740 --> 00:05:35,150 >> Hivyo, katika kesi hiyo, kusema, oh, hivyo una orodha kubwa ya yote 115 00:05:35,150 --> 00:05:36,140 iwezekanavyo hukumu. 116 00:05:36,140 --> 00:05:38,240 Na kisha wakati wewe kusikia hukumu, unajua kama ni sarufi au 117 00:05:38,240 --> 00:05:39,450 si kwa kuzingatia kwamba. 118 00:05:39,450 --> 00:05:42,360 jambo ni kwamba kama ukiangalia hukumu, kwa mfano, " 119 00:05:42,360 --> 00:05:47,540 tano inaongozwa CS50 TFS kupikwa kipofu octopus kutumia DAPA mug. "Ni 120 00:05:47,540 --> 00:05:49,630 dhahiri si hukumu ya kwamba kusikia kabla. 121 00:05:49,630 --> 00:05:52,380 Lakini wakati huo huo unajua ni pretty much sarufi, sawa? 122 00:05:52,380 --> 00:05:55,570 Hakuna makosa ya kisarufi na unaweza kusema kwamba 123 00:05:55,570 --> 00:05:57,020 ni hukumu iwezekanavyo. 124 00:05:57,020 --> 00:06:01,300 >> Hivyo inafanya sisi kufikiri kwamba kweli njia ambayo sisi kujifunza lugha ni si tu 125 00:06:01,300 --> 00:06:07,090 kwa kuwa database kubwa ya uwezekano wa maneno au sentensi, lakini zaidi ya 126 00:06:07,090 --> 00:06:11,490 kuelewa uhusiano kati ya maneno katika hukumu hizo. 127 00:06:11,490 --> 00:06:14,570 Je, hiyo mantiki? 128 00:06:14,570 --> 00:06:19,370 Kwa hiyo, basi swali ni, unaweza kompyuta kujifunza lugha? 129 00:06:19,370 --> 00:06:21,490 Je, sisi kufundisha lugha ya kompyuta? 130 00:06:21,490 --> 00:06:24,230 >> Kwa hiyo, hebu fikiria ya tofauti kati ya msemaji asili ya lugha 131 00:06:24,230 --> 00:06:25,460 na kompyuta. 132 00:06:25,460 --> 00:06:27,340 Hivyo, nini kinatokea kwa msemaji? 133 00:06:27,340 --> 00:06:30,430 Naam, msemaji uliotokea kujifunza lugha kutoka yatokanayo na yake. 134 00:06:30,430 --> 00:06:34,200 Kawaida utoto wake mapema miaka. 135 00:06:34,200 --> 00:06:38,570 Kwa hiyo, kimsingi, wewe tu kuwa na mtoto, na kuweka kuzungumza na hayo, na 136 00:06:38,570 --> 00:06:40,540 tu kujifunza jinsi ya kuzungumza lugha, haki? 137 00:06:40,540 --> 00:06:42,660 Hivyo, wewe ni kimsingi kutoa pembejeo kwa mtoto. 138 00:06:42,660 --> 00:06:45,200 Kwa hiyo, basi unaweza kusema kwamba kompyuta unaweza kufanya kitu kimoja, sawa? 139 00:06:45,200 --> 00:06:49,510 Unaweza tu kutoa lugha kama pembejeo kwa kompyuta. 140 00:06:49,510 --> 00:06:53,410 >> Kama kwa mfano kundi la files kuwa na vitabu katika lugha ya Kiingereza. 141 00:06:53,410 --> 00:06:56,190 Labda hiyo ndiyo njia moja kwamba inaweza uwezekano wa kufundisha 142 00:06:56,190 --> 00:06:57,850 kompyuta Kiingereza, right? 143 00:06:57,850 --> 00:07:01,000 Na kwa kweli, kama wewe kufikiri juu yake, inachukua wewe labda michache 144 00:07:01,000 --> 00:07:02,680 siku ya kusoma kitabu. 145 00:07:02,680 --> 00:07:05,760 Kwa ajili ya kompyuta inachukua pili kwa kuangalia maneno yote katika kitabu. 146 00:07:05,760 --> 00:07:10,810 Hivyo unaweza kufikiri kwamba inaweza kuwa tu hii Hoja ya pembejeo kutoka kwa karibu na wewe, 147 00:07:10,810 --> 00:07:15,440 hiyo haitoshi kusema kwamba hiyo ni kitu ambacho binadamu tu anaweza kufanya. 148 00:07:15,440 --> 00:07:17,680 Unaweza kufikiria kompyuta pia wanaweza kupata pembejeo. 149 00:07:17,680 --> 00:07:21,170 >> Jambo la pili ni kwamba wasemaji pia na ubongo ambayo ina 150 00:07:21,170 --> 00:07:23,870 kujifunza lugha uwezo. 151 00:07:23,870 --> 00:07:27,020 Lakini kama wewe kufikiri juu yake, ubongo ni jambo imara. 152 00:07:27,020 --> 00:07:30,450 Wakati wewe ni kuzaliwa, ni tayari kuweka - 153 00:07:30,450 --> 00:07:31,320 hii ni ubongo wako. 154 00:07:31,320 --> 00:07:34,660 Na kama wewe kukua, wewe tu kupata zaidi mchango wa lugha na labda virutubisho 155 00:07:34,660 --> 00:07:35,960 na mambo mengine. 156 00:07:35,960 --> 00:07:38,170 Lakini pretty much ubongo wako ni kitu kigumu. 157 00:07:38,170 --> 00:07:41,290 >> Hivyo unaweza kusema, vizuri, labda unaweza kujenga kompyuta ambayo ina rundo la 158 00:07:41,290 --> 00:07:45,890 kazi na mbinu tu ya kuiga kujifunza lugha uwezo. 159 00:07:45,890 --> 00:07:49,630 Hivyo kwa mantiki hiyo, unaweza kusema, vizuri, mimi unaweza kuwa na kompyuta ambayo ina wote 160 00:07:49,630 --> 00:07:52,270 mambo mimi haja ya kujifunza lugha. 161 00:07:52,270 --> 00:07:56,200 Na jambo la mwisho ni kwamba asili msemaji kujifunza kutoka kwa majaribio na makosa. 162 00:07:56,200 --> 00:08:01,090 Hivyo kimsingi jambo lingine muhimu katika kujifunza lugha ni kwamba aina 163 00:08:01,090 --> 00:08:05,340 ya kujifunza mambo kwa kufanya generalizations ya nini kusikia. 164 00:08:05,340 --> 00:08:10,280 >> Hivyo kama wewe ni kupanda juu kujifunza kwamba baadhi ya maneno ni zaidi kama majina, 165 00:08:10,280 --> 00:08:11,820 baadhi wale wengine ni sifa. 166 00:08:11,820 --> 00:08:14,250 Na huna kuwa na yoyote maarifa ya isimu 167 00:08:14,250 --> 00:08:15,040 kuelewa kwamba. 168 00:08:15,040 --> 00:08:18,560 Lakini wewe tu kujua kuna baadhi ya maneno ni nafasi nzuri katika baadhi ya sehemu ya 169 00:08:18,560 --> 00:08:22,570 hukumu na baadhi ya watu wengine katika nchi nyingine sehemu ya hukumu. 170 00:08:22,570 --> 00:08:26,110 >> Na kwamba wakati wa kufanya kitu ambacho ni kama hukumu hiyo ni si sahihi - 171 00:08:26,110 --> 00:08:28,770 labda kwa sababu ya zaidi ya generalization kwa mfano. 172 00:08:28,770 --> 00:08:32,210 Labda wakati wewe ni kupanda juu, taarifa kwamba wingi ni kawaida 173 00:08:32,210 --> 00:08:35,809 sumu kwa kuweka S katika mwisho wa neno. 174 00:08:35,809 --> 00:08:40,042 Na kisha kujaribu kufanya wingi wa "Deer" kama "deers" au "jino" kama 175 00:08:40,042 --> 00:08:44,780 "Tooths." Hivyo basi wazazi wako au mtu husahihisha wewe na anasema, hapana, 176 00:08:44,780 --> 00:08:49,020 wingi wa "deer" ni "kulungu," na wingi wa "jino" ni "meno." Na kisha 177 00:08:49,020 --> 00:08:50,060 kujifunza mambo hayo. 178 00:08:50,060 --> 00:08:51,520 Hivyo kujifunza kutoka kwa majaribio na makosa. 179 00:08:51,520 --> 00:08:53,100 >> Lakini pia unaweza kufanya hivyo na kompyuta. 180 00:08:53,100 --> 00:08:55,310 Unaweza kuwa na kitu kinachoitwa kuimarisha kujifunza. 181 00:08:55,310 --> 00:08:58,560 Ambayo kimsingi ni kama kutoa kompyuta malipo wakati wowote haina 182 00:08:58,560 --> 00:08:59,410 kitu kwa usahihi. 183 00:08:59,410 --> 00:09:04,710 Na kutoa kinyume cha malipo na wakati gani kitu kibaya. 184 00:09:04,710 --> 00:09:07,410 Unaweza kweli kuona kwamba kama wewe kwenda kwa Google Tafsiri na wewe kujaribu 185 00:09:07,410 --> 00:09:10,220 kutafsiri hukumu, ni anauliza kwa maoni. 186 00:09:10,220 --> 00:09:13,240 Hivyo kama wewe kusema, oh, kuna bora tafsiri kwa adhabu hii. 187 00:09:13,240 --> 00:09:18,140 Unaweza aina it up na kisha kama mengi ya watu kuendelea kusema kwamba ni bora 188 00:09:18,140 --> 00:09:21,560 tafsiri, ni tu kujifunza kwamba ni lazima kutumia tafsiri badala ya 189 00:09:21,560 --> 00:09:22,960 moja ilikuwa kutoa. 190 00:09:22,960 --> 00:09:28,830 >> Hivyo, ni suala falsafa sana kuona kama kompyuta ni kwenda kuwa 191 00:09:28,830 --> 00:09:30,340 uwezo wa kuzungumza au si katika siku zijazo. 192 00:09:30,340 --> 00:09:34,440 Lakini Nina matumaini makubwa kwamba wanaweza tu kwa kuzingatia hoja hizo. 193 00:09:34,440 --> 00:09:38,570 Lakini ni tu zaidi ya falsafa swali. 194 00:09:38,570 --> 00:09:43,460 >> Hivyo wakati kompyuta bado hawezi kuzungumza, nini ni mambo ambayo tunaweza kufanya? 195 00:09:43,460 --> 00:09:47,070 Baadhi ya mambo ya kweli ya baridi ni data uainishaji. 196 00:09:47,070 --> 00:09:53,210 Hivyo, kwa mfano, you guys kujua kwamba huduma ya barua pepe kufanya, kwa 197 00:09:53,210 --> 00:09:55,580 mfano, spam filtering. 198 00:09:55,580 --> 00:09:59,070 Hivyo wakati wowote kupokea spam, ni anajaribu kuchuja kwa sanduku mwingine. 199 00:09:59,070 --> 00:10:00,270 Hivyo ni jinsi gani kufanya hivyo? 200 00:10:00,270 --> 00:10:06,080 Siyo kama kompyuta tu anajua anwani nini email ni kutuma spam. 201 00:10:06,080 --> 00:10:09,130 Hivyo ni zaidi ya msingi juu ya maudhui ya ujumbe, au labda cheo, au 202 00:10:09,130 --> 00:10:11,310 labda baadhi mfano kwamba wewe. 203 00:10:11,310 --> 00:10:15,690 >> Kwa hiyo, kimsingi, nini unaweza kufanya kupata ni mengi ya data ya barua pepe kwamba ni spam, 204 00:10:15,690 --> 00:10:19,980 barua pepe ambayo ni si spam, na kujifunza nini aina ya mifumo una katika 205 00:10:19,980 --> 00:10:21,000 wale ambao ni spam. 206 00:10:21,000 --> 00:10:23,260 Na hii ni sehemu ya computational isimu. 207 00:10:23,260 --> 00:10:24,720 Ni wito data uainishaji. 208 00:10:24,720 --> 00:10:28,100 Na sisi ni kweli kwenda kuona mfano wa kwamba katika slides ijayo. 209 00:10:28,100 --> 00:10:32,910 >> Jambo la pili ni lugha ya asili usindikaji ambayo ni kitu 210 00:10:32,910 --> 00:10:36,580 Graph Tafuta ni kufanya ya kuruhusu kuandika hukumu. 211 00:10:36,580 --> 00:10:38,690 Na amana wewe kuelewa nini ni maana na inatoa 212 00:10:38,690 --> 00:10:39,940 wewe matokeo bora. 213 00:10:39,940 --> 00:10:43,880 Kwa kweli, kama wewe kwenda Google au Bing na wewe kutafuta kitu kama Lady 214 00:10:43,880 --> 00:10:47,060 Urefu Gaga, wewe ni kweli kwenda kupata 5 '1 "badala ya habari 215 00:10:47,060 --> 00:10:50,170 kutoka kwake kwa sababu ni kweli anaelewa nini wewe kuzungumza juu. 216 00:10:50,170 --> 00:10:52,140 Hivyo kwamba ni sehemu ya asili usindikaji lugha. 217 00:10:52,140 --> 00:10:57,000 >> Au pia wakati unatumia Siri, kwanza una algorithm ambayo inajaribu 218 00:10:57,000 --> 00:11:01,130 kutafsiri unachosema katika maneno, katika maandishi. 219 00:11:01,130 --> 00:11:03,690 Na kisha anajaribu kutafsiri kwamba katika maana. 220 00:11:03,690 --> 00:11:06,570 Ili wote sehemu ya asili usindikaji lugha. 221 00:11:06,570 --> 00:11:08,320 >> Kisha una mashine tafsiri - 222 00:11:08,320 --> 00:11:10,300 ambayo ni kweli moja ya favorites yangu - 223 00:11:10,300 --> 00:11:14,060 ambayo ni kutafsiri tu kutoka lugha na mwingine. 224 00:11:14,060 --> 00:11:17,950 Hivyo unaweza kufikiri kwamba wakati unafanya tafsiri mashine, una 225 00:11:17,950 --> 00:11:19,750 uwezekano usio wa hukumu. 226 00:11:19,750 --> 00:11:22,960 Hivyo hakuna njia ya kuhifadhi tu kila tafsiri moja. 227 00:11:22,960 --> 00:11:27,440 Hivyo kuwa na kuja na kuvutia algorithms kuwa na uwezo wa 228 00:11:27,440 --> 00:11:30,110 kutafsiri kila moja hukumu kwa namna fulani. 229 00:11:30,110 --> 00:11:32,483 >> You guys una maswali yoyote hadi sasa? 230 00:11:32,483 --> 00:11:34,450 No? 231 00:11:34,450 --> 00:11:34,830 OK. 232 00:11:34,830 --> 00:11:36,900 >> Basi ni nini sisi kwenda kuona leo? 233 00:11:36,900 --> 00:11:39,300 Awali ya yote, mimi nina kwenda kuzungumza kuhusu Uainishaji tatizo. 234 00:11:39,300 --> 00:11:41,440 Hivyo moja kwamba nilikuwa kusema kuhusu spam. 235 00:11:41,440 --> 00:11:46,820 Nini mimi kwenda kufanya ni, kutokana na lyrics wimbo, unaweza kujaribu kufikiri 236 00:11:46,820 --> 00:11:49,810 na uwezekano mkubwa ambao ni muimbaji? 237 00:11:49,810 --> 00:11:53,590 Hebu kusema kwamba nina nyimbo kutoka Lady Gaga na Katy Perry, kama mimi kukupa 238 00:11:53,590 --> 00:11:58,130 wimbo mpya, unaweza kufikiri kama ni Katy Perry au Lady Gaga? 239 00:11:58,130 --> 00:12:01,490 >> moja ya pili, Mimi tu kwenda kuzungumza kuhusu tatizo segmentation. 240 00:12:01,490 --> 00:12:05,780 Basi, mimi sijui kama nyie kujua, lakini China, Japan, wengine Asia ya Mashariki 241 00:12:05,780 --> 00:12:08,090 lugha, na lugha nyingine kwa ujumla, hawana 242 00:12:08,090 --> 00:12:09,830 nafasi kati ya maneno. 243 00:12:09,830 --> 00:12:13,540 Na kisha kama unadhani kuhusu njia ambayo kompyuta ya aina yako ya inajaribu kwa 244 00:12:13,540 --> 00:12:18,600 kuelewa usindikaji lugha ya asili, inaonekana katika maneno na 245 00:12:18,600 --> 00:12:21,500 anajaribu kuelewa uhusiano kati yao, sawa? 246 00:12:21,500 --> 00:12:25,440 Lakini basi kama una Kichina, na kuwa na nafasi zero, ni ngumu kweli kweli 247 00:12:25,440 --> 00:12:28,360 kujua nini ni uhusiano kati ya maneno, kwa sababu wao hawana 248 00:12:28,360 --> 00:12:29,530 maneno ya kwanza. 249 00:12:29,530 --> 00:12:32,600 Hivyo kufanya kitu kinachoitwa segmentation ambayo ina maana ya kuweka 250 00:12:32,600 --> 00:12:36,490 nafasi kati ya kile tunatarajia kuwaita maneno katika lugha hizo. 251 00:12:36,490 --> 00:12:37,740 Mantiki? 252 00:12:37,740 --> 00:12:39,680 253 00:12:39,680 --> 00:12:41,540 >> Na kisha tunakwenda majadiliano juu ya syntax. 254 00:12:41,540 --> 00:12:44,050 Hivyo tu kidogo kuhusu asili usindikaji lugha. 255 00:12:44,050 --> 00:12:45,420 Ni kwenda kuwa tu maelezo ya jumla. 256 00:12:45,420 --> 00:12:50,700 Kwa hiyo leo, kimsingi nini nataka kufanya ni kukupa guys kidogo ya 257 00:12:50,700 --> 00:12:53,930 ndani ya nini ni uwezekano kwamba unaweza kufanya na computational 258 00:12:53,930 --> 00:12:54,960 isimu. 259 00:12:54,960 --> 00:13:00,410 Na kisha unaweza kuona nini unafikiri ni baridi kati ya mambo hayo. 260 00:13:00,410 --> 00:13:02,270 Na labda unaweza kufikiria mradi na kuja kuzungumza na mimi. 261 00:13:02,270 --> 00:13:05,260 Na mimi siwezi kutoa ushauri juu ya jinsi ya kutekeleza. 262 00:13:05,260 --> 00:13:09,060 >> Hivyo syntax ni kwenda kuwa kidogo kuhusu Graph Search na mashine 263 00:13:09,060 --> 00:13:09,670 tafsiri. 264 00:13:09,670 --> 00:13:13,650 Mimi tu kwenda kutoa mfano wa jinsi unaweza, kwa mfano, kutafsiri 265 00:13:13,650 --> 00:13:16,020 kitu kutoka Ureno kwa lugha ya Kiingereza. 266 00:13:16,020 --> 00:13:17,830 Sauti nzuri? 267 00:13:17,830 --> 00:13:19,293 >> Hivyo kwanza, tatizo uainishaji. 268 00:13:19,293 --> 00:13:23,590 Mimi itabidi kusema kwamba sehemu hii ya semina ni kwenda kuwa magumu sana 269 00:13:23,590 --> 00:13:27,560 moja kwa sababu tu kuna kwenda kuwa baadhi ya coding. 270 00:13:27,560 --> 00:13:29,470 Lakini ni kwenda kuwa Python. 271 00:13:29,470 --> 00:13:34,380 Mimi najua wewe guys sijui Python, hivyo Mimi tu kwenda kueleza juu 272 00:13:34,380 --> 00:13:35,750 ngazi ya nini mimi kufanya. 273 00:13:35,750 --> 00:13:40,900 Na huna kwa kweli huduma pia mengi kuhusu syntax kwa sababu hiyo ni 274 00:13:40,900 --> 00:13:42,140 kitu guys wanaweza kujifunza. 275 00:13:42,140 --> 00:13:42,540 OK? 276 00:13:42,540 --> 00:13:43,580 Sauti nzuri. 277 00:13:43,580 --> 00:13:46,020 >> Kwa hiyo kile ni tatizo uainishaji? 278 00:13:46,020 --> 00:13:49,140 Hivyo wewe ni kutokana na baadhi ya lyrics kwa song na unataka nadhani 279 00:13:49,140 --> 00:13:50,620 ambaye ni kuimba. 280 00:13:50,620 --> 00:13:54,045 Na hii inaweza kuwa kwa aina yoyote ya matatizo mengine. 281 00:13:54,045 --> 00:13:59,980 Hivyo inaweza kuwa, kwa mfano, una kampeni ya urais na una 282 00:13:59,980 --> 00:14:02,610 hotuba, na unataka kupata nje kama ilikuwa, kwa mfano, 283 00:14:02,610 --> 00:14:04,470 Obama au Mitt Romney. 284 00:14:04,470 --> 00:14:07,700 Au unaweza kuwa na rundo la barua pepe na unataka kufikiri kama wao ni 285 00:14:07,700 --> 00:14:08,890 spam au la. 286 00:14:08,890 --> 00:14:11,440 Hivyo ni kuainisha baadhi tu data ya msingi juu ya maneno 287 00:14:11,440 --> 00:14:13,790 kwamba una huko. 288 00:14:13,790 --> 00:14:16,295 >> Kufanya hivyo kwamba, una kufanya baadhi ya mawazo. 289 00:14:16,295 --> 00:14:20,570 Hivyo mengi kuhusu computational isimu ni kufanya mawazo, 290 00:14:20,570 --> 00:14:24,100 mawazo kawaida smart, ili unaweza kupata matokeo mazuri. 291 00:14:24,100 --> 00:14:26,670 Kujaribu kujenga mfano wa kuigwa kwa ajili yake. 292 00:14:26,670 --> 00:14:31,290 Na kisha kujaribu nje na kuona kama ni kazi, kama anatoa usahihi nzuri. 293 00:14:31,290 --> 00:14:33,940 Na kama itakuwa hivyo, basi kujaribu kuboresha yake. 294 00:14:33,940 --> 00:14:37,640 Kama hana, wewe ni kama, OK, labda mimi wanapaswa kufanya dhana tofauti. 295 00:14:37,640 --> 00:14:44,030 >> Hivyo dhana kwamba tunakwenda kufanya ni kwamba msanii kawaida kuimba 296 00:14:44,030 --> 00:14:49,220 kuhusu mada mara nyingi, na labda anatumia maneno mara nyingi tu 297 00:14:49,220 --> 00:14:50,270 kwa sababu wao ni kutumika yake. 298 00:14:50,270 --> 00:14:51,890 Unaweza kufikiri tu ya rafiki yako. 299 00:14:51,890 --> 00:14:57,350 Mimi nina uhakika guys wote kuwa na marafiki kwamba kusema saini zao maneno, 300 00:14:57,350 --> 00:14:59,260 literally kwa kila sentensi moja - 301 00:14:59,260 --> 00:15:02,660 kama baadhi neno maalum au baadhi maalum maneno ya kwamba wanasema kwa 302 00:15:02,660 --> 00:15:04,020 kila sentensi moja. 303 00:15:04,020 --> 00:15:07,920 >> Na nini unaweza kusema ni kwamba kama unaweza kuona hukumu ambayo ina sahihi 304 00:15:07,920 --> 00:15:11,450 maneno, unaweza nadhani kwamba pengine rafiki yako ni 305 00:15:11,450 --> 00:15:13,310 moja kusema kuwa, right? 306 00:15:13,310 --> 00:15:18,410 Hivyo kudhani kuwa na kisha kwamba ni jinsi gani kujenga mfano wa kuigwa. 307 00:15:18,410 --> 00:15:24,440 >> mfano kwamba mimi nina kwenda kutoa ni juu ya jinsi Lady Gaga, kwa mfano, watu 308 00:15:24,440 --> 00:15:27,430 kusema kwamba anatumia "mtoto" kwa yake yote namba moja nyimbo. 309 00:15:27,430 --> 00:15:32,270 Na kwa kweli hii ni video ambayo inaonyesha yake akisema neno "mtoto" kwa 310 00:15:32,270 --> 00:15:33,410 nyimbo mbalimbali. 311 00:15:33,410 --> 00:15:33,860 >> [Video avspelning] 312 00:15:33,860 --> 00:15:34,310 >> - (KUIMBA) Baby. 313 00:15:34,310 --> 00:15:36,220 Baby. 314 00:15:36,220 --> 00:15:37,086 Baby. 315 00:15:37,086 --> 00:15:37,520 Baby. 316 00:15:37,520 --> 00:15:37,770 Baby. 317 00:15:37,770 --> 00:15:38,822 Babe. 318 00:15:38,822 --> 00:15:39,243 Baby. 319 00:15:39,243 --> 00:15:40,085 Baby. 320 00:15:40,085 --> 00:15:40,510 Baby. 321 00:15:40,510 --> 00:15:40,850 Baby. 322 00:15:40,850 --> 00:15:41,090 >> [Mwisho video avspelning- 323 00:15:41,090 --> 00:15:44,020 >> LUCAS Freitas: Hivyo kuna, nadhani, 40 nyimbo hapa katika ambayo anasema 324 00:15:44,020 --> 00:15:48,690 neno "mtoto." Hivyo unaweza kimsingi nadhani kwamba kama unaweza kuona wimbo ambayo ina 325 00:15:48,690 --> 00:15:52,180 neno "mtoto," kuna baadhi ya high uwezekano kwamba ni Lady Gaga. 326 00:15:52,180 --> 00:15:56,450 Lakini hebu jaribu kuendeleza hii zaidi zaidi rasmi. 327 00:15:56,450 --> 00:16:00,470 >> Basi hizi ni lyrics kwa nyimbo na Lady Gaga na Katy Perry. 328 00:16:00,470 --> 00:16:04,120 Hivyo ukiangalia Lady Gaga, unaweza kuona wao kuwa na mengi ya matukio ya "mtoto," a 329 00:16:04,120 --> 00:16:07,710 mengi ya matukio ya "njia." Na kisha Katy Perry ina mengi ya matukio ya 330 00:16:07,710 --> 00:16:10,360 "," Mengi ya matukio ya "moto." 331 00:16:10,360 --> 00:16:14,560 >> Hivyo kimsingi ni nini tunataka kufanya ni, unaweza kupata lyric. 332 00:16:14,560 --> 00:16:20,480 Hebu kusema kwamba, kupata lyric kwa wimbo kuwa ni "mtoto," tu "mtoto." Kama 333 00:16:20,480 --> 00:16:24,750 wewe tu kupata neno "mtoto," na hii ni data yote una kutoka 334 00:16:24,750 --> 00:16:27,880 Lady Gaga na Katy Perry, ambao wewe nadhani ni mtu 335 00:16:27,880 --> 00:16:29,370 ambao kuimba wimbo? 336 00:16:29,370 --> 00:16:32,360 Lady Gaga au Katy Perry? 337 00:16:32,360 --> 00:16:33,150 Lady Gaga, sawa? 338 00:16:33,150 --> 00:16:37,400 Kwa sababu yeye ni mmoja tu ambaye anasema "Mtoto." Hii inaonekana kijinga, sawa? 339 00:16:37,400 --> 00:16:38,760 OK, hii ni kweli ni rahisi. 340 00:16:38,760 --> 00:16:41,860 Mimi tu kuangalia nyimbo mbili na Bila shaka, yeye ni mmoja tu ambaye ana 341 00:16:41,860 --> 00:16:42,660 "Mtoto." 342 00:16:42,660 --> 00:16:44,740 >> Lakini nini kama wewe kuwa na rundo la maneno? 343 00:16:44,740 --> 00:16:50,900 Kama una halisi lyric, kitu kama, "mtoto, I just 344 00:16:50,900 --> 00:16:51,610 akaenda kuona [? CFT?] 345 00:16:51,610 --> 00:16:54,020 hotuba, "au kitu kama hicho, na basi kweli kuwa na takwimu nje - 346 00:16:54,020 --> 00:16:55,780 msingi maneno hayo yote - 347 00:16:55,780 --> 00:16:58,350 ambao ni msanii ambaye pengine kuimba wimbo huu? 348 00:16:58,350 --> 00:17:01,860 Basi hebu kujaribu kuendeleza hii kidogo zaidi. 349 00:17:01,860 --> 00:17:05,630 >> OK, hivyo msingi tu juu ya data kwamba sisi got, inaonekana kwamba Gaga pengine ni 350 00:17:05,630 --> 00:17:06,260 mwimbaji. 351 00:17:06,260 --> 00:17:07,904 Lakini jinsi gani tunaweza kuandika hii rasmi zaidi? 352 00:17:07,904 --> 00:17:10,579 353 00:17:10,579 --> 00:17:13,140 Na kuna kwenda kuwa ni kidogo kidogo ya takwimu. 354 00:17:13,140 --> 00:17:15,880 Hivyo kama wewe kupotea, kujaribu tu kuelewa dhana. 355 00:17:15,880 --> 00:17:18,700 Haijalishi kama wewe kuelewa equations kikamilifu vizuri. 356 00:17:18,700 --> 00:17:22,150 Hii yote ni kwenda kuwa online. 357 00:17:22,150 --> 00:17:25,490 >> Hivyo kimsingi nini mimi kuhesabu ni uwezekano kwamba wimbo huu ni kwa 358 00:17:25,490 --> 00:17:28,040 Lady Gaga kutokana na kwamba - 359 00:17:28,040 --> 00:17:30,660 hivyo bar hii ina maana kutokana na kwamba - 360 00:17:30,660 --> 00:17:33,680 Niliona neno "mtoto." Je, hiyo mantiki? 361 00:17:33,680 --> 00:17:35,540 Basi, mimi nina kujaribu mahesabu ya kwamba uwezekano. 362 00:17:35,540 --> 00:17:38,540 >> Kwa hiyo, kuna theorem hii inayoitwa Bayes theorem kwamba anasema kwamba 363 00:17:38,540 --> 00:17:43,330 uwezekano wa B aliyopewa, ni uwezekano wa B kutolewa, mara 364 00:17:43,330 --> 00:17:47,660 uwezekano wa A, juu ya uwezekano ya B. Hii ni equation kwa muda mrefu. 365 00:17:47,660 --> 00:17:51,970 Lakini nini una kuelewa kutoka ni kwamba hii ni nini nataka 366 00:17:51,970 --> 00:17:52,830 mahesabu, right? 367 00:17:52,830 --> 00:17:56,570 Hivyo uwezekano kwamba wimbo ni kwa Lady Gaga kutokana na kwamba Niliona neno 368 00:17:56,570 --> 00:17:58,230 "Mtoto." 369 00:17:58,230 --> 00:18:02,960 >> Na sasa nini mimi kupata ni uwezekano wa neno "mtoto" kutokana na 370 00:18:02,960 --> 00:18:04,390 kwamba mimi na Lady Gaga. 371 00:18:04,390 --> 00:18:07,220 Na kile ambacho ni kwamba kimsingi? 372 00:18:07,220 --> 00:18:10,500 Nini maana ya ni, ni nini uwezekano wa kuona neno "mtoto" 373 00:18:10,500 --> 00:18:12,130 katika Gaga lyrics? 374 00:18:12,130 --> 00:18:16,240 Kama nataka kufanya mahesabu kwamba katika sana njia rahisi, ni tu ya idadi ya 375 00:18:16,240 --> 00:18:23,640 mara Mimi naona "mtoto" juu ya idadi ya jumla ya maneno katika Gaga lyrics, sawa? 376 00:18:23,640 --> 00:18:27,600 Ni frequency kwamba mimi kuona nini kwamba neno katika kazi Gaga ya? 377 00:18:27,600 --> 00:18:30,530 Mantiki? 378 00:18:30,530 --> 00:18:33,420 >> awamu ya pili ni uwezekano wa Gaga. 379 00:18:33,420 --> 00:18:34,360 Hiyo ina maana gani? 380 00:18:34,360 --> 00:18:38,550 Kwamba kimsingi ina maana, ni nini uwezekano wa kuainisha 381 00:18:38,550 --> 00:18:40,690 baadhi lyrics kama Gaga? 382 00:18:40,690 --> 00:18:45,320 Na kwamba ni aina ya weird, lakini hebu fikiria ya mfano. 383 00:18:45,320 --> 00:18:49,230 Basi hebu kusema kwamba uwezekano wa kuwa "mtoto" katika wimbo ni sawa 384 00:18:49,230 --> 00:18:51,760 kwa Gaga na Britney Spears. 385 00:18:51,760 --> 00:18:54,950 Lakini Britney Spears ina mara mbili zaidi ya nyimbo Lady Gaga. 386 00:18:54,950 --> 00:19:00,570 Hivyo kama mtu nasibu tu inakupa lyrics ya "mtoto," Jambo la kwanza 387 00:19:00,570 --> 00:19:04,710 kuangalia ni, ni nini uwezekano wa kuwa "mtoto" katika Gaga wimbo, "mtoto" 388 00:19:04,710 --> 00:19:05,410 katika Britney wimbo? 389 00:19:05,410 --> 00:19:06,460 Na kitu kimoja. 390 00:19:06,460 --> 00:19:10,040 >> Kwa hivyo jambo la pili kwamba utaona ni, vizuri, ni nini uwezekano wa 391 00:19:10,040 --> 00:19:13,770 lyric hii na yenyewe kuwa Gaga lyric, na ni nini uwezekano wa 392 00:19:13,770 --> 00:19:15,380 kuwa Britney lyric? 393 00:19:15,380 --> 00:19:18,950 Hivyo tangu Britney ina watu wengi zaidi lyrics kuliko Gaga, ungekuwa pengine 394 00:19:18,950 --> 00:19:21,470 kusema, vizuri, hii pengine ni Britney lyric. 395 00:19:21,470 --> 00:19:23,340 Hivyo ndiyo sababu tuna hii Muda hapa. 396 00:19:23,340 --> 00:19:24,670 Uwezekano wa Gaga. 397 00:19:24,670 --> 00:19:26,950 Hufanya akili? 398 00:19:26,950 --> 00:19:28,660 Gani? 399 00:19:28,660 --> 00:19:29,370 OK. 400 00:19:29,370 --> 00:19:33,500 >> Na ile ya mwisho ni uwezekano ya "mtoto" ambayo haina 401 00:19:33,500 --> 00:19:34,810 kweli jambo hilo sana. 402 00:19:34,810 --> 00:19:39,940 Lakini ni uwezekano wa kuona "mtoto" katika lugha ya Kiingereza. 403 00:19:39,940 --> 00:19:42,725 Sisi kwa kawaida hawajali kwamba mengi kuhusu muda huo. 404 00:19:42,725 --> 00:19:44,490 Je, hiyo mantiki? 405 00:19:44,490 --> 00:19:48,110 Hivyo uwezekano wa Gaga ni aitwaye uwezekano kabla ya 406 00:19:48,110 --> 00:19:49,530 ya Gaga darasa. 407 00:19:49,530 --> 00:19:53,840 Kwa sababu maana yake ni kwamba tu, ni nini uwezekano wa kuwa na darasa kwamba - 408 00:19:53,840 --> 00:19:55,520 ambayo ni Gaga - 409 00:19:55,520 --> 00:19:59,350 tu kwa ujumla, tu bila masharti. 410 00:19:59,350 --> 00:20:02,560 >> Na kisha wakati mimi na uwezekano wa Gaga kupewa "mtoto," tunasema pamoja na 411 00:20:02,560 --> 00:20:06,160 teary uwezekano kwa sababu ni uwezekano wa kuwa na 412 00:20:06,160 --> 00:20:08,300 Gaga kutokana na baadhi ya ushahidi. 413 00:20:08,300 --> 00:20:11,050 Hivyo mimi nina kutoa ushahidi kwamba Niliona neno mtoto na 414 00:20:11,050 --> 00:20:12,690 wimbo mantiki? 415 00:20:12,690 --> 00:20:15,960 416 00:20:15,960 --> 00:20:16,410 OK. 417 00:20:16,410 --> 00:20:22,400 >> Hivyo Kama mimi mahesabu kwamba kwa kila ya nyimbo kwa ajili ya Lady Gaga, 418 00:20:22,400 --> 00:20:25,916 yale ambayo itakuwa - 419 00:20:25,916 --> 00:20:27,730 inaonekana, siwezi hoja hii. 420 00:20:27,730 --> 00:20:31,850 421 00:20:31,850 --> 00:20:36,920 uwezekano wa Gaga itakuwa kitu kama, 2 zaidi ya 24, mara 1/2, 422 00:20:36,920 --> 00:20:38,260 zaidi ya 2 juu ya 53. 423 00:20:38,260 --> 00:20:40,640 Haijalishi kama unajua nini namba hizi ni kuja kutoka. 424 00:20:40,640 --> 00:20:44,750 Lakini ni idadi tu kwamba ni kwenda kuwa zaidi ya 0, sawa? 425 00:20:44,750 --> 00:20:48,610 >> Na kisha wakati mimi kufanya Katy Perry, uwezekano wa "mtoto" kutokana na Katy ni 426 00:20:48,610 --> 00:20:49,830 tayari 0, sawa? 427 00:20:49,830 --> 00:20:52,820 Kwa sababu hakuna "mtoto" katika Katy Perry. 428 00:20:52,820 --> 00:20:56,360 Hivyo basi hii inakuwa 0, na Gaga mafanikio, ambayo ina maana kwamba Gaga ni 429 00:20:56,360 --> 00:20:57,310 pengine mwimbaji. 430 00:20:57,310 --> 00:20:58,560 Je, hiyo mantiki? 431 00:20:58,560 --> 00:21:00,700 432 00:21:00,700 --> 00:21:01,950 OK. 433 00:21:01,950 --> 00:21:04,160 434 00:21:04,160 --> 00:21:11,750 >> Hivyo kama nataka kufanya rasmi hii zaidi, Mimi kweli anaweza kufanya mfano 435 00:21:11,750 --> 00:21:12,700 kwa maneno mengi. 436 00:21:12,700 --> 00:21:14,610 Basi hebu kusema kwamba nina kitu kama, "mtoto, mimi 437 00:21:14,610 --> 00:21:16,030 juu ya moto, "au kitu. 438 00:21:16,030 --> 00:21:17,760 Hivyo ina maneno mengi. 439 00:21:17,760 --> 00:21:20,880 Na katika kesi hii, unaweza kuona kwamba "mtoto" ni katika Gaga, 440 00:21:20,880 --> 00:21:21,710 lakini siyo katika Katy. 441 00:21:21,710 --> 00:21:24,940 Na "moto" ni katika Katy, lakini siyo katika Gaga, sawa? 442 00:21:24,940 --> 00:21:27,200 Hivyo ni kupata trickier, sawa? 443 00:21:27,200 --> 00:21:31,440 Kwa sababu inaonekana kwamba wewe karibu na tie kati ya mbili. 444 00:21:31,440 --> 00:21:36,980 >> Basi nini kufanya ni kudhani independency kati ya maneno. 445 00:21:36,980 --> 00:21:41,210 Hivyo kimsingi nini maana ya ni kwamba Mimi nina kuhesabu kile tu ni 446 00:21:41,210 --> 00:21:44,330 uwezekano wa kuona "mtoto," ni nini uwezekano wa kuona "Mimi," na 447 00:21:44,330 --> 00:21:46,670 "Ni", na "juu ya," na "moto," wote tofauti. 448 00:21:46,670 --> 00:21:48,670 Basi mimi nina kuzidisha wao wote. 449 00:21:48,670 --> 00:21:52,420 Na mimi nina kuona nini ni uwezekano ya kuona sentensi nzima. 450 00:21:52,420 --> 00:21:55,210 Mantiki? 451 00:21:55,210 --> 00:22:00,270 >> Hivyo kimsingi, kama mimi na neno moja tu, nini nataka kupata ni ARG max, 452 00:22:00,270 --> 00:22:05,385 ambayo ina maana, ni nini darasa kwamba ni kunipa uwezekano mkubwa? 453 00:22:05,385 --> 00:22:10,010 Kwa hiyo kile ni darasa kwamba ni kutoa mimi uwezekano mkubwa kwa 454 00:22:10,010 --> 00:22:11,940 uwezekano wa darasa kutokana na neno. 455 00:22:11,940 --> 00:22:17,610 Hivyo katika kesi hii, Gaga kupewa "mtoto." Au Katy kupewa "mtoto." Mantiki? 456 00:22:17,610 --> 00:22:21,040 >> Na tu kutoka Bayes, kwamba equation kwamba mimi ilionyesha, 457 00:22:21,040 --> 00:22:24,780 sisi kujenga sehemu hii. 458 00:22:24,780 --> 00:22:28,750 Kitu pekee ni kwamba unaweza kuona kwamba uwezekano wa neno kutokana na 459 00:22:28,750 --> 00:22:31,370 mabadiliko darasa kutegemea juu ya darasa, sawa? 460 00:22:31,370 --> 00:22:34,260 idadi ya "mtoto" s kwamba mimi na katika Gaga ni tofauti na Katy. 461 00:22:34,260 --> 00:22:37,640 uwezekano wa darasa pia mabadiliko kwa sababu ni idadi tu ya 462 00:22:37,640 --> 00:22:39,740 ya nyimbo kila mmoja wao ana. 463 00:22:39,740 --> 00:22:43,980 >> Lakini uwezekano wa neno lenyewe ni kwenda kuwa sawa kwa wote 464 00:22:43,980 --> 00:22:44,740 wasanii, sawa? 465 00:22:44,740 --> 00:22:47,150 Hivyo uwezekano wa neno ni tu, ni nini uwezekano wa 466 00:22:47,150 --> 00:22:49,820 kuona kwamba neno katika Lugha ya Kiingereza? 467 00:22:49,820 --> 00:22:51,420 Hivyo ni sawa kwa wote. 468 00:22:51,420 --> 00:22:55,790 Hivyo tangu hii ni mara kwa mara, tunaweza tu tone hii na si huduma ya juu yake. 469 00:22:55,790 --> 00:23:00,230 Hivyo hii itakuwa kweli equation sisi ni kuangalia kwa. 470 00:23:00,230 --> 00:23:03,360 >> Na kama mimi na maneno mengi, mimi nina bado kwenda na kabla ya 471 00:23:03,360 --> 00:23:04,610 uwezekano hapa. 472 00:23:04,610 --> 00:23:06,980 Kitu pekee ni kwamba mimi nina kuzidisha uwezekano wa 473 00:23:06,980 --> 00:23:08,490 maneno mengine yote. 474 00:23:08,490 --> 00:23:10,110 Hivyo mimi nina kuzidisha wao wote. 475 00:23:10,110 --> 00:23:12,610 Mantiki? 476 00:23:12,610 --> 00:23:18,440 Inaonekana weird lakini kimsingi ina maana, mahesabu ya kabla ya darasa, na 477 00:23:18,440 --> 00:23:22,100 kisha kuongezeka kwa uwezekano wa kila ya maneno kuwa katika darasa hilo. 478 00:23:22,100 --> 00:23:24,620 479 00:23:24,620 --> 00:23:29,150 >> Na unajua kwamba uwezekano wa neno kutokana na darasa ni kwenda kuwa 480 00:23:29,150 --> 00:23:34,520 idadi ya nyakati unaweza kuona kwamba neno katika kwamba darasa, kugawanyika kwa idadi ya 481 00:23:34,520 --> 00:23:37,020 maneno na kwa kuwa darasa kwa ujumla. 482 00:23:37,020 --> 00:23:37,990 Mantiki? 483 00:23:37,990 --> 00:23:41,680 Ni jinsi "mtoto" ilikuwa 2 juu ya idadi ya maneno kwamba 484 00:23:41,680 --> 00:23:43,020 Nilikuwa katika lyrics. 485 00:23:43,020 --> 00:23:45,130 Hivyo tu frequency. 486 00:23:45,130 --> 00:23:46,260 >> Lakini kuna jambo moja. 487 00:23:46,260 --> 00:23:51,250 Kumbuka jinsi mimi alikuwa kuonyesha kwamba uwezekano wa "mtoto" kuwa lyrics 488 00:23:51,250 --> 00:23:56,350 kutoka Katy Perry alikuwa 0 kwa sababu tu Katy Perry hakuwa na "mtoto" wakati wote? 489 00:23:56,350 --> 00:24:04,900 Lakini inaonekana kidogo kali tu tu kusema kwamba lyrics hawezi kuwa kutoka 490 00:24:04,900 --> 00:24:10,040 msanii kwa sababu tu hawana kwamba neno hasa wakati wowote. 491 00:24:10,040 --> 00:24:13,330 >> Hivyo unaweza kusema tu, vizuri, kama hawana neno hili, mimi nina kwenda 492 00:24:13,330 --> 00:24:15,640 kukupa uwezekano chini, lakini nina si tu kwenda 493 00:24:15,640 --> 00:24:17,420 kukupa 0 haki mbali. 494 00:24:17,420 --> 00:24:21,040 Kwa sababu labda ni kitu kama, "Moto, moto, moto, moto," ambayo ni 495 00:24:21,040 --> 00:24:21,990 kabisa Katy Perry. 496 00:24:21,990 --> 00:24:26,060 Na kisha "mtoto," na huenda tu 0 haki mbali kwa sababu kulikuwa na mtu mmoja 497 00:24:26,060 --> 00:24:27,250 "Mtoto." 498 00:24:27,250 --> 00:24:31,440 >> Hivyo kimsingi nini cha kufanya ni kitu aitwaye Laplace smoothing. 499 00:24:31,440 --> 00:24:36,260 Na hii ina maana tu kwamba mimi nina kutoa baadhi uwezekano hata maneno 500 00:24:36,260 --> 00:24:37,850 ambazo hazipo. 501 00:24:37,850 --> 00:24:43,170 Hivyo nini mimi ni kwamba wakati mimi nina kuhesabu, daima kuongeza 1 kwa 502 00:24:43,170 --> 00:24:44,180 numerator. 503 00:24:44,180 --> 00:24:48,060 Hivyo hata kama neno haipo, katika kesi hii, kama hii ni 0, mimi bado nina 504 00:24:48,060 --> 00:24:51,250 kuhesabu hii kama 1 juu ya jumla ya idadi ya maneno. 505 00:24:51,250 --> 00:24:55,060 Vinginevyo, mimi kupata jinsi maneno mengi Mimi na mimi kuongeza 1. 506 00:24:55,060 --> 00:24:58,300 Hivyo mimi nina kuhesabu kwa hali zote mbili. 507 00:24:58,300 --> 00:25:00,430 Mantiki? 508 00:25:00,430 --> 00:25:03,060 >> Hivyo sasa hebu kufanya baadhi ya coding. 509 00:25:03,060 --> 00:25:06,440 Mimi nina kwenda na kufanya hivyo pretty haraka, lakini ni muhimu tu kwamba 510 00:25:06,440 --> 00:25:08,600 guys kuelewa dhana. 511 00:25:08,600 --> 00:25:13,450 Hivyo kile sisi ni kujaribu kufanya ni hasa kutekeleza hili 512 00:25:13,450 --> 00:25:14,330 jambo ambalo mimi tu alisema - 513 00:25:14,330 --> 00:25:19,110 Mimi nataka wewe kuweka lyrics kutoka Lady Gaga na Katy Perry. 514 00:25:19,110 --> 00:25:22,980 Na mpango ni kwenda kuwa na uwezo wa kusema kama haya lyrics mpya ni kutoka Gaga 515 00:25:22,980 --> 00:25:24,170 au Katy Perry. 516 00:25:24,170 --> 00:25:25,800 Mantiki? 517 00:25:25,800 --> 00:25:27,530 OK. 518 00:25:27,530 --> 00:25:30,710 >> Hivyo nina mpango huu mimi nina kwenda kuwaita classify.py. 519 00:25:30,710 --> 00:25:31,970 Hivyo hii ni Python. 520 00:25:31,970 --> 00:25:34,210 Ni mpya lugha ya programu. 521 00:25:34,210 --> 00:25:38,020 Ni sawa sana katika baadhi njia ya C na PHP. 522 00:25:38,020 --> 00:25:43,180 Ni sawa kwa sababu kama unataka kujifunza Python baada ya kujua C, ni 523 00:25:43,180 --> 00:25:46,270 kweli si kwamba sehemu kubwa ya changamoto kwa sababu tu Python ni rahisi sana 524 00:25:46,270 --> 00:25:47,520 kuliko C, kwanza ya yote. 525 00:25:47,520 --> 00:25:49,370 Na mambo mengi tayari kutekelezwa kwa ajili yenu. 526 00:25:49,370 --> 00:25:56,820 Hivyo tu jinsi kama PHP ina kazi ambayo aina orodha, au append kitu 527 00:25:56,820 --> 00:25:58,780 safu, au blah, blah, blah. 528 00:25:58,780 --> 00:26:00,690 Chatu ana wale wote pia. 529 00:26:00,690 --> 00:26:05,960 >> Hivyo mimi nina tu kwenda kueleza haraka jinsi gani tunaweza kufanya uainishaji 530 00:26:05,960 --> 00:26:07,860 tatizo kwa hapa. 531 00:26:07,860 --> 00:26:13,230 Basi hebu kusema kwamba katika kesi hii, nina lyrics kutoka Gaga na Katy Perry. 532 00:26:13,230 --> 00:26:21,880 njia ambayo mimi na wale lyrics ni kwamba neno la kwanza la lyrics ni 533 00:26:21,880 --> 00:26:25,250 jina la msanii, na wengine ni lyrics. 534 00:26:25,250 --> 00:26:29,470 Basi hebu kusema kwamba nina orodha hii katika ambayo moja ya kwanza ni lyrics na Gaga. 535 00:26:29,470 --> 00:26:31,930 Hivyo hapa mimi juu ya haki ya kufuatilia. 536 00:26:31,930 --> 00:26:35,270 Na moja ijayo ni Katy, na ina pia lyrics. 537 00:26:35,270 --> 00:26:38,040 >> Hivyo hii ni jinsi gani kutangaza variable katika Python. 538 00:26:38,040 --> 00:26:40,200 Huna kwa kutoa aina data. 539 00:26:40,200 --> 00:26:43,150 Wewe tu kuandika "lyrics," aina ya kama katika PHP. 540 00:26:43,150 --> 00:26:44,890 Mantiki? 541 00:26:44,890 --> 00:26:47,770 >> Hivyo ni mambo ambayo mimi na nini mahesabu ya kuwa na uwezo wa mahesabu ya 542 00:26:47,770 --> 00:26:49,360 probabilities? 543 00:26:49,360 --> 00:26:55,110 Mimi na kwa mahesabu ya "Priors" ya kila mbalimbali 544 00:26:55,110 --> 00:26:56,710 madarasa ambayo mimi. 545 00:26:56,710 --> 00:27:06,680 Mimi na kwa mahesabu ya "posteriors," au pretty much probabilities ya 546 00:27:06,680 --> 00:27:12,150 kila ya maneno tofauti kwamba Naweza kuwa na kwa kila msanii. 547 00:27:12,150 --> 00:27:17,210 Hivyo ndani ya Gaga, kwa mfano, mimi nina kwenda kuwa na orodha ya mara ngapi mimi kuona 548 00:27:17,210 --> 00:27:19,250 kila ya maneno. 549 00:27:19,250 --> 00:27:20,760 Mantiki? 550 00:27:20,760 --> 00:27:25,370 >> Na hatimaye, mimi nina kwenda tu kuwa orodha inayoitwa "maneno" kwamba ni kwenda tu 551 00:27:25,370 --> 00:27:29,780 kuwa na jinsi maneno mengi mimi na kwa kila msanii. 552 00:27:29,780 --> 00:27:33,760 Hivyo kwa Gaga, kwa mfano, wakati mimi kuangalia kwa lyrics, nilikuwa, nadhani, 24 553 00:27:33,760 --> 00:27:34,750 maneno katika jumla. 554 00:27:34,750 --> 00:27:38,970 Hivyo orodha hii ni kwenda tu kuwa na Gaga 24, na Katy mwingine idadi. 555 00:27:38,970 --> 00:27:40,130 Mantiki? 556 00:27:40,130 --> 00:27:40,560 OK. 557 00:27:40,560 --> 00:27:42,530 >> Basi sasa, kwa kweli, hebu kwenda coding. 558 00:27:42,530 --> 00:27:45,270 Hivyo katika Python, unaweza kweli kurudi kundi la mbalimbali 559 00:27:45,270 --> 00:27:46,630 mambo kutoka kazi. 560 00:27:46,630 --> 00:27:50,810 Hivyo nina kwenda kujenga kazi hii inayoitwa "masharti", ambayo ni kwenda 561 00:27:50,810 --> 00:27:53,890 kurudi yote ya mambo hayo, "Priors," "probabilities," na 562 00:27:53,890 --> 00:28:05,690 "Maneno." Kwa hiyo, "masharti," na ni kwenda kuwa wito katika "lyrics." 563 00:28:05,690 --> 00:28:11,510 >> Basi sasa mimi nataka wewe kweli kuandika kazi hii. 564 00:28:11,510 --> 00:28:17,750 Hivyo njia kwamba naweza kuandika hii kazi ni mimi tu kuelezwa hii 565 00:28:17,750 --> 00:28:20,620 kazi na "def." Hivyo mimi "def masharti, "na ni kuchukua 566 00:28:20,620 --> 00:28:28,700 "Lyrics." Na jambo hili ni kwenda kufanya ni, kwanza ya yote, nina Priors yangu 567 00:28:28,700 --> 00:28:31,030 kwamba mimi nataka mahesabu. 568 00:28:31,030 --> 00:28:34,330 >> Hivyo njia kwamba naweza kuwafanyia jambo hilo ni kujenga kamusi katika Python, ambayo 569 00:28:34,330 --> 00:28:37,320 ni pretty much kitu kimoja kama hash meza, au ni kama iterative 570 00:28:37,320 --> 00:28:40,480 safu katika PHP. 571 00:28:40,480 --> 00:28:44,150 Hii ni jinsi mimi kutangaza dictionary. 572 00:28:44,150 --> 00:28:53,580 Na kimsingi nini maana ya hii ni kwamba Priors ya Gaga ni 0.5, kwa mfano, kama 573 00:28:53,580 --> 00:28:57,200 50% ya lyrics ni kutoka Gaga, 50% ni kutoka Katy. 574 00:28:57,200 --> 00:28:58,450 Mantiki? 575 00:28:58,450 --> 00:29:00,680 576 00:29:00,680 --> 00:29:03,680 Hivyo nina kufikiri jinsi mahesabu ya Priors. 577 00:29:03,680 --> 00:29:07,120 >> ndio pili kwamba mimi kufanya, pia, ni probabilities na maneno. 578 00:29:07,120 --> 00:29:17,100 Hivyo probabilities ya Gaga ni orodha ya probabilities wote kwamba mimi 579 00:29:17,100 --> 00:29:19,160 na kwa kila moja ya maneno kwa Gaga. 580 00:29:19,160 --> 00:29:23,880 Basi, ikiwa mimi kwenda probabilities ya Gaga "Mtoto," kwa mfano, kutakuwa na kunipa 581 00:29:23,880 --> 00:29:28,750 kitu kama 2 zaidi ya 24 katika kesi hiyo. 582 00:29:28,750 --> 00:29:30,070 Mantiki? 583 00:29:30,070 --> 00:29:36,120 Basi, mimi kwenda na "probabilities" kwenda "Gaga" ndoo ambayo ina orodha ya 584 00:29:36,120 --> 00:29:40,550 Maneno Gaga, basi mimi kwenda na "mtoto," na mimi kuona uwezekano. 585 00:29:40,550 --> 00:29:45,940 >> Na hatimaye nina hii "Maneno" dictionary. 586 00:29:45,940 --> 00:29:53,620 Hivyo hapa, "probabilities." Na kisha "Maneno." Basi, ikiwa mimi kufanya "maneno", "Gaga," 587 00:29:53,620 --> 00:29:58,330 ni nini kinaenda kutokea ni kwamba ni anaenda kunipa 24, kusema kwamba mimi 588 00:29:58,330 --> 00:30:01,990 na maneno 24 ndani ya lyrics kutoka Gaga. 589 00:30:01,990 --> 00:30:04,110 Hufanya akili? 590 00:30:04,110 --> 00:30:07,070 Hivyo hapa, "maneno" ni sawa na dah-dah-dah. 591 00:30:07,070 --> 00:30:07,620 OK 592 00:30:07,620 --> 00:30:12,210 >> Hivyo nini mimi kwenda kufanya ni mimi nina kwenda iterate juu ya kila mmoja lyrics, hivyo 593 00:30:12,210 --> 00:30:14,490 kila ya masharti kwamba Nina katika orodha. 594 00:30:14,490 --> 00:30:18,040 Na mimi nina kwenda kwa mahesabu ya mambo hayo kwa kila mmoja wa wagombea. 595 00:30:18,040 --> 00:30:19,950 Hufanya akili? 596 00:30:19,950 --> 00:30:21,700 Hivyo nina kufanya kwa kitanzi. 597 00:30:21,700 --> 00:30:26,300 >> Hivyo katika Python nini siwezi kufanya ni "kwa ajili ya line katika lyrics. "kitu kimoja kama 598 00:30:26,300 --> 00:30:28,000 "Kwa kila" taarifa katika PHP. 599 00:30:28,000 --> 00:30:33,420 Kumbuka jinsi ikiwa ni PHP mimi naweza kusema "kwa kila lyrics kama 600 00:30:33,420 --> 00:30:35,220 line. "Hufanya akili? 601 00:30:35,220 --> 00:30:38,900 Hivyo mimi nina kuchukua kila wa mistari, katika hii kesi, hii kamba na wa pili 602 00:30:38,900 --> 00:30:44,540 kamba hivyo kwa kila moja ya mistari nini mimi kwenda kufanya ni mara ya kwanza, mimi nina kwenda kwa 603 00:30:44,540 --> 00:30:49,150 mgawanyiko mstari huu katika orodha ya maneno kutengwa na nafasi. 604 00:30:49,150 --> 00:30:53,730 >> Hivyo jambo zuri kuhusu Python ni kwamba unaweza tu Google kama "jinsi gani mimi 605 00:30:53,730 --> 00:30:58,220 mgawanyiko kamba katika maneno? "Na ni kwenda kuwaambia jinsi ya kufanya hivyo. 606 00:30:58,220 --> 00:31:04,890 Na njia ya kufanya hivyo, ni tu "line = Line.split () "na kimsingi ni 607 00:31:04,890 --> 00:31:08,640 kwenda kukupa orodha na kila ya maneno hapa. 608 00:31:08,640 --> 00:31:09,620 Hufanya akili? 609 00:31:09,620 --> 00:31:15,870 Hivyo sasa kwamba mimi kwamba nataka kujua ambao ni muimbaji wa wimbo huo. 610 00:31:15,870 --> 00:31:20,130 Na kwa kufanya hivyo mimi tu na kupata hiki kwanza ya safu, sawa? 611 00:31:20,130 --> 00:31:26,390 Hivyo naweza kusema tu kwamba mimi "mwimbaji = Line (0) "Hufanya akili? 612 00:31:26,390 --> 00:31:32,010 >> Na kisha mimi haja ya kufanya ni nini, ya kwanza ya wote, mimi nina kwenda update wangapi 613 00:31:32,010 --> 00:31:36,130 maneno nina chini ya "Gaga." hivyo mimi nina tu kwenda kwa mahesabu ya jinsi maneno mengi mimi 614 00:31:36,130 --> 00:31:38,690 na katika orodha hii, sawa? 615 00:31:38,690 --> 00:31:41,910 Kwa sababu hii ni jinsi maneno mengi mimi na katika lyrics na mimi nina kwenda tu kwa 616 00:31:41,910 --> 00:31:44,120 kuongeza na "Gaga" safu. 617 00:31:44,120 --> 00:31:47,090 Je, hiyo mantiki? 618 00:31:47,090 --> 00:31:49,010 Wala kuzingatia sana juu ya syntax. 619 00:31:49,010 --> 00:31:50,430 Fikiria zaidi kuhusu dhana. 620 00:31:50,430 --> 00:31:52,400 Hiyo ni sehemu muhimu zaidi. 621 00:31:52,400 --> 00:31:52,720 OK. 622 00:31:52,720 --> 00:32:00,260 >> Basi nini siwezi kufanya hivyo ni kama "Gaga" ni tayari katika orodha hiyo, hivyo "kama mwimbaji katika 623 00:32:00,260 --> 00:32:03,190 maneno "ambayo ina maana kwamba mimi tayari na maneno kwa Gaga. 624 00:32:03,190 --> 00:32:06,640 Mimi nataka tu kuongeza nyongeza ya maneno ya hiyo. 625 00:32:06,640 --> 00:32:15,810 Hivyo nini mimi ni "maneno (mwimbaji) + = Len (line) - 1 ". 626 00:32:15,810 --> 00:32:18,250 Na kisha naweza tu kufanya urefu wa line. 627 00:32:18,250 --> 00:32:21,860 Mambo hivyo jinsi wengi mimi na katika safu. 628 00:32:21,860 --> 00:32:27,060 Na mimi kufanya minus 1 kwa sababu tu hiki kwanza ya safu ni 629 00:32:27,060 --> 00:32:29,180 mwimbaji na wale si lyrics. 630 00:32:29,180 --> 00:32:31,420 Hufanya akili? 631 00:32:31,420 --> 00:32:32,780 OK. 632 00:32:32,780 --> 00:32:35,820 >> "Else," maana yake ni kwamba mimi unataka kweli kuingiza Gaga katika orodha. 633 00:32:35,820 --> 00:32:45,990 Hivyo mimi tu kufanya "maneno (mwimbaji) = Len (line) - 1, "sorry. 634 00:32:45,990 --> 00:32:49,200 Hivyo tofauti kati ya wawili mistari ni kwamba hii moja, haina 635 00:32:49,200 --> 00:32:51,080 halipo, hivyo mimi nina tu initializing yake. 636 00:32:51,080 --> 00:32:53,820 Hii ni moja ya Mimi kwa kweli kuongeza. 637 00:32:53,820 --> 00:32:55,570 OK. 638 00:32:55,570 --> 00:32:59,480 Hiyo hii ilikuwa kuongeza maneno. 639 00:32:59,480 --> 00:33:03,040 >> Sasa nataka kuongeza Priors. 640 00:33:03,040 --> 00:33:05,480 Hivyo ni jinsi gani mimi mahesabu ya Priors? 641 00:33:05,480 --> 00:33:11,580 Priors inaweza kuwa mahesabu na jinsi mara nyingi. 642 00:33:11,580 --> 00:33:15,340 Hivyo ni jinsi mara nyingi unaweza kuona kwamba mwimbaji kati ya wote wa waimbaji kwamba 643 00:33:15,340 --> 00:33:16,380 na, sawa? 644 00:33:16,380 --> 00:33:18,810 Hivyo kwa Gaga na Katy Perry, katika kesi hii, naona Gaga 645 00:33:18,810 --> 00:33:20,570 mara moja, Katy Perry mara moja. 646 00:33:20,570 --> 00:33:23,320 >> Hivyo kimsingi Priors kwa Gaga na kwa Katy Perry ingekuwa 647 00:33:23,320 --> 00:33:24,390 tu kuwa moja, sawa? 648 00:33:24,390 --> 00:33:26,500 Wewe tu mara ngapi Mimi naona msanii. 649 00:33:26,500 --> 00:33:28,740 Hivyo hii ni rahisi sana kwa mahesabu. 650 00:33:28,740 --> 00:33:34,100 Naweza tu kitu sawa kama kama "kama mwimbaji katika Priors, "Mimi kwenda tu 651 00:33:34,100 --> 00:33:38,970 kuongeza 1 kwa Priors yao sanduku. 652 00:33:38,970 --> 00:33:51,000 Hivyo, "Priors (kuimba)" + = 1 "na kisha" mwingine " Mimi nina kwenda kufanya "Priors (mwimbaji) 653 00:33:51,000 --> 00:33:55,000 = 1. "Hufanya akili? 654 00:33:55,000 --> 00:34:00,080 >> Hivyo kama haipo mimi tu ya kuweka kama 1, vinginevyo mimi tu kuongeza 1. 655 00:34:00,080 --> 00:34:11,280 OK, hivyo sasa wote kwamba nina kushoto kufanya pia kuongeza kila ya maneno ya 656 00:34:11,280 --> 00:34:12,290 probabilities. 657 00:34:12,290 --> 00:34:14,889 Hivyo nina kuhesabu ni mara ngapi Mimi naona kila ya maneno. 658 00:34:14,889 --> 00:34:18,780 Hivyo mimi tu na kufanya mwingine kwa kitanzi katika line. 659 00:34:18,780 --> 00:34:25,190 >> Kitu hivyo kwanza kwamba mimi nina kwenda kufanya ni kuangalia kama mwimbaji tayari ina 660 00:34:25,190 --> 00:34:26,969 probabilities safu. 661 00:34:26,969 --> 00:34:31,739 Hivyo mimi nina kuangalia kama mwimbaji haina na probabilities safu, mimi tu 662 00:34:31,739 --> 00:34:34,480 kwenda initialize moja kwa ajili yao. 663 00:34:34,480 --> 00:34:36,400 Siyo hata safu, sorry, ni dictionary. 664 00:34:36,400 --> 00:34:43,080 Hivyo probabilities ya mwanamuziki ni kwenda kuwa kamusi wazi, hivyo mimi nina 665 00:34:43,080 --> 00:34:45,830 tu initializing kamusi kwa ajili yake. 666 00:34:45,830 --> 00:34:46,820 OK? 667 00:34:46,820 --> 00:34:58,330 >> Na sasa mimi kweli anaweza kufanya kwa kitanzi mahesabu ya kila ya maneno ' 668 00:34:58,330 --> 00:35:00,604 probabilities. 669 00:35:00,604 --> 00:35:01,540 OK. 670 00:35:01,540 --> 00:35:04,160 Hivyo nini siwezi kufanya ni kwa kitanzi. 671 00:35:04,160 --> 00:35:06,590 Hivyo mimi nina tu kwenda iterate juu ya safu. 672 00:35:06,590 --> 00:35:15,320 Hivyo njia kwamba naweza kufanya hivyo katika Python ni "kwa ajili ya i katika mbalimbali." Kutoka 1 673 00:35:15,320 --> 00:35:19,200 kwa sababu Mimi nataka kuanza katika pili hiki kwa sababu kwanza ni 674 00:35:19,200 --> 00:35:20,260 mwimbaji jina. 675 00:35:20,260 --> 00:35:24,990 Kwa hiyo kutokana na moja hadi urefu wa line. 676 00:35:24,990 --> 00:35:29,760 Na wakati mimi mbalimbali ni kweli kwenda kutoka kama hapa kutoka 1 kwa len ya 677 00:35:29,760 --> 00:35:30,740 line minus 1. 678 00:35:30,740 --> 00:35:33,810 Hivyo tayari gani kwamba jambo la kufanya n minus 1 kwa arrays ambayo ni sana 679 00:35:33,810 --> 00:35:35,500 urahisi. 680 00:35:35,500 --> 00:35:37,850 Hufanya akili? 681 00:35:37,850 --> 00:35:42,770 >> Hivyo kwa kila moja ya haya, nini mimi kwenda kwa kufanya ni, kama vile katika mtu mwingine, 682 00:35:42,770 --> 00:35:50,320 Mimi nina kwenda kuangalia kama neno katika hii nafasi katika line tayari ni katika 683 00:35:50,320 --> 00:35:51,570 probabilities. 684 00:35:51,570 --> 00:35:53,400 685 00:35:53,400 --> 00:35:57,260 Na kisha kama nilivyosema hapa, probabilities maneno, kama katika mimi kuweka 686 00:35:57,260 --> 00:35:58,400 "Probabilities (mwimbaji)". 687 00:35:58,400 --> 00:35:59,390 Na jina la mwimbaji. 688 00:35:59,390 --> 00:36:03,450 Hivyo kama ni tayari katika "Probabilit (mwimbaji)", maana yake ni kwamba mimi 689 00:36:03,450 --> 00:36:11,960 unataka kuongeza 1 kwa hiyo, hivyo mimi nina kwenda kwa kufanya "probabilities (mwimbaji)", na 690 00:36:11,960 --> 00:36:14,100 neno inaitwa "line (i)". 691 00:36:14,100 --> 00:36:22,630 Mimi nina kwenda kuongeza 1 na "mwingine" Mimi tu kwenda initialize 1. 692 00:36:22,630 --> 00:36:23,880 "Line (i)". 693 00:36:23,880 --> 00:36:26,920 694 00:36:26,920 --> 00:36:28,420 Hufanya akili? 695 00:36:28,420 --> 00:36:30,180 >> Kwa hiyo, mimi mahesabu yote ya arrays. 696 00:36:30,180 --> 00:36:36,580 Kwa hiyo, sasa wote kwamba mimi kufanya kwa ajili ya hii moja ni tu "kurudi Priors, 697 00:36:36,580 --> 00:36:43,230 probabilities na maneno. "Hebu kuona kama kuna yoyote, OK. 698 00:36:43,230 --> 00:36:45,690 Inaonekana kila kitu ni kazi hadi sasa. 699 00:36:45,690 --> 00:36:46,900 Hivyo, kwamba inafanya hisia? 700 00:36:46,900 --> 00:36:47,750 Katika baadhi ya njia? 701 00:36:47,750 --> 00:36:49,280 OK. 702 00:36:49,280 --> 00:36:51,980 Basi sasa nina probabilities wote. 703 00:36:51,980 --> 00:36:55,100 Hivyo sasa kitu tu mimi wameondoka ni tu na kwamba jambo 704 00:36:55,100 --> 00:36:58,650 mahesabu ya bidhaa ya yote probabilities wakati mimi kupata lyrics. 705 00:36:58,650 --> 00:37:06,270 >> Basi hebu kusema kwamba mimi nataka sasa kuwaita kazi hii "kuainisha ()" na 706 00:37:06,270 --> 00:37:08,880 kitu kazi ambayo inachukua ni hoja tu. 707 00:37:08,880 --> 00:37:13,170 Hebu sema "Baby, mimi niko kwenye moto" na ni kwenda kufikiri nini ni 708 00:37:13,170 --> 00:37:14,490 uwezekano kwamba hii ni Gaga? 709 00:37:14,490 --> 00:37:16,405 Ni uwezekano gani kwamba hii ni Katie? 710 00:37:16,405 --> 00:37:19,690 Sauti nzuri? 711 00:37:19,690 --> 00:37:25,750 Hivyo mimi nina kwenda tu na kujenga kazi mpya iitwayo "kuainisha ()" na 712 00:37:25,750 --> 00:37:29,180 ni kwenda kuchukua baadhi lyrics pia. 713 00:37:29,180 --> 00:37:31,790 714 00:37:31,790 --> 00:37:36,160 Na badala ya lyrics mimi pia na kutuma Priors, 715 00:37:36,160 --> 00:37:37,700 probabilities na maneno. 716 00:37:37,700 --> 00:37:44,000 Hivyo nina kwenda kwa kutuma lyrics, Priors, probabilities, maneno. 717 00:37:44,000 --> 00:37:51,840 >> Hivyo hii ni kuchukua lyrics, Priors, probabilities, maneno. 718 00:37:51,840 --> 00:37:53,530 Hiyo, ni nini ni nini? 719 00:37:53,530 --> 00:37:57,180 Ni kimsingi ni kwenda njia zote wagombea inawezekana kwamba wewe 720 00:37:57,180 --> 00:37:58,510 na kama mwimbaji. 721 00:37:58,510 --> 00:37:59,425 Na ambapo ni wagombea hao? 722 00:37:59,425 --> 00:38:01,020 Wao uko katika Priors, sawa? 723 00:38:01,020 --> 00:38:02,710 Hivyo nina wale wote huko. 724 00:38:02,710 --> 00:38:07,870 Hivyo nina kwenda kuwa na kamusi ya wagombea wote iwezekanavyo. 725 00:38:07,870 --> 00:38:14,220 Na kisha kwa kila mgombea katika Priors, hivyo ina maana kwamba ni kwenda 726 00:38:14,220 --> 00:38:17,740 kuwa Gaga, Katie kama alikuwa zaidi itakuwa zaidi. 727 00:38:17,740 --> 00:38:20,410 Mimi nina kwenda kuanza kuhesabu uwezekano huu. 728 00:38:20,410 --> 00:38:28,310 uwezekano kama tuliona katika PowerPoint ni mara kabla ya 729 00:38:28,310 --> 00:38:30,800 bidhaa ya kila probabilities nyingine. 730 00:38:30,800 --> 00:38:32,520 >> Hivyo siwezi kufanya hivyo hapa. 731 00:38:32,520 --> 00:38:36,330 Naweza tu kufanya uwezekano ni awali tu kabla ya. 732 00:38:36,330 --> 00:38:40,340 Hivyo Priors wa mgombea. 733 00:38:40,340 --> 00:38:40,870 Haki? 734 00:38:40,870 --> 00:38:45,360 Na sasa nina iterate juu ya yote maneno ambayo nina katika lyrics kuwa 735 00:38:45,360 --> 00:38:48,820 uwezo wa kuongeza uwezekano kwa kila mmoja wao, OK? 736 00:38:48,820 --> 00:38:57,900 Hivyo, "kwa neno katika lyrics" nini mimi kwenda kufanya, kama neno ni katika 737 00:38:57,900 --> 00:39:01,640 "Probabilities (mgombea)", ambayo ina maana kwamba ni neno 738 00:39:01,640 --> 00:39:03,640 mgombea ana katika lyrics yao - 739 00:39:03,640 --> 00:39:05,940 kwa mfano, "mtoto" kwa Gaga - 740 00:39:05,940 --> 00:39:11,710 nini mimi kwenda kufanya ni kwamba uwezekano ni kwenda tele 741 00:39:11,710 --> 00:39:22,420 na 1 pamoja na probabilities ya mgombea wa neno hilo. 742 00:39:22,420 --> 00:39:25,710 Na ni kuitwa "neno". 743 00:39:25,710 --> 00:39:32,440 Hii kugawanywa na idadi ya maneno kwamba nina kwa mgombea huyo. 744 00:39:32,440 --> 00:39:37,450 jumla ya idadi ya maneno ambayo nina kwa mwimbaji kwamba mimi nina kuangalia. 745 00:39:37,450 --> 00:39:40,290 >> "Else." maana ni neno jipya hivyo Ningependa kuwa kama kwa mfano 746 00:39:40,290 --> 00:39:41,860 "Moto" kwa Lady Gaga. 747 00:39:41,860 --> 00:39:45,760 Kwa hiyo mimi nataka tu kufanya 1 juu ya "Neno (mgombea)". 748 00:39:45,760 --> 00:39:47,710 Hivyo mimi si unataka kuweka muda huu hapa. 749 00:39:47,710 --> 00:39:50,010 >> Hivyo ni kwenda kwa kuwa kimsingi kuiga na pasting hii. 750 00:39:50,010 --> 00:39:54,380 751 00:39:54,380 --> 00:39:56,000 Lakini mimi nina kwenda kufuta sehemu hii. 752 00:39:56,000 --> 00:39:57,610 Hivyo ni kwenda tu kuwa 1 juu ya hilo. 753 00:39:57,610 --> 00:40:00,900 754 00:40:00,900 --> 00:40:02,150 Sauti nzuri? 755 00:40:02,150 --> 00:40:03,980 756 00:40:03,980 --> 00:40:09,700 Na sasa mwishoni, mimi nina kwenda tu kwa magazeti jina la mgombea na 757 00:40:09,700 --> 00:40:15,750 uwezekano kwamba una ya kuwa S juu ya lyrics yao. 758 00:40:15,750 --> 00:40:16,200 Hufanya akili? 759 00:40:16,200 --> 00:40:18,390 Na mimi kwa kweli kufanya hata haja kamusi hii. 760 00:40:18,390 --> 00:40:19,510 Hufanya akili? 761 00:40:19,510 --> 00:40:21,810 >> Kwa hiyo, hebu angalia kama hii kweli kazi. 762 00:40:21,810 --> 00:40:24,880 Basi, ikiwa mimi kukimbia hii, haikuwa kazi. 763 00:40:24,880 --> 00:40:26,130 Kusubiri moja ya pili. 764 00:40:26,130 --> 00:40:28,870 765 00:40:28,870 --> 00:40:31,720 "Maneno (mgombea)", "maneno (mgombea)", kwamba 766 00:40:31,720 --> 00:40:33,750 jina la safu. 767 00:40:33,750 --> 00:40:41,435 OK Kwa hiyo, anasema kuna baadhi mdudu kwa mgombea katika Priors. 768 00:40:41,435 --> 00:40:46,300 769 00:40:46,300 --> 00:40:48,760 Napenda tu chill kidogo. 770 00:40:48,760 --> 00:40:50,360 OK. 771 00:40:50,360 --> 00:40:51,305 Hebu jaribu. 772 00:40:51,305 --> 00:40:51,720 OK. 773 00:40:51,720 --> 00:40:58,710 >> Hivyo anatoa Katy Perry ana hili uwezekano wa hili mara kwa mara 10 kwa 774 00:40:58,710 --> 00:41:02,200 minus 7, na Gaga ana hili mara 10 kwa minus 6. 775 00:41:02,200 --> 00:41:05,610 Hivyo unaweza kuona inaonyesha kwamba Gaga ina uwezekano ya juu. 776 00:41:05,610 --> 00:41:09,260 Kwa hiyo, "Baby, mimi nina on Fire" ni pengine Gaga wimbo. 777 00:41:09,260 --> 00:41:10,580 Hufanya akili? 778 00:41:10,580 --> 00:41:12,030 Hivyo hii ni nini sisi alivyofanya. 779 00:41:12,030 --> 00:41:16,010 >> Kanuni hii ni kwenda kuwa posted online, hivyo guys unaweza kuangalia ni nje. 780 00:41:16,010 --> 00:41:20,720 Labda kutumia baadhi ya ni kwa ajili ya kama unataka kufanya mradi au kitu sawa. 781 00:41:20,720 --> 00:41:22,150 OK. 782 00:41:22,150 --> 00:41:25,930 Hii ilikuwa ni kuonyesha nini computational 783 00:41:25,930 --> 00:41:27,230 isimu code inaonekana kama. 784 00:41:27,230 --> 00:41:33,040 Lakini sasa hebu kwenda zaidi kiwango cha juu mambo ya ajabu. 785 00:41:33,040 --> 00:41:33,340 OK. 786 00:41:33,340 --> 00:41:35,150 >> Hivyo matatizo mengine mimi alikuwa anazungumza juu ya - 787 00:41:35,150 --> 00:41:37,550 tatizo segmentation ni ya kwanza ya yao. 788 00:41:37,550 --> 00:41:40,820 Hivyo kuwa hapa Japan. 789 00:41:40,820 --> 00:41:43,420 Na kisha unaweza kuona kwamba hakuna nafasi. 790 00:41:43,420 --> 00:41:49,110 Hivyo hii ni kimsingi ina maana kwamba ni juu ya kiti, sawa? 791 00:41:49,110 --> 00:41:50,550 Kusema Kijapani? 792 00:41:50,550 --> 00:41:52,840 Ni juu ya kiti, sawa? 793 00:41:52,840 --> 00:41:54,480 >> STUDENT: Mimi sijui nini ja zaidi ya hapo ni. 794 00:41:54,480 --> 00:41:57,010 >> LUCAS Freitas: Ni [ANAZUNGUMZA Kijapani] 795 00:41:57,010 --> 00:41:57,950 OK. 796 00:41:57,950 --> 00:42:00,960 Hivyo kimsingi ina maana mwenyekiti wa juu. 797 00:42:00,960 --> 00:42:03,620 Hivyo kama wewe alikuwa na kuweka nafasi itakuwa hapa. 798 00:42:03,620 --> 00:42:05,970 Na kisha una [? Ueda-san. ?] 799 00:42:05,970 --> 00:42:09,040 Ambayo kimsingi ina maana Mheshimiwa Ueda. 800 00:42:09,040 --> 00:42:13,180 Na unaweza kuona kwamba "Ueda" na una nafasi na kisha "san." Hivyo unaweza kuona kwamba 801 00:42:13,180 --> 00:42:15,470 hapa "UE" ni kama kwa yenyewe. 802 00:42:15,470 --> 00:42:17,750 Na hapa ina tabia karibu na hiyo. 803 00:42:17,750 --> 00:42:21,720 >> Hivyo si kama katika lugha hizo wahusika maana neno ni hivyo 804 00:42:21,720 --> 00:42:23,980 tu kuweka mengi ya nafasi. 805 00:42:23,980 --> 00:42:25,500 Wahusika yanahusiana na kila mmoja. 806 00:42:25,500 --> 00:42:28,680 Na wanaweza kuwa pamoja kama mbili, tatu, moja. 807 00:42:28,680 --> 00:42:34,520 Hivyo kweli na kujenga aina fulani njia ya kuweka nafasi hizo. 808 00:42:34,520 --> 00:42:38,850 >> Na jambo hili ni kwamba wakati wowote, kupata data kutoka lugha hizo Asia, 809 00:42:38,850 --> 00:42:40,580 kila kitu huja unsegmented. 810 00:42:40,580 --> 00:42:45,940 Kwa sababu hakuna mtu ambaye anaandika Kijapani au Kichina anaandika na nafasi. 811 00:42:45,940 --> 00:42:48,200 Kila wewe ni kuandika Kichina, Japan wewe andika tu kila kitu 812 00:42:48,200 --> 00:42:48,710 na hakuna nafasi. 813 00:42:48,710 --> 00:42:52,060 Haina hata mantiki kuweka mazingira. 814 00:42:52,060 --> 00:42:57,960 Hivyo basi wakati kupata data kutoka, baadhi Lugha Asia ya Mashariki, kama unataka 815 00:42:57,960 --> 00:43:00,760 kweli kufanya kitu na kwamba una sehemu ya kwanza. 816 00:43:00,760 --> 00:43:05,130 >> Fikiria ya kufanya mfano wa lyrics bila nafasi. 817 00:43:05,130 --> 00:43:07,950 Hivyo lyrics tu kwamba una itakuwa hukumu, sawa? 818 00:43:07,950 --> 00:43:09,470 Kutengwa na vipindi. 819 00:43:09,470 --> 00:43:13,930 Lakini basi tu kuwa hukumu mapenzi si kweli kusaidia juu ya kutoa taarifa 820 00:43:13,930 --> 00:43:17,760 nani wale lyrics ni kwa. 821 00:43:17,760 --> 00:43:18,120 Haki? 822 00:43:18,120 --> 00:43:20,010 Hivyo ni lazima unaweka nafasi ya kwanza. 823 00:43:20,010 --> 00:43:21,990 Hivyo jinsi gani unaweza kufanya hivyo? 824 00:43:21,990 --> 00:43:24,920 >> Hivyo basi huja wazo la lugha mfano ambayo ni kitu kweli 825 00:43:24,920 --> 00:43:26,870 muhimu kwa computational isimu. 826 00:43:26,870 --> 00:43:32,790 Hivyo mfano lugha kimsingi ni meza ya probabilities kwamba inaonyesha 827 00:43:32,790 --> 00:43:36,260 Awali ya yote nini ni uwezekano ya kuwa na neno katika lugha? 828 00:43:36,260 --> 00:43:39,590 Hivyo kuonyesha jinsi ya mara kwa mara neno ni. 829 00:43:39,590 --> 00:43:43,130 Na kisha pia kuonyesha uhusiano kati ya maneno katika sentensi. 830 00:43:43,130 --> 00:43:51,500 >> Hivyo wazo kuu, kama mgeni alikuja wewe na alisema adhabu 831 00:43:51,500 --> 00:43:55,600 ninyi, je, ni uwezekano kwamba, kwa mfano, "hii ni dada yangu [? GTF"?] 832 00:43:55,600 --> 00:43:57,480 ni hukumu ya kwamba mtu alisema? 833 00:43:57,480 --> 00:44:00,380 Hivyo ni wazi baadhi ya sentensi ni kawaida zaidi kuliko wengine. 834 00:44:00,380 --> 00:44:04,450 Kwa mfano, "asubuhi nzuri," au "nzuri usiku, "au" hey huko, "ni zaidi ya 835 00:44:04,450 --> 00:44:08,260 kawaida zaidi kuliko hukumu nyingi za kwamba tuna lugha ya Kiingereza. 836 00:44:08,260 --> 00:44:11,060 Hivyo kwa nini ni hukumu wale mara kwa mara zaidi? 837 00:44:11,060 --> 00:44:14,060 >> Awali ya yote, ni kwa sababu una maneno ambayo ni mara kwa mara. 838 00:44:14,060 --> 00:44:20,180 Hivyo, kwa mfano, kama wewe kusema, mbwa ni kubwa, na mbwa ni mkubwa, wewe 839 00:44:20,180 --> 00:44:23,880 kawaida pengine kusikia mbwa ni kubwa mara nyingi zaidi kwa sababu "kubwa" ni zaidi 840 00:44:23,880 --> 00:44:27,260 mara kwa mara katika lugha ya Kiingereza ya "mkubwa." Kwa hiyo, moja ya 841 00:44:27,260 --> 00:44:30,100 mambo ni neno frequency. 842 00:44:30,100 --> 00:44:34,490 >> Jambo la pili ambayo ni kweli muhimu ni tu 843 00:44:34,490 --> 00:44:35,490 utaratibu wa maneno. 844 00:44:35,490 --> 00:44:39,500 Kwa hiyo, ni kawaida kwa kusema "paka ni ndani ya boksi. "lakini huna kawaida 845 00:44:39,500 --> 00:44:44,250 kuona katika "sanduku ndani ya ni paka." hivyo unaweza kuona kwamba kuna baadhi ya umuhimu 846 00:44:44,250 --> 00:44:46,030 katika utaratibu wa maneno. 847 00:44:46,030 --> 00:44:50,160 Huwezi tu kusema kwamba hizo mbili hukumu kuwa na uwezekano sawa 848 00:44:50,160 --> 00:44:53,010 kwa sababu tu wao na maneno yale yale. 849 00:44:53,010 --> 00:44:55,550 Wewe kweli kuwa na huduma ya kuhusu utaratibu pia. 850 00:44:55,550 --> 00:44:57,650 Mantiki? 851 00:44:57,650 --> 00:44:59,490 >> Hivyo tunafanya nini? 852 00:44:59,490 --> 00:45:01,550 Basi nini mimi ili kujaribu kupata wewe? 853 00:45:01,550 --> 00:45:04,400 Mimi nina kujaribu kupata nini sisi kuwaita mifano n-gram. 854 00:45:04,400 --> 00:45:09,095 Hivyo mifano n-gram kimsingi kudhani kwamba kwa kila neno 855 00:45:09,095 --> 00:45:10,960 una katika hukumu. 856 00:45:10,960 --> 00:45:15,020 Ni uwezekano wa kuwa na kwamba neno huko inategemea si tu juu ya 857 00:45:15,020 --> 00:45:18,395 mzunguko wa neno katika lugha, lakini pia juu ya maneno ambayo 858 00:45:18,395 --> 00:45:19,860 ni jirani yake. 859 00:45:19,860 --> 00:45:25,810 >> Hivyo kwa mfano, kwa kawaida wakati unaweza kuona kitu kama juu ya au wewe ni 860 00:45:25,810 --> 00:45:28,040 pengine ni kwenda kuona noun baada ya yake, sawa? 861 00:45:28,040 --> 00:45:31,750 Kwa sababu wakati una preposition kawaida inachukua noun baada ya yake. 862 00:45:31,750 --> 00:45:35,540 Au kama una verb kwamba ni elekezi kawaida ni kwenda 863 00:45:35,540 --> 00:45:36,630 na noun maneno. 864 00:45:36,630 --> 00:45:38,780 Hivyo ni kwenda kuwa na noun mahali fulani karibu yake. 865 00:45:38,780 --> 00:45:44,950 >> Kwa hiyo, kimsingi, kile yake ni kwamba anaona uwezekano wa kuwa na 866 00:45:44,950 --> 00:45:47,960 maneno karibu na kila mmoja, wakati wewe ni kuhesabu 867 00:45:47,960 --> 00:45:49,050 uwezekano wa adhabu. 868 00:45:49,050 --> 00:45:50,960 Na kwamba ni nini lugha mfano ni kimsingi. 869 00:45:50,960 --> 00:45:54,620 Kusema tu nini uwezekano ya kuwa na maalum 870 00:45:54,620 --> 00:45:57,120 hukumu katika lugha? 871 00:45:57,120 --> 00:45:59,110 Hivyo ni kwa nini kwamba muhimu, kimsingi? 872 00:45:59,110 --> 00:46:02,390 Na ya kwanza ya wote ni nini kwa n-gram mfano, basi? 873 00:46:02,390 --> 00:46:08,850 >> Hivyo n-gram mfano ina maana kwamba kila neno inategemea 874 00:46:08,850 --> 00:46:12,700 ijayo N minus 1 maneno. 875 00:46:12,700 --> 00:46:18,150 Kwa hiyo, kimsingi, ina maana kwamba kama mimi kuangalia, kwa mfano, katika CS50 TF wakati 876 00:46:18,150 --> 00:46:21,500 Mimi nina kuhesabu uwezekano wa hukumu, wewe utakuwa na kama " 877 00:46:21,500 --> 00:46:25,280 uwezekano wa kuwa na neno "" mara uwezekano wa kuwa " 878 00:46:25,280 --> 00:46:31,720 CS50 "mara uwezekano wa kuwa na "CS50 TF." Kwa hiyo, kimsingi, mimi kuhesabu 879 00:46:31,720 --> 00:46:35,720 kila njia ya kukaza mwendo yake. 880 00:46:35,720 --> 00:46:41,870 >> Na kisha kawaida wakati wewe ni kufanya hivyo, kama katika mradi, kuweka N kuwa 881 00:46:41,870 --> 00:46:42,600 chini thamani. 882 00:46:42,600 --> 00:46:45,930 Hivyo, kwa kawaida na bigrams au trigrams. 883 00:46:45,930 --> 00:46:51,090 Ili tu kuhesabu maneno mawili, a kundi la maneno mawili, au maneno tatu, 884 00:46:51,090 --> 00:46:52,620 tu kwa ajili ya masuala ya utendaji. 885 00:46:52,620 --> 00:46:56,395 Na pia kwa sababu labda kama una kitu kama "CS50 TF." Wakati 886 00:46:56,395 --> 00:47:00,510 na "TF," ni muhimu sana kwamba "CS50" ni karibu na hiyo, right? 887 00:47:00,510 --> 00:47:04,050 Mambo hayo mawili ni kawaida karibu na kila mmoja. 888 00:47:04,050 --> 00:47:06,410 >> Kama unafikiri ya "TF," pengine ni kwenda na nini 889 00:47:06,410 --> 00:47:07,890 darasa ni TF'ing kwa. 890 00:47:07,890 --> 00:47:11,330 Pia "" kwa kweli ni muhimu kwa CS50 TF. 891 00:47:11,330 --> 00:47:14,570 Lakini kama una kitu kama "CS50 TF aliingia darasani na alitoa yao 892 00:47:14,570 --> 00:47:20,060 baadhi ya wanafunzi pipi. "" Candy "na" " kuwa hakuna uhusiano kweli, haki? 893 00:47:20,060 --> 00:47:23,670 Wao uko hivyo mbali na kila mmoja kwamba ni kweli haina jambo gani 894 00:47:23,670 --> 00:47:25,050 maneno na. 895 00:47:25,050 --> 00:47:31,210 >> Hivyo kwa kufanya bigram au trigram, ni tu ina maana kwamba wewe ni kikwazo 896 00:47:31,210 --> 00:47:33,430 mwenyewe baadhi ya maneno kwamba ni karibu. 897 00:47:33,430 --> 00:47:35,810 Mantiki? 898 00:47:35,810 --> 00:47:40,630 Hivyo wakati unataka kufanya segmentation, kimsingi, nini unataka kufanya ni kuona 899 00:47:40,630 --> 00:47:44,850 ni njia zipi wote inawezekana kwamba unaweza sehemu ya hukumu. 900 00:47:44,850 --> 00:47:49,090 >> Kama kwamba wewe kuona ni nini uwezekano wa kila ya hukumu wale 901 00:47:49,090 --> 00:47:50,880 zilizopo katika lugha? 902 00:47:50,880 --> 00:47:53,410 Hivyo nini kufanya ni kama, vizuri, basi mimi kujaribu kuweka nafasi hapa. 903 00:47:53,410 --> 00:47:55,570 Hivyo kuweka nafasi huko na unaweza kuona nini ni 904 00:47:55,570 --> 00:47:57,590 uwezekano wa hukumu hiyo? 905 00:47:57,590 --> 00:48:00,240 Basi ni kama, OK, labda kwamba alikuwa si nzuri. 906 00:48:00,240 --> 00:48:03,420 Basi, mimi kuweka nafasi huko na nafasi huko, na mahesabu 907 00:48:03,420 --> 00:48:06,240 uwezekano sasa, na unaweza kuona kwamba ni uwezekano mkubwa. 908 00:48:06,240 --> 00:48:12,160 >> Hivyo hii ni algorithm aitwaye TANGO segmentation algorithm, ambayo ni 909 00:48:12,160 --> 00:48:14,990 kweli kitu ambacho itakuwa kweli baridi kwa mradi huo, ambao 910 00:48:14,990 --> 00:48:20,860 kimsingi inachukua Nakala unsegmented ambayo inaweza kuwa Japan au China au labda 911 00:48:20,860 --> 00:48:26,080 Kiingereza bila nafasi na anajaribu kuweka nafasi kati ya maneno na haina 912 00:48:26,080 --> 00:48:29,120 kwamba kwa kutumia mfano lugha na kujaribu kuona ni nini juu 913 00:48:29,120 --> 00:48:31,270 uwezekano unaweza kupata. 914 00:48:31,270 --> 00:48:32,230 OK. 915 00:48:32,230 --> 00:48:33,800 Hivyo hii ni segmentation. 916 00:48:33,800 --> 00:48:35,450 >> Sasa syntax. 917 00:48:35,450 --> 00:48:40,940 Hivyo, syntax ni kuwa kutumika kwa ajili ya mambo mengi hivi sasa. 918 00:48:40,940 --> 00:48:44,880 Hivyo kwa Graph Search, kwa Siri kwa pretty much aina yoyote ya asili 919 00:48:44,880 --> 00:48:46,490 usindikaji lugha una. 920 00:48:46,490 --> 00:48:49,140 Kwa hiyo kile ni muhimu kuhusu mambo ya syntax? 921 00:48:49,140 --> 00:48:52,390 Kwa hiyo, hukumu kwa ujumla na kile tunachokiita wapiga kura. 922 00:48:52,390 --> 00:48:57,080 Ambayo ni aina ya kama makundi ya maneno kuwa na kazi katika hukumu. 923 00:48:57,080 --> 00:49:02,220 Na wanaweza si kweli kuwa mbali na kila mmoja. 924 00:49:02,220 --> 00:49:07,380 >> Hivyo, kama mimi kusema, kwa mfano, "Lauren anapenda Milo. "Najua kwamba" Lauren "ni 925 00:49:07,380 --> 00:49:10,180 Constituent na kisha "anapenda Milo "ni pia mtu mwingine. 926 00:49:10,180 --> 00:49:16,860 Kwa sababu huwezi kusema kama "Lauren Milo anapenda "kuwa na maana moja. 927 00:49:16,860 --> 00:49:18,020 Ni si kwenda kuwa na maana moja. 928 00:49:18,020 --> 00:49:22,500 Au siwezi kusema kama "Milo Lauren anapenda. "Si kila kitu ina sawa 929 00:49:22,500 --> 00:49:25,890 maana kufanya hivyo. 930 00:49:25,890 --> 00:49:31,940 >> Hivyo mambo mawili muhimu zaidi kuhusu syntax ni aina lexical ambayo ni 931 00:49:31,940 --> 00:49:35,390 kimsingi kazi kwamba na kwa maneno kwa wenyewe. 932 00:49:35,390 --> 00:49:39,180 Hivyo kuwa na kujua kwamba "Lauren" na "Milo" ni nomino. 933 00:49:39,180 --> 00:49:41,040 "Upendo" ni kitenzi. 934 00:49:41,040 --> 00:49:45,660 Na jambo la pili muhimu ni kwamba wao ni aina verbs. 935 00:49:45,660 --> 00:49:48,990 Hivyo unajua kuwa "anapenda Milo" ni kweli maneno ya matusi. 936 00:49:48,990 --> 00:49:52,390 Hivyo wakati mimi kusema "Lauren," Najua kwamba Lauren ni kufanya kitu fulani. 937 00:49:52,390 --> 00:49:53,620 Mwanamke anafanya nini? 938 00:49:53,620 --> 00:49:54,570 Yeye upendo Milo. 939 00:49:54,570 --> 00:49:56,440 Hivyo ni jambo zima. 940 00:49:56,440 --> 00:50:01,640 Lakini sehemu yake ni noun na verb. 941 00:50:01,640 --> 00:50:04,210 Lakini pamoja, wao kufanya kifungu kitenzi. 942 00:50:04,210 --> 00:50:08,680 >> Hivyo, nini unaweza sisi kwa kweli kufanya na computational isimu? 943 00:50:08,680 --> 00:50:13,810 Hivyo, kama mimi kuwa na kitu kwa mfano "Marafiki wa Allison." Mimi naona kama mimi tu 944 00:50:13,810 --> 00:50:17,440 hakuwa mti Kiwango cha kisintaksia napenda kujua kwamba "Marafiki" ni noun maneno ni 945 00:50:17,440 --> 00:50:21,480 noun na kisha "ya Allison" ni prepositional maneno ambayo "ya" ni 946 00:50:21,480 --> 00:50:24,810 pendekezo na "Allison" ni jina. 947 00:50:24,810 --> 00:50:30,910 Nini mimi naweza kufanya ni kufundisha kompyuta yangu kwamba wakati mimi na noun maneno moja na 948 00:50:30,910 --> 00:50:33,080 kisha maneno prepositional. 949 00:50:33,080 --> 00:50:39,020 Hivyo katika kesi hii, "marafiki" na kisha "ya Milo "Najua kwamba hii ina maana kwamba 950 00:50:39,020 --> 00:50:43,110 NP2, wa pili, anamiliki NP1. 951 00:50:43,110 --> 00:50:47,680 >> Hivyo siwezi kujenga aina fulani ya uhusiano, baadhi ya aina ya kazi kwa ajili yake. 952 00:50:47,680 --> 00:50:52,370 Hivyo wakati wowote naona muundo huu, ambayo mechi hasa na marafiki wa " 953 00:50:52,370 --> 00:50:56,030 Allison, "Najua kwamba Allison anamiliki marafiki. 954 00:50:56,030 --> 00:50:58,830 Hivyo marafiki ni kitu kwamba Allison ana. 955 00:50:58,830 --> 00:50:59,610 Hufanya akili? 956 00:50:59,610 --> 00:51:01,770 Hivyo hii ni kimsingi nini Graph Tafuta gani. 957 00:51:01,770 --> 00:51:04,360 Inasababisha sheria kwa mambo mengi. 958 00:51:04,360 --> 00:51:08,190 Kwa hiyo, "marafiki wa Allison," "marafiki zangu ambao wanaishi katika Cambridge, "" marafiki zangu 959 00:51:08,190 --> 00:51:12,970 wanaokwenda Harvard. "Ni inajenga sheria kwa mambo hayo yote. 960 00:51:12,970 --> 00:51:14,930 >> Sasa tafsiri mashine. 961 00:51:14,930 --> 00:51:18,850 Hivyo, mashine ya tafsiri ni pia kitu takwimu. 962 00:51:18,850 --> 00:51:21,340 Na kama kweli kujihusisha katika isimu mahesabu, mengi ya 963 00:51:21,340 --> 00:51:23,580 mambo yako ni kwenda kuwa takwimu. 964 00:51:23,580 --> 00:51:26,670 Ili nilikuwa akifanya mfano pamoja mengi ya probabilities kwamba mimi ni 965 00:51:26,670 --> 00:51:30,540 kuhesabu, na kisha kupata hii idadi ndogo sana kwamba mwisho 966 00:51:30,540 --> 00:51:33,180 uwezekano, na kwamba ni nini anatoa jibu. 967 00:51:33,180 --> 00:51:37,540 Tafsiri mashine pia hutumia mfano takwimu. 968 00:51:37,540 --> 00:51:44,790 Na kama unataka kufikiri ya mashine tafsiri katika rahisi iwezekanavyo 969 00:51:44,790 --> 00:51:48,970 njia, nini unaweza kufikiria ni kutafsiri neno kwa neno, sawa? 970 00:51:48,970 --> 00:51:52,150 >> Wakati wewe ni kujifunza lugha kwa mara ya kwanza, hiyo ni kawaida nini 971 00:51:52,150 --> 00:51:52,910 kufanya, sawa? 972 00:51:52,910 --> 00:51:57,050 Kama unataka kutafsiri hukumu katika lugha yako kwa lugha 973 00:51:57,050 --> 00:52:00,060 wewe ni kujifunza, kwa kawaida kwanza, kutafsiri kila ya maneno 974 00:52:00,060 --> 00:52:03,180 mmoja mmoja, na kisha kujaribu kuweka maneno katika nafasi. 975 00:52:03,180 --> 00:52:07,100 >> Basi, ikiwa mimi alitaka kutafsiri hii, [ANAZUNGUMZA KIRENO] 976 00:52:07,100 --> 00:52:10,430 ambayo ina maana "cat nyeupe mbio mbali." Kama nilitaka kutafsiri kutoka 977 00:52:10,430 --> 00:52:13,650 Kireno Kiingereza, kile inaweza kufanya ni, kwanza, mimi tu 978 00:52:13,650 --> 00:52:14,800 kutafsiri neno kwa neno. 979 00:52:14,800 --> 00:52:20,570 Kwa hiyo, "o" ni "," "Gato," "paka," "Branco," "nyeupe," na kisha "fugio" ni 980 00:52:20,570 --> 00:52:21,650 "Mbio mbali." 981 00:52:21,650 --> 00:52:26,130 >> Hivyo basi mimi na maneno yote hapa, lakini siyo katika utaratibu. 982 00:52:26,130 --> 00:52:29,590 Ni kama "paka nyeupe wakakimbia" ambayo ni ungrammatical. 983 00:52:29,590 --> 00:52:34,490 Kwa hiyo, basi naweza kuwa na hatua ya pili, ambayo ni kwenda kuwa kutafuta bora 984 00:52:34,490 --> 00:52:36,610 nafasi kwa ajili ya kila ya maneno. 985 00:52:36,610 --> 00:52:40,240 Hivyo najua kwamba mimi kwa kweli unataka kuwa na "Nyeupe paka" badala ya "cat nyeupe." Hivyo 986 00:52:40,240 --> 00:52:46,050 nini siwezi kufanya ni, njia ya naive itakuwa kujenga wote 987 00:52:46,050 --> 00:52:49,720 permutations uwezekano wa maneno, ya nafasi. 988 00:52:49,720 --> 00:52:53,300 Na kisha kuona ambayo moja ina uwezekano mkubwa kwa mujibu 989 00:52:53,300 --> 00:52:54,970 kwa lugha mfano wangu. 990 00:52:54,970 --> 00:52:58,390 Na kisha wakati mimi kupata moja ambayo ina uwezekano mkubwa yake, ambayo ni 991 00:52:58,390 --> 00:53:01,910 pengine "cat nyeupe akakimbia," kwamba tafsiri yangu. 992 00:53:01,910 --> 00:53:06,710 >> Na hii ni njia rahisi ya kueleza jinsi mengi ya mashine tafsiri 993 00:53:06,710 --> 00:53:07,910 algorithms kazi. 994 00:53:07,910 --> 00:53:08,920 Je, hiyo mantiki? 995 00:53:08,920 --> 00:53:12,735 Hii pia ni kitu kweli kusisimua kwamba wewe guys unaweza labda kuchunguza kwa 996 00:53:12,735 --> 00:53:13,901 mradi wa mwisho, yeah? 997 00:53:13,901 --> 00:53:15,549 >> Mwanafunzi: Naam, ninyi alisema ilikuwa njia naive, hivyo nini 998 00:53:15,549 --> 00:53:17,200 mashirika yasiyo ya naive njia? 999 00:53:17,200 --> 00:53:18,400 >> LUCAS Freitas: njia zisizo za naive? 1000 00:53:18,400 --> 00:53:19,050 OK. 1001 00:53:19,050 --> 00:53:22,860 Hivyo jambo la kwanza kwamba ni mbaya kuhusu njia hii ni kwamba mimi tu kutafsiriwa 1002 00:53:22,860 --> 00:53:24,330 maneno, neno kwa neno. 1003 00:53:24,330 --> 00:53:30,570 Lakini wakati mwingine una maneno ambayo inaweza kuwa na tafsiri nyingi. 1004 00:53:30,570 --> 00:53:32,210 Mimi nina kwenda kujaribu kufikiri ya kitu fulani. 1005 00:53:32,210 --> 00:53:37,270 Kwa mfano, "manga" kwa Kireno can ama kuwa "guruta" au "sleeve." Hivyo 1006 00:53:37,270 --> 00:53:40,450 wakati wewe ni kujaribu kutafsiri neno na neno, inaweza kuwa kutoa 1007 00:53:40,450 --> 00:53:42,050 kitu ambacho haina mantiki. 1008 00:53:42,050 --> 00:53:45,770 >> Hivyo kweli wanataka kuangalia wakati wote tafsiri uwezekano wa 1009 00:53:45,770 --> 00:53:49,840 maneno na kuona, kwanza ya yote, nini ni utaratibu. 1010 00:53:49,840 --> 00:53:52,000 Sisi walikuwa wanazungumza juu ya permutating mambo? 1011 00:53:52,000 --> 00:53:54,150 Kuona maagizo yote iwezekanavyo na kuchagua moja na ya juu 1012 00:53:54,150 --> 00:53:54,990 uwezekano? 1013 00:53:54,990 --> 00:53:57,860 Unaweza pia kuchagua inawezekana tafsiri kwa kila 1014 00:53:57,860 --> 00:54:00,510 neno na kisha kuona - 1015 00:54:00,510 --> 00:54:01,950 pamoja na permutations - 1016 00:54:01,950 --> 00:54:03,710 ambayo moja ina uwezekano mkubwa. 1017 00:54:03,710 --> 00:54:08,590 >> Plus, unaweza pia kuangalia si maneno tu lakini maneno. 1018 00:54:08,590 --> 00:54:11,700 hivyo unaweza kuchambua uhusiano kati ya maneno na kisha kupata 1019 00:54:11,700 --> 00:54:13,210 bora ya tafsiri. 1020 00:54:13,210 --> 00:54:16,690 Pia kitu kingine, hivyo muhula hii Mimi kwa kweli kufanya utafiti katika 1021 00:54:16,690 --> 00:54:19,430 Kichina-Kiingereza tafsiri mashine, hivyo kutafsiri kutoka 1022 00:54:19,430 --> 00:54:20,940 Kichina katika lugha ya Kiingereza. 1023 00:54:20,940 --> 00:54:26,760 >> Na kitu sisi kufanya ni, badala ya kutumia mfano takwimu, ambayo ni 1024 00:54:26,760 --> 00:54:30,570 kuona probabilities ya kuona baadhi ya msimamo katika hukumu, mimi nina 1025 00:54:30,570 --> 00:54:35,360 kweli pia kuongeza baadhi ya syntax kwa yangu mfano, akisema, oh, kama mimi kuona aina hii 1026 00:54:35,360 --> 00:54:39,420 ya ujenzi, hii ni nini nataka mabadiliko hayo kwa wakati mimi kutafsiri. 1027 00:54:39,420 --> 00:54:43,880 Hivyo unaweza pia kuongeza baadhi ya aina ya kipengele cha syntax kufanya 1028 00:54:43,880 --> 00:54:47,970 tafsiri ufanisi zaidi na sahihi zaidi. 1029 00:54:47,970 --> 00:54:48,550 OK. 1030 00:54:48,550 --> 00:54:51,010 >> Hivyo jinsi gani unaweza kuanza, kama unataka kufanya kitu katika computational 1031 00:54:51,010 --> 00:54:51,980 isimu? 1032 00:54:51,980 --> 00:54:54,560 >> Kwanza, unaweza kuchagua mradi ambayo inahusisha lugha. 1033 00:54:54,560 --> 00:54:56,310 Hivyo, kuna wengi huko nje. 1034 00:54:56,310 --> 00:54:58,420 Kuna mambo mengi unaweza kufanya. 1035 00:54:58,420 --> 00:55:00,510 Na kisha unaweza kufikiria mfano kwamba unaweza kutumia. 1036 00:55:00,510 --> 00:55:04,710 Kawaida kwamba maana yake ni mawazo ya mawazo, kama kama, oh, wakati mimi nilikuwa 1037 00:55:04,710 --> 00:55:05,770 kama kufikiri ya maneno. 1038 00:55:05,770 --> 00:55:09,510 Mimi nilikuwa kama, vizuri, kama nataka kufikiri nje ambaye aliandika hii, mimi pengine wanataka 1039 00:55:09,510 --> 00:55:15,400 kuangalia maneno mtu kutumika na kuona ambaye anatumia neno kwamba mara nyingi sana. 1040 00:55:15,400 --> 00:55:18,470 Hivyo kujaribu kufanya mawazo na jaribu kufikiria mifano. 1041 00:55:18,470 --> 00:55:21,395 Na kisha unaweza pia kutafuta online kwa aina ya tatizo kwamba una, 1042 00:55:21,395 --> 00:55:24,260 na ni kwenda na kupendekeza wewe mifano ya kwamba labda 1043 00:55:24,260 --> 00:55:26,560 inatokana kwamba kitu vizuri. 1044 00:55:26,560 --> 00:55:29,080 >> Na pia unaweza daima email yangu. 1045 00:55:29,080 --> 00:55:31,140 me@lfreitas.com. 1046 00:55:31,140 --> 00:55:34,940 Na naweza tu kujibu maswali yako. 1047 00:55:34,940 --> 00:55:38,600 Tunaweza hata wanaweza kukutana ili niweze kutoa mapendekezo juu ya njia za 1048 00:55:38,600 --> 00:55:41,490 utekelezaji wa mradi wako. 1049 00:55:41,490 --> 00:55:45,610 Na mimi maana kama wewe kujihusisha na isimu mahesabu, itakuja 1050 00:55:45,610 --> 00:55:46,790 kuwa kubwa. 1051 00:55:46,790 --> 00:55:48,370 Wewe ni kwenda kuona kuna fursa nyingi mno. 1052 00:55:48,370 --> 00:55:52,060 Na sekta ya anataka kuajiri hivyo mbaya kwa sababu ya hiyo. 1053 00:55:52,060 --> 00:55:54,720 Hivyo Natumaini guys starehe hii. 1054 00:55:54,720 --> 00:55:57,030 Kama wewe guys una maswali yoyote, unaweza kuuliza mimi baada ya hii. 1055 00:55:57,030 --> 00:55:58,280 Lakini asante. 1056 00:55:58,280 --> 00:56:00,150