1 00:00:00,000 --> 00:00:00,750 2 00:00:00,750 --> 00:00:09,800 >> [MUSIC kucheza] 3 00:00:09,800 --> 00:00:13,014 4 00:00:13,014 --> 00:00:13,680 DUSTIN Tran: Hi. 5 00:00:13,680 --> 00:00:14,980 Jina langu Dustin. 6 00:00:14,980 --> 00:00:18,419 Hivyo mimi itabidi kuwasilisha Takwimu Uchambuzi katika R. 7 00:00:18,419 --> 00:00:19,710 Kidogo tu kuhusu mwenyewe. 8 00:00:19,710 --> 00:00:24,320 Mimi nina sasa mwanafunzi kuhitimu katika Uhandisi na Sayansi. 9 00:00:24,320 --> 00:00:28,330 Mimi kujifunza makutano ya kujifunza mashine na takwimu 10 00:00:28,330 --> 00:00:31,375 hivyo Takwimu Uchambuzi katika R ni kweli ya msingi kwa nini 11 00:00:31,375 --> 00:00:33,790 Mimi kufanya juu ya msingi ya kila siku. 12 00:00:33,790 --> 00:00:35,710 >> Na R ni hasa nzuri kwa ajili ya uchambuzi data 13 00:00:35,710 --> 00:00:39,310 kwa sababu ni nzuri sana kwa prototyping. 14 00:00:39,310 --> 00:00:43,590 Na kwa kawaida, wakati wewe ni kufanya baadhi ya aina ya data uchambuzi, na matatizo mengi 15 00:00:43,590 --> 00:00:44,920 ni kwenda utambuzi. 16 00:00:44,920 --> 00:00:48,700 Na hivyo unataka tu kuwa na baadhi ya lugha ni vizuri kwamba 17 00:00:48,700 --> 00:00:53,770 ni nzuri tu kwa ajili ya kufanya kujengwa katika kazi, kinyume 18 00:00:53,770 --> 00:00:57,430 kwa kuwa ili kukabiliana na mambo kiwango cha chini. 19 00:00:57,430 --> 00:01:01,040 Hivyo katika mwanzo, mimi nina kwenda tu kuanzisha ni nini R, kwa nini 20 00:01:01,040 --> 00:01:04,540 unataka kutumia hiyo, na kisha kwenda juu katika baadhi demo, 21 00:01:04,540 --> 00:01:07,060 na kwenda tu juu ya kutoka huko. 22 00:01:07,060 --> 00:01:08,150 >> Kwa hiyo kile ni R? 23 00:01:08,150 --> 00:01:11,180 R ni tu lugha maendeleo kwa ajili ya kompyuta takwimu 24 00:01:11,180 --> 00:01:12,450 na visualization. 25 00:01:12,450 --> 00:01:16,000 Hivyo kile maana yake ni kwamba ni lugha bora sana 26 00:01:16,000 --> 00:01:22,400 kwa aina yoyote ya kitu ambayo inahusika na kutokuwa na uhakika au data visualization. 27 00:01:22,400 --> 00:01:24,850 Hivyo kuwa haya yote uwezekano mgawanyo. 28 00:01:24,850 --> 00:01:27,140 Kuna ni kwenda kuwa kujengwa katika kazi. 29 00:01:27,140 --> 00:01:31,650 Utapata pia bora kupanga njama paket. 30 00:01:31,650 --> 00:01:34,110 >> Python ni mwingine mashindano lugha ya data. 31 00:01:34,110 --> 00:01:40,020 Na jambo moja kwamba mimi kupata kwamba R ni bora zaidi katika ni visualization. 32 00:01:40,020 --> 00:01:45,200 Hivyo kile utaona katika demo kama vizuri ni lugha Intuitive sana 33 00:01:45,200 --> 00:01:48,050 kwamba tu kazi vizuri sana. 34 00:01:48,050 --> 00:01:53,140 Pia ni huru na ya chanzo, kama ni nyingine yoyote lugha nzuri mimi nadhani. 35 00:01:53,140 --> 00:01:55,440 >> Na hapa, rundo la tu maneno kutupwa katika wewe. 36 00:01:55,440 --> 00:02:00,450 Ni nguvu, maana kama una aina maalum kwa ajili ya kitu 37 00:02:00,450 --> 00:02:02,025 kuliko itabidi mabadiliko hayo juu ya kuruka. 38 00:02:02,025 --> 00:02:05,670 Ni wavivu hivyo ni smart kuhusu jinsi gani hesabu. 39 00:02:05,670 --> 00:02:12,250 Functional maana ya ni kweli unaweza kuendesha msingi mbali ya kazi hivyo anything-- 40 00:02:12,250 --> 00:02:16,910 aina yoyote ya udanganyifu uko kufanya, itakuwa msingi mbali kazi. 41 00:02:16,910 --> 00:02:20,162 >> Operators Hivyo binary, kwa mfano, ni kazi tu asili. 42 00:02:20,162 --> 00:02:21,870 Na kila kitu kwamba wewe ni kwenda kufanya ni 43 00:02:21,870 --> 00:02:24,690 kwenda kuwa na kukimbia mbali kazi yenyewe. 44 00:02:24,690 --> 00:02:27,140 Na kisha kitu oriented pia. 45 00:02:27,140 --> 00:02:30,930 >> Hivyo hapa ni njama XKCD. 46 00:02:30,930 --> 00:02:34,350 Si tu kwa sababu mimi kuhisi kama XKCD ni ya msingi katika aina yoyote 47 00:02:34,350 --> 00:02:37,770 ya mada, lakini kwa sababu Najisikia kama kweli hii 48 00:02:37,770 --> 00:02:42,160 nyundo uhakika kwamba mengi ya wakati wewe ni kufanya baadhi ya aina ya data 49 00:02:42,160 --> 00:02:46,570 uchambuzi, tatizo si sana jinsi ya kufunga ni anaendesha, 50 00:02:46,570 --> 00:02:49,850 lakini ni muda gani kinaendelea kuchukua wewe mpango kazi. 51 00:02:49,850 --> 00:02:54,112 Hivyo hapa ni kuchambua tu kama mkakati a au b ni ufanisi zaidi. 52 00:02:54,112 --> 00:02:55,820 Hii ni kwenda kuwa kitu ambacho wewe ni 53 00:02:55,820 --> 00:02:58,290 kwenda kukabiliana mengi na katika aina ya lugha ngazi ya chini 54 00:02:58,290 --> 00:03:03,440 ambapo wewe ni kushughulika na seg makosa, mgao kumbukumbu, initializations, 55 00:03:03,440 --> 00:03:05,270 hata kufanya kujengwa katika kazi. 56 00:03:05,270 --> 00:03:09,920 Na mambo haya yote ni kubebwa sana, elegantly sana katika R. 57 00:03:09,920 --> 00:03:12,839 >> Hivyo tu kwa nyundo hii uhakika, kubwa bottleneck 58 00:03:12,839 --> 00:03:13,880 ni kwenda kuwa utambuzi. 59 00:03:13,880 --> 00:03:17,341 Hivyo uchambuzi data ni tatizo ngumu sana. 60 00:03:17,341 --> 00:03:19,340 Kama wewe ni kufanya kujifunza mashine au wewe ni 61 00:03:19,340 --> 00:03:22,550 kufanya tu baadhi ya aina ya utafutaji data ya msingi, 62 00:03:22,550 --> 00:03:25,290 hawataki kuwa na kuchukua hati 63 00:03:25,290 --> 00:03:27,440 na kisha kukusanya kitu kila wakati 64 00:03:27,440 --> 00:03:31,010 wanataka kuona nini safu inaonekana kama, nini hasa entries katika tumbo 65 00:03:31,010 --> 00:03:32,195 inaonekana kama. 66 00:03:32,195 --> 00:03:34,320 Kwa hiyo unataka tu kuwa na baadhi interface ni nzuri 67 00:03:34,320 --> 00:03:37,740 unaweza kukimbia kazi rahisi kwamba bahati kwa chochote 68 00:03:37,740 --> 00:03:41,870 Ningependa na kukimbia tu ni kutoka huko. 69 00:03:41,870 --> 00:03:44,190 Na unahitaji domain lugha maalum kwa ajili ya hii. 70 00:03:44,190 --> 00:03:51,750 Na R kweli kukusaidia kufafanua tatizo na kulitatua kwa namna hii. 71 00:03:51,750 --> 00:03:58,690 >> Hivyo hapa ni njama kuonyesha programu umaarufu wa R kama ni gone juu ya muda. 72 00:03:58,690 --> 00:04:04,060 Hivyo kama unaweza kuona, kama 2013 au hivyo tu barugumu up kwa kiasi kikubwa. 73 00:04:04,060 --> 00:04:09,570 Na hii imekuwa kwa sababu tu ya kwamba mwenendo kubwa katika sekta ya teknolojia 74 00:04:09,570 --> 00:04:10,590 data kuhusu kubwa. 75 00:04:10,590 --> 00:04:13,010 Pia, si tu teknolojia sekta, lakini kwa kweli 76 00:04:13,010 --> 00:04:16,490 sekta yoyote that-- kwa sababu mengi ya viwanda 77 00:04:16,490 --> 00:04:20,589 ni aina ya msingi kwa kujaribu kutatua matatizo haya. 78 00:04:20,589 --> 00:04:24,590 Na kwa kawaida, unaweza kuwa na baadhi nzuri njia ya kupima matatizo haya 79 00:04:24,590 --> 00:04:29,720 au hata kufafanua yao au kutatua yao kwa kutumia data. 80 00:04:29,720 --> 00:04:35,430 Hivyo nadhani sasa hivi R ni 11 zaidi lugha maarufu katika TIOBE 81 00:04:35,430 --> 00:04:38,200 na imekuwa ni kuongezeka tangu wakati huo. 82 00:04:38,200 --> 00:04:40,740 83 00:04:40,740 --> 00:04:43,080 >> Hivyo hapa ni baadhi ya zaidi sifa za R. Ina 84 00:04:43,080 --> 00:04:46,900 idadi kubwa ya fedha na ajili ya hayo yote tofauti. 85 00:04:46,900 --> 00:04:52,470 Hivyo wakati wowote una baadhi ya tatizo, wengi 86 00:04:52,470 --> 00:04:55,060 muda R itakuwa na kuwa kazi kwa ajili yenu. 87 00:04:55,060 --> 00:04:58,520 Hivyo kama unataka kujenga aina fulani ya mashine 88 00:04:58,520 --> 00:05:02,770 kujifunza algorithm kuitwa Random Forest au Uamuzi Miti, 89 00:05:02,770 --> 00:05:07,530 au hata kujaribu kuchukua maana ya kazi au yoyote ya mambo haya, 90 00:05:07,530 --> 00:05:10,000 R utakuwa na kwamba. 91 00:05:10,000 --> 00:05:14,190 >> Na kama wewe kufanya wewe huduma ya juu optimization, jambo moja kwamba ni kawaida 92 00:05:14,190 --> 00:05:17,430 ni kwamba baada ya wewe ni kosa prototyping aina fulani ya lugha ngazi ya juu, 93 00:05:17,430 --> 00:05:19,810 utakuwa kutupa kwamba in-- wewe tu bandari kwamba zaidi ya 94 00:05:19,810 --> 00:05:21,550 kwa baadhi ya lugha ngazi ya chini. 95 00:05:21,550 --> 00:05:26,090 Nini nzuri kuhusu R ni kwamba mara moja uko kufanyika prototyping hiyo, unaweza kukimbia C ++, 96 00:05:26,090 --> 00:05:29,510 au Fortran, au yoyote ya hizi ngazi wale wa chini moja kwa moja kwenye R. 97 00:05:29,510 --> 00:05:32,320 Hivyo hiyo ni moja kweli baridi kipengele kuhusu R, 98 00:05:32,320 --> 00:05:35,930 kama kweli huduma ya juu optimization uhakika. 99 00:05:35,930 --> 00:05:39,490 >> Na kwa kweli ni nzuri pia kwa ajili ya mtandao visualizations. 100 00:05:39,490 --> 00:05:43,530 Hivyo D3.js, kwa mfano, ni Nadhani semina nyingine 101 00:05:43,530 --> 00:05:45,130 kwamba sisi iliyotolewa leo. 102 00:05:45,130 --> 00:05:48,510 Na hii ni kweli kutisha kwa ajili ya kufanya visualizations mwingiliano. 103 00:05:48,510 --> 00:05:54,460 Na D3.js akubali kwamba una baadhi ya aina ya data kuwa walipanga 104 00:05:54,460 --> 00:05:58,080 na R ni njia kubwa ya kuwa na uwezo wa kufanya uchambuzi data kabla ya kuuza nje 105 00:05:58,080 --> 00:06:04,220 juu ya D3.js au hata tu kukimbia D3.js amri katika R yenyewe, 106 00:06:04,220 --> 00:06:08,240 vilevile haya yote maktaba nyingine pia. 107 00:06:08,240 --> 00:06:13,041 >> Hivyo kwamba mara tu ya kuanzishwa kwa kile ni R na nini unaweza kuitumia. 108 00:06:13,041 --> 00:06:14,790 Hivyo hopefully, nimekuwa wanaamini jambo fulani 109 00:06:14,790 --> 00:06:18,460 kuhusu kujaribu tu na kuona nini ni kama. 110 00:06:18,460 --> 00:06:23,930 Hivyo mimi nina kwenda mbele na kwenda kwa njia ya baadhi ya misingi kuhusu vitu R 111 00:06:23,930 --> 00:06:26,150 na nini unaweza kweli kufanya. 112 00:06:26,150 --> 00:06:29,690 >> Hivyo hapa ni tu rundo la amri math. 113 00:06:29,690 --> 00:06:35,000 Hivyo kusema you're-- unataka kujenga Lugha mwenyewe na unataka tu 114 00:06:35,000 --> 00:06:38,080 kuwa na rundo la zana mbalimbali. 115 00:06:38,080 --> 00:06:42,520 Aina yoyote ya operesheni unafikiri d wanataka ni kiasi pretty kwenda kuwa katika R. 116 00:06:42,520 --> 00:06:44,150 >> Hivyo hapa ni 2 plus 2. 117 00:06:44,150 --> 00:06:46,090 Hapa ni mara 2 pi. 118 00:06:46,090 --> 00:06:51,870 R ina rundo la kujengwa katika constants kwamba utasikia mara nyingi kutumia kama pi, e. 119 00:06:51,870 --> 00:06:56,230 >> Na kisha, hapa 7 plus runif, hivyo runif ya 1. 120 00:06:56,230 --> 00:07:02,450 Hii ni kazi ambayo ni inazalisha moja random sare 0-1. 121 00:07:02,450 --> 00:07:04,400 Na kisha kuna 3 kwa nguvu ya 4. 122 00:07:04,400 --> 00:07:06,430 Kuna mizizi ya mraba. 123 00:07:06,430 --> 00:07:07,270 >> Kuna gogo. 124 00:07:07,270 --> 00:07:14,500 Hivyo kuingia kufanya msingi kielelezo kwa yenyewe. 125 00:07:14,500 --> 00:07:18,337 Na kisha, kama wewe taja msingi, basi unaweza kufanya lolote msingi unataka. 126 00:07:18,337 --> 00:07:19,920 Na kisha hapa ni baadhi ya amri nyingine. 127 00:07:19,920 --> 00:07:22,180 Hivyo una 23 mod 2. 128 00:07:22,180 --> 00:07:24,910 Kisha una salio. 129 00:07:24,910 --> 00:07:27,110 Kisha una kisayansi nukuu kama wewe pia 130 00:07:27,110 --> 00:07:34,060 wanataka kufanya zaidi tu na mambo ngumu zaidi. 131 00:07:34,060 --> 00:07:37,320 >> Hivyo hapa ni kazi. 132 00:07:37,320 --> 00:07:40,830 Kazi hiyo ya kawaida katika R inafanywa na mshale 133 00:07:40,830 --> 00:07:43,440 hivyo ni chini ya na kisha hyphen. 134 00:07:43,440 --> 00:07:47,250 Hivyo hapa mimi nina kumshirikisha tu 3 Val kutofautiana. 135 00:07:47,250 --> 00:07:50,160 >> Na kisha mimi nina uchapishaji nje Val na kisha Prints nje tatu. 136 00:07:50,160 --> 00:07:53,920 By default katika R mkalimani, ni magazeti mambo ya nje kwa ajili yenu 137 00:07:53,920 --> 00:07:57,280 hivyo huna bayana magazeti Val wakati wowote unataka magazeti kitu. 138 00:07:57,280 --> 00:08:00,200 Unaweza tu kufanya Val na basi ni itabidi kufanya hivyo kwa ajili yenu. 139 00:08:00,200 --> 00:08:04,380 >> Pia, unaweza kutumia sawa kitaalam kama zoezi operator. 140 00:08:04,380 --> 00:08:07,190 Kuna subtleties kidogo kati ya kutumia arrow 141 00:08:07,190 --> 00:08:10,730 operator na sawa operator kwa kazi. 142 00:08:10,730 --> 00:08:15,470 Zaidi kwa mkataba, kila mtu tu kutumia mshale operator. 143 00:08:15,470 --> 00:08:21,850 >> Na hapa, mimi nina kumshirikisha hii nukuu oblique kuitwa 1 koloni 6. 144 00:08:21,850 --> 00:08:26,010 Hii inazalisha vector 1-6. 145 00:08:26,010 --> 00:08:29,350 Na hii ni nzuri kwa sababu basi wewe tu hawawajui vector kwa val 146 00:08:29,350 --> 00:08:34,270 na kwamba kazi kwa yenyewe. 147 00:08:34,270 --> 00:08:37,799 >> Hivyo hii ni tayari kwenda kutoka single-- data Intuitive sana 148 00:08:37,799 --> 00:08:41,070 muundo wa tu mara mbili ya baadhi ya aina ya aina katika vector 149 00:08:41,070 --> 00:08:45,670 na ambayo itakuwa na kukusanya wote maadili scalar kwa ajili yenu. 150 00:08:45,670 --> 00:08:50,770 Hivyo baada ya kwenda kutoka scalar, wewe R vitu na hii ni vector. 151 00:08:50,770 --> 00:08:55,610 vector ni aina yoyote ya ukusanyaji wa aina moja. 152 00:08:55,610 --> 00:08:58,150 Hivyo hapa ni rundo la vectors. 153 00:08:58,150 --> 00:08:59,800 >> Hivyo hii ni numeric. 154 00:08:59,800 --> 00:09:02,440 Numeric ni njia R ya kusema mara mbili. 155 00:09:02,440 --> 00:09:07,390 Na hivyo kwa default, yoyote idadi itakuwa mara mbili. 156 00:09:07,390 --> 00:09:13,150 >> Hivyo kama una c ya 1.1, 3, hasi 5.7, c ni kazi. 157 00:09:13,150 --> 00:09:16,760 Hii concatenates zote tatu namba katika vector. 158 00:09:16,760 --> 00:09:19,619 Na hii be-- hivyo kama taarifa 3 kwa wenyewe, 159 00:09:19,619 --> 00:09:21,910 kawaida ungependa kudhani kwamba hii ni kama integer, 160 00:09:21,910 --> 00:09:25,050 lakini kwa sababu vectors wote ni aina moja, 161 00:09:25,050 --> 00:09:28,660 hii ni vector ya DOUBLES au numeric katika kesi hii. 162 00:09:28,660 --> 00:09:34,920 >> rnorm ni kazi ambayo inazalisha standard ya kawaida variables-- 163 00:09:34,920 --> 00:09:36,700 au kiwango maadili ya kawaida. 164 00:09:36,700 --> 00:09:38,360 Na mimi nina kubainisha watu hao wawili. 165 00:09:38,360 --> 00:09:43,840 Hivyo mimi nina kufanya rnorm 2, kumshirikisha kwamba devs, na kisha mimi nina uchapishaji nje devs. 166 00:09:43,840 --> 00:09:47,350 Basi hizi ni mbili tu maadili random kawaida. 167 00:09:47,350 --> 00:09:50,060 >> Na kisha ints kama wewe kufanya wewe huduma ya juu integers. 168 00:09:50,060 --> 00:09:54,650 Hivyo hii ni tu kuhusu kumbukumbu ugawaji na kuokoa kumbukumbu kawaida. 169 00:09:54,650 --> 00:10:01,460 Hivyo ingekuwa append namba yako na mtaji L. 170 00:10:01,460 --> 00:10:04,170 >> Kwa ujumla, hii ni R kihistoria nukuu 171 00:10:04,170 --> 00:10:06,940 kwa kitu kinachoitwa muda integer. 172 00:10:06,940 --> 00:10:09,880 Hivyo zaidi ya muda, utasikia kuwa kushughulika na mara mbili. 173 00:10:09,880 --> 00:10:15,180 Na kama wewe milele mapenzi baadaye juu ya kuongeza code yako, 174 00:10:15,180 --> 00:10:18,110 unaweza kuongeza tu hizi L's baadaye au wakati ni 175 00:10:18,110 --> 00:10:22,280 kama wewe ni kama precognitive kuhusu nini wewe ni kwenda kufanya vigezo hivi. 176 00:10:22,280 --> 00:10:25,340 177 00:10:25,340 --> 00:10:26,890 >> Hivyo hapa ni vector tabia. 178 00:10:26,890 --> 00:10:31,440 Hivyo, tena, mimi nina concatenating masharti tatu wakati huu. 179 00:10:31,440 --> 00:10:36,230 Taarifa kwamba masharti mara mbili na masharti moja ni sawa katika R. 180 00:10:36,230 --> 00:10:41,000 Hivyo nina Arthur na marvin na hivyo wakati mimi nina uchapishaji ni nje, wao wote 181 00:10:41,000 --> 00:10:43,210 ni kwenda kuonyesha masharti mbili. 182 00:10:43,210 --> 00:10:45,880 Na kama wewe pia wanataka ni pamoja kamba mara mbili au moja 183 00:10:45,880 --> 00:10:50,070 katika wahusika wako, basi unaweza ama mbadala masharti yako. 184 00:10:50,070 --> 00:10:53,540 >> Hivyo marvin kwa ajili ya Sehemu ya pili, hii ni 185 00:10:53,540 --> 00:10:56,380 kwenda show-- wewe tu masharti mara mbili 186 00:10:56,380 --> 00:10:59,050 na kisha kamba moja hivyo hii ni alternating. 187 00:10:59,050 --> 00:11:04,040 Vinginevyo, kama unataka kutumia mara mbili kamba operator katika kamba mbili 188 00:11:04,040 --> 00:11:07,090 wakati wewe ni kutangaza ni, basi wewe tu kutumia kutoroka operator. 189 00:11:07,090 --> 00:11:10,600 Hivyo kufanya backslash mbili kamba. 190 00:11:10,600 --> 00:11:13,330 >> Na hatimaye, sisi pia kuwa vectors mantiki. 191 00:11:13,330 --> 00:11:15,890 Hivyo logical-- hivyo TRUE na FALSE, na wao uko 192 00:11:15,890 --> 00:11:18,880 kwenda kuwa barua zote mji mkuu. 193 00:11:18,880 --> 00:11:22,370 Na kisha, tena, mimi nina concatenating yao na kisha kumshirikisha yao kwa bools. 194 00:11:22,370 --> 00:11:24,590 Hivyo bools ni kwenda kuonyesha wewe KWELI, FALSE, na KWELI. 195 00:11:24,590 --> 00:11:28,280 196 00:11:28,280 --> 00:11:31,620 >> Hivyo hapa ni Indexing vectorized. 197 00:11:31,620 --> 00:11:34,870 Hivyo katika mwanzo, mimi am kuchukua function-- 198 00:11:34,870 --> 00:11:39,230 hii inaitwa sequence-- mlolongo kutoka 2 hadi 12. 199 00:11:39,230 --> 00:11:42,490 Na mimi nina kuchukua mfululizo na 2. 200 00:11:42,490 --> 00:11:46,660 Hivyo ni kwenda kufanya 2, 4, 6, 8, 10 na 12. 201 00:11:46,660 --> 00:11:50,080 Na kisha, mimi nina Indexing kupata ya tatu. 202 00:11:50,080 --> 00:11:55,770 >> Hivyo jambo moja kukumbuka ni kwamba R bahati kwa kuanzia 1. 203 00:11:55,770 --> 00:12:00,550 Hivyo vals 3 ni kwenda kutoa wewe ya tatu. 204 00:12:00,550 --> 00:12:04,580 Hii ni aina ya tofauti na wengine lugha ambapo ni kuanza kutoka sifuri. 205 00:12:04,580 --> 00:12:09,780 Hivyo katika C au C ++, kwa mfano, wewe ni kwenda kupata kipengele cha nne. 206 00:12:09,780 --> 00:12:13,280 >> Na hapa ni vals 3-5. 207 00:12:13,280 --> 00:12:16,030 Kwa hiyo, jambo moja kwamba kweli baridi ni kwamba 208 00:12:16,030 --> 00:12:20,410 inaweza kuzalisha vigezo muda ndani na kisha tu matumizi yao juu ya kuruka. 209 00:12:20,410 --> 00:12:21,960 Hivyo hapa ni 3 hadi 5. 210 00:12:21,960 --> 00:12:25,070 Hivyo mimi nina kuzalisha vector 3, 4, na 5 na kisha 211 00:12:25,070 --> 00:12:29,700 Mimi nina Indexing kupata tatu, nne, na tano vipengele. 212 00:12:29,700 --> 00:12:32,280 >> Hivyo vile vile, unaweza abstract hii tu kufanya 213 00:12:32,280 --> 00:12:35,280 aina yoyote ya vector kwamba anatoa Indexing. 214 00:12:35,280 --> 00:12:40,050 Hivyo hapa ni vals na kisha mambo ya kwanza, tatu, na sita. 215 00:12:40,050 --> 00:12:42,800 Na kisha, kama unataka kufanya inayosaidia, 216 00:12:42,800 --> 00:12:45,210 hivyo wewe tu kufanya minus baadaye na kwamba itabidi 217 00:12:45,210 --> 00:12:48,600 kukupa kila kitu kwamba si kwanza, ya tatu, au kipengele sita. 218 00:12:48,600 --> 00:12:51,590 Hivyo hii itakuwa 4, 8, na 10. 219 00:12:51,590 --> 00:12:54,380 >> Na kama unataka kupata hata zaidi ya juu, 220 00:12:54,380 --> 00:12:57,610 unaweza concatenate vectors Boolean. 221 00:12:57,610 --> 00:13:05,210 Hivyo ripoti hii ni kwenda kukupa vector hii Boolean ya urefu 6. 222 00:13:05,210 --> 00:13:07,280 Hivyo rep TRUE comma 3. 223 00:13:07,280 --> 00:13:09,680 Hii kurudia TRUE mara tatu. 224 00:13:09,680 --> 00:13:12,900 Hivyo hii nitakupa vector KWELI, KWELI, KWELI. 225 00:13:12,900 --> 00:13:17,470 >> rep FALSE 4-- hii ni kwenda kukupa vector ya FALSE, FALSE, FALSE, FALSE. 226 00:13:17,470 --> 00:13:21,280 Na kisha c ni kwenda concatenate wale Booleans mbili pamoja. 227 00:13:21,280 --> 00:13:24,090 Hivyo wewe ni kwenda kupata tatu TRUEs na kisha FALSEs nne. 228 00:13:24,090 --> 00:13:28,460 >> Hivyo kwamba wakati wewe index vals, uko kwenda kupata KWELI, KWELI, KWELI. 229 00:13:28,460 --> 00:13:31,420 Hivyo kwamba kinaendelea kusema ndiyo, Nataka mambo hayo matatu. 230 00:13:31,420 --> 00:13:33,520 Na kisha FALSE, FALSE, FALSE, FALSE ni kwenda 231 00:13:33,520 --> 00:13:37,140 kusema hapana, sitaki mambo hayo hivyo si kwenda na kurudi kwao. 232 00:13:37,140 --> 00:13:41,490 >> Na mimi nadhani kuna kweli typo hapa kwa sababu hii ni kusema kurudia TRUE 3 233 00:13:41,490 --> 00:13:47,990 na kurudia FALSE 4, na kitaalam, wewe tu kuwa na vipengele sita ili kurudia FALSE, 234 00:13:47,990 --> 00:13:50,470 ni lazima kurudia FALSE 3. 235 00:13:50,470 --> 00:13:55,260 Nadhani R pia ni smart kutosha kama kwamba kama wewe tu bayana 4 hapa, basi 236 00:13:55,260 --> 00:13:56,630 itakuwa hata makosa nje. 237 00:13:56,630 --> 00:13:58,480 Itakuwa tu kukupa thamani hii. 238 00:13:58,480 --> 00:14:00,970 Hivyo ni itabidi kupuuza kwamba FALSE nne. 239 00:14:00,970 --> 00:14:05,310 240 00:14:05,310 --> 00:14:09,270 >> Hivyo hapa ni zoezi vectorized. 241 00:14:09,270 --> 00:14:15,480 Hivyo set.seed-- hii seti tu mbegu kwa namba pseudorandom. 242 00:14:15,480 --> 00:14:20,110 Hivyo mimi nina kuweka mbegu kwa 42, kwa maana ya kwamba kama mimi kuzalisha 243 00:14:20,110 --> 00:14:22,950 tatu random kawaida maadili, na kisha kama wewe 244 00:14:22,950 --> 00:14:27,400 kukimbia set.seed juu yako mwenyewe kompyuta kwa kutumia thamani sawa 42, 245 00:14:27,400 --> 00:14:30,990 basi wewe pia kupata sawa tatu random normals. 246 00:14:30,990 --> 00:14:33,411 >> Hivyo hii ni mzuri kwa reproducibility. 247 00:14:33,411 --> 00:14:35,910 Kwa kawaida, wakati wewe ni kufanya baadhi ya aina ya uchambuzi wa kisayansi, 248 00:14:35,910 --> 00:14:37,230 wewe unataka kuweka mbegu. 249 00:14:37,230 --> 00:14:41,270 Kwa njia hiyo wanasayansi wengine unaweza tu kuzaliana halisi code huo wameweza 250 00:14:41,270 --> 00:14:44,790 kufanyika kwa sababu wao itabidi halisi sawa vigezo random that-- au random 251 00:14:44,790 --> 00:14:47,270 maadili ambayo umechukua nje vilevile. 252 00:14:47,270 --> 00:14:49,870 253 00:14:49,870 --> 00:14:53,910 >> Na hivyo zoezi vectorized hapa ni kuonyesha vals 1 hadi 2. 254 00:14:53,910 --> 00:14:59,290 Hivyo inachukua kwanza mambo mawili ya vals na kisha inateua yao kwa 0. 255 00:14:59,290 --> 00:15:03,940 Na kisha, unaweza pia tu kufanya jambo kama hilo na Booleans. 256 00:15:03,940 --> 00:15:09,340 >> Hivyo vals si sawa na 0-- mapenzi hii kukupa FALSE vector, FALSE, TRUE 257 00:15:09,340 --> 00:15:10,350 katika kesi hii. 258 00:15:10,350 --> 00:15:13,770 Na kisha, ni kwenda kusema lolote ya wale bahati waliokuwa KWELI, 259 00:15:13,770 --> 00:15:15,270 basi ni kwenda hawawajui kwamba hadi 5. 260 00:15:15,270 --> 00:15:18,790 Hivyo inachukua ya tatu hapa na kisha inateua kwa 5. 261 00:15:18,790 --> 00:15:22,300 >> Na hii ni kwa kweli ni nzuri ikilinganishwa na lugha ngazi ya chini 262 00:15:22,300 --> 00:15:25,560 ambapo una kutumia kwa ajili ya loops kufanya yote ya mambo haya vectorized 263 00:15:25,560 --> 00:15:30,281 sababu tu Intuitive sana na ni moja mmoja mjengo. 264 00:15:30,281 --> 00:15:32,030 Na nini kuhusu kubwa nukuu vectorized 265 00:15:32,030 --> 00:15:37,020 ni kwamba katika R, hizi ni aina ya kujengwa katika hivyo kwamba wao ni karibu kwa haraka 266 00:15:37,020 --> 00:15:42,490 kama kufanya katika lugha ngazi ya chini kama kinyume na kufanya kwa kitanzi katika R 267 00:15:42,490 --> 00:15:46,317 na kisha kuwa ni kufanya Indexing nguvu yenyewe. 268 00:15:46,317 --> 00:15:48,900 Na kwamba utakuwa polepole zaidi kuliko kufanya aina hii ya kitu vectorized 269 00:15:48,900 --> 00:15:55,950 ambapo unaweza kufanya hivyo katika sambamba, ambapo ni kufanya hivyo katika threading kimsingi. 270 00:15:55,950 --> 00:15:58,650 >> Hivyo hapa ni vectorized shughuli. 271 00:15:58,650 --> 00:16:04,920 Hivyo mimi nina kuzalisha thamani ya 1 hadi 3, kumshirikisha kwamba vec1, 3 hadi 5, vec2, 272 00:16:04,920 --> 00:16:05,950 kuongeza yao pamoja. 273 00:16:05,950 --> 00:16:11,490 Inaongeza yao sehemu-busara ili ni 1 plus 3, 2 pamoja na 4, na kadhalika. 274 00:16:11,490 --> 00:16:13,330 >> vec1 mara vec2. 275 00:16:13,330 --> 00:16:16,110 Hii hulizidisha mbili inathamini sehemu busara. 276 00:16:16,110 --> 00:16:21,830 Hivyo ni 1 mara 3, mara 2 4, na kisha mara 3 5. 277 00:16:21,830 --> 00:16:28,250 >> Na kisha, vile vile unaweza pia kufanya comparisons-- kulinganisha mantiki. 278 00:16:28,250 --> 00:16:33,640 Hivyo ni uongo uongo TRUE katika hii kesi kwa sababu 1 ni si zaidi ya 3, 279 00:16:33,640 --> 00:16:35,920 2 si mkuu zaidi kuliko 4. 280 00:16:35,920 --> 00:16:41,160 Hii ni, mimi nadhani, typo mwingine, 3 ni dhahiri si zaidi ya 5. 281 00:16:41,160 --> 00:16:41,660 Yeah. 282 00:16:41,660 --> 00:16:45,770 Na hivyo unaweza tu kufanya yote shughuli hizo rahisi 283 00:16:45,770 --> 00:16:48,350 kwa sababu ya kurithi yao kutoka madarasa wenyewe. 284 00:16:48,350 --> 00:16:51,110 285 00:16:51,110 --> 00:16:52,580 >> Hivyo kwamba mara tu vector. 286 00:16:52,580 --> 00:16:56,530 Na hiyo ni aina ya msingi zaidi R kitu kwa sababu kutokana na vector, 287 00:16:56,530 --> 00:16:59,170 unaweza kujenga vitu juu zaidi. 288 00:16:59,170 --> 00:17:00,560 >> Hivyo hapa ni tumbo. 289 00:17:00,560 --> 00:17:05,030 Hii ni kimsingi abstraction ya nini tumbo ni yenyewe. 290 00:17:05,030 --> 00:17:10,099 Hivyo katika kesi hii, ni tatu tofauti vectors, ambapo kila moja ni safu, 291 00:17:10,099 --> 00:17:12,710 au unaweza kufikiria ni kama kila mmoja ni mfululizo. 292 00:17:12,710 --> 00:17:18,250 >> Hivyo mimi nina kuhifadhi tumbo kutoka 1 kwa 9 na basi mimi nina kubainisha mistari 3. 293 00:17:18,250 --> 00:17:23,364 Hivyo 1 kwa 9 nitakupa vector 1, 2, 3, 4, 5, 6, na njia yote ya 9. 294 00:17:23,364 --> 00:17:29,250 >> Jambo moja pia kukumbuka ni kwamba Maduka R maadili katika safu-kubwa format. 295 00:17:29,250 --> 00:17:34,160 Hivyo kwa maneno mengine, wakati unaweza kuona 1 9, ni kwenda kuhifadhi them-- 296 00:17:34,160 --> 00:17:36,370 ni kwenda kuwa 1, 2, 3 katika safu ya kwanza, 297 00:17:36,370 --> 00:17:38,510 na kisha itabidi kufanya 4, 5, 6 katika safu ya pili, 298 00:17:38,510 --> 00:17:41,440 na kisha 7, 8, 9 katika safu ya tatu. 299 00:17:41,440 --> 00:17:45,570 >> Na hapa ni baadhi ya wengine kazi ya kawaida unaweza kutumia. 300 00:17:45,570 --> 00:17:49,650 Hivyo dim kitanda, hii nitakupa vipimo ya tumbo. 301 00:17:49,650 --> 00:17:52,620 Ni kwenda na kurudi wewe vector ya mwelekeo. 302 00:17:52,620 --> 00:17:55,580 Hivyo katika kesi hii, kwa sababu tumbo yetu ni 3 na 3, 303 00:17:55,580 --> 00:18:01,900 ni kwenda kukupa numeric vector kwamba 3 3. 304 00:18:01,900 --> 00:18:05,270 >> Na hapa ni kuonyesha tu kuzidisha tumbo. 305 00:18:05,270 --> 00:18:11,970 Hivyo kawaida, kama wewe tu kufanya asterisk-- hivyo kitanda asterisk mat-- 306 00:18:11,970 --> 00:18:15,380 hii ni kwenda kuwa operesheni sehemu-busara 307 00:18:15,380 --> 00:18:17,300 au kile kinachoitwa bidhaa Hadamard. 308 00:18:17,300 --> 00:18:21,310 Hivyo ni kwenda kufanya kila kipengele sehemu-busara. 309 00:18:21,310 --> 00:18:23,610 Hata hivyo, kama unataka tumbo multiplication-- 310 00:18:23,610 --> 00:18:29,380 hivyo kuzidisha mara ya kwanza mstari tumbo pili ya kwanza ya safu 311 00:18:29,380 --> 00:18:34,510 na hivyo on-- ungependa kutumia operesheni hii asilimia. 312 00:18:34,510 --> 00:18:38,110 >> Na t ya kitanda ni operesheni kwa transpose. 313 00:18:38,110 --> 00:18:42,590 Hivyo mimi kusema kuchukua transpose katika tumbo, kuzidisha ni kwa tumbo 314 00:18:42,590 --> 00:18:43,090 yenyewe. 315 00:18:43,090 --> 00:18:45,006 Na kisha kwenda kurudi mwingine 3 316 00:18:45,006 --> 00:18:50,700 na 3 tumbo kuonyesha bidhaa wewe d wanataka. 317 00:18:50,700 --> 00:18:53,750 >> Na ili kwamba ilikuwa tumbo. 318 00:18:53,750 --> 00:18:56,020 Hapa ni kile kinachoitwa frame data. 319 00:18:56,020 --> 00:19:00,780 frame data unaweza kufikiria kama tumbo, lakini kila safu yenyewe 320 00:19:00,780 --> 00:19:02,990 ni kwenda kuwa ya aina mbalimbali. 321 00:19:02,990 --> 00:19:07,320 >> Basi nini kweli cool kuhusu data muafaka ni kwamba katika uchambuzi data yenyewe, 322 00:19:07,320 --> 00:19:11,260 wewe ni kwenda kuwa na yote hii data heterogeneous na wote hawa kweli 323 00:19:11,260 --> 00:19:15,640 mambo messy ambapo kila mmoja wa nguzo wenyewe inaweza kuwa ya aina mbalimbali. 324 00:19:15,640 --> 00:19:21,460 Hivyo hapa mimi kusema kujenga frame data, kufanya ints 1-3, 325 00:19:21,460 --> 00:19:24,750 na kisha pia kuwa vector tabia. 326 00:19:24,750 --> 00:19:28,470 Hivyo naweza index kupitia kila moja ya nguzo hizi 327 00:19:28,470 --> 00:19:30,930 na basi mimi itabidi kupata maadili wenyewe. 328 00:19:30,930 --> 00:19:34,370 Na unaweza pia kufanya baadhi ya aina wa shughuli muafaka data. 329 00:19:34,370 --> 00:19:38,040 Na zaidi ya muda wakati uko kufanya uchambuzi wa takwimu au aina fulani 330 00:19:38,040 --> 00:19:42,042 ya preprocessing, wewe utakuwa na kufanya kazi na miundo haya data 331 00:19:42,042 --> 00:19:44,250 ambapo kila safu ni kwenda kuwa ya aina mbalimbali. 332 00:19:44,250 --> 00:19:47,880 333 00:19:47,880 --> 00:19:52,970 >> Hatimaye, hivyo ni kimsingi tu vitu nne muhimu katika R. Orodha 334 00:19:52,970 --> 00:19:55,820 tu kukusanya yoyote vitu vingine unataka. 335 00:19:55,820 --> 00:20:00,130 Hivyo itakuwa kuhifadhi hii katika moja variable kwamba unaweza kupata kwa urahisi. 336 00:20:00,130 --> 00:20:02,370 >> Hivyo hapa, mimi nina kuchukua orodha. 337 00:20:02,370 --> 00:20:04,460 Mimi kusema mambo sawa 3. 338 00:20:04,460 --> 00:20:08,060 Hivyo mimi nina kwenda kuwa moja ya kipengele katika orodha, na hii inaitwa stuff, 339 00:20:08,060 --> 00:20:10,570 na ni kwenda kuwa na thamani 3. 340 00:20:10,570 --> 00:20:13,140 >> Mimi inaweza pia kuunda tumbo. 341 00:20:13,140 --> 00:20:17,970 Hivyo hii ni 1 na 4 na safu ya mwisho sawa 2, hivyo 2 na 2 tumbo. 342 00:20:17,970 --> 00:20:20,270 Pia katika orodha na ni kuitwa kitanda chake. 343 00:20:20,270 --> 00:20:24,690 moreStuff, kamba tabia, na hata mwingine orodha yenyewe. 344 00:20:24,690 --> 00:20:27,710 >> Hivyo hii ni orodha hiyo ni 5 na kubeba. 345 00:20:27,710 --> 00:20:30,990 Hivyo ina thamani 5 na ina kamba tabia kubeba 346 00:20:30,990 --> 00:20:32,710 na ni orodha ndani ya orodha. 347 00:20:32,710 --> 00:20:35,965 Hivyo unaweza kuwa na haya mambo kujirudia ambapo 348 00:20:35,965 --> 00:20:38,230 una another-- a aina ndani ya aina. 349 00:20:38,230 --> 00:20:41,420 Hivyo vile vile, unaweza kuwa na tumbo ndani ya tumbo mwingine na kadhalika. 350 00:20:41,420 --> 00:20:44,264 Na orodha ni njia nzuri wa kukusanya na kujumlisha 351 00:20:44,264 --> 00:20:45,430 hizi vitu mbalimbali wote. 352 00:20:45,430 --> 00:20:50,210 353 00:20:50,210 --> 00:20:57,150 >> Na hatimaye, hapa ni kusaidia tu katika kesi hii mara tu wamekwenda juu haraka sana. 354 00:20:57,150 --> 00:21:01,350 Hivyo wakati wowote wewe ni kuchanganyikiwa kuhusu aina fulani ya kazi, 355 00:21:01,350 --> 00:21:03,510 unaweza kufanya msaada wa kazi hiyo. 356 00:21:03,510 --> 00:21:07,120 Hivyo unaweza kufanya msaada tumbo au alama swali tumbo. 357 00:21:07,120 --> 00:21:11,430 Na msaada na alama swali ni shorthand tu kwa ajili ya kitu kimoja 358 00:21:11,430 --> 00:21:13,040 hivyo ni aliases. 359 00:21:13,040 --> 00:21:16,820 >> lm ni kazi ambayo tu haina linear mfano. 360 00:21:16,820 --> 00:21:20,340 Lakini kama wewe tu hawana wazo jinsi kwamba kazi, unaweza tu kufanya msaada wa lm 361 00:21:20,340 --> 00:21:24,610 na kwamba nitakupa baadhi aina ya nyaraka kwamba 362 00:21:24,610 --> 00:21:27,960 inaonekana aina ya kama mtu ukurasa katika Unix, ambapo 363 00:21:27,960 --> 00:21:34,210 una maelezo mafupi ya nini hivyo, pia ni nini hoja yake ni, 364 00:21:34,210 --> 00:21:38,850 nini anarudi, na tips tu juu ya jinsi kuitumia, na baadhi ya mifano pia. 365 00:21:38,850 --> 00:21:41,680 366 00:21:41,680 --> 00:21:52,890 >> Hivyo basi mimi kwenda mbele na show baadhi demo ya kutumia R. OK. 367 00:21:52,890 --> 00:21:55,470 Basi, mimi nikaenda juu ya sana haraka tu data 368 00:21:55,470 --> 00:21:59,440 miundo na aina fulani ya op-- baadhi ya shughuli. 369 00:21:59,440 --> 00:22:02,960 Hapa ni baadhi ya kazi. 370 00:22:02,960 --> 00:22:06,750 >> Hivyo hapa mimi nina kwenda tu kufafanua kazi. 371 00:22:06,750 --> 00:22:09,970 Hivyo mimi nina pia kutumia zoezi operator hapa, 372 00:22:09,970 --> 00:22:12,610 na kisha mimi kusema kutangaza kama kazi. 373 00:22:12,610 --> 00:22:14,140 Na inachukua thamani x. 374 00:22:14,140 --> 00:22:18,210 Hivyo hii ni thamani yoyote unataka na mimi nina kwenda na kurudi x yenyewe. 375 00:22:18,210 --> 00:22:20,840 Hivyo hii ni utambulisho kazi. 376 00:22:20,840 --> 00:22:23,670 >> Na nini baridi kuhusu hili ikilinganishwa na lugha nyingine 377 00:22:23,670 --> 00:22:26,330 na mwingine ngazi ya chini lugha ni kwamba x 378 00:22:26,330 --> 00:22:29,350 inaweza kuwa ya aina yoyote yenyewe na itabidi kurudi aina hiyo. 379 00:22:29,350 --> 00:22:35,251 Hivyo unaweza imagine-- hivyo basi mimi tu kukimbia hii haraka. 380 00:22:35,251 --> 00:22:35,750 Sorry. 381 00:22:35,750 --> 00:22:40,300 >> Hivyo jambo moja mimi lazima pia kutaja ni kwamba mhariri hii mimi nina kutumia 382 00:22:40,300 --> 00:22:41,380 inaitwa rstudio. 383 00:22:41,380 --> 00:22:44,389 Hii ni kile kinachoitwa IDE. 384 00:22:44,389 --> 00:22:46,180 Na jambo moja kwamba kweli nzuri kuhusu hili 385 00:22:46,180 --> 00:22:51,500 ni kwamba inashirikisha mengi ya mambo unataka kufanya katika R na yenyewe 386 00:22:51,500 --> 00:22:53,180 tu sana intuitively. 387 00:22:53,180 --> 00:22:55,550 >> Hivyo hapa ni mkalimani console. 388 00:22:55,550 --> 00:23:02,160 Hivyo vile vile, unaweza pia kupata hii console ghafi tu kwa kufanya mji mkuu R. 389 00:23:02,160 --> 00:23:05,630 Na hii ni hasa kitu sawa kama console. 390 00:23:05,630 --> 00:23:12,210 Hivyo siwezi tu kufanya id kazi x, x, x. 391 00:23:12,210 --> 00:23:16,130 Na then-- na kisha kuwa itakuwa faini yenyewe. 392 00:23:16,130 --> 00:23:19,200 393 00:23:19,200 --> 00:23:21,740 >> Hivyo rstudio ni kubwa kwa sababu ina console. 394 00:23:21,740 --> 00:23:25,360 Pia ina nyaraka Ningependa kukimbia. 395 00:23:25,360 --> 00:23:28,629 Na kisha ina baadhi ya vigezo kwamba unaweza kuona katika mazingira. 396 00:23:28,629 --> 00:23:30,420 Na kisha, kama una kufanya viwanja, basi 397 00:23:30,420 --> 00:23:33,730 unaweza tu kuona hapa, kinyume na kusimamia madirisha yote haya tofauti 398 00:23:33,730 --> 00:23:35,940 kwa wenyewe. 399 00:23:35,940 --> 00:23:40,530 >> Mimi kwa kweli binafsi kutumia VIM, lakini mimi kujisikia kama rstudio ni bora tu 400 00:23:40,530 --> 00:23:44,640 kwa ajili ya kupata wazo nzuri ya jinsi ya kutumia R. kawaida, 401 00:23:44,640 --> 00:23:47,040 wakati wewe ni kujaribu kujifunza baadhi ya kazi mpya, 402 00:23:47,040 --> 00:23:49,590 hawataki kushughulikia mambo mengi mno kwa wakati mmoja. 403 00:23:49,590 --> 00:23:53,120 Hivyo R ni rstudio very-- ni njia nzuri sana ya kujifunza R 404 00:23:53,120 --> 00:23:56,760 bila kuwa ili kukabiliana na haya mambo mengine yote. 405 00:23:56,760 --> 00:23:58,600 >> Hivyo hapa Mimi mbio id hello. 406 00:23:58,600 --> 00:24:00,090 Hii anarudi hello. 407 00:24:00,090 --> 00:24:01,740 id 123. 408 00:24:01,740 --> 00:24:04,610 Hapa ni vector ya integers. 409 00:24:04,610 --> 00:24:08,620 Hivyo vile vile, kwa sababu unaweza kuchukua yoyote aina fulani ya thamani, 410 00:24:08,620 --> 00:24:16,060 unaweza kufanya kurudi id ya x hivyo anarudi 1234 na 5. 411 00:24:16,060 --> 00:24:22,210 >> Na napenda tu kuonyesha kwamba hii ni kweli integer. 412 00:24:22,210 --> 00:24:28,800 Na vile vile, kama wewe kufanya darasa id x, ni kwenda kuwa integer. 413 00:24:28,800 --> 00:24:34,170 Na kisha, unaweza pia kulinganisha mbili na ni KWELI. 414 00:24:34,170 --> 00:24:38,350 Hivyo mimi nina kuangalia kama id ya x sawa sawa x na tangazo 415 00:24:38,350 --> 00:24:39,760 kwamba anatoa wewe TRUEs mbili. 416 00:24:39,760 --> 00:24:44,280 Hivyo hii si kusema ni vitu viwili kufanana, 417 00:24:44,280 --> 00:24:46,845 lakini ni kila moja ya entries ndani ya vectors kufanana. 418 00:24:46,845 --> 00:24:50,000 419 00:24:50,000 --> 00:24:52,090 >> Hapa ni bounded.compare. 420 00:24:52,090 --> 00:24:58,470 Hivyo hii ni kidogo ngumu zaidi kwa kuwa ina kama hali na mwingine 421 00:24:58,470 --> 00:25:00,960 na kisha inachukua mbili hoja wakati huo. 422 00:25:00,960 --> 00:25:02,640 Hivyo x ni ya aina yoyote. 423 00:25:02,640 --> 00:25:06,280 Na mimi kusema hii Hoja ya pili ni. 424 00:25:06,280 --> 00:25:08,380 Hii inaweza kuwa kitu chochote pia. 425 00:25:08,380 --> 00:25:12,490 Lakini kwa default, ni kwenda kuchukua 5 kama huna bayana chochote. 426 00:25:12,490 --> 00:25:16,730 >> Hivyo hapa mimi nina kwenda kusema kama x ni mkubwa kuliko. 427 00:25:16,730 --> 00:25:19,220 Hivyo kama mimi si bayana, ni anasema kama x ni mkubwa kuliko 5, 428 00:25:19,220 --> 00:25:20,470 basi mimi nina kwenda na kurudi KWELI. 429 00:25:20,470 --> 00:25:23,230 mwingine, mimi nina kwenda na kurudi FALSE. 430 00:25:23,230 --> 00:25:24,870 Hivyo basi mimi kwenda mbele na kufafanua hili. 431 00:25:24,870 --> 00:25:30,600 432 00:25:30,600 --> 00:25:34,550 >> Na sasa mimi nina kwenda kukimbia bounded.compare 3. 433 00:25:34,550 --> 00:25:39,150 Hivyo anasema ni 3 chini than-- ni 3 kubwa zaidi kuliko 5. 434 00:25:39,150 --> 00:25:41,830 No, siyo hivyo FALSE. 435 00:25:41,830 --> 00:25:46,550 >> Na bounded.compare 3 na mimi nina kwenda kulinganisha ni kutumia sawa 2. 436 00:25:46,550 --> 00:25:50,700 Hivyo sasa mimi kusema ndiyo, sasa mimi unataka kuwa kitu kingine. 437 00:25:50,700 --> 00:25:52,750 Hivyo mimi nina kwenda kusema, unapaswa kuwa 2. 438 00:25:52,750 --> 00:25:56,640 >> Siwezi ama kufanya aina hii ya nukuu au nasema sawa 2. 439 00:25:56,640 --> 00:25:58,720 Hii ni zaidi someka kwa kuwa wakati uko 440 00:25:58,720 --> 00:26:01,450 kuangalia hizi kwa kweli kazi ngumu kwamba 441 00:26:01,450 --> 00:26:08,110 kuchukua arguments-- nyingi na hii inaweza kuwa kadhaa oftentimes-- kusema tu 442 00:26:08,110 --> 00:26:11,140 a sawa 2 ni zaidi someka kwa wewe ili baadaye katika siku zijazo 443 00:26:11,140 --> 00:26:13,020 wewe kujua nini unachokifanya. 444 00:26:13,020 --> 00:26:17,120 >> Hivyo katika kesi hii, mimi nina Usemi huu ni wa 3 kubwa zaidi kuliko 2. 445 00:26:17,120 --> 00:26:18,270 Ndiyo ni. 446 00:26:18,270 --> 00:26:22,350 Na vile vile, siwezi tu kuondoa huu na kusema, ni 3 kubwa zaidi kuliko 2 447 00:26:22,350 --> 00:26:23,440 ambapo ni sawa na 2. 448 00:26:23,440 --> 00:26:26,230 Na kwamba pia KWELI. 449 00:26:26,230 --> 00:26:26,730 Ndiyo? 450 00:26:26,730 --> 00:26:29,670 >> Watazamaji: Je, wewe utekelezaji mstari kwa mstari? 451 00:26:29,670 --> 00:26:30,670 >> DUSTIN Tran: Ndiyo mimi. 452 00:26:30,670 --> 00:26:33,900 Hivyo nini mimi kufanya hapa ni kuchukua Nakala hii document-- 453 00:26:33,900 --> 00:26:39,825 na nini kubwa juu ya rstudio ni kwamba Siwezi kukimbia tu short-- njia ya mkato muhimu. 454 00:26:39,825 --> 00:26:41,820 Hivyo mimi nina kufanya Udhibiti-kuingia. 455 00:26:41,820 --> 00:26:44,850 >> Na kisha, mimi nina kuchukua line katika hati asilia 456 00:26:44,850 --> 00:26:46,710 na kisha kuweka katika console. 457 00:26:46,710 --> 00:26:50,800 Hivyo hapa mimi kusema, bounded.compare na mimi nina kufanya Kudhibiti-X. 458 00:26:50,800 --> 00:26:52,540 Hivyo siwezi tu kukimbia hapa pia. 459 00:26:52,540 --> 00:26:54,920 Na kisha kwamba itabidi kuchukua line na kisha kuiweka hapa. 460 00:26:54,920 --> 00:26:57,900 Na kisha vile vile, siwezi kufanya kukimbia hapa. 461 00:26:57,900 --> 00:27:04,630 Na basi itakuwa tu kuendelea kufafanua mistari kwenye console kama hiyo. 462 00:27:04,630 --> 00:27:10,690 >> Na kama wewe pia taarifa curly braces ni pale tu kama katika C syntax. 463 00:27:10,690 --> 00:27:13,910 x-- kama kama hali ni pia kwenda kutumia mabano na kisha 464 00:27:13,910 --> 00:27:15,350 unaweza kutumia mwingine. 465 00:27:15,350 --> 00:27:17,496 Mwingine ni mwingine kama. 466 00:27:17,496 --> 00:27:21,440 Hivyo hii ni kwenda kuwa x sawa sawa, kwa mfano. 467 00:27:21,440 --> 00:27:24,190 468 00:27:24,190 --> 00:27:26,350 Na kisha mimi nina kwenda kurudi kitu hapa. 469 00:27:26,350 --> 00:27:29,490 >> Taarifa kwamba kuna mambo mawili tofauti mambo hapa kwamba kinaendelea. 470 00:27:29,490 --> 00:27:34,360 Moja ni kwamba hapa mimi nina kubainisha kurudi thamani KWELI. 471 00:27:34,360 --> 00:27:35,950 Hapa Mimi tu kusema x. 472 00:27:35,950 --> 00:27:39,970 Hivyo R mapenzi kwa kawaida kwa default kuchukua arguments-- mwisho 473 00:27:39,970 --> 00:27:43,510 au kuchukua mstari wa mwisho wa kanuni, na kwamba itakuwa nini ni akarudi. 474 00:27:43,510 --> 00:27:46,920 Hivyo hapa hii ni sawa kitu kama kufanya kurudi x. 475 00:27:46,920 --> 00:27:49,450 476 00:27:49,450 --> 00:27:50,540 >> Na tu kuonyesha wewe. 477 00:27:50,540 --> 00:27:54,000 478 00:27:54,000 --> 00:27:57,052 Na kisha, itakuwa kazi tu kama hiyo. 479 00:27:57,052 --> 00:27:58,260 Hivyo basi mimi kuendelea na hii. 480 00:27:58,260 --> 00:28:00,630 >> Hivyo mwingine kama. 481 00:28:00,630 --> 00:28:04,060 Na kwa kweli, siwezi kurudi chochote Ningependa. 482 00:28:04,060 --> 00:28:06,680 Hivyo mimi si hata kuwa na kurudi Booleans wakati wote, 483 00:28:06,680 --> 00:28:08,410 Naweza tu kurudi kitu kingine. 484 00:28:08,410 --> 00:28:10,670 Hivyo siwezi kufanya kurudi kubeba. 485 00:28:10,670 --> 00:28:12,989 >> Hivyo kama x ni sawa na usawa, ni kwenda na kurudi kubeba. 486 00:28:12,989 --> 00:28:14,530 Vinginevyo, ni kwenda na kurudi KWELI. 487 00:28:14,530 --> 00:28:19,310 Pia naweza kufanya vector au kweli kitu chochote. 488 00:28:19,310 --> 00:28:22,210 >> Na kawaida katika statically lugha typed, 489 00:28:22,210 --> 00:28:23,840 wewe d na kutaja aina hapa. 490 00:28:23,840 --> 00:28:25,750 Na taarifa kwamba inaweza tu kuwa kitu chochote. 491 00:28:25,750 --> 00:28:32,400 Na R ni akili ya kutosha kwamba tu kufanya hivyo na itakuwa kazi nzuri. 492 00:28:32,400 --> 00:28:33,620 >> Hivyo basi mimi kufafanua hili. 493 00:28:33,620 --> 00:28:39,460 494 00:28:39,460 --> 00:28:41,230 Unexpected-- oh sorry. 495 00:28:41,230 --> 00:28:44,336 Ni lazima brace curly hapa. 496 00:28:44,336 --> 00:28:44,836 OK. 497 00:28:44,836 --> 00:28:45,336 Baridi. 498 00:28:45,336 --> 00:28:52,580 499 00:28:52,580 --> 00:28:54,530 Wote haki. 500 00:28:54,530 --> 00:28:58,250 Hivyo sasa hebu kulinganisha 3 na sawa 3. 501 00:28:58,250 --> 00:29:01,860 Hivyo ni lazima return-- yeah-- thamani kubeba. 502 00:29:01,860 --> 00:29:06,740 >> Hivyo sasa jambo zaidi kwa ujumla ni kama nini kuhusu miundo mingine data. 503 00:29:06,740 --> 00:29:09,110 Hivyo una kazi hii. 504 00:29:09,110 --> 00:29:15,360 Hii ni kwenda kufanya kazi ya aina yoyote ya thamani kama 3 au numeric yoyote, 505 00:29:15,360 --> 00:29:17,500 kwa maneno mengine, mara mbili. 506 00:29:17,500 --> 00:29:19,330 >> Lakini nini kuhusu kitu kama vector. 507 00:29:19,330 --> 00:29:27,750 Hivyo kile kinachotokea kama wewe do-- hivyo mimi nina kwenda hawawajui Val, kusema, 4 hadi 6. 508 00:29:27,750 --> 00:29:31,640 Hivyo kama mimi kurudi, hii ni vector kutoka 4, 5, 6. 509 00:29:31,640 --> 00:29:34,935 >> Sasa hebu angalia nini kinatokea kama mimi kufanya bounded.compare Val. 510 00:29:34,935 --> 00:29:37,680 511 00:29:37,680 --> 00:29:42,450 Hivyo hii ni kwenda kukupa 15 1251. 512 00:29:42,450 --> 00:29:46,440 Hivyo kwa maneno mengine, ni kusema kama ukiangalia katika hali hii 513 00:29:46,440 --> 00:29:50,040 hivyo anasema x ni chini kuliko au kitu. 514 00:29:50,040 --> 00:29:51,880 Hivyo hii ni kidogo utata kwa sababu sasa 515 00:29:51,880 --> 00:29:53,379 wewe tu hawajui nini kinaendelea. 516 00:29:53,379 --> 00:29:58,690 Kwa hiyo mimi nadhani jambo moja kwamba kweli vizuri kuhusu kujaribu tu Debug 517 00:29:58,690 --> 00:30:04,600 ni kwamba unaweza tu kufanya Val ni mkubwa kuliko na kuona nini kinatokea huko. 518 00:30:04,600 --> 00:30:09,720 >> Hivyo val-- ni kwa default 5 ili hebu tu Val zaidi kuliko 5. 519 00:30:09,720 --> 00:30:14,280 Hivyo hii ni vector uongo uongo KWELI. 520 00:30:14,280 --> 00:30:17,206 Hivyo sasa wakati wewe ni kuangalia hii, ni kwenda kusema kama, 521 00:30:17,206 --> 00:30:20,080 na basi ni kwenda kukupa hii ni vector ya uongo uongo KWELI. 522 00:30:20,080 --> 00:30:23,450 >> Hivyo wakati wewe kupitisha hii katika R, R hana wazo nini unachokifanya. 523 00:30:23,450 --> 00:30:26,650 Kwa sababu anatarajia moja moja thamani, ambayo ni Boolean, na sasa 524 00:30:26,650 --> 00:30:29,420 wewe ni kutoa ni vector ya Booleans. 525 00:30:29,420 --> 00:30:31,970 Hivyo kwa default, R ni kwenda kusema nini heck, 526 00:30:31,970 --> 00:30:35,440 Mimi nina kwenda kwa kudhani kuwa uko kwenda kuchukua kipengele kwanza hapa. 527 00:30:35,440 --> 00:30:38,320 Hivyo nina kwenda kwa say-- mimi nina kwenda kudhani kwamba hii ni UONGO. 528 00:30:38,320 --> 00:30:40,890 Hivyo ni kwenda kusema hakuna, hii si sahihi. 529 00:30:40,890 --> 00:30:45,246 >> Vile vile, ni kwenda kuwa Val sawa sawa. 530 00:30:45,246 --> 00:30:47,244 No, sorry 5. 531 00:30:47,244 --> 00:30:48,910 Na ni pia kwenda kuwa uongo pia. 532 00:30:48,910 --> 00:30:52,410 Hivyo ni kwenda kusema hapana, si kweli kama vizuri hivyo ni 533 00:30:52,410 --> 00:30:53,680 kwenda na kurudi hii moja ya mwisho. 534 00:30:53,680 --> 00:30:56,420 535 00:30:56,420 --> 00:31:01,360 >> Hivyo hii ni ama jambo zuri au baya kitu, kulingana na jinsi wewe kuangalia yake. 536 00:31:01,360 --> 00:31:05,104 Kwa sababu wakati uko kujenga kazi hizo, 537 00:31:05,104 --> 00:31:06,770 wewe si kweli kujua nini kinaendelea. 538 00:31:06,770 --> 00:31:10,210 Hivyo wakati mwingine wewe d wanataka makosa, au labda unataka tu onyo. 539 00:31:10,210 --> 00:31:12,160 Katika kesi hiyo, R haina kufanya hivyo. 540 00:31:12,160 --> 00:31:14,300 Hivyo ni kweli hadi wewe msingi mbali ya nini 541 00:31:14,300 --> 00:31:17,310 unafikiri lugha lazima kufanya katika kesi hii 542 00:31:17,310 --> 00:31:22,920 kama wewe kupita katika vector ya Booleans wakati wewe ni kufanya kama hali hiyo. 543 00:31:22,920 --> 00:31:31,733 >> Basi hebu kusema kwamba alikuwa awali moja na kama mwingine kurudi TRUE na uko 544 00:31:31,733 --> 00:31:34,190 kwenda na kurudi FALSE. 545 00:31:34,190 --> 00:31:39,300 Hivyo njia moja ya abstracting hii ni kusema mimi 546 00:31:39,300 --> 00:31:41,530 hawana hata haja ya jambo hili kwa masharti. 547 00:31:41,530 --> 00:31:47,220 Kitu kingine siwezi kufanya ni tu kurudi maadili wenyewe. 548 00:31:47,220 --> 00:31:53,240 Hivyo kama wewe taarifa, kama wewe kufanya Val ni mkubwa kuliko 5, 549 00:31:53,240 --> 00:31:56,350 hii ni kwenda na kurudi vector uongo uongo KWELI. 550 00:31:56,350 --> 00:31:58,850 >> Labda hii ni nini wanataka kwa ajili ya bounded.compare. 551 00:31:58,850 --> 00:32:02,940 Unataka kurudi vector ya Booleans ambapo inalinganishwa kila moja ya maadili 552 00:32:02,940 --> 00:32:04,190 kwa wenyewe. 553 00:32:04,190 --> 00:32:11,165 Hivyo unaweza tu bounded.compare kazi x, a sawa 5. 554 00:32:11,165 --> 00:32:13,322 555 00:32:13,322 --> 00:32:15,363 Na kisha badala ya kufanya hii kama mwingine hali, 556 00:32:15,363 --> 00:32:21,430 Mimi tu kwenda na kurudi x ni mkubwa kuliko 5. 557 00:32:21,430 --> 00:32:23,620 Hivyo kama ni kweli, basi ni kwenda na kurudi KWELI. 558 00:32:23,620 --> 00:32:26,830 Na kisha kama si, ni kwenda na kurudi FALSE. 559 00:32:26,830 --> 00:32:30,880 >> Na hii kazi kwa ajili yoyote ya miundo haya. 560 00:32:30,880 --> 00:32:41,450 Hivyo siwezi bounded.compare c 1 6 au 9 na kisha mimi nina kwenda kusema ni sawa na 6, 561 00:32:41,450 --> 00:32:42,799 kwa mfano. 562 00:32:42,799 --> 00:32:44,840 Na kisha kwenda kukupa Boolean haki 563 00:32:44,840 --> 00:32:48,240 vector kwamba wewe ni kubuni. 564 00:32:48,240 --> 00:32:50,660 >> Hivyo wale ni kazi tu na sasa napenda tu 565 00:32:50,660 --> 00:32:54,980 kuonyesha baadhi visuals mwingiliano. 566 00:32:54,980 --> 00:32:59,700 Sidhani mimi kweli kuwa Wi-Fi hapa hivyo basi mimi tu kwenda mbele 567 00:32:59,700 --> 00:33:01,970 na ruka hii moja mimi nadhani. 568 00:33:01,970 --> 00:33:05,260 >> Lakini jambo moja kwamba ni baridi ingawa ni kwamba kama wewe tu 569 00:33:05,260 --> 00:33:09,600 wanataka mtihani rundo la amri mbalimbali data, 570 00:33:09,600 --> 00:33:13,320 kuna kundi la datasets mbalimbali ambayo tayari preloaded katika R. 571 00:33:13,320 --> 00:33:15,770 Hivyo mmoja wao ni kuitwa iris CCD. 572 00:33:15,770 --> 00:33:18,910 Hii ni moja ya vizuri zaidi inayojulikana wale katika kujifunza mashine. 573 00:33:18,910 --> 00:33:23,350 Wewe utakuwa kawaida tu kufanya aina fulani ya kesi mtihani ili kuona kama kanuni yako anaendesha. 574 00:33:23,350 --> 00:33:27,520 Basi hebu tu kuangalia nini iris ni. 575 00:33:27,520 --> 00:33:33,130 >> Kwa hiyo, jambo hili ni kwenda kuwa frame data. 576 00:33:33,130 --> 00:33:36,000 Na ni aina ya muda mrefu kwa sababu I just kuchapishwa iris. 577 00:33:36,000 --> 00:33:38,810 Ni uchapishaji nje jambo nzima. 578 00:33:38,810 --> 00:33:42,830 Hivyo ina majina yote haya tofauti. 579 00:33:42,830 --> 00:33:45,505 Hivyo iris ni mkusanyiko maua ya tofauti. 580 00:33:45,505 --> 00:33:48,830 Katika kesi hiyo, Ni kuwaambia wewe aina yake, 581 00:33:48,830 --> 00:33:54,760 widths haya yote tofauti na urefu wa sepal na petal. 582 00:33:54,760 --> 00:33:58,880 >> Na hivyo kwa kawaida, kama unataka magazeti iris, 583 00:33:58,880 --> 00:34:03,680 kwa mfano, hawataki kuwa ni kufanya yote haya kwa sababu ambayo inaweza kuchukua zaidi ya 584 00:34:03,680 --> 00:34:05,190 console yako yote. 585 00:34:05,190 --> 00:34:09,280 Hivyo jambo moja kwamba kweli nzuri ni kichwa kazi. 586 00:34:09,280 --> 00:34:12,929 Hivyo kama wewe tu kufanya kichwa iris, hii nitakupa 587 00:34:12,929 --> 00:34:17,389 safu ya kwanza tano, au sita mimi nadhani. 588 00:34:17,389 --> 00:34:19,909 Na kisha vizuri, unaweza tu kutaja hapa. 589 00:34:19,909 --> 00:34:22,914 Hivyo 20-- hii nitakupa wewe kwanza safu 20. 590 00:34:22,914 --> 00:34:24,830 Na mimi kwa kweli ilikuwa ni aina ya kushangaa kwamba hii 591 00:34:24,830 --> 00:34:28,770 aliyonipa sita hivyo basi mimi kwenda mbele na kuangalia iris-- au kichwa, sorry. 592 00:34:28,770 --> 00:34:31,699 593 00:34:31,699 --> 00:34:34,960 Na hapa itakuwa kutoa wewe nyaraka 594 00:34:34,960 --> 00:34:37,960 ya nini thamani kichwa gani. 595 00:34:37,960 --> 00:34:40,839 Hivyo kuirudisha kwanza au mwisho wa kitu. 596 00:34:40,839 --> 00:34:42,630 Na kisha mimi nina kwenda kuangalia defaults. 597 00:34:42,630 --> 00:34:47,340 Na kisha anasema default njia ya kichwa x na n sawa 6L. 598 00:34:47,340 --> 00:34:50,620 Hivyo hii anarudi kwanza mambo sita. 599 00:34:50,620 --> 00:34:55,050 Na vile vile kama taarifa hapa, mimi hakuwa na bayana n sawa na 6. 600 00:34:55,050 --> 00:34:56,840 By default inatumia sita, mimi nadhani. 601 00:34:56,840 --> 00:35:00,130 Na kisha, kama nataka kutaja baadhi ya thamani, basi mimi unaweza kuona kuwa vilevile. 602 00:35:00,130 --> 00:35:02,970 603 00:35:02,970 --> 00:35:10,592 >> Hivyo kwamba ni baadhi ya amri rahisi na hapa mwingine moja ambayo just-- vizuri, 604 00:35:10,592 --> 00:35:12,550 Mimi can-- hii ni kweli kidogo ngumu zaidi, 605 00:35:12,550 --> 00:35:17,130 lakini hii itakuwa tu kuchukua darasa ya kila safu ya CCD iris. 606 00:35:17,130 --> 00:35:20,910 Hivyo hii nitakuonyesha nini kila moja ya haya nguzo ni katika suala la aina yao. 607 00:35:20,910 --> 00:35:23,665 Hivyo sepal urefu ni numeric, sepal upana ni numeric. 608 00:35:23,665 --> 00:35:26,540 Maadili haya yote ni tu numeric kwa sababu unaweza kuwaambia kutoka data hii 609 00:35:26,540 --> 00:35:29,440 muundo hizi ni wote kwenda numeric. 610 00:35:29,440 --> 00:35:34,310 >> Na safu Species ni kwenda kuwa sababu. 611 00:35:34,310 --> 00:35:37,270 Hivyo kawaida, unaweza kudhani kuwa hii ni kama kamba tabia. 612 00:35:37,270 --> 00:35:48,830 Lakini kama wewe tu kufanya irisSpecies, na kisha mimi nina kwenda kufanya kichwa 5, 613 00:35:48,830 --> 00:35:51,820 na hii ni kwenda magazeti maadili mitano ya kwanza. 614 00:35:51,820 --> 00:35:54,150 >> Na kisha taarifa ngazi hii. 615 00:35:54,150 --> 00:35:58,870 Hivyo hii ni saying-- hii ni njia R ya kuwa na vigezo categorical. 616 00:35:58,870 --> 00:36:03,765 Hivyo badala ya kuwa masharti tabia, 617 00:36:03,765 --> 00:36:06,740 ina ngazi kubainisha ambayo ya mambo haya ni. 618 00:36:06,740 --> 00:36:12,450 >> Basi hebu kusema irisSpecies 1. 619 00:36:12,450 --> 00:36:17,690 Hivyo kile unataka kufanya hapa ni mimi nina subsetting kwa safu hii Spishi. 620 00:36:17,690 --> 00:36:21,480 Hivyo hii inachukua Aina safu na kisha 621 00:36:21,480 --> 00:36:23,820 ni bahati ya kupata kipengele kwanza. 622 00:36:23,820 --> 00:36:27,140 Hivyo hii lazima kukupa setosa. 623 00:36:27,140 --> 00:36:28,710 Na pia inakupa ngazi hapa. 624 00:36:28,710 --> 00:36:32,812 >> Hivyo unaweza pia kulinganisha hii setosa tabia 625 00:36:32,812 --> 00:36:34,645 na hii si kwenda kuwa TRUE kwa sababu moja 626 00:36:34,645 --> 00:36:37,940 ni ya aina tofauti kuliko nyingine. 627 00:36:37,940 --> 00:36:40,590 Au mimi nadhani ni kweli kwa sababu R ni akili zaidi kuliko hiyo. 628 00:36:40,590 --> 00:36:45,420 Na inaonekana wakati huu na kisha anasema, labda hii ni nini unataka. 629 00:36:45,420 --> 00:36:51,860 Hivyo ni kwenda kusema tabia kamba setosa ni sawa na hii moja. 630 00:36:51,860 --> 00:37:01,290 Na kisha vile vile, unaweza pia kunyakua tu hizi kama kadhalika. 631 00:37:01,290 --> 00:37:05,580 >> Hivyo kwamba ni tu aina fulani ya amri ya haraka ya CCD. 632 00:37:05,580 --> 00:37:08,030 Hivyo hapa ni baadhi ya utafutaji data. 633 00:37:08,030 --> 00:37:11,360 Hivyo hii ni kidogo zaidi wanaohusika na uchambuzi data. 634 00:37:11,360 --> 00:37:18,340 Na hii ni kuchukuliwa kutoka kwa baadhi ya bootcamp katika R kwa maana Berkeley. 635 00:37:18,340 --> 00:37:20,790 >> Hivyo maktaba kigeni. 636 00:37:20,790 --> 00:37:24,880 Hivyo nina kwenda kupakia katika maktaba hiyo kuitwa kigeni. 637 00:37:24,880 --> 00:37:32,460 Hivyo hii ni kwenda nipe read.dta hivyo kudhani kuwa nina CCD hii. 638 00:37:32,460 --> 00:37:39,000 Hii ni kuhifadhiwa katika sasa kazi directory ya console yangu. 639 00:37:39,000 --> 00:37:42,190 Basi hebu tu kuona nini directory kazi ni. 640 00:37:42,190 --> 00:37:44,620 >> Hivyo hapa ni kazi directory yangu. 641 00:37:44,620 --> 00:37:50,040 Na kusoma dot data, hii kitu, ni kusema faili hili 642 00:37:50,040 --> 00:37:54,650 iko katika folder data ya kazi saraka hii ya sasa. 643 00:37:54,650 --> 00:38:00,520 Na read.dta hii si default amri. 644 00:38:00,520 --> 00:38:02,760 Mimi nadhani kubeba katika tayari. 645 00:38:02,760 --> 00:38:04,750 IEI kudhani Mimi kubeba huu katika tayari. 646 00:38:04,750 --> 00:38:08,115 >> Lakini hivyo read.dta si kwenda kuwa amri default. 647 00:38:08,115 --> 00:38:11,550 Na kwamba ni kwa nini wewe ni kwenda kuwa na shehena katika maktaba hii package-- 648 00:38:11,550 --> 00:38:14,500 mfuko huu inaitwa kigeni. 649 00:38:14,500 --> 00:38:16,690 Na kama huna mfuko, nadhani 650 00:38:16,690 --> 00:38:19,180 kigeni ni mmoja wa wale kujengwa katika. 651 00:38:19,180 --> 00:38:31,150 Vinginevyo, unaweza pia kufanya install.packages 652 00:38:31,150 --> 00:38:33,180 na hii itakuwa kufunga mfuko. 653 00:38:33,180 --> 00:38:36,878 Na hii nitakupa R. Uh, hakuna. 654 00:38:36,878 --> 00:38:39,830 655 00:38:39,830 --> 00:38:43,140 Na kisha mimi nina kwenda tu kuacha hii kwa sababu mimi tayari kuwa nayo. 656 00:38:43,140 --> 00:38:46,920 >> Lakini nini ni nzuri kuhusu R ni kwamba usimamizi mfuko 657 00:38:46,920 --> 00:38:48,510 mfumo ni kifahari sana. 658 00:38:48,510 --> 00:38:52,470 Kwa sababu itakuwa kuhifadhi kila kitu kweli nicely kwa ajili yenu. 659 00:38:52,470 --> 00:38:59,780 Hivyo katika kesi hii, ni kwenda kuhifadhi katika, naamini hii maktaba hapa. 660 00:38:59,780 --> 00:39:02,390 >> Hivyo wakati wowote unataka kufunga paket mpya, 661 00:39:02,390 --> 00:39:04,980 ni tu rahisi kama kufanya install.packages 662 00:39:04,980 --> 00:39:07,500 na R kusimamia yote paket kwa ajili yenu. 663 00:39:07,500 --> 00:39:12,900 Hivyo huna kufanya kitu katika Python, ambapo una mfuko nje 664 00:39:12,900 --> 00:39:15,330 mameneja kama karatasi Anaconda ambapo wewe ni 665 00:39:15,330 --> 00:39:18,310 doing-- kufunga paket nje ya Python 666 00:39:18,310 --> 00:39:20,940 na kisha kujaribu kukimbia nao mwenyewe. 667 00:39:20,940 --> 00:39:22,210 Hivyo hii ni kwa kweli ni nzuri njia. 668 00:39:22,210 --> 00:39:25,590 >> Na install.packages inahitaji internet. 669 00:39:25,590 --> 00:39:31,950 Inachukua ni kutoka server na there kwamba 670 00:39:31,950 --> 00:39:33,960 kukusanya yote paket inaitwa CRAN. 671 00:39:33,960 --> 00:39:40,690 Na unaweza bayana ambayo aina ya kioo unataka kupakua paket kutoka. 672 00:39:40,690 --> 00:39:43,420 >> Hivyo hapa I am kuchukua CCD hii. 673 00:39:43,420 --> 00:39:46,240 Mimi nina kusoma katika kutumia kazi huu. 674 00:39:46,240 --> 00:39:49,360 Hivyo basi mimi kwenda mbele na kufanya hivyo. 675 00:39:49,360 --> 00:39:52,900 >> Basi hebu kudhani kwamba una CCD hii 676 00:39:52,900 --> 00:39:55,550 na una kabisa sijui ni nini. 677 00:39:55,550 --> 00:39:58,560 Na hii kwa kweli inakuja up haki mara nyingi katika sekta ya 678 00:39:58,560 --> 00:40:00,910 ambapo wewe tu haya tani na tani ya mambo messy 679 00:40:00,910 --> 00:40:02,890 na wao uko incredibly unlabeled. 680 00:40:02,890 --> 00:40:06,380 Hivyo hapa nina hii CCD na sijui 681 00:40:06,380 --> 00:40:08,400 nini ni hivyo mimi nina tu kuonyesha kwa kuangalia ni nje. 682 00:40:08,400 --> 00:40:10,620 >> Hivyo nina kwenda kufanya kichwa kwanza. 683 00:40:10,620 --> 00:40:14,190 Hivyo mimi kuangalia kwanza sita nguzo ya nini CCD hii ni. 684 00:40:14,190 --> 00:40:21,730 Hivyo hii ni hali, pres04, na kisha hizi aina tofauti zote za nguzo. 685 00:40:21,730 --> 00:40:25,612 Na nini kuvutia hapa, mimi nadhani, ni kwamba 686 00:40:25,612 --> 00:40:27,945 ingekuwa kudhani kwamba hii inaonekana kama aina fulani ya uchaguzi. 687 00:40:27,945 --> 00:40:30,482 688 00:40:30,482 --> 00:40:32,190 Na mimi nadhani tu kutoka kuangalia faili 689 00:40:32,190 --> 00:40:41,070 jina hii ni aina fulani ya mkusanyiko ya data juu ya wagombea au wapiga kura 690 00:40:41,070 --> 00:40:44,920 ambao walipiga kura kwa marais maalum au wagombea rais 691 00:40:44,920 --> 00:40:46,550 kwa uchaguzi wa 2004. 692 00:40:46,550 --> 00:40:52,920 >> Hivyo hapa ni maadili 1, 2 hivyo njia moja ya kuhifadhi 693 00:40:52,920 --> 00:40:56,540 wagombea rais ni majina yao. 694 00:40:56,540 --> 00:40:59,780 Katika kesi hii, inaonekana kama wao uko maadili tu integer. 695 00:40:59,780 --> 00:41:04,030 Hivyo mwaka 2004, ilikuwa Bush dhidi ya Kerry naamini. 696 00:41:04,030 --> 00:41:09,010 Na sasa, hebu sema wewe tu hawajui kama 1 sambamba na Bush au 2 697 00:41:09,010 --> 00:41:11,703 sambamba na Kerry au na kadhalika na kadhalika, haki? 698 00:41:11,703 --> 00:41:15,860 >> Na hii ni, tu kwangu, tatizo haki ya kawaida. 699 00:41:15,860 --> 00:41:18,230 Hivyo unaweza kufanya nini katika kesi hii? 700 00:41:18,230 --> 00:41:20,000 Basi hebu angalia mambo haya mengine yote. 701 00:41:20,000 --> 00:41:22,790 >> hali, mimi nina kuchukua hii linatokana na mataifa tofauti. 702 00:41:22,790 --> 00:41:25,100 partyid, kuongeza mapato. 703 00:41:25,100 --> 00:41:27,710 Hebu tuangalie partyid. 704 00:41:27,710 --> 00:41:32,800 Hivyo labda jambo moja unaweza kufanya ni kuangalia kila ya uchunguzi 705 00:41:32,800 --> 00:41:36,250 kuwa na partyid ya Republican au Democrat au kitu. 706 00:41:36,250 --> 00:41:38,170 Basi hebu tu kuangalia nini partyid ni. 707 00:41:38,170 --> 00:41:41,946 >> Hivyo nina kwenda kuchukua dat na kisha mimi nina kwenda 708 00:41:41,946 --> 00:41:47,960 kufanya ishara hii ya dola operator kwamba sikuwa awali 709 00:41:47,960 --> 00:41:50,770 na hii ni kwenda subset kwa safu hiyo. 710 00:41:50,770 --> 00:41:57,760 Na kisha mimi nina kwenda kichwa huu katika 20, tu kuona nini hii inaonekana kama. 711 00:41:57,760 --> 00:42:00,170 >> Hivyo hii ni tu rundo la NAS. 712 00:42:00,170 --> 00:42:02,800 Hivyo kwa maneno mengine, una kukosa data kuhusu guys haya. 713 00:42:02,800 --> 00:42:08,100 Lakini pia taarifa hii dat partyid ni sababu 714 00:42:08,100 --> 00:42:10,030 hivyo hii inakupa makundi mbalimbali. 715 00:42:10,030 --> 00:42:14,170 Hivyo kwa maneno mengine, partyid inaweza kuchukua Democrat, Republican, Independent, 716 00:42:14,170 --> 00:42:16,640 au kitu kingine. 717 00:42:16,640 --> 00:42:23,940 >> Basi hebu kwenda mbele na hebu kuona yupi kati ya hawa is-- oh, OK. 718 00:42:23,940 --> 00:42:28,480 Hivyo nina kwenda kwa subset kwa partyid na kisha 719 00:42:28,480 --> 00:42:32,780 kuangalia ambayo ndio ni Democrat, kwa mfano. 720 00:42:32,780 --> 00:42:37,150 Hii ni kwenda kukupa Boolean, Boolean mkubwa wa TRUEs na FALSEs. 721 00:42:37,150 --> 00:42:41,630 >> Na sasa, hebu sema Nataka kwa subset kwa guys haya. 722 00:42:41,630 --> 00:42:47,260 Hivyo hii ni kwenda kuchukua dat yangu na subset kwa namna yoyote uchunguzi 723 00:42:47,260 --> 00:42:48,910 kuwa sawa partyid sawa Democrat. 724 00:42:48,910 --> 00:42:52,830 725 00:42:52,830 --> 00:42:55,180 Na hii ni muda mrefu kabisa kwa sababu kuna wengi wao. 726 00:42:55,180 --> 00:42:59,060 Hivyo sasa, mimi nina kwenda kichwa hii katika 20. 727 00:42:59,060 --> 00:43:05,690 728 00:43:05,690 --> 00:43:11,270 >> Na kama wewe taarifa, sawa sawa ni ya kuvutia katika kwamba wewe ni 729 00:43:11,270 --> 00:43:13,250 already-- wewe pia ikiwa ni pamoja na NAS. 730 00:43:13,250 --> 00:43:19,010 Hivyo katika kesi hii, wewe bado hawawezi kupata taarifa yoyote kwa sababu sasa una Nas 731 00:43:19,010 --> 00:43:22,650 na wewe tu wanataka kuona ambayo ya uchunguzi yanahusiana na Democrat 732 00:43:22,650 --> 00:43:24,670 na si hizi kukosa maadili wenyewe. 733 00:43:24,670 --> 00:43:27,680 Hivyo ni jinsi gani unaweza kujikwamua Nas haya? 734 00:43:27,680 --> 00:43:36,410 >> Hivyo hapa mimi nina kutumia tu juu ya msingi juu yangu mshale na kisha kusema kusonga kote. 735 00:43:36,410 --> 00:43:39,778 Na kisha hapa mimi nina kwenda tu kusema is.na datpartyid. 736 00:43:39,778 --> 00:43:48,970 737 00:43:48,970 --> 00:43:52,720 Hivyo hii na na kuchukua mbili vectors mbalimbali Boolean 738 00:43:52,720 --> 00:43:57,160 na kusema ni kwenda kuwa Kweli na uongo kwa mfano. 739 00:43:57,160 --> 00:43:59,190 Hivyo ni kwenda kufanya hili sehemu-busara. 740 00:43:59,190 --> 00:44:02,910 Hivyo hapa mimi kusema kuchukua frame data, subset 741 00:44:02,910 --> 00:44:10,170 na wale kwamba yanahusiana na Democrat, na kuondoa yao yoyote ambayo si NA. 742 00:44:10,170 --> 00:44:13,540 >> Hivyo will-- hii lazima kukupa kitu. 743 00:44:13,540 --> 00:44:16,540 744 00:44:16,540 --> 00:44:17,600 Hebu angalia is.na. 745 00:44:17,600 --> 00:44:24,670 746 00:44:24,670 --> 00:44:27,690 Hebu jaribu is.na datpartyid. 747 00:44:27,690 --> 00:44:36,290 748 00:44:36,290 --> 00:44:45,290 Na hii wanapaswa kutoa you-- sorry-- tu vector Boolean. 749 00:44:45,290 --> 00:44:49,260 Na kisha, kwa sababu ni muda mrefu, Mimi nina kwenda subset 20. 750 00:44:49,260 --> 00:44:49,760 OK. 751 00:44:49,760 --> 00:44:51,570 Hivyo hii lazima kazi. 752 00:44:51,570 --> 00:44:54,700 >> Na hii moja pia kuwa TRUEs. 753 00:44:54,700 --> 00:45:01,830 Ah, hivyo makosa yangu hapa ni kwamba I'm-- mimi kutumia C ++ na R kubadilishana hivyo mimi kufanya 754 00:45:01,830 --> 00:45:03,590 kosa hili wakati wote. 755 00:45:03,590 --> 00:45:05,807 na operator ni kweli moja unataka. 756 00:45:05,807 --> 00:45:08,140 Wewe hawataki kutumia mbili ampersands, tu mmoja. 757 00:45:08,140 --> 00:45:14,970 758 00:45:14,970 --> 00:45:17,010 OK. 759 00:45:17,010 --> 00:45:18,140 >> Basi hebu angalia. 760 00:45:18,140 --> 00:45:20,930 761 00:45:20,930 --> 00:45:23,920 Hivyo sisi subsetted kwa partyid ambapo wao uko Democrat 762 00:45:23,920 --> 00:45:25,300 na siyo kukosa maadili. 763 00:45:25,300 --> 00:45:27,690 Na sasa hebu tuangalie ambayo ndio wao walipiga kura kwa ajili ya. 764 00:45:27,690 --> 00:45:31,530 Hivyo inaonekana kama wengi wao walipiga kura kwa ajili ya 1. 765 00:45:31,530 --> 00:45:36,090 Hivyo mimi nina kwenda mbele na kusema kwamba ni Kerry. 766 00:45:36,090 --> 00:45:39,507 >> Na vile vile, unaweza pia kwenda Republican 767 00:45:39,507 --> 00:45:41,090 na hopefully, hii lazima kukupa 2. 768 00:45:41,090 --> 00:45:49,730 769 00:45:49,730 --> 00:45:51,770 Ni tu rundo la nguzo tofauti. 770 00:45:51,770 --> 00:45:53,070 Na hakika, ni 2. 771 00:45:53,070 --> 00:45:55,750 Hivyo partyid wote Republican, wengi wao ni kupiga kura kwa ajili ya 2. 772 00:45:55,750 --> 00:45:58,390 >> Hivyo inaonekana kama, tu kwa kuangalia hii, 773 00:45:58,390 --> 00:46:00,600 Republican ni kwenda kuwa very-- au partyid 774 00:46:00,600 --> 00:46:02,790 ni kwenda kuwa sana sababu kubwa katika kuamua 775 00:46:02,790 --> 00:46:05,420 ambayo mgombea wao uko kwenda kupiga kura kwa ajili ya. 776 00:46:05,420 --> 00:46:07,120 Na hii ni wazi kweli kwa ujumla. 777 00:46:07,120 --> 00:46:10,139 Na hii mechi yako Intuition, bila shaka. 778 00:46:10,139 --> 00:46:11,930 Hivyo inaonekana kama mimi nina mbio nje ya muda ili 779 00:46:11,930 --> 00:46:17,040 napenda tu lazima kwenda mbele na kuonyesha baadhi ya picha ya haraka. 780 00:46:17,040 --> 00:46:21,120 Hivyo hapa ni kitu ambacho ni kidogo ngumu zaidi na taswira. 781 00:46:21,120 --> 00:46:26,450 Hivyo katika kesi hii, hii ni sana uchambuzi rahisi ya kuangalia tu kile 782 00:46:26,450 --> 00:46:28,500 rais wa '04 ni. 783 00:46:28,500 --> 00:46:33,920 >> Hivyo katika kesi hii, hebu sema wewe alitaka kujibu swali hili. 784 00:46:33,920 --> 00:46:38,540 Hivyo tuseme sisi alitaka kujua kupiga kura tabia mwaka 2004 rais uchaguzi 785 00:46:38,540 --> 00:46:41,170 na jinsi inatofautiana na mbio. 786 00:46:41,170 --> 00:46:44,380 Hivyo si tu unataka kuona tabia ya kupiga kura, 787 00:46:44,380 --> 00:46:47,860 lakini unataka subset ya kila mbio na aina ya muhtasari huo. 788 00:46:47,860 --> 00:46:50,770 Na unaweza tu kuwaambia na nukuu hii tata 789 00:46:50,770 --> 00:46:52,580 kwamba hii ni aina ya kupata hazy. 790 00:46:52,580 --> 00:46:56,390 >> Hivyo moja ya R ya juu zaidi paket kwamba pia aina ya hivi karibuni 791 00:46:56,390 --> 00:47:00,070 inaitwa dplyr. 792 00:47:00,070 --> 00:47:03,060 Hivyo ni hii moja hapa hapa. 793 00:47:03,060 --> 00:47:08,080 Na ggg-- ggplot2 ni nzuri tu njia ya kufanya visualizations bora 794 00:47:08,080 --> 00:47:09,400 kuliko kujengwa katika moja. 795 00:47:09,400 --> 00:47:11,108 >> Hivyo nina kwenda kupakia maktaba hizi mbili. 796 00:47:11,108 --> 00:47:13,200 797 00:47:13,200 --> 00:47:16,950 Na kisha, mimi nina kwenda mbele na kukimbia amri hii. 798 00:47:16,950 --> 00:47:19,050 Unaweza tu kutibu hii kama sanduku nyeusi. 799 00:47:19,050 --> 00:47:23,460 >> Nini kinachotokea ni kwamba bomba hili operator ni kupita katika hoja hii 800 00:47:23,460 --> 00:47:24,110 katika hapa. 801 00:47:24,110 --> 00:47:28,070 Hivyo mimi nina akisema kikundi na dat mbio na kisha rais 04. 802 00:47:28,070 --> 00:47:31,530 Na kisha, amri hizi nyingine zote ni kuchuja na kisha muhtasari 803 00:47:31,530 --> 00:47:34,081 ambapo mimi nina kufanya hesabu na basi mimi nina mipango ya hapa. 804 00:47:34,081 --> 00:47:39,980 805 00:47:39,980 --> 00:47:42,500 OK baridi. 806 00:47:42,500 --> 00:47:44,620 Basi hebu kwenda mbele na kuona nini hii inaonekana kama. 807 00:47:44,620 --> 00:47:52,280 808 00:47:52,280 --> 00:47:57,290 >> Hivyo nini kinatokea hapa ni kwamba mimi tu walipanga kila ya jamii na kisha 809 00:47:57,290 --> 00:47:59,670 ambayo ndio wao walipiga kura kwa ajili ya. 810 00:47:59,670 --> 00:48:03,492 Na hizi mbili tofauti maadili yanahusiana na 2 na 1. 811 00:48:03,492 --> 00:48:05,325 Kama unataka kuwa zaidi kifahari, unaweza pia 812 00:48:05,325 --> 00:48:11,770 bayana tu kwamba 2 ni Kerry-- au 2 ni Bush, na kisha 1 ni Kerry. 813 00:48:11,770 --> 00:48:13,700 Na unaweza pia kuwa kwamba katika legend yako. 814 00:48:13,700 --> 00:48:17,410 >> Na unaweza pia umegawanyika grafu hizi bar. 815 00:48:17,410 --> 00:48:19,480 Kwa sababu jambo moja ni kwamba, kama taarifa, 816 00:48:19,480 --> 00:48:24,560 hii si rahisi sana kutambua ambayo ya maadili hizi mbili ni kubwa. 817 00:48:24,560 --> 00:48:27,920 Hivyo jambo moja wewe d wanataka kufanya ni kuchukua eneo hili bluu 818 00:48:27,920 --> 00:48:31,855 na tu hoja hiyo juu ya hapa hivyo unaweza kulinganisha hizi mbili upande kwa upande. 819 00:48:31,855 --> 00:48:34,480 Na mimi nadhani hiyo ni kitu mimi hawana muda wa kufanya hivi sasa, 820 00:48:34,480 --> 00:48:36,660 lakini hiyo ni pia ni rahisi sana kufanya. 821 00:48:36,660 --> 00:48:40,310 Unaweza tu kuangalia katika mtu kurasa za ggplot. 822 00:48:40,310 --> 00:48:47,170 Hivyo unaweza tu ggplot kama kuwa na kusoma katika ukurasa hii mtu. 823 00:48:47,170 --> 00:48:51,920 >> Hivyo basi mimi tu haraka kuonyesha baadhi ya mambo ya baridi. 824 00:48:51,920 --> 00:48:57,610 Hebu kwenda mbele na kwenda to-- tu maombi ya kujifunza mashine. 825 00:48:57,610 --> 00:49:02,450 Basi hebu kusema tuna hizi tatu paket hivyo mimi nina kwenda kupakia hizi katika. 826 00:49:02,450 --> 00:49:05,500 827 00:49:05,500 --> 00:49:09,170 Hivyo hii Prints tu nje baadhi habari baada ya mimi kubeba katika kitu. 828 00:49:09,170 --> 00:49:15,220 Hivyo ninachosema read.csv hii, CCD hili, na sasa 829 00:49:15,220 --> 00:49:18,940 Mimi nina kwenda mbele na kuangalia na kuona nini ndani CCD hii. 830 00:49:18,940 --> 00:49:22,080 >> Hivyo kwanza 20 uchunguzi. 831 00:49:22,080 --> 00:49:27,190 Hivyo mimi tu X1, X2, na Y. hiyo Inaonekana kama rundo la maadili haya 832 00:49:27,190 --> 00:49:31,640 ni kuanzia labda 20-80 au hivyo. 833 00:49:31,640 --> 00:49:37,700 Na kisha vile vile kwa X2 na kisha hii Y inaonekana kuwa maandiko 0 na 1. 834 00:49:37,700 --> 00:49:49,500 >> Ili kuthibitisha hili, siwezi tu kufanya muhtasari data X1. 835 00:49:49,500 --> 00:49:51,660 Na kisha vile vile kwa hizi nguzo mengine yote. 836 00:49:51,660 --> 00:49:55,300 Hivyo muhtasari ni njia ya haraka ya kuonyesha wewe tu maadili ya haraka. 837 00:49:55,300 --> 00:49:56,330 Oh, sorry. 838 00:49:56,330 --> 00:49:58,440 Hii moja lazima Y. 839 00:49:58,440 --> 00:50:03,420 >> Hivyo katika kesi hii, anatoa quantiles, medians, maxes pia. 840 00:50:03,420 --> 00:50:07,130 Katika kesi hiyo, dataY, unaweza kuona kwamba ni tu kwenda kuwa 0 na 1. 841 00:50:07,130 --> 00:50:10,100 Pia maana ni kusema 0.6, tu ina maana kwamba ni 842 00:50:10,100 --> 00:50:13,380 Inaonekana kama nina 1s zaidi ya sekunde 0. 843 00:50:13,380 --> 00:50:16,160 >> Hivyo basi mimi kwenda mbele na show nini hii inaonekana kama. 844 00:50:16,160 --> 00:50:17,470 Hivyo mimi nina kwenda tu njama hii. 845 00:50:17,470 --> 00:50:22,852 846 00:50:22,852 --> 00:50:24,636 Hebu angalia jinsi ya wazi hii. 847 00:50:24,636 --> 00:50:30,492 848 00:50:30,492 --> 00:50:31,468 Oh OK. 849 00:50:31,468 --> 00:50:35,840 850 00:50:35,840 --> 00:50:36,340 OK. 851 00:50:36,340 --> 00:50:37,590 >> Hivyo hii ni nini inaonekana kama. 852 00:50:37,590 --> 00:50:46,310 Hivyo inaonekana kama njano mimi maalum kama 0, na kisha nyekundu mimi maalum kama 1s. 853 00:50:46,310 --> 00:50:52,190 Hivyo hapa inaonekana kama pointi studio na 854 00:50:52,190 --> 00:50:56,410 Inaonekana kama wewe tu alitaka baadhi aina ya kuunganisha juu ya hili. 855 00:50:56,410 --> 00:51:01,020 >> Na napenda tu kwenda mbele na show baadhi ya kujengwa katika kazi hizi. 856 00:51:01,020 --> 00:51:03,580 Hivyo hapa ni lm. 857 00:51:03,580 --> 00:51:06,060 Hivyo hii ni kujaribu tu fit line hii. 858 00:51:06,060 --> 00:51:08,640 Kwa hiyo kile ni njia bora kwamba naweza fit line kama 859 00:51:08,640 --> 00:51:14,020 kwamba itakuwa bora tofauti aina hii ya kuunganisha. 860 00:51:14,020 --> 00:51:21,790 Na walau, unaweza tu kuona kwamba mimi tu kukimbia amri hizi zote 861 00:51:21,790 --> 00:51:25,450 na kisha, mimi nina kwenda mbele na kuongeza line. 862 00:51:25,450 --> 00:51:28,970 >> Hivyo hii inaonekana kama nadhani bora. 863 00:51:28,970 --> 00:51:34,150 Ni kuchukua moja bora ambayo itapunguza kosa katika kujaribu fit mstari huu. 864 00:51:34,150 --> 00:51:40,000 Ni wazi, hii inaonekana aina ya nzuri, lakini siyo bora. 865 00:51:40,000 --> 00:51:43,130 Na mifano linear, katika ujumla, ni kwenda kuwa 866 00:51:43,130 --> 00:51:46,811 kubwa kweli kweli kwa ajili ya nadharia na aina tu wa jengo misingi ya mashine 867 00:51:46,811 --> 00:51:47,310 kujifunza. 868 00:51:47,310 --> 00:51:50,330 Lakini katika utekelezaji, utaenda wanataka kufanya kitu zaidi kwa ujumla. 869 00:51:50,330 --> 00:51:54,280 >> Hivyo unaweza tu kujaribu kuendesha kitu kinachoitwa mtandao neural. 870 00:51:54,280 --> 00:51:57,110 Mambo hayo ni inazidi kawaida zaidi. 871 00:51:57,110 --> 00:52:00,530 Na wao tu kazi fantastically setidata kubwa. 872 00:52:00,530 --> 00:52:07,080 Hivyo katika kesi hii, sisi tu have-- hebu see-- tuna nrow. 873 00:52:07,080 --> 00:52:09,010 Hivyo nrow ni kusema tu idadi ya mistari. 874 00:52:09,010 --> 00:52:11,790 Hivyo katika kesi hii, mimi kuwa 100 uchunguzi. 875 00:52:11,790 --> 00:52:15,010 >> Hivyo basi mimi kwenda mbele na kutengeneza mtandao neural. 876 00:52:15,010 --> 00:52:18,620 Hivyo hii ni kwa kweli ni nzuri kwa sababu naweza kusema tu nnet 877 00:52:18,620 --> 00:52:21,767 na kisha mimi nina regressing Y. Hivyo Y ni safu hiyo. 878 00:52:21,767 --> 00:52:23,850 Na kisha regressing kwenye nyingine vigezo mbili. 879 00:52:23,850 --> 00:52:27,360 Hivyo hii ni mfupi nukuu kwa X1 na X2. 880 00:52:27,360 --> 00:52:29,741 >> Basi hebu kwenda mbele na kukimbia hii. 881 00:52:29,741 --> 00:52:30,240 Oh, sorry. 882 00:52:30,240 --> 00:52:32,260 Mimi haja ya kuendesha mambo yote haya. 883 00:52:32,260 --> 00:52:37,500 Na hii ni uchapishaji tu nukuu kwa jinsi ya haraka au si haraka ni 884 00:52:37,500 --> 00:52:38,460 converged. 885 00:52:38,460 --> 00:52:41,420 Hivyo inaonekana kama hakuwa hukutana. 886 00:52:41,420 --> 00:52:44,970 Hivyo basi mimi kwenda mbele na magazeti nini hii inaonekana kama. 887 00:52:44,970 --> 00:52:51,260 >> Angalia hapa picha na hapa ni contour kuonyesha jinsi vizuri inafaa. 888 00:52:51,260 --> 00:52:56,380 Na hii ni just-- unaweza kuona hii kwamba hii ni sana, nzuri sana. 889 00:52:56,380 --> 00:52:59,400 Ni inaweza hata kuwa overfitting, lakini pia unaweza 890 00:52:59,400 --> 00:53:03,390 akaunti kwa ajili ya hii na wengine mbinu kama msalaba-uthibitisho. 891 00:53:03,390 --> 00:53:06,180 Na hawa pia ni kujengwa katika R. 892 00:53:06,180 --> 00:53:09,170 >> Na napenda tu kuonyesha kusaidia mashine vector. 893 00:53:09,170 --> 00:53:12,470 Hii ni mwingine kweli ya kawaida mbinu katika kujifunza mashine. 894 00:53:12,470 --> 00:53:18,550 Ni sawa na mifano linear, lakini inatumia kile kinachoitwa njia kernel. 895 00:53:18,550 --> 00:53:22,790 Na hebu angalia jinsi vizuri kwamba hana. 896 00:53:22,790 --> 00:53:26,430 Hivyo hii moja ni sawa na jinsi vizuri mtandao neural hufanya, 897 00:53:26,430 --> 00:53:27,900 lakini ni zaidi laini. 898 00:53:27,900 --> 00:53:35,740 Na hii ni msingi mbali ya what-- jinsi SVMs kazi. 899 00:53:35,740 --> 00:53:40,250 >> Hivyo hii ni tu sana maelezo ya haraka ya baadhi 900 00:53:40,250 --> 00:53:43,822 ya kujengwa katika kazi unaweza kufanya na baadhi ya utafutaji data pia. 901 00:53:43,822 --> 00:53:45,905 Hivyo basi mimi tu kwenda mbele na kwenda nyuma ya slides. 902 00:53:45,905 --> 00:53:50,290 903 00:53:50,290 --> 00:53:53,670 >> Hivyo ni wazi, hii ni si pana sana. 904 00:53:53,670 --> 00:53:57,140 Na hii ni kweli tu teaser kuonyesha nini unaweza kweli kufanya katika R. 905 00:53:57,140 --> 00:53:59,100 Hivyo kama wewe d tu kama kujifunza zaidi, hapa 906 00:53:59,100 --> 00:54:01,210 ni kundi la rasilimali mbalimbali. 907 00:54:01,210 --> 00:54:06,890 >> Hivyo kama wewe ni uzoefu wa vitabu vya kiada au wewe ni tu uzoefu wa kusoma vitu online, 908 00:54:06,890 --> 00:54:09,670 basi hii ni ajabu moja kwa Hadley Wickham, 909 00:54:09,670 --> 00:54:13,010 hawa ambao pia umba kila kweli baridi paket. 910 00:54:13,010 --> 00:54:17,420 Kama wewe ni fond ya video, basi Berkeley ina bootcamp kutisha 911 00:54:17,420 --> 00:54:21,060 hiyo ni several-- hiyo ni aina ya muda mrefu. 912 00:54:21,060 --> 00:54:24,210 Na itakuwa kufundisha karibu kila kitu Ningependa kujua kuhusu R. 913 00:54:24,210 --> 00:54:27,770 >> Na vile vile, kuna Codeacademy na hizi aina nyingine zote 914 00:54:27,770 --> 00:54:29,414 tovuti ya mwingiliano. 915 00:54:29,414 --> 00:54:31,580 Pia ni kupata common-- zaidi na zaidi ya kawaida. 916 00:54:31,580 --> 00:54:33,749 Hivyo hii ni sawa na Codeacademy. 917 00:54:33,749 --> 00:54:35,790 Na hatimaye, kama wewe tu wanataka Jumuiya na kusaidia, 918 00:54:35,790 --> 00:54:38,800 hizi ni kundi la mambo unaweza kwenda. 919 00:54:38,800 --> 00:54:40,880 Ni wazi, sisi bado kutumia orodha ya barua pepe, tu 920 00:54:40,880 --> 00:54:44,860 kama karibu kila mengine lugha ya programu ya jamii. 921 00:54:44,860 --> 00:54:47,880 Na #rstats, hii ni jamii yetu Twitter. 922 00:54:47,880 --> 00:54:49,580 Hiyo ni kweli kawaida kabisa. 923 00:54:49,580 --> 00:54:50,850 Na kisha mtumiaji! 924 00:54:50,850 --> 00:54:52,340 Ni tu mkutano wetu. 925 00:54:52,340 --> 00:54:55,390 >> Na kisha, bila shaka, unaweza kutumia yote haya mengine Q & A mambo, 926 00:54:55,390 --> 00:54:57,680 kama Stack Overflow, Google, na kisha GitHub. 927 00:54:57,680 --> 00:55:00,490 Kwa sababu wengi wa hizi paket na mengi ya jamii 928 00:55:00,490 --> 00:55:03,420 itakuwa unaozingatia kuzunguka zinazoendelea code sababu ni wazi chanzo. 929 00:55:03,420 --> 00:55:05,856 Na ni kweli tu nzuri juu ya GitHub. 930 00:55:05,856 --> 00:55:08,730 Na hatimaye, unaweza kuwasiliana na mimi kama wewe tu una maswali yoyote ya haraka. 931 00:55:08,730 --> 00:55:13,530 Hivyo unaweza kupata yangu juu ya Twitter hapa, tovuti yangu, na tu email yangu. 932 00:55:13,530 --> 00:55:17,840 Hivyo hopefully, kwamba alikuwa something-- tu teaser short 933 00:55:17,840 --> 00:55:20,900 ya nini R ni kweli uwezo wa kufanya. 934 00:55:20,900 --> 00:55:23,990 Na hopefully, wewe tu kuangalia nje ya viungo hivi tatu 935 00:55:23,990 --> 00:55:25,760 na kuona nini unaweza kufanya zaidi. 936 00:55:25,760 --> 00:55:28,130 Na mimi nadhani kwamba tu kuhusu hilo. 937 00:55:28,130 --> 00:55:28,630 Shukrani. 938 00:55:28,630 --> 00:55:30,780 >> [Makofi] 939 00:55:30,780 --> 00:55:31,968