1 00:00:00,000 --> 00:00:03,310 CS50 Live first looks today at GitLab. 2 00:00:03,310 --> 00:00:08,665 GitLab is a popular source code hosting site, much like GitHub.com, 3 00:00:08,665 --> 00:00:12,160 that developers can use in order to store their code centrally, 4 00:00:12,160 --> 00:00:14,500 in order to version control it, share multiple copies, 5 00:00:14,500 --> 00:00:16,360 as well as share it with other users. 6 00:00:16,360 --> 00:00:20,170 Unfortunately, GitLab ran into a bit of an issue very recently. 7 00:00:20,170 --> 00:00:23,112 The whole incident started when they saw this. 8 00:00:23,112 --> 00:00:24,820 They support a feature known as Snippets, 9 00:00:24,820 --> 00:00:26,860 whereby users can create snippets of code, 10 00:00:26,860 --> 00:00:30,190 much like GitHub Gist, whereby users can upload small snippets of code 11 00:00:30,190 --> 00:00:31,750 to share them with other people. 12 00:00:31,750 --> 00:00:35,800 Unfortunately, having some 1.5 million snippets of code 13 00:00:35,800 --> 00:00:37,960 created over the course of just a few days? 14 00:00:37,960 --> 00:00:38,710 Not normal. 15 00:00:38,710 --> 00:00:41,380 In fact, this seemed to be the result of some spamming behavior 16 00:00:41,380 --> 00:00:43,450 by some adversarial folks online. 17 00:00:43,450 --> 00:00:48,160 Moreover, GitLab also notice that one or more spammers seemed to be using GitLab 18 00:00:48,160 --> 00:00:51,760 inappropriately, as a content delivery network, or CDN, 19 00:00:51,760 --> 00:00:54,700 whereby they were serving up files in ways that they shouldn't. 20 00:00:54,700 --> 00:00:57,280 Now unfortunately, these kinds of attacks 21 00:00:57,280 --> 00:01:00,880 resulted in a bit of a ripple effect on their back-end databases. 22 00:01:00,880 --> 00:01:03,137 Particularly, GitLab posted the following. 23 00:01:03,137 --> 00:01:05,470 "We are experiencing issues with our production database 24 00:01:05,470 --> 00:01:07,364 and are working to recover." 25 00:01:07,364 --> 00:01:09,280 Now unfortunately, just minutes later did they 26 00:01:09,280 --> 00:01:12,430 post, "We accidentally deleted production data 27 00:01:12,430 --> 00:01:15,130 and might have to restore from backup." 28 00:01:15,130 --> 00:01:16,540 Now what exactly happened? 29 00:01:16,540 --> 00:01:20,260 Well, it's quite common for databases to be replicated from one to another 30 00:01:20,260 --> 00:01:22,130 so that you have a primary and a secondary, 31 00:01:22,130 --> 00:01:25,690 the latter of which is a backup of the former in real time. 32 00:01:25,690 --> 00:01:28,840 As part of the diagnosis challenge for figuring out 33 00:01:28,840 --> 00:01:32,110 why the databases were slowing down in terms of this replication, 34 00:01:32,110 --> 00:01:34,930 one of GitLab's system administrators very deliberately 35 00:01:34,930 --> 00:01:37,250 executed a command quite like this. 36 00:01:37,250 --> 00:01:38,340 Now what is this command? 37 00:01:38,340 --> 00:01:40,360 Well at the front of this command is "sudo," 38 00:01:40,360 --> 00:01:44,410 which says execute the following command with administrative, or root, 39 00:01:44,410 --> 00:01:45,460 privileges. 40 00:01:45,460 --> 00:01:46,930 What is the command to be executed? 41 00:01:46,930 --> 00:01:48,970 Well, rm -rf apparently. 42 00:01:48,970 --> 00:01:52,660 And rm, you might know, is to remove files or folders from a system. 43 00:01:52,660 --> 00:01:55,060 -r though means recursively. 44 00:01:55,060 --> 00:01:58,480 Delete the following thing recursively, so that any directories inside of that 45 00:01:58,480 --> 00:01:59,890 also get deleted. 46 00:01:59,890 --> 00:02:04,600 And unfortunately, the "f" -rf means forcibly, 47 00:02:04,600 --> 00:02:08,110 which means don't even prompt the human to confirm or deny 48 00:02:08,110 --> 00:02:09,610 that he or she wants to do this. 49 00:02:09,610 --> 00:02:12,970 Now the system administrator meant to execute this command deliberately 50 00:02:12,970 --> 00:02:17,130 on their secondary database, db2.cluster.gitlab.com, 51 00:02:17,130 --> 00:02:20,020 so that they could resume then the replication from their primary 52 00:02:20,020 --> 00:02:21,412 to their secondary database. 53 00:02:21,412 --> 00:02:23,620 Unfortunately, it appears to have been late at night, 54 00:02:23,620 --> 00:02:25,840 and this was a stressful situation, and darn it 55 00:02:25,840 --> 00:02:30,720 if this command weren't executed on db1.cluster.gitlab.com, 56 00:02:30,720 --> 00:02:32,830 the actual primary database. 57 00:02:32,830 --> 00:02:35,530 Now no big deal, surely we have backups all over the place. 58 00:02:35,530 --> 00:02:37,800 So we can just restore from backup, and our customers 59 00:02:37,800 --> 00:02:40,390 will be perfectly happy and on their way. 60 00:02:40,390 --> 00:02:45,070 Unfortunately, out of five backup or replication techniques deployed, 61 00:02:45,070 --> 00:02:49,840 GitLab reported that, "None are working reliably or set up in the first place." 62 00:02:49,840 --> 00:02:53,350 Indeed, if you'd like to read their whole post-mortem in which they 63 00:02:53,350 --> 00:02:57,430 discussed exactly what went wrong, and how, you can check out this URL here. 64 00:02:57,430 --> 00:02:59,380 But the moral of the story, for our purposes, 65 00:02:59,380 --> 00:03:04,090 is please, please beware the rm -rf, especially 66 00:03:04,090 --> 00:03:06,430 if you're not just deleting some directory of your own, 67 00:03:06,430 --> 00:03:09,520 potentially your customers as well.