BRIAN YU: In this lab your task is going to be to write a program in Python to simulate the results of a sports tournament. In a sports tournament, like the FIFA World Cup or other sport tournaments as well, oftentimes tournaments end up in a single elimination bracket where you end up with a bunch of teams each of which play each other, where the winners then move on to the next round and then play each other, the winners then move on to the next round, play each other and then finally the last two teams play each other and whichever team wins between those last two teams is ultimately declared the winner of the tournament. How might we simulate this type of tournament? Well in order to do so, we need some idea of how good each of these teams actually is so that we can compare two teams and make some prediction about who is likely to win a game between those two teams. So oftentimes teams or players will have ratings, some number that determines how good that particular team or player is and as a result, we can use that information to compare two ratings to determine who might win a game between any two teams for instance. Ultimately your program is going to use this kind of information, a listing of teams and what their ratings are, to simulate a tournament and simulate what the probability is that any particular team is going to win that tournament. In order to do so you'll need access to some data, so we'll give you some data formatted as a CSV file, comma separated values, where every line corresponds to a team that has two values. First the name of the team, in other words, like what country for example that team is from, followed by a comma, and then the rating for that team, some number representing the strength of that team where a higher rating means that team is better and is therefore more likely to win a game against a lower rated team for example. The bigger the difference between the ratings of those two teams, the more likely it is that the team with the higher rating is going to win that game. If we stored this information inside of a CSV file, then your program is going to work as follows, you'll run python tournament.py followed by a CSV file, the one we have here is the 2018 men's FIFA World Cup teams, and after that your program is going to simulate a whole bunch of tournaments, maybe simulating 1,000 different tournaments within these teams and then printing out based on those results what the program thinks the probability is that any particular country will be the eventual winner of the entire tournament. How are you going to do that? Well, let's start by taking a look at the distribution code that we give to you as part of this lab. For this lab, we give to you a couple of files. We give you some CSV files, each of which is going to contain a listing of teams as well as what rating each of those teams has, and we give you that for a couple of different tournaments but then tournament.py is where all of the logic is. This is the Python file that you're going to use to actually simulate one of these sports tournaments. We start here by defining a variable n which is equal to the number of simulations to run, and by default we're going to simulate 1,000 different tournaments with these teams. Inside the main function, we check to make sure that the program is being used correctly with a file name provided as an argument. Then we define a variable called teams, which initially is just going to be an empty list, there are no teams we know of yet. But the first thing you'll want to do is to read from the CSV file all of those teams and sort each team inside of this list of teams, storing each team with a dictionary where that dictionary is going to store values for both the name of the team as well as for the rating for that team as well. After that, we define another dictionary called count. And count is going to be a dictionary that maps keys to values as all dictionaries do, where in this case, the keys are going to be the names of the teams and the values are going to be how many tournaments that team has won. Because ultimately we're going to simulate n tournaments, where by default n is going to be 1,000 and we want to keep track of how many times any given team wins a tournament. And if a team wins the tournament 100 times then that team name is going to be the key and 100 is going to be the value, so that we can remember for any given team how many tournaments they won according to our simulation. And based on that simulation, we've already written code for you that goes through each of those teams and prints out what probability we expect them to have of winning the entire tournament. We've also given to you a couple of other functions. We've given to a simulate_game function that accepts two teams as input. And what it's going to do is return true if based on the simulation team 1 wins and false otherwise. This function utilizes some randomness, it's not always going to return the same result to you every time, just as when two of the same teams play a game it's not always going to be the case most likely that the same team is going to win every time. There is some variability in the function as well. What the function does is it looks at the rating for both of those teams, rating 1 and rating 2, and uses that information to calculate what the probability is that team 1 for example wins the game. And then randomly using that probability, returns true sometimes if team 1 wins and false otherwise. We've also given you a function called simulate_round, which does the same thing, but not just for one game but for an entire round of games between many different teams. The simulate_round function accepts as input a list of teams, and what the simulate_round function will do is consider each of those pairs of teams one at a time, teams 0 and 1, then 2 and 3, then 4 and 5, and simulate the game between each of them, returning to you a list of the winners of that round. So if you give to simulate_round a list of eight teams for example, then simulate_round will return to you a list of the four winners from pairing up teams 0 and 1, 2 and 3, 4 5, 6 and 7 for example. Finally, here is the simulate_tournament function. This function ultimately should simulate the entire tournament, starting out with all of the teams, which you can assume will be some power of 2, like 16 teams for example. And then repeatedly simulating rounds until we're down to just one winner of the entire tournament and it's going to be left up to you to complete that function. So let's recap what you'll need to do in tournament.py. First, you should complete the main function using csv.DictReader you can read teams from the CSV file one at a time, treating each team as a dictionary, where there's a key called team that represents the team's name as well as a key called rating that represents the team's rating. Now by default when you read files as a CSV file it's going to treat everything as a string and because the rating is a number you'll want to make sure that you actually convert that rating to an integer first. Once you do, you're going to store each team as a dictionary inside that list of teams. So the teams ends up being a list of dictionaries, one dictionary per team. And once you have that list of teams and you can then simulate n tournaments, where n by default is 1,000, by calling your simulate_tournament function. After each of those tournaments, which you might imagine having in some sort of loop that's going to repeatedly simulate one tournament after another, you'll want to keep track of the win count inside of your count dictionary. Keeping track for any given team how many times that team has won one of your simulated tournaments. You'll also want to complete the simulate_tournament function. The simulate_tournament function again should simulate an entire tournament, accepting a list of teams and producing who the winner of the simulated tournament is. In doing so, you'll probably want to call the simulate_round function, which we've already written for you, which accepts a list of teams and returns a list of the winners from that round. And likely you'll want to run this function multiple times, repeatedly simulating rounds until only one team is left. If you start off with a tournament with 16 teams and you pass those teams into simulate_round, you'll get back a list of eight winners. If you simulate a round with those eight winners, you'll get back a list of four winners and then two, all the way down until one team is left in this tournament. And once you're down to just one winning team, you're going to return the name of that winning team so that you can use that name inside your count dictionary to figure out who ultimately is going to win the simulation. After you've done all of that, you should be able to run tournament.py on a CSV file that contains teams and their ratings and figure out the approximate probability that any given team is going to win the tournament. My name is Brian and this was World Cup.