BRIAN YU: Let's dive into readability. In readability, your task is going to be to write a program in C that takes as input some text and outputs the approximate US grade level that would be the appropriate reading level for that text. For example, you might run your readability program by calling ./readability. Your program will then prompt you to type in some text, or you could type in a couple of sentences. And then your program would analyze that text and conclude, for example, that these sentences are at a third grade reading level. Or if you typed in something a little more complicated, you might get that it's a fifth grade reading level, or something else. 

How do you actually calculate this reading level? Well, first, we can make a couple of observations about what makes something easier or harder to read. One thing to notice is that longer words tends to mean a higher reading level, and another thing that you might notice is that more words per sentence-- in other words, longer sentences-- might also mean that a particular text is at a higher reading level. We can take that information and actually plug it into a readability test, a formula that takes a text and computes what grade level it's appropriate for. 

One such example is the Coleman-Liau index, which takes the number of letters and words and sentences in a text and is able to conclude what US grade level it approximately corresponds to. The formula looks like this. The Coleman-Liau index value is equal to 0.0588 times L minus 2.96 times S minus 15.8, where here L is the average number of letters per 100 words in the text and S is the average number of sentences per 100 words in the text. 

So to compute the Coleman-Liau index value for a particular text, you'll first need to count up how many letters, words, and sentences there are in that particular text, plug them into the formula, and use the result to determine what the US grade reading level is appropriate for this particular text. Let's start by trying to count up the number of letters in a particular text, for example. 

In order to do that, you'll want to keep track of the number of both uppercase and lowercase letters that appear in the text, which isn't going to be every character. You should ignore spaces and punctuation, for example. But how are you going to do that? Well, remember that a string of text you can think of as really just an array of characters you can iterate over one at a time. So in order to do so, you'll probably want to keep some sort of variable that's going to keep track of how many letters you've encountered so far. That variable probably initially will be set to 0 because before you start looking through the string, you haven't seen any letters. 

But if you loop through the string one character at a time, you might start with the character n and realize that it is, in fact, an alphabetic character. It's a letter, so you should increment your letter, count from 0 to 1. And you can do so for every subsequent character, checking if it's a letter and increasing the letter count if so. But as soon as you encounter a character that isn't a letter, you'll want to be careful to not increase the letter count, and in fact, to leave it the same. And then as you get to the next character, if it is a letter, then you can continue to increase the count. 

And you'll continue to repeat that for each of the characters in the string so that by the end of it, you have an accurate count of how many letters there are in this text. How do you determine whether or not a character is a letter or not? Well, there are ways to do this using ASCII, remembering that every character has a numeric value. But you also might find it helpful to take a look at a C header file called ctype.h, which includes several functions that are helpful for determining the type of a particular character. That might help you to figure out how many letters there are in the text. 

After you've calculated how many letters are in the text, the next step is to figure out how many words are in that text. And what really is a word? Well, for the purposes of this program, you're going to count the number of words in a sentence by assuming that any sequence of characters separated by one or more spaces is going to count as a word. So let's take a look at an example. 

Here we again have a string, an array of characters representing some text. And we have a variable called words which is going to keep track of how many words we've encountered. As soon as we hit the first alphabetical character, the letter A, the fact that we've hit this first non-space character at the start of the string indicates to us that this is, in fact, the start of the first word, and we've now found one word. When we encounter other alphabetical characters, we're not going to increment the word count just yet because words have to be separated by spaces. 

As soon as we do hit a space, though, the fact that we've hit a space-- that marks the end of a word. It means the next word is coming if we ever encounter another alphabetic character. And as soon as we get to the next character, we do, in fact, encounter an alphabetic character, so we can increment the word count from 1 to 2. The non-space character here marks the start of a new word. 

And we can keep going. When we detect the space again, that means a new word is coming so that when we hit another alphabetical character-- in this case, W-- we increment the word count from 2 to 3. Notice that the punctuation after the word "by" here doesn't mean there's a new word yet. It's still part of the existing word. The space means that a new word is coming. But imagine, for example, there are two spaces in a row in the string. What happens then? 

Well, multiple spaces in a row shouldn't count as a new word yet. We've still only seen four words. We haven't yet seen five words. So you want to wait until we get to the next alphabetic character. Once we get to the letter T, which is, in fact, a non-space character, that should indicate to us that we've found another word, and we can increment the word count from four to five, for example. 

We can continue to do that for the rest of the string so that we can conclude that in this string, there are, in fact, five words. So that's how we might go about counting words. But after we've counted letters and words, the last piece of information we need to plug into that Coleman-Liau index is the number of sentences that are present in the string. And this is, in fact, a little bit tricky. But for the purpose of this problem, we'll let you assume that any period, exclamation point, or question mark that appears in the string indicates a sentence. 

In reality, this might not be the case. Consider, for example, Mr., where you might see a period that doesn't actually indicate the end of a sentence. But for simplicity, it's safe to assume that generally, periods and exclamation points and question marks are going to mark sentence boundaries. So in a string like this, for example, if we look for all the periods and question marks and exclamation points, we find two of them. So we can conclude that this string has two sentences in it. 

After we've done all of these steps, you should now have accurate counts of the number of letters, words, and sentences that appear inside of the text. And the last step is to calculate the value of the Coleman-Liau index. So how are you going to do that? Well, once you have these three values-- letters, words, and sentences-- you can plug them into the formula to compute what the index value should be. Remember that the index value is based on l, the average number of letters per 100 words, and S, the average number of sentences per 100 words. But now that you have a count of the number of words, the number of letters, and the number of sentences, you should be able to calculate l and s and plug that information into the Coleman-Liau index formula to figure out what the reading level should be. 

What should your program then output? Well, your formula might give you a decimal number, so you'll want to be sure to round the score to the nearest whole number first because you want to approximate a US grade level. You want your program to output something like grade x, where x is the grade level appropriate for this particular text. Of course, what happens if the number is remarkably low or especially high? 

Well, if the output number is less than 1, then you should instead output before grade 1 to indicate that the reading level is earlier than grade 1. Meanwhile, if the output is 16 or higher, approximate that of a college senior or higher, you should just output grade 16 plus to indicate that it's the highest reading level that we'll keep track of for the purpose of this program. Once you've done that, you should be able to run your readability program, type in some text, and see as output the approximate reading level that would be appropriate for this particular text. 

My name is Brian, and this was readability.