We’ve looked at arrays in both Scratch and C. In the context of the actual computer, an array is simply a collection of contiguous chunks of memory. Last Wednesday, we saw that an array could be useful for storing quiz grades. If we had instead used individual variables for each quiz grade and there had been 10, 20, or 30 quizzes, the number of variables we would need to keep track of would be unwieldy. Instead, we were able to store all of our quiz grades in a single variable using this syntax:
float grades[QUIZZES]
Recall that QUIZZES
is actually just a number that we defined as a constant. So, the above is equivalent to:
float grades[2]
We use the square brackets to declare an array, but also to assign values to it:
grades[i] = GetFloat();
float
from the user in the i
th location of the array. In the case of the declaration, the line begins with a data type, e.g. float
above.grades[2]
when the array is only of size 2? In an array of size 2, grades[0]
and grades[1]
are both valid locations, but grades[2]
is one off the end. In some other languages, arrays can be dynamic in size, but in C, they are fixed in size. Trying to access a location beyond the end of the array can cause serious problems, as we’ll see later.In array.c
, we use the following line to compute the average of the quiz grades:
int average = (int) round(sum / QUIZZES);
Writing (int)
is a way of casting, or converting, another data type (in this case, the return value of round
which is a float
) to an int
.
\0
at the end of the array? This only appears at the ends of strings, i.e. arrays of char
, not arrays of other data types.Last week, we briefly touched upon the idea that strings are actually just arrays of characters. Given this knowledge, we can iterate through the characters in strings just as we can the elements of any array:
/****************************************************************************
* string1.c
*
* David J. Malan
* malan@harvard.edu
*
* Prints a given string one character per line.
*
* Demonstrates strings as arrays of chars and use of strlen.
***************************************************************************/
#include <cs50.h>
#include <stdio.h>
#include <string.h>
int main(void)
{
// get line of text
string s = GetString();
// print string, one character per line
for (int i = 0; i < strlen(s); i++)
{
char c = s[i];
printf("%c\n", c);
}
return 0;
}
strlen(s)
, the length of the string. In each iteration of the loop, we access s[i]
, the i
th character of string s
, and store it in a temporary variable c
. If our string s
is “hello,” then s[0]
is “h,” s[1]
is “e,” s[2]
is “l,” and so on. Each of these characters is printed on its own line by this program.One optimization we can make here is to pass s[i]
directly to printf
rather than using the temporary variable c
:
printf("%c\n", s[i]);
%c
, not %s
, because we’re printing a char
, not a string
.Another optimization we can make is to avoid calling the strlen
function on every iteration of the loop:
/****************************************************************************
* string2.c
*
* David J. Malan
* malan@harvard.edu
*
* Prints a given string one character per line.
*
* Demonstrates strings as arrays of chars with slight optimization.
***************************************************************************/
#include <cs50.h>
#include <stdio.h>
#include <string.h>
int main(void)
{
// get line of text
string s = GetString();
// print string, one character per line
for (int i = 0, n = strlen(s); i < n; i++)
{
printf("%c\n", s[i]);
}
return 0;
}
n
. This speed optimization is not without a memory cost, however. We have to store an extra 32 bits now. In this case, it’s probably worth it. In other cases, you’ll need to make a design decision as to which is more important.Ralphie taught us that the text “Or fher gb qevax lbhe Binygvar” actually means “Be sure to drink your Ovaltine” if you decrypt it using the ROT-13 cipher in which every character is rotated around the alphabet halfway. In Problem Set 2, you’ll play around with this kind of encryption, but to do so, you’ll need to know one important fact: a char
in C can be treated as an int
and vice versa. We take advantage of this in ascii.c
:
/****************************************************************************
* ascii.c
*
* David J. Malan
* malan@harvard.edu
*
* Displays the mapping between alphabetical ASCII characters and
* their decimal equivalents using one column.
*
* Demonstrates casting from int to char.
***************************************************************************/
#include <stdio.h>
int main(void)
{
// display mapping for uppercase letters
for (int i = 65; i < 65 + 26; i++)
printf("%c: %d\n", (char) i, i);
// separate uppercase from lowercase
printf("\n");
// display mapping for lowercase letters
for (int i = 97; i < 97 + 26; i++)
printf("%c: %d\n", (char) i, i);
return 0;
}
(char) i
into a %c
placeholder. It seems that simply by casting an int
to a char
, we get the ASCII letter that the number stands for. Actually, we don’t even need to explicitly cast the int
. We can substitute i
into %c
and printf
will implicitly cast the number for us.To get “O” from “B” as we did in Ralphie’s encrypted text, we take 13 steps from “B” to “C” to “D” and so on. In ASCII, “B” is represented by the number 66. Adding 13 to this gives us 79, which sure enough represents “O” in ASCII. What is the letter “Z” encrypted to by ROT-13? In ASCII, “Z” is represented by the number 91. Adding 13 to this gives us 104, which does not represent an uppercase letter in ASCII. Thus, when we encrypt letters closer to the end of the alphabeth than the front, we want to wrap around to the front of the alphabet. The modulo operator (%
) will accomplish this for us:
int x = y % z;
y
by z
and store the remainder in x
.” If y
is 3 and z
is 2, then the value 1 will be stored in x
.int
? To oversimplify things for now, just remember that the leftmost bit is a 1 if it’s negative and 0 if it’s positive. In reality, the representation is accomplished via two’s complement and is a little more complicated.clang
, you passed in the name of your source code file as the first argument. The value of this argument influences the behavior of clang
.So far, most of our programs have started with the following:
int main(void)
However, another legitimate declaration of main
is as follows:
int main(int argc, string argv[])
argc
is an argument to main
which stands for “argument count,” that is, how many command-line arguments the program was passed. If you ran clang mario.c
, argc
would take the value of 2 because the name of the program is counted as an argument. argv
is an array of strings, each of which is a command-line argument that was passed to the program. Note that the square brackets have no number in them since the function main
can take an array of any size.GetString
to accept it while the program is running. Imagine if you had to run clang
, hit Enter, then type the name of the program you want to compile, then hit Enter again. Would be more annoying than e-mails from David!As it turns out, string
is a shorthand we granted you for char *
, so the more proper way to declare main
is:
int main(int argc, char* argv[])
Take a look at this program which simply prints out the command-line arguments it’s been passed:
/****************************************************************************
* argv1.c
*
* David J. Malan
* malan@harvard.edu
*
* Prints command-line arguments, one per line.
*
* Demonstrates use of argv.
***************************************************************************/
#include <cs50.h>
#include <stdio.h>
int main(int argc, string argv[])
{
// print arguments
printf("\n");
for (int i = 0; i < argc; i++)
printf("%s\n", argv[i]);
printf("\n");
return 0;
}
argv[i]
gives us the i
th argument that was passed to argv1
.If argv
is an array of strings, and strings are arrays of characters, then argv
is an array of arrays of characters. Interesting. If we use two loops, then, we can iterate over every character in every argument passed to our program:
/****************************************************************************
* argv2.c
*
* David J. Malan
* malan@harvard.edu
*
* Prints command-line arguments, one character per line.
*
* Demonstrates argv as a two-dimensional array.
***************************************************************************/
#include <cs50.h>
#include <stdio.h>
#include <string.h>
int main(int argc, string argv[])
{
// print arguments
printf("\n");
for (int i = 0; i < argc; i++)
{
for (int j = 0, n = strlen(argv[i]); j < n; j++)
printf("%c\n", argv[i][j]);
printf("\n");
}
return 0;
}
argv[i][j]
gives us the j
th character of the i
th command-line argument.strlen
function, we have to include the string.h
library. Because we’re using the keyword string
, we have to include the CS50 Library.GetString
returns user input? If the user types “hello,” GetString
has to ask the operating system for at least 5 bytes of memory (1 per character) in which to store this string. But let’s say that we call GetString
again and the user types “bye” this time. If “bye” is stored directly next to “hello” in memory, how do we know where one string ends and the other begins? We need a marker to tell us that we’re at the end of a string. This marker is called the null terminator and is represented as \0
. The null terminator is literally the number 0 (not the ASCII character “0”).Even with the null terminator in place, strlen
returns 5 for the string “hello.” However, the operating system is using 6 bytes, not 5, to store this 5-character string. If you had to implement strlen
yourself, you might try something like the following:
#include <cs50.h>
#include <stdio.h>
int strlen(string input)
{
int i = 0;
int length = 0;
while (input[i] != '\0')
length++;
return length;
}
int main(void)
{
printf("Give me a string: ");
string s = GetString();
printf("The length is %d\n", strlen(s));
return 0;
}
\0
, we have to use single quotes, not double quotes.Here, we’ve defined our own function named strlen
; we’re not including string.h
at all. But, oops, it turns out that this program won’t compile because clang
already knows there’s a function named strlen
. Let’s rename our function to length
:
#include <cs50.h>
#include <stdio.h>
int length(string input)
{
int i = 0;
int length = 0;
while (input[i] != '\0')
length++;
return length;
}
int main(void)
{
printf("Give me a string: ");
string s = GetString();
printf("The length is %d\n", length(s));
return 0;
}
When we run this program and enter a string at the prompt, it appears to hang. Why? Our while loop is infinite because we’re never updating i
. Know that Ctrl + C will kill your program when it’s hung.
#include <cs50.h>
#include <stdio.h>
int length(string input)
{
int i = 0;
int length = 0;
while (input[i] != '\0')
{
length++;
i++;
}
return length;
}
int main(void)
{
printf("Give me a string: ");
string s = GetString();
printf("The length is %d\n", length(s));
return 0;
}
Woohoo, it works! But let’s optimize it:
#include <cs50.h>
#include <stdio.h>
int length(string input)
{
int length = 0;
while (input[length] != '\0')
length++;
return length;
}
int main(void)
{
printf("Give me a string: ");
string s = GetString();
printf("The length is %d\n", length(s));
return 0;
}
i
.strlen
on every iteration of a loop, this might not have been necessary. Compilers are often smart enough to know that the return value of a function like this won’t change, so they’ll optimize it for us by only calling it once and storing the result. Still, for the sake of your problem sets, it’s best not to assume!n
, or some arbitrary number.