tokenscanner.h
Types | |
This enumerated type defines the values of the getTokenType method. | |
This abstract type divides a string into individual tokens. | |
Functions | |
Creates a new TokenScanner with an empty token stream. | |
Frees the storage associated with the TokenScanner . | |
Sets the token stream for this scanner to the specified string. | |
Sets the token stream for this scanner to the specified file, which must be open for input. | |
Returns true if there are additional tokens for this scanner to read. | |
Returns the next token from this If nextToken is called when no tokens are available, it returns the empty string. | |
Pushes the specified token back into this scanner's input stream. | |
Returns the current position of the scanner in the input stream. | |
Tells the scanner to ignore whitespace characters. | |
Tells the scanner to ignore comments. | |
Controls how the scanner treats tokens that begin with a digit. | |
Controls how the scanner treats tokens enclosed in quotation marks. | |
Adds the characters in str to the set of characters legal in a WORD token. | |
Returns true if the character is valid in a word. | |
Defines a new multicharacter operator. | |
Reads the next token and makes sure it matches the string expected . | |
Returns the type of this token. | |
Returns the string value of a token. |
typedef enum { SEPARATOR, WORD, NUMBER, STRING, OPERATOR } TokenType;
getTokenType
method.
typedef struct TokenScannerCDT *TokenScanner;
TokenScanner
ADT is illustrated by
the following pattern, which reads the tokens in the string variable
input
:
string token; TokenScanner scanner; scanner = newTokenScanner(); setInputString(scanner, input); while (hasMoreTokens(scanner)) { token = nextToken(scanner); . . . process the token . . . freeBlock(token); } freeTokenScanner(scanner);The
TokenScanner
ADT exports several additional methods
that give clients more control over its behavior. Those methods are
described individually in the documentation.
TokenScanner newTokenScanner(void);
TokenScanner
with an empty token stream.
Before using the scanner, an input stream must be set by calling
either setInputString
or setInputFile
.
Usage:
scanner = newTokenScanner();
void freeTokenScanner();
TokenScanner
.
Usage:
freeTokenScanner(scanner);
void setInputString(TokenScanner scanner, string str);
Usage:
setInputString(scanner, str);
void setInputFile(TokenScanner scanner, FILE *infile);
Usage:
setInputFile(scanner, infile);
bool hasMoreTokens(TokenScanner scanner);
true
if there are additional tokens for this
scanner to read.
Usage:
if (hasMoreTokens(scanner)) . . .
string nextToken(TokenScanner scanner);
nextToken
is called when no tokens are available, it returns the empty string.
Usage:
token = nextToken(scanner);
void saveToken(TokenScanner scanner, string token);
nextToken
, the scanner will return
the saved token without reading any additional characters from the
token stream.
Usage:
saveToken(scanner, token);
int getPosition(TokenScanner scanner);
saveToken
has been called, this position corresponds
to the beginning of the saved token. If saveToken
is
called more than once, getPosition
returns -1.
Usage:
pos = getPosition(scanner);
void ignoreWhitespace(TokenScanner scanner);
nextToken
method treats whitespace characters
(typically spaces and tabs) just like any other punctuation mark
and returns them as single-character tokens.
Calling
ignoreWhitespace(scanner);changes this behavior so that the scanner ignore whitespace characters.
Usage:
ignoreWhitespace(scanner);
void ignoreComments(TokenScanner scanner);
ignoreComments(scanner);sets the parser to ignore comments.
Usage:
ignoreComments(scanner);
void scanNumbers(TokenScanner scanner);
nextToken
method treats numbers and letters
identically and therefore does not provide any special processing for
numbers. Calling
scanNumbers(scanner);changes this behavior so that
nextToken
returns the
longest substring that can be interpreted as a real number.
Usage:
scanNumbers(scanner);
void scanStrings(TokenScanner scanner);
scanStrings(scanner);changes this assumption so that
nextToken
returns a single
token consisting of all characters through the matching quotation mark.
The quotation marks are returned as part of the scanned token so that
clients can differentiate strings from other token types.
Usage:
scanStrings(scanner);
void addWordCharacters(TokenScanner scanner, string str);
str
to the set of characters
legal in a WORD
token. For example, calling
addWordCharacters("_")
adds the underscore to the
set of characters that are accepted as part of a word.
Usage:
addWordCharacters(scanner, str);
bool isWordCharacter(TokenScanner scanner, char ch);
true
if the character is valid in a word.
Usage:
if (isWordCharacter(scanner, ch)) . . .
void addOperator(TokenScanner scanner, string op);
nextToken
when the input stream contains operator
characters, the scanner returns the longest possible operator
string that can be read at that point.
Usage:
addOperator(scanner, op);
void verifyToken(TokenScanner scanner, string expected);
expected
. If it does not, verifyToken
throws an error.
Usage:
verifyToken(scanner, expected);
TokenType getTokenType(TokenScanner scanner, string token);
EOF
,
SEPARATOR
, WORD
, NUMBER
,
STRING
, or OPERATOR
.
Usage:
type = getTokenType(scanner, token);
string getStringValue(string token);
Usage:
str = getStringValue(token);