30 October 2012

Part 7 - Counting newlines and other characters

Topics covered:

Counting newlines
Undefined values
Numeric values of characters



Watch video >>


Downloadable files

countchars.c


Exercises

1. Modify the program countchars.c to count the number of newlines. A newline has the character value '\n'. Notice that '\n' is a single character.

2. Write a program charvalue.c which will show the numeric value of every character read from the standard input. For example, if the characters "abcdefghijklmnopqrstuvwxyz" are read from the standard input, the program should output each numeric value for the Latin small letters.

Questions

1. What is the value of a local variable in C if it has not yet been assigned a value? Such values are also sometimes called gargbage or junk values. What can you expect to happen if you use an undefined value in a calculation?

The particular way that characters are mapped to integers is called the character encoding. A popular encoding is ASCII, which assigns 7-bit integers to each character. A 7-bit integer represents possible values 0...127 and thus cannot handle languages which have large numbers of characters such as Chinese, Japanese and Korean. These three languages are collectively known as CJK when discussing character encodings.

2. Suppose c is a character from some alphabet ('A', 'B', etc.) in some character encoding. Then the expression (c - 'A' + 1) will give the corresponding integer (starting at 1) of the character c whose alphabet begins with 'A'. For example, because the Latin alphabet of capital letters begins with 'A', the expression (c - 'A' + 1) will produce the integer value 26 in the case that c=='Z'. Will this expression also work with other alphabets and character encodings? What must be true of the underlying character encoding for the expression to be valid?

No comments:

Post a Comment