C character classification

From Wikipedia, the free encyclopedia
Jump to navigation Jump to search

Template:Short description Template:Refimprove Template:C Standard Library

C character classification is a group of operations in the C standard library that test a character for membership in a particular class of characters; such as alphabetic, control, etc. Both single-byte, and wide characters are supported.[1]

History

Early C programmers working on the Unix operating system developed programming idioms for classifying characters. For example, the following code evaluates as true for an ASCII letter character c:

('A' <= c && c <= 'Z') || ('a' <= c && c <= 'z')

Eventually, the interface to common character classification functionality was codified in the C standard library file ctype.h.

Implementation

For performance, the standard character classification functions are usually implemented as macros instead of functions. But, due to limitations of macro evaluation, they are generally not implemented today as they were in early versions of Linux like:

#define isdigit(c) ((c) >= '0' && (c) <= '9')

This can lead to an error when the macro parameter x is expanded to an expression with a side effect; for example: isdigit(x++). If the implementation was a function, then x would be incremented only once. But for this macro definition it is incremented twice.

To eliminate this problem, a common implementation is for the macro to use table lookup. For example, the standard library provides an array of 256 integers Template:Endash one for each character value Template:Endash that each contain a bit-field for each supported classification. A macro references an integer by character value index and accesses the associated bit-field. For example, if the low bit indicates whether the character is a digit, then the isdigit macro could be written as:

#define isdigit(c) (TABLE[c] & 1)

The macro argument, c, is referenced only once, so is evaluated only once.

Overview of functions

The functions that operate on single-byte characters are defined in ctype.h header file (cctype in C++). The functions that operate on wide characters are defined in wctype.h header file (cwctype in C++).

The classification is evaluated according to the effective locale.

Byte
character
Wide
character
Description
Script error: No such module "anchor".isalnum Script error: No such module "anchor".iswalnum checks whether the operand is alphanumeric
Script error: No such module "anchor".isalpha Script error: No such module "anchor".iswalpha checks whether the operand is alphabetic
Script error: No such module "anchor".islower Script error: No such module "anchor".iswlower checks whether the operand is lowercase
Script error: No such module "anchor".isupper Script error: No such module "anchor".iswupper checks whether the operand is an uppercase
Script error: No such module "anchor".isdigit Script error: No such module "anchor".iswdigit checks whether the operand is a digit
Script error: No such module "anchor".isxdigit Script error: No such module "anchor".iswxdigit checks whether the operand is hexadecimal
Script error: No such module "anchor".iscntrl Script error: No such module "anchor".iswcntrl checks whether the operand is a control character
Script error: No such module "anchor".isgraph Script error: No such module "anchor".iswgraph checks whether the operand is a graphical character
Script error: No such module "anchor".isspace Script error: No such module "anchor".iswspace checks whether the operand is space
Script error: No such module "anchor".isblank Script error: No such module "anchor".iswblank checks whether the operand is a blank space character
Script error: No such module "anchor".isprint Script error: No such module "anchor".iswprint checks whether the operand is a printable character
Script error: No such module "anchor".ispunct Script error: No such module "anchor".iswpunct checks whether the operand is punctuation
Script error: No such module "anchor".tolower Script error: No such module "anchor".towlower converts the operand to lowercase
Script error: No such module "anchor".toupper Script error: No such module "anchor".towupper converts the operand to uppercase
Script error: No such module "anchor".iswctype checks whether the operand falls into specific class
Script error: No such module "anchor".towctrans converts the operand using a specific mapping
Script error: No such module "anchor".wctype returns a wide character class to be used with iswctype
Script error: No such module "anchor".wctrans returns a transformation mapping to be used with towctrans

References

Template:Reflist

External links

Template:Sister project Template:Sister project

Template:CProLang Template:Use dmy dates

  1. Script error: No such module "citation/CS1".