The small-c handbook download




















Explore a preview version of Smaller C right now. For makers looking to use the smallest microcontrollers or to wring the highest performance out of larger ones, the C language is still the best option.

This practical book provides a solid grounding in C basics for anyone who tinkers with programming microcontrollers. You'll explore the many ways C enables developers and makers to get big results out of tiny devices. Author Marc Loy shows you how to write clean, maintainable C code from scratch. By understanding C syntax and its quirks, you'll gain an enduring computer language literacy that will help you pick up new languages and styles more easily.

There is no significance associated with any particular character positions within a line, and both multistatement lines and multiline statements are allowed. The only exceptions are the preprocessor commands see Chapter 15 , which are written each on a line by itself.

So far, C programs have been described in terms of a single source file, except that code from other files may be included into programs at designated points. But programs may also consist of several source files which are compiled separately. The Small-C compiler provides for these subprograms to be combined either at assembly time or at load time, but not both.

See Chapter If the compiler is configured to support load-time linking, each global variable referenced in one source file but residing in another must be declared external in the file making the reference.

See Chapter 8. Functions which do not exist in a source file that contains calls to them are assumed to be external. The compiler automatically declares each global entity varible, array, pointer, or function as an entry point. However, if the compiler is configured to support assemblytime combining of program parts, none of these is true since the assembler will see all of the parts as a single program.

To summarize, a Small-C program consists of one or more source files. Three mechanisms are provided for bringing together the parts of a program. First, source code from one file may be included into another file by means of the include statement. Second, the parts of a program may be compiled separately and then assembled together. This approach requires that the assembler support an include feature similar to the one in C compilers. Finally, the parts may be compiled and assembled separately, and then linked together by means of a linking loader or a linkage editor.

Each source file consists of preprocessor commands and a list of global declarations for variables, arrays, pointers, and functions. Each function in turn consists of a declarator and a body.

The declarator names the function and gives local names to the arguments it receives. The body includes type declarations for the arguments and a compound statement consisting of local-variable declarations, executable statements, and other compound statements. These compound statements in turn have their own local variable declarations, statements, and compound statements; and so on. Small-C programs begin execution with a function called main, and other functions receive control only when they are specifically called.

A function is called by writing its name followed by a list of zero or more arguments enclosed in parentheses. Chapter 6 Small-C Language Elements erhaps the first thing that catches your eye about the program in Listing 5—1 is that, for the most part, it is written in lower-case letters.

All keywords such as int, while, and if must be in lower case. Userdefined names, however, may be in either upper or lower case. It is customary to use lower case everywhere except for symbols defined with the define statement. Making these upper case calls attention to the fact that they are not variable names, but usually constants. Symbols data names, etc. Thus, the names nameindexi and nameindex2 will both be seen by the compiler as nameinde.

Symbols must begin with a letter, and the remaining characters must be either letters or digits. The underscore character - may be used as a letter, however. Every global name defined to the Small-C compiler generates an assembly language label of the same name. Some assemblers restrict labels to six characters and disallow certain special characters and reserved symbols. CPU register names and assembler directives, for example, would likely be reserved. So, it is best to stay away from such names at the global level and to choose names in which the first six characters are unique.

Also, it is best to avoid names beginning with the underscore character or the letters cc since they are used by lowlevel library functions. There are no problems of this sort with local names because they are allocated on the stack and are referenced relative to the top of the stack rather than by name.

Semicolons are used primarily as statement terminators. A semicolon is placed at the end of every elementary noncompound statement. Quotation marks double quotes , by analogy, surround strings of characters representing arrays of character values. Braces enclose compound statements—blocks of statements that are executed together as though they were a single statement.

Brackets enclose array dimensions in declarations and subscripts in expressions. Parentheses enclose argument lists associated with function declarations and calls. They are also used to group expressions into subexpressions for controlling the order of evaluation. As Table 13—1 illustrates, a number of special characters are used as expression operators. In many cases, a pair of characters constitutes a single operator.

They are ignored by the compiler, but appear in the listing that it produces if requested. These are the elements of the C language. How they are used will become clearer in the following chapters. Chapter 7 Constants mall-C recognizes two types of constants: integers and characters. Integer constants are written as a string of decimal digits. Negative values are written with a leading minus sign. Positive values have no sign or perhaps a leading plus sign.

Most implementations of Small-C represent integer constants internally as signed bit words. This limits their range to the positive values 0 through 32,, and the negative values — 32, through — 1. It is noteworthy that the negative range has the same binary representations as the unsigned values 32, through 65, The Small-C compiler accepts these unsigned values and yields their negative counterparts.

When these negative values pass through the assembler, however, they become the same binary patterns as their unsigned counterparts. Therefore, one may write all of the unsigned values 0 through 65, But care must be taken to ensure that if large positive values are compared with other operands, an unsigned comparison is performed. This is the case if at least one of the operands being compared is an address. Small-C always takes integer constants as decimal values. Full C, however, also recognizes octal and hexadecimal constants.

In full C, if a string of digits begins with 0 zero it is taken as octal; if it begins with Ox or OX, it is taken as hexadecimal. Small-C recognizes neither of these.

The former case is mistaken for a decimal number, and the latter produces an error message. It is important, therefore, when writing Small-C programs, to avoid placing leading zeroes on numeric constants since that would cause problems if the programs were ever compiled by a full-C compiler. It might seem odd that a character constant could have two characters in it, but it makes sense when you consider that, like variables, constants even character constants are loaded as fulllength words.

Notice that there is no sign extension here, as there is for character variables. Sometimes it is desirable to code certain unprintable characters in a program.

This can be done by using an escape sequence—a sequence of two or more characters in which the first escape character changes the meaning of the following character s. The entire sequence generates only one character. Directed to a CRT, it would place the cursor at the first column of the next line. Written to an output device or a character-stream file, the newline character becomes a sequence of two characters: carriage return and line feed not necessarily in that order.

Conversely, on input, a carriage return, line feed pair is reduced to a single newline character. Some implementations of C use the carriage return for newline, and others use the line feed. The compiler takes each digit in the range 0—7 following the backslash until three digits have been found or a nonoctal character is found.

Full C differs slightly here. It also takes the digits 8 and 9 to represent the octal values 10 and 11, respectively. There is one other type of escape sequence: anything undefined. If the backslash is followed by any character other than the ones just described, it is ignored, and the following character is taken as the constant value of itself.

CONSTANTS 39 Strictly speaking, C does not recognize character strings, but it does recognize arrays of characters and provides a way to write arrays of character constants, which are called strings. By surrounding a character string with quotation marks double quotes , you set up an array of the enclosed characters and generate the address of the array.

This last point bears repeating: at the position in the program where it appears, a character string generates the address of an array of character constants which itself is located elsewhere. This is very important to remember. Notice that it differs from a character constant, which generates the value of the constant directly. Since it is a convention in C to identify the end of a character string with a null zero byte, C compilers automatically suffix character strings with such terminators.

As with character constants, the backslash escape sequences may be used. Thus, a quotation mark may be written as Since strings may contain as few as one or two characters, they provide an alternative way of writing character constants in situations where an address, rather than a character value, is needed. Note that character and string constants must be written entirely on one line.

Chapter 8 Variables here are only two types of variables in Small-C: integers and characters. Integers occupy a word, and characters a byte, in memory. An important thing to remember about character variables is that whenever one is fetched from memory, it is converted to an integer. The byte itself goes into the low-order position of a register.

The leftmost bit is considered a sign bit, and its value is placed into every bit of the high-order byte. In other words, character variables become integers by extending the sign bit through the high-order byte. Not all C compilers perform sign extension on character variables, so it is important to note that Small-C does, and to consider the possible effect on programs being moved from Small-C to a compiler that does not or vice versa.

For example, consider the expression ch where ch is a character variable ranging from 0 through The problem here is that all values of ch higher than have the sign bit set, making them effectively less than zero. The only case to yield false is when ch is This requires that an integer be reduced to a character by truncating the high-order byte. A variable is an operand at some location in memory. It is very important to distinguish between the operand itself and its address.

You refer to the operand by writing its name, var for instance. It helps to read the address operator as though it stood for the words address of. Describing a variable implies two operations: declaring its type integer or character and defining it in memory reserving a place for it.

Although both of these are usually involved, variable descriptions are called declarations. External declarations only assign types to variable names; they do not actually define them. The full definition exists in another source file. The examples in Table 8—1 illustrate variable declarations.

Notice that an extern declaration with an unspecified type defaults to type mt. The same basic syntax is used to declare pointers, arrays, and external functions. See Chapters 9, 10, and Variable Declarations Declaration Comment mt i; Defines i and declares it to be an integer. Variables declared at the global level are called static variables because they always exist and never lose their values regardless of the flow of control through the program.

The scope of a global variable includes all of the program below the declaration. That is, the variable is known to all of the following functions in the program.

Global names must be unique to the entire program. Local variables, on the other hand, are called automatic variables because they do not exist until the flow of control passes into the block compound statement in which they are declared.

They are automatic in the sense that they automatically appear when needed and vanish when no longer needed. The scope of a local variable includes only the block in which it is declared and subordinate blocks. Thus, a reference can be made to local variables that are declared in the block in which the reference occurs, or in superior blocks, but not in subordinate blocks. A local variable name has to be unique only in the block in which it is declared.

A given name may be declared in every block of a program. Each instance defines a different variable. A reference to the name will see the one declared in the lowest block that is identical or superior to the block containing the reference. If no such local variable is found, a global variable by that name is sought by the compiler.

In other words, local declarations mask out higher-level declarations. This is an advantage, since it allows you to declare local variables for temporary use without regard for other uses of the same names elsewhere in the program.

As it relates to the C language, the word object refers to any area of memory which can be altered. It is a generic term for anything which can be referenced and manipulated. All variables are objects, but not all objects are variables. The next two chapters discuss pointers and arrays, both of which are objects. Constants, on the other hand, are not objects since they cannot be changed.

Chapter 9 Pointers ne feature of C which makes it suitable for systems programming is its capability of working with addresses.

This capability adds a great deal of flexibility to the language and opens up the entire scope of memory for access to C programs. Addresses which are stored in memory like ordinary variables are called pointers. They have names, occupy one computer word each, and are treated much like integer variables. When they are compared to other things, however, both are considered to be unsigned positive integers.

Thus, it makes no sense to compare a pointer with anything but another address. In fact, to maintain compatibility between C compilers, only addresses within an array should be compared. Any other address comparisons would involve assumptions about how some particular compiler organizes program memory. Pointers are typed according to whether they point to integers or characters.

In full C, pointers may refer to other objects, but in Small-C there are only these two. The type of a pointer is important because different objects have different sizes. When pointers are manipulated, it is in terms of the objects they address, not necessarily bytes. Adding one to a character pointer should direct it to the next character, whereas adding one to an integer pointer should direct it to the next integer.

It follows that any value added to or subtracted from an integer pointer must be scaled by the compiler to account for the fact that integers occupy more than one byte in memory. Some addresses are not pointers, either because they do not have names or because they cannot be modified.

This chapter deals only with addresses that are pointers. The syntax for declaring pointers is the same as that for variables see Chapter 8 , except that pointer names are prefixed with an asterisk. In fact, pointers and variables may be mixed in the same declaration. The asterisk is read as though it stood for the words object at. The idea is that both i and the object at pointed to by ip are integers.

Notice that cp takes up a full word since it is a pointer, not a character. If a pointer appears to the left of an assignment operator, or next to an increment or decrement operator, its value is changed. If it appears elsewhere in an expression, its value normally an address is fetched and used as is. Thus, pointers can be manipulated like ordinary variables. The only differences are that pointer comparisons always assume unsigned positive values, and pointer increments and decrements are scaled according to the type of the pointer.

See Chapter 10 for more about address arithmetic. More specifically, the pointer is first loaded into a register, and the register then supplies the address for the load or store operation.

That is why the asterisk is called the indirection operator: because the object is referenced indirectly by first obtaining its address. An example may help put all of this together.

Consider the program fragment in Listing This code adds five characters to corresponding integers. First, the pointer cend is set five characters beyond whatever address is in cp. The while statement then repeatedly tests whether cp is less than cend. If so, the compound statement is performed; if not, control passes to whatever follows. With each execution of the compound statement, the object at cp a character is added to the object at ip an integer.

Then, both cp and ip are incremented to the next objects. Since ip is an integer pointer, each increment advances it by two bytes instead of one. The procedure executes five times. They are organized as contiguous collections of integer or character variables called elements.

Appending a constant expression in square brackets to a name in a declaration identifies an array with the number of elements indicated by the expression. The examples in Table 10—1 are valid array declarations. SZ must be defined as a constant or a constant expression. You may be troubled by the example of an undimensioned array. How can the compiler work with indefinite array sizes? This points up an important fact about the C language.

Hence, it is not necessary for the com46 ARRAYS 47 piler to know the sizes of arrays which appear as external declarations or function arguments because they are defined elsewhere. The name of an array stands for the address of the first element in the array. Since arrays are fixed in memory, their addresses are invariant. So while it is valid to use array names as addresses, it is invalid to assign new values to array names.

Array elements, however, can be altered. An array element is referenced by writing the name of the array, followed by a subscript expression in square brackets. Any valid expression see Chapter 13 may be used as an array subscript. Zero refers to the first element, one to the second, and so on. Thus, the first and last elements of array ca above would be written ca[O] and ca[7], respectively.

It may also be used to get the address of an array element. To refer to an array element, the Small-C compiler adds the subscript scaled in the case of integer arrays to the address of the array.

The result points to the object to be fetched or stored. This operation suggests an alternative way for programmers to refer to array elements—by adding subscripts to array names and applying the indirection operator to the result. And indeed they may. Pointers may be subscripted, and array names may be used as addresses. As you have seen, addresses pointers or array names may be used freely in expressions. Only two operations make sense, however: displacing an address by some amount, and taking the difference of two addresses.

All other possible operations yield meaningless results. Displacing an array name or a pointer in the positive direction makes sense. Displacing an array name in the negative direction, however, makes no sense because it yields an address outside of the ar ARRAYS 49 ray. Pointers, on the other hand, are not tied to the beginning of an array, so a negative displacement could be useful.

Taking the difference of two addresses yields the number of objects lying between the two addresses. For example, the expression p1 — p2 where p1 and p2 are integer pointers produces the number of integers between these addresses.

The compiler generates code to subtract p2 from ipl and then divide the result by the number of bytes per integer. Had these been character pointers, there would have been no division. It should be clear that certain nonsensical expressions of this type might be written too, for example, taking the difference of a character address and an integer address, taking the difference of addresses which are not in the same array, and subtracting a larger address from a smaller one.

The compiler will accept these, but the results will not be useful. One last point should be made with regard to address arithmetic: Small-C does not support the unsigned-integer data type.

You may declare a character pointer, however, and use it as though it were an unsigned integer. Nothing requires that the value of a pointer actually be a memory address; it could stand for anything.

However, you must be careful to use character pointers since the automatic scaling of values added to or subtracted from integer addresses would produce undesirable effects. The example in Listing may be rewritten using array notation. Listing 10—1 shows the result. First, the integer i is set to zero. Then the while statement controls repeated execution of a compound statement which does the work. As long as i is less than five, corresponding elements of the arrays ca and ia are added, with the result going into a.

With each iteration, i is incremented by one. When i becomes five, control leaves the while and goes to whatever follows. Example of the Use of Arrays Chapter 11 Initial Values he full C language has ways of assigning preliminary values to both global and local objects, but Small-C initializes global objects only.

Local objects always start with unpredictable values. Globals always have predictable initial values: if specific values are not given, then zero is assumed. The advantages of using initial values are somewhat limited, and there is one disadvantage which should be considered.

First, the advantages. By assigning initial values to global objects, you avoid writing assignment statements to do the job. For variables and pointers, the amount of writing saved is only slight.

For arrays, however, the difference is more significant since either an iterative statement or a list of statements must be written. Object-program sizes are a bit smaller when initial values are used because assignment statements are avoided. The speed of execution, however, is not significantly affected since preliminary assignment statements are executed only once. On the negative side, using initial values may result in a loss of serial reusability—the ability to reexecute programs without reloading them.

Some operating systems allow you to reuse programs after previous execution. This is a handy feature with floppy-disk computers since load time is noticeable, especially if the program resides on a floppy diskette which is not currently mounted in a disk drive.

This means that the initial values of its variables must be the same with each execution. If serial reusability is important, you should use assignment statements to give initial values to variables which are changed by program execution. Character constants with backslash escape sequences are permitted.

If array elements are being initialized, a list of constant expressions separated by commas and enclosed in braces is written. If the size of the array is not specified, it is determined by the number of initializers. If the size of the array exceeds the number of initializers, leading elements are initialized and trailing elements default to zero.

If the size of an array is given and there are too many initializers, the compiler generates an error message. Character arrays and character pointers may be initialized with a character string enclosed in quotation marks; a terminating zero byte is automatically generated. The fourth element contains zero.

If the size of an array is not given, it will be set to the size of the string plus one. If the size is given and the string is shorter, trailing elements default to zero. If the string is longer, the array size is increased to match. When a character pointer is initialized, it is set to the address of a string of characters containing the initial values. Table 11—1 shows the permissible combinations of object types and initializers. Full C provides more options for initializing objects, but these are upward-compatible with it.

At any point in an expression, a function may be called by writing its name and, in parentheses, a list of zero or more arguments to be passed to the function. A function call always yields a value which may be used in the subsequent evaluation of an expression.

In the example above, the value returned by func is added to 1 to produce the final value of the expression. You've discovered a title that's missing from our library.

Can you help donate a copy? When you buy books using these links the Internet Archive may earn a small commission. Open Library is a project of the Internet Archive , a c 3 non-profit. This edition doesn't have a description yet.

Can you add one? Add another edition? Copy and paste this code into your Wikipedia page. Need help? If nothing happens, download the GitHub extension for Visual Studio and try again.

It's intended to support all C11 language features while keeping the code as small and simple as possible. The compiler is able to compile itself. You can see its code both as an implementation of the C language and as an example of what this compiler is able to compile.

Here you'll find free compilers including sometimes their sources and articles on writing a compiler. If you want to contact me about this page, send a mail to compilers bloodshed. If you know of any resources about compilers I could add to this page, please submit it. Here is the free compilers list.

If you want to add a new one to this list, click here.



0コメント

  • 1000 / 1000