Chapter 7. Carefully Call Out to Other Resources

 

Do not put your trust in princes, in mortal men, who cannot save.

 Psalms 146:3 (NIV)
Table of Contents
7.1. Limit Call-outs to Valid Values
7.2. Check All System Call Returns

7.1. Limit Call-outs to Valid Values

Ensure that any call out to another program only permits valid and expected values for every parameter. This is more difficult than it sounds, because there are many library calls or commands call lower-level routines in potentially surprising ways. For example, several system calls, such as popen(3) and system(3), are implemented by calling the command shell, meaning that they will be affected by shell metacharacters. Similarly, execlp(3) and execvp(3) may cause the shell to be called. Many guidelines suggest avoiding popen(3), system(3), execlp(3), and execvp(3) entirely and use execve(3) directly in C when trying to spawn a process [Galvin 1998b]. At the least, avoid using system(3) when you can use the execve(3); since system(3) uses the shell to expand characters, there is more opportunity for mischief in system(3). In a similar manner the Perl and shell backtick (`) also call a command shell; see the section on Perl.

One of the nastiest examples of this problem are shell metacharacters. The standard Unix-like command shell (stored in /bin/sh) interprets a number of characters specially. If these characters are sent to the shell, then their special interpretation will be used unless escaped; this fact can be used to break programs. According to the WWW Security FAQ [Stein 1999, Q37], these metacharacters are:
& ; ` ' \ " | * ? ~ < > ^ ( ) [ ] { } $ \n \r

Unfortunately, in real life this isn't a complete list. Here are some other characters that can be problematic:

Forgetting one of these characters can be disastrous, for example, many programs omit backslash as a metacharacter [rfp 1999]. As discussed in the section on validating input, a recommended approach by some is to immediately escape at least all of these characters when they are input. But again, by far and away the best approach is to identify which characters you wish to permit, and only use those characters.

A number of programs have ``escape'' codes that perform ``extra'' activities; make sure that these can't be included (unless you intend for them to be in the message). For example, many line-oriented mail programs (such as mail or mailx) use tilde (~) as an escape character, which can then be used to send a number of commands. As a result, apparantly-innocent commands such as ``mail admin < file-from-user'' can be used to execute arbitrary programs. Interactive programs such as vi and emacs have ``escape'' mechanisms that normally allow users to run arbitrary shell commands from their session. Always examine the documentation of programs you call to search for escape mechanisms.

The issue of avoiding escape codes even goes down to low-level hardware components and emulators of them. Most modems implement the so-called ``Hayes'' command set, in which the sequence ``+++'', a delay, and then ``+++'' again forces the modem to switch modes (and interpret following text as commands to it). This can be used to implement denial-of-service attacks or even forcing a user to connect to someone else. Many ``terminal'' interfaces implement the escape codes of ancient, long-gone physical terminals like the VT100. These codes can be useful, for example, for changing font color in a terminal interface. However, do not allow arbitrary untrusted data to be sent directly to a terminal screen, because some of those codes can cause serious problems. On some systems you can remap keys; on a few you can even send codes to clear the screen, display a set of commands you'd like the victim to run, and then send that set ``back'' (forcing the victim to run the commands of the attacker's choosing).

A related problem is that the NIL character (character 0) can have surprising effects. Most C and C++ functions assume that this character marks the end of a string, but string-handling routines in other languages (such as Perl and Ada95) can handle strings containing NIL. Since many libraries and kernel calls use the C convention, the result is that what is checked is not what is actually used [rfp 1999].

When calling another program or referring to a file always specify its full path (e.g, /usr/bin/sort). For program calls, this will eliminate possible errors in calling the ``wrong'' command, even if the PATH value is incorrectly set. For other file referents, this reduces problems from ``bad'' starting directories.