In the course of reading Advanced Linux Programming, I’ve come across a variety of functions: some system-related, some for parsing command-line arguments, and others for mapping memory. In Chapter 8 of ALP, I take a closer look at Linux system calls and library calls. The most difficult part of this chapter is trodding through the API presented. After all, that’s what this entire chapter is all about.
Broadly speaking, functions fall into two general categories:
- library functions are ordinary functions that reside in a library external to your program. The most common example is the Standard C library,
- system calls are implemented in the Linux kernel. A system call isn’t like an ordinary function call, and a special procedure is required to transfer control to the kernel. Fortunately, the GNU C library wraps Linux system calls with functions so that you can call them easily. Examples include
For reference, a list of system calls is available in
strace command traces the execution of another program, listing any system calls the program makes and any signals it receives. Each line in the output corresponds to a single system call. Be aware that
strace will not show ordinary function calls. While
strace is of limited use to application programmers, some understanding is useful for debugging purposes.
Testing File Permissions
access system call determines whether the calling process has access permissions to a file. The
access call takes two arguments: the path of the file to check, and a bitwise
fcntl system call is the access point for several advanced operations on file descriptors. Its arguments are simple enough: an open file descriptor, and an operation to be performed.
fcntl system call allows a program to place a read lock or write lock on a file, similar to mutex locks. Only one process may hold a write lock, but note that placing a lock does not actually prevent other processes from opening a file, unless they also acquire a lock with
Flushing Disk Buffers
When you write data to a file, the data is not immediately written to disk. To improve performance, the operating system caches the written data in a memory buffer. When the buffer fills or some other condition occurs, the system writes the cached data to disk all at one time. However, this behavior can be undesirable for programs that rely on the integrity of disk-based records, such as a transaction processing system.
This is where the
fsync system call comes in handy. It takes one argument, a writable file descriptor, and flushes to disk any data written to thie file. The
fsync call does not return until the data has been physically written. The system call
fsyncdata does essentially the same thing, with the sole difference being that
fsyncdata does not guarantee that the file’s modification time will be updated. In theory,
fsyncdata can execute faster than
fsync. In practice,
fsyncdata do the same thing, at least in Linux.
setrlimit system calls allow a process to read and set limits on the system resources that it can consume. This is similar to the
ulimit shell command. These system calls allow a program to do this programmatically. Only processes with superuser privelege may set hard limits. Some useful limits include
gettimeofday system call takes a pointer to a
struct timeval variable. The structure represents the time, in seconds, split into two fields:
tv_usec. Moreover, the structure represents the number of seconds elapsed since the start of the UNIX epoch, on midnight UTC on January 1 , 1970. Using the wall-clock time directly is not very handy. Consequently, Linux provides the functions
Locking Physical Memory
mlock family of system calls allows a program to lock some or all of its address space into physical memory. This prevents Linux from paging this memory to swap space, even if the program hasn’t accessed it for a while. Such a technique is commonly used in high-security applications, like gnupg.
Locking a region of memory is simple. Specify a pointer to the start of the region, and the amount of memory to lock as a multiple of the page size. Thus, to allocate 32MB of address space you would use the following code:
const int alloc_size = 32 * 1024 * 1024;
char *memory = malloc (alloc_size);
mlock (memory, alloc_size);
Note, however, that simply allocating memory and locking it with
mlock does not reserve physical memory because the pages may be copy-on-write.
Reading Symbolic Links
readlink system call retrieves the target of a symbolic link. Usually,
readlink does not NUL-terminate the target path that it fills into the buffer. It does, however, return the number of characters in the path, so self-terminating the string is trivial.
Fast Data Transfers
sendfile system call provides an efficient mechanism for copying data from one file descriptor to another. The conventional technique is to allocate a buffer, copy from one file descriptor into the buffer, write the buffer out to the other descriptor, and repeat until all the data has been copied. This process is inefficient. It requires additional memory, and an extra copy to put data into the buffer. Using
sendfile, the intermediate buffer is eliminated.
Setting Interval Times
I’ve seen the
setitimer system call used in autograding software but never quite understood what it did until now. The
setitimer system call is a generalization of the
alarm call. It schedules the delivery of a signal at some point in the future after a fixed amount of time has elapsed. There are three types of timers available:
And that, my friends, concludes Chapter 8.