Flashback

Uncategorized — Titus Barik on November 30, 2004 at 12:00 am

It’s been a long week. I returned to Atlanta on Tuesday, bearing foodstuffs and cookies from home. This past Sunday I had lunch with Sarah at the new Moe’s Southwest Grill, and we had a chance to catch up on things. She is doing well. Met with Ryan a few times this past week, and was introduced to Sid Meier’s Pirates and Vampire: The Masquerade in the computer gaming front, and knocked out Red Dragon, Kill Bill 2, and Kevin Smith’s classic, Mallrats, somewhere along the way.

On Monday we met with Rtech on Lakeside Drive, a small home automation company that specializes in audio and video equipment to get some ideas about our new house. Smarthome has some X10 protocol automation tools that might be worth investigating. Finally, made a quick stop to Spectronics in a vain attempt to resurrect a few dead APC systems.

Vonage

Uncategorized — Titus Barik on November 27, 2004 at 8:16 pm

I made the plunge and subscribed to Vonage, a broadband phone company that provides VoIP service. This decision is partly due to the outrageous phone bills that I’ve been having to pay lately, as well as the generally poor voice quality associated with cellular networks. While I was at it, I also went ahead and added the phone number to the National Do Not Call Registry. The equipment should arrive within the next week and when it does, I’ll probably reorganize the network using their alternate multiple computer setup. I envision some QoS problems along the way, but those should be addressed through the use of the latest Linksys firmware, and if not, with the Sveasoft third-party firmware. I hope that is will be a good learning experience as well.

GeekOS

Uncategorized — Titus Barik on November 26, 2004 at 4:24 pm

When teaching operating system concepts, there are basically two prevalent approaches. The first is abstract, and uses a virtual machine or some other user space tool to simulate an operating system environment with the aid of a high-level language. The second approach is concrete, and uses a microkernel or toy operating system that can run on actual hardware.

Having recently graduated from Georgia Tech, I believe that I’m not alone in saying that I come from the first camp of non-happy campers, though allow me to digress for just a moment.

The first half of our operating systems and architecture course, ECE3055, is admittedly excellent. The course provides a spectacular introduction to hardware and the MIPS architecture through the use of Hennessy and Patterson. And unlike many other colleges, we use a combination of VHDL and Altera tools to work with actual, honest-to-god, FPGA processors. Finally, the SPIM simulator is used to experiment with assembly level programming for the MIPS.

Sadly, these qualities don’t carry through to the second half of the semester. Here, Georgia Tech uses higher-level languages in an ill-conceived attempt to introduce students to genuine operating systems concepts such as file systems, memory allocation, and scheduling. In my opinion, that’s not the way to go. The use of these languages results in a very ethereal experience. And there’s just something distant and not quite satisfying about implementing a FAT virtual file system in a simulation environment on top of Java. In fact, I would go as far as to say that such a pedagogy actually turns away students who might have otherwise taken an interest in operating system design.

As a result, it’s difficult to make the connection between how such an implementation might translate in a real-world operating system. Disappointed, students like myself graduate with a very brittle and fragmented idea of what an operating system is all about.

This time, I’d like to do it right. The obvious choice appears to be MINIX. It’s a fully complete operating system with excellent documentation and a modular design that seems well suited for an undergraduate curriculum. But at the same time, maybe it’s just a little too much to absorb at once.

That’s where GeekOS comes in. I recently learned about this peculiar operating system and found that GeekOS follows the concrete methodology, with a tiny operating system kernel for x86 PCs. It’s also about as bare bones as you can get. The operating system is currently being used at the University of Maryland with great success. Additionally, David Hovemeyer has some useful information on hacking the GeekOS. I think that it might be worthwhile to look at GeekOS while simultaneously working with MINIX.

Inline Assembly Code

Uncategorized — Titus Barik on November 25, 2004 at 6:55 pm

At a mere eight pages, Chapter 9 of ALP is the shortest yet. But in my opinion, it’s also one of the most interesting. In this chapter I explore the use of inline assembly with the popular gcc compiler on the x86 architecture. As a rare exception, I’ll deviate from the book and take a top-down approach. I’ll also skip the pros and cons of assembly language programming as most people are already familiar with the core issues. This article is intentionally brief.

Source and Destination

In instructions, source comes first and destination follows. This differs from Intel syntax, where source comes after destination. Consequently, to transfer the contents of eax to ebx:

mov %eax, %ebx

GCC Syntax

The basic GCC syntax skeleton follows, and will be clarified through the use of examples:

asm ( assembler template
      : output operands
      : input operands
      : list of clobbered registers       
     );

In a manner of speaking, the asm statements are little more than a glorified preprocessor. It produces assembly instructions to
deal with the asm’s operands, and it replaces the asm statement with the instruction that you specify. It does not analyze the instruction in any way.

Simple Inline Assembly

Let’s begin by looking at an example to shift a value 8 bits to the right:

asm ("shrl $8, %0"
     : "=r" (answer) 
     : "r" (operand) 
     : "cc");

Here, it’s clear that we’re using the shrl instruction. $8 indicates an immediate constant, or in our case, the shift amount. Output operands are indicated through the use of the = sign. Thus, the result of the operating will be stored in answer. The input register should come from the variable operand. The last parameter indicates that the instruction changes the value in the condition code cc register.

Optimizations

The golden rule of writing programs is to never try to outwit the compiler. It’s smarter than you are. And that’s generally true. Here, however, let’s look at an example where the use of inline assembly can actually benefit us.

For example, the bsrl assembly instruction computes the position of the most significant bit. Thus, we could use this asm statement:

asm ("bsrl %1, %0" 
     : "=r" (position) 
     : "r" (number));

Re-writing the code in conventional C requires quite a bit more effort:

long i;
for (i = (number >> 1), 
        position = 0; i != 0; ++position)
  i >>= 1;

Conclusion

There’s a lot more that you can do with inline assembly. See the resources below for extended examples. Oh, and have a Happy Turkey Day.

Resources

Linux System Calls

Uncategorized — Titus Barik on November 24, 2004 at 12:00 am
19da799e8e7a32238e9f579920c1f17f

In the course of reading Advanced Linux Programming, I’ve come across a variety of functions: some system-related, some for parsing command-line arguments, and others for mapping memory. In Chapter 8 of ALP, I take a closer look at Linux system calls and library calls. The most difficult part of this chapter is trodding through the API presented. After all, that’s what this entire chapter is all about.

Function Categories

Broadly speaking, functions fall into two general categories:

  1. library functions are ordinary functions that reside in a library external to your program. The most common example is the Standard C library, libc.
  2. system calls are implemented in the Linux kernel. A system call isn’t like an ordinary function call, and a special procedure is required to transfer control to the kernel. Fortunately, the GNU C library wraps Linux system calls with functions so that you can call them easily. Examples include open and read.

For reference, a list of system calls is available in /usr/include/asm/unistd.h.

strace

The strace command traces the execution of another program, listing any system calls the program makes and any signals it receives. Each line in the output corresponds to a single system call. Be aware that strace will not show ordinary function calls. While strace is of limited use to application programmers, some understanding is useful for debugging purposes.

Testing File Permissions

The access system call determines whether the calling process has access permissions to a file. The access call takes two arguments: the path of the file to check, and a bitwise OR of R_OK, W_OK, and X_OK.

File Locking

The fcntl system call is the access point for several advanced operations on file descriptors. Its arguments are simple enough: an open file descriptor, and an operation to be performed.

The fcntl system call allows a program to place a read lock or write lock on a file, similar to mutex locks. Only one process may hold a write lock, but note that placing a lock does not actually prevent other processes from opening a file, unless they also acquire a lock with fnctl.

Flushing Disk Buffers

When you write data to a file, the data is not immediately written to disk. To improve performance, the operating system caches the written data in a memory buffer. When the buffer fills or some other condition occurs, the system writes the cached data to disk all at one time. However, this behavior can be undesirable for programs that rely on the integrity of disk-based records, such as a transaction processing system.

This is where the fsync system call comes in handy. It takes one argument, a writable file descriptor, and flushes to disk any data written to thie file. The fsync call does not return until the data has been physically written. The system call fsyncdata does essentially the same thing, with the sole difference being that fsyncdata does not guarantee that the file’s modification time will be updated. In theory, fsyncdata can execute faster than fsync. In practice, fsync and fsyncdata do the same thing, at least in Linux.

Resource Limits

The getrlimit and setrlimit system calls allow a process to read and set limits on the system resources that it can consume. This is similar to the ulimit shell command. These system calls allow a program to do this programmatically. Only processes with superuser privelege may set hard limits. Some useful limits include RLIMIT_CPU, RLIMIT_DATA, RLIMIT_NPROC, and RLIMIT_NOFILE.

Wall-Clock Time

The gettimeofday system call takes a pointer to a struct timeval variable. The structure represents the time, in seconds, split into two fields: tv_sec and tv_usec. Moreover, the structure represents the number of seconds elapsed since the start of the UNIX epoch, on midnight UTC on January 1 , 1970. Using the wall-clock time directly is not very handy. Consequently, Linux provides the functions localtime and strftime.

Locking Physical Memory

The mlock family of system calls allows a program to lock some or all of its address space into physical memory. This prevents Linux from paging this memory to swap space, even if the program hasn’t accessed it for a while. Such a technique is commonly used in high-security applications, like gnupg.

Locking a region of memory is simple. Specify a pointer to the start of the region, and the amount of memory to lock as a multiple of the page size. Thus, to allocate 32MB of address space you would use the following code:

const int alloc_size = 32 * 1024 * 1024;
char *memory = malloc (alloc_size);
mlock (memory, alloc_size);

Note, however, that simply allocating memory and locking it with mlock does not reserve physical memory because the pages may be copy-on-write.

Reading Symbolic Links

The readlink system call retrieves the target of a symbolic link. Usually, readlink does not NUL-terminate the target path that it fills into the buffer. It does, however, return the number of characters in the path, so self-terminating the string is trivial.

Fast Data Transfers

The sendfile system call provides an efficient mechanism for copying data from one file descriptor to another. The conventional technique is to allocate a buffer, copy from one file descriptor into the buffer, write the buffer out to the other descriptor, and repeat until all the data has been copied. This process is inefficient. It requires additional memory, and an extra copy to put data into the buffer. Using sendfile, the intermediate buffer is eliminated.

Setting Interval Times

I’ve seen the setitimer system call used in autograding software but never quite understood what it did until now. The setitimer system call is a generalization of the alarm call. It schedules the delivery of a signal at some point in the future after a fixed amount of time has elapsed. There are three types of timers available: ITIMER_REAL, ITIMER_VIRTUAL, and ITIMER_PROF.

And that, my friends, concludes Chapter 8.

Less is More

Uncategorized — Titus Barik on November 23, 2004 at 10:13 pm

In a fit of sober introspection, it seems that I always end up going full circle. In particular, I spent the evening trying to simplify the layout of the weblog in an effort to strike a balance between content and presentation. It’s always an uphill battle, and I’ve yet to perform an exhaustive browser test, so let me know what you think.

On a side note, I’m in Mobile for about a week to spend Turkey Day, which is just around the corner, with my family. I suppose that it could be worse. After all, I could be forced to watch the Gilmore Girls.

The /proc File System

Uncategorized — Titus Barik on November 22, 2004 at 12:00 am
330912323c6050b6fce5fd8f88d7edfe

Just a few days ago I summarized Linux devices. This provided an excellent segue to discuss Chapter 7 of ALP, on the /proc virtual file system, a window into the running Linux kernel. This article is a little heavy on definitional items, and I apologize because it reads a little bit like a dictionary and reference manual.

Extracting Information from /proc

Most entries in /proc provide information formatted to be human readable. Still, the formats are simple enough to be easily parsed. A simple technique to extract a value from /proc files is to use sscanf. Be aware, however, that the names, semantics, and output formats in the /proc file system might change from kernel version to version. Make sure that your program supports graceful degradation.

Process Entries

The /proc file system contains a directory for each running process on the Linux system. And each of these process directories contains the following entries:

  • cmdline contains the argument list for a process.
  • cwd is a symbolic link that points to the current working directory of the process.
  • environ contains the process’s environment.
  • exe is a symbolic link that points to the executable image running in the process.
  • fd is a subdirectory that contains entries for the file descriptors opened by the process.
  • maps displays information about files mapped into the process’s address space.
  • root is a symbolic link to the root directory of the process. This can be changed using the chroot call.

For security reasons, the permissions of some entries are set so that only the user who owns the process can access them. ALP provides a blow by blow detail of each of the these entries, if you’re looking for a more comprehensive description.

Process Memory Statistics

The statm entry contains a list of seven numbers, separated by spaces. Each number is a count of the number of memory pages used by the process.
The columns, in the order they appear, represent:

  1. the total process size
  2. the size of the process resident in physical memory
  3. the memory shared with other applications; mapped memory
  4. the text size of the process
  5. the size of shared libraries mapped into this process
  6. the memory used by this process for its stack
  7. the number of dirty pages

Hardware Information

Several entries in the /proc file system give us access to hardware and kernel information. I present some of the more useful ones here:

  • for CPU information, see /proc/cpuinfo. The process field lists the processor number, which is zero for single-processor systems. There are also fields for the vendor, CPU family, model, and stepping. Most of this information is obtained from the cpuid x86 assembly instruction.
  • the entry /proc/version contains the OS name and kernel version, as well as the revision.
  • The /proc/sys/kernel/hostname and /proc/sys/kernel/domainname entries contain the computer’s hostname and domain name, respectively. This information is the same as that returned by the uname system call.
  • The /proc/meminfo entry contains information about the system’s memory usage.

File Systems

The /proc/filesystems entry displays the file system types known to the kernel. Unfortunately, it’s not too useful because the list is not complete. The contents of /proc/filesystems list only file system types that either are statically linked into the kernel or are currently loaded.

For information on the drives and partitions themselves, see /proc/ide and /proc/scsi.

Mounts

The /proc/mounts entry provides a summary of the mounted file systems. Each line corresponds to a single mount descriptor and lists the mounted device, the mount point, and other information. This entry contains teh same information as the ordinary file /etc/mtab.

A short description of the columns of the mount descriptor follow, in the order that they appear:

  1. The first element is the mounted device. For special file system, such as /proc, this is none.
  2. The second element is the mount point. For the root file system, the mount point is listed as /. For swap drives, the mount point is listed as swap.
  3. The third element is the file system type.
  4. The fourth element lists mount flags. These are options that were specified when the mount was added.

The last two elements are always 0 and have no meaning.

Locks

The /proc/locks entry describes all file locks currently outstanding in the system. Each row in the output corresponds to one lock. POSIX locks are set using the fnctl system call.

System Statistics

The /proc/loadavg file contains information about the system load. The /proc/uptime file contains the length of time since the system was booted and the amount of the time the system has been idle. The uptime command can also be used to obtain the system’s uptime.

Mayflower

Uncategorized — Titus Barik on November 20, 2004 at 12:00 am

Roslyn and I had lunch at the Cosmic Cantina before departing our separate ways. She lent me a copy of Dave Eggers to take home with me. Stacie lives about ten minutes away, and I spent tonight at her place. She has three, very cute, very friendly kittens. We had dinner at the Mayflower Seafood Restaurant and finished up the night by watching Woody Allen. I’ll be heading back to Atlanta tomorrow, and then it’s back to work.

Confessions

Uncategorized — Titus Barik on November 19, 2004 at 12:00 am

I’m in North Carolina for the weekend. And being the untalented individual that I am, Roslyn, her friend Bethye, and I saw the Achordants perform a cappella at Gerrard Hall at UNC. We concluded the night by attending a party at the AEPi fraternity house. Looks like those crazy Jews are at it again. I also met Roslyn’s roommate Ashley. And it’s a surprising coincidence that she graduated from ASMS, just a year after me. It’s a small world after all.

Linux Devices

Uncategorized — Titus Barik on November 18, 2004 at 8:14 pm
87cc06a587df0d3032312d290760a25e

This week I summarize Chapter 6 of ALP, which introduces us to Linux devices. This also brings us to part II of the Advanced Linux Programming text.

Introduction to Devices

Linux, like most operating systems, interacts with hardware devices through modular software components called device drivers. This boundary abstracts the peculiarities of a hardware device’s communication protocol and allows the system to act through a standardized interface.

Device drivers are part of the kernel and are not directly accessible to user processes. However, Linux provides a mechanism to interface with the driver indirectly through file-like objects. Linux also provides special file-like devices that commmunicate with the kernel that aren’t linked with any particular hardware device.

I’ll begin by describing these file-like objects, and conclude with an examination of special devices and their usage.

Device Types

Though device files appear as ordinary files, they aren’t. Data read from or written to a device file is communicated to the corresponding device driver, and from there to the underlying device. There are two types of device files:

  1. A character device represents a hardware device that reads or writes a serial stream of data bytes. Examples of character devices include serial ports, parallel ports, tape drivers, terminal devices, and sound cards.
  2. A block device represents a hardware device that reads or writes data in fixed size blocks. Unlike a character device, block devices provide random access to data stored on the device. A good example is a disk drive. Typically application will never use block devices.

The following code illustrates a typical read operation from a character device:

int fd = open("/dev/lp0", O_WRONLY);
write (fd, buffer, buffer_length);
close (fd);

Device Numbers

Linux identifies devices using two numbers: the major device number and the minor device number.

The major device number specifies which driver the device corresponds to. The correspondence between the major device numbers to drivers is fixed and is part of the Linux kernel sources. The minor device numbers distinguish individual devices controlled by a single driver. Thus, the meaning of a minor device number depends on the device driver.

Take a look a /proc/devices for a list of major device numbers corresponding to active drivers currently loaded in the kernel.

Device Entries

A device entry is in many ways the same as a regular file. Indeed, typical Linux commands can operate on them:

  • You can move it using mv, or delete it with rm.
  • If you copy a device entry with cp, you will read bytes from the device and write them to a destination file.
  • If you try to overwrite a device entry, you’ll write bytes to the corresponding device instead.

To create a device entry in the file system, use the mknod command. And you can get more information about a device entry by using the stat system call. To remove a device entry, simply use rm.

Special Devices

Linux provide some special devices that don’t correspond to hardware devices. These entries all use the major device number 1, which is associated with the Linux kernel’s memory device instead of a device driver. The most important of these are /dev/null, /dev/zero, /dev/full, /dev/random, and /dev/urandom. Now let’s discuss these in detail:

  • Linux discards any data written to /dev/null. Reading from this device always results in an EOF. This device is useful for disgarding irrelevant data.
  • The device entry /dev/zero simply behaves as if it were an infinitely long file filled with zero bytes. Memory mapping /dev/zero is an advanced technique for allocating memory.
  • The device entry /dev/full behaves as if it were a file in a file system that has no more room. A write to /dev/full sets errno to ENOSPC. This device entry is useful for testing how a program behaves if it runs out of disk space while writing to a file.
  • The special devices /dev/random and /dev/urandom provide access to the Linux kernel’s built-in random number-generation facility. While most software functions for generating random numbers are actually psuedorandom, Linux actually uses an external source of randomness: the user. By measuring the time delay between user input actions, Linux is capable of generating high-quality random numbers. The difference between /dev/random and /dev/urandom is that the former is blocking.

Loopback Devices

A loopback device enables you to simulate a block device using an ordinary file. Loopback devices are named /dev/loop0, /dev/loop1, and so on. Only the superuser can set up a loopback device. Loopback devices are particularly useful for virtual file systems and mounting disk images. For example, one might create a virtual floppy disk image by executing the following:

dd if=/dev/zero of=foo.img \ 
   bs=1024 count=1440
/sbin/mke2fs foo.img

PTYs

The special file system devpts is mounted on /dev/pts. The file system is a magic file system because it is not associated with any hardware device. The directory is created dynamically by the kernel, and the contents of the directory vary with time and reflect the state of the running system. The entries in /dev/pts correspond to pseudo-terminals. Linux creates a PTY for every new terminal window you open and displays a corresponding
entry in /dev/pts.

ioctl

The ioctl system call is an all-purpose interface for controlling hardware devices. The first argument is a file descriptor opened to the device that you want to control. The second argument is a request code that indicates the operation that you want to perform. Here’s an example of how you might use ioctl:

int fd = open (argv[1], O_RDONLY);
ioctl (fd, CDROMEJECT);
close (fd);

The above code will eject the CD-ROM drive.

Next Page »
titus@barik.net | The Weblog of Titus Barik