About The Pentium
Backwards Compatibility and the Pentium Architecture
Backwards compatibility is generally a good thing, but in the Pentium family it has been taken to a cancerous extreme: programs written for DOS in 1985 stand an excellent chance of running on a modern Pentium processor with minimal change. 20 years of backwards compatibility definitely has its merit: the result has been enormously successful in the marketplace.
The other side of the coin is that the Pentium is by far the most complicated processor currently in existence. Currently deployed Pentium IV processors provide backward support for the 286 (16-bit protected mode), the 386 (32-bit protected mode, and Virtual 8086 compatibility mode), the 486, the 586 (a.k.a Pentium), the Pentium II (with it's Physical Address Extension mechanism), and the Pentium III (with Page Size Extensions). Today, Pentium variants that are starting to ship now are finally catching up with the AMD-64. These parts provide 64-bit support, which entails yet another set of address translation extensions, more registers, and a bunch of new operating mode details.
To the operating system builder, all of this adds up to overwhelming complexity. The bad news is that we have to deal with it. The good news is that the ten people in the world who can keep the entire Pentium architecture in their head will never be unemployed. :-)
Because the Pentium is such a hairy beast, it is an unfortunate choice for a class setting. We are using this architecture in the class because:
-
All of you have (or can get) one.
-
There is a reasonably good, widely available emulator for it (QEMU) that provides some degree of debugging support.
-
We can disable most of the backwards compatibility stuff, or set up an initial machine configuration that lets is simply ignore it.
-
Most of our work will be done in C, and the virtual addressing mechanism of the Pentium is actually pretty standard.
Where to Start
In order to jump-start this class, I'm going to have us reading some code very very early. If you have never seen assembly code, this is going to come as a bit of a shock. Don't worry about it. We really aren't going to look at a lot of assembly code during this course. In fact my goal is to be done with it by the end of week two. The optimized version of the interrupt.S code (which, note, is no longer in the source tree) took months to write. My goal in having you look at this code is to give you a sense of how this part of a microkernel works.
That being said, we are going to do a lot of things that require general knowledge of the Pentium processor and specific knowledge of the virtual address translation mechanism. If you aren't familiar with the Pentium (and perhaps even if you are), you should read the following in order:
-
Volume 1, Chapters 1,2,3.
Volume 3, Chapters 1,2.In earlier versions of the Pentium architecture manual, the content of these chapters was gathered together in one place. Later, it was split into ``user'' bits and ``supervisor'' bits. This reduces the amount of stuff that the application programmer needs to know, but it means that you will have some flipping back and forth to do.
-
Volume 3, Chapters 3,4.
These are the chapters that talk about address translation and segmentation.
-
Volume 3, Chapter 5.
This section talks about exception handling.
I strongly recommend that you start by reading followed by Volume 3, Chapters 1,2, followed by Volume 3, Chapter 3 and 4. You should probably do this before you tackle Volume 3 chapter 5.
Simplifying Assumptions
To make all of this tractable, there are certain features that we will ``configure out of the way:''
-
We will not support the ``task switch'' instruction, nor will we use ``call gates'' or ``jump gates.'' All of these are either unnecessary, slow enough that nobody uses them, or have the property that equivalent function can be accomplished faster using less hardware-dependent means.
We will need to preconfigure the machine to effectively avoid these features.
-
We will not support segmentation. More precisely, we will use the segment registers because we must, but we will configure all segments to have a base of zero and a limit of 4G. This effectively removes the segment registers from the architecture (with one irritating exception), and is common practice on all modern Pentium-based operating systems.
-
We will completely ignore ``Virtual 8086'' mode. Strictly speaking this is a mistake, because applications can enable this mode without kernel consent, and can crash the kernel if the kernel is not expecting this behavior. In the interest of reduced complexity, we will ignore this mode entirely.