Anatomy of a PC BIOS

Either use a “live” or “dead” image. A “dead” image may have a header that must be removed before disassembly because the BIOS code itself uses absolute offsets. A “live” image may be misleading because the BIOS code can self-modify, overwrite components or map them out using system hardware, and any static data reflects the state of the system after boot, not before boot. Also, the 64K segment of F0000-FFFFF is not necessarily large enough to map the entire BIOS ROM chip, meaning that there are parts of the BIOS ROM which are hidden from the system after initialization.

BIOS components

POST (Power On Self Test) routine

This is executed after initial CPU, cache, and chipset/memory setup.
The POST routine, since it is only used at boot, may be mapped into the option ROM region and then mapped out by the boot code and chipset after it is finished. This means it can be located outside the 64K BIOS segment in the BIOS ROM (which is larger than 64K in this case).
This routine will make external notification of its status by writes to port 0x80.

BIOS interrupt vectors

The real mode Interrupt Vector Table (IVT) contains pointers into ISR (Interrupt Service Routine) methods in BIOS code to implement various software interrupts. For analysis, you should also dump the IVT, and preferably from a protected mode operating system such as Linux which has not installed its own real-mode interrupt handlers. It is usually located at the start of memory (a protected mode operating system can relocate it, but Linux does not), and is of length $400 (256 DWORD vectors in segment:offset format – little endian!). Usually only handlers for INT0 through INT$77 are initialized. Handlers that are not initialized point to 0:0 which is also the INT0 handler. This handler may enter a debugger or simply crash the system. Most initialized interrupt vectors will point to an 'iret' instruction, making them a no-op. For installed vectors, if the OS has not installed its own handlers, the segment should always be a BIOS segment (i.e. the system BIOS F000 or the video or HDD controller BIOS). Then in your code viewer, disassemble the vector's offset in the BIOS image to see what the handler does. It will check the contents of various registers, commonly ax. Then the BIOS will either call its own functions or simply exit depending on the register contents. You can use Ralf Brown's Interrupt List to figure out what the BIOS is doing here based on the register contents passed to it. The NMI vector is INT $2 and the CPU exception vectors are scattered from INT $0-$11. Hardware IRQ handlers for PIC interrupts 0-7 are mapped to INT $8-$F, and for interrupts 8-15 are mapped to INT $70-77 (IVT offset 0x1A0).

You will notice that hardware IRQ handlers work slightly differently. When the PIC receives an interrupt, it asserts INT#. The CPU finishes what it is doing and asserts INTA#. The PIC then deasserts INT# and sends the interrupt number (as a byte) to the CPU, and, in real mode, the CPU jumps to the code the IVT vector points to. At the end of an IRQ handler, the BIOS will write an EOI (End Of Interrupt) code to the 8259 PIC (Programmable Interrupt Controller). The EOI code is $20, and the hardware port of the master PIC is $20 and the slave PIC is $A0. Sending EOI to the PIC allows the PIC to handle new IRQ events that are equal or lower priority. Also, an ISR will always be exited with an 'iret' instruction. This gets the previous EFLAGS, CS, and EIP off the stack. Interrupts are enabled while executing an interrupt handler, thus the PIC could reassert INT# in response to a higher priority IRQ. If the programmer disables interrupts in an ISR (using 'cli'), he must remember to re-enable them (using 'sti') before 'iret'. And receiving a NMI or SMI during an ISR is possible even if interrupts are disabled. However, if a NMI or SMI is executing, further interrupts of any kind are disabled until iret or RSM.

BIOS32

This is a generic entry point for BIOS services that can be used by 32-bit callers. It, and thus the installed services, can be called directly from a protected mode program (using CALL FAR). To detect its presence, scan for the string _32_ on a paragraph (16-byte) boundary from $E0000-$FFFFF AND validate the checksum of the structure present there.

Reference:

Standard BIOS 32-bit Service Directory Proposal, v0.4, Phoenix Technologies

Legacy configuration tables

Various programs may rely on the presence of these lists, tables, and interrupts, so they should be present and accurately reflect the state of the system hardware and firmware.

  • BIOS equipment list (Call INT $11 to obtain as value in AX), also stored at 0:410
  • Video Parameter Table (VPT), address stored in INT $1D (not a vector!), located at $FF0A4 in 100% PC compatibles
  • Diskette Parameter Table (DPT), address stored in INT $1E (not a vector!), located at $FEFC7 in 100% PC compatibles
  • Video Graphics Character Table (VGCT), address stored in INT $1F (not a vector!), located at $FFA6E in 100% PC compatibles
  • Fixed Disk Parameter Table (FDPT) for first hard drive, address stored in INT $41 (not a vector!), located at $FE401 in 100% PC compatibles
  • Fixed Disk Parameter Table (FDPT) for second hard drive, address stored in INT $46 (not a vector!)
  • EGA video graphics character table, address stored in INT $43 (not a vector!)
  • PCjr compatibles should also have INT $44, INT $48, and INT $49 locations pointing to valid tables

The RTC (CMOS) RAM (described below) may also contain system configuration information that has been stored by the BIOS or a configuration utility.

The BIOS Data Area (BDA, or BIOS Communication Area), 0:400 to 0:4AC, contains volatile BIOS state as well as hardware configuration data. It is left over from when the BIOS was always contained in a read only ROM. It too must reflect valid state. PC Convertible compatibles must also fill in the locations between 0:4AC and 0:4F0. 0:4F0 is a 16 byte scratch area for inter-application communication. 0:500-0:5FF is a scratch area for ROM POST and ROM BASIC.

References:

Ralf Brown's Interrupt List

http://www.frontiernet.net/~fys/rombios.htm

Extended BIOS Data Area (XBDA/EBDA)

This is a 1K data area for data such as user defined drive parameters (Type 47) which do not fit in the BIOS communication area. The BIOS has to store these somewhere besides inside its own image, because it has traditionally been fixed in a memory mapped ROM which cannot be written to. (The other option was to store these two 16-byte lists at 0:31D and 0:32D, but doing so precludes the use of software interrupts corresponding to the overwritten vectors.) A pointer to the XBDA is stored in 0:40E and is usually equal to $9FC0 (0:9FC0), which is the top 1K of the 640K region. The XBDA eventually evolved to include information about pointing devices and peripheral ports among other configuration information.

With systems that contain more than 640K of memory (at least 1MB low memory), and the advent of chipsets that support shadow RAM (being able to switch between hardware memory/ROM and underlying RAM in $A0000-FFFFF on demand), the System BIOS could then be copied to RAM and the chipset used to switch from the ROM (mapped at boot) to the copy in the underlying RAM which will be used at runtime. The BIOS copy in Shadow RAM obsoletes the XBDA, as well as the BIOS communication area, because any volatile data can now be written to the BIOS image in RAM.

Even if the XBDA is empty (because the BIOS image is shadowed and the information is stored internally), it still exists and consumes 1K of conventional memory, i.e. the BIOS memory size at 0:413 reflects the size of conventional memory minus the size of this area.

SMBIOS (System Management BIOS, previously DMI [Desktop Management Interface])

SMBIOS is a standard that supercedes the legacy equipment list, parameter tables, and XBDA. Search for the string “_SM_” at paragraph boundaries in the $F0000-FFFFF region. It should be followed by “_DMI_” at the beginning of the next paragraph. Verify the checksum of this structure. Then, The DWORD 8 bytes following “_DMI_” is a pointer to the DMI structure ($000Fxxxx). This structure should be filled out and have a correct checksum so that client software can use it.

Reference: System Management BIOS (SMBIOS) Reference Specification, Version 2.4, DMTF

PnP (Plug & Play) BIOS

Real mode and 16-bit protected mode entry points, will usually call the same internal function.

PCI

PCI BIOS supports 16-bit real and protected-mode callers through INT $1A, AH=$B1. Protected mode callers use the BIOS32 interface with $PCI or ICP$ service id.

Reference:

PCI BIOS Specification, 2.0, PCI SIG

Accesses to $CF8-$CFF are PCI controller accesses. The BIOS will use $CF8 to select a configuration register on a particular PCI target, and $CFC to read or write that configuration data. A DWORD write to $CF8 contains the following:

bus = bits 16-23
device = bits 11-15
function = bits 8-10
register = bits 2-7

For a configuration cycle, the high bit should always be set ($8xxxxxxx). After setting the target correctly, write or read the desired byte, word, or dword from $CFC.

A configuration structure being written to $CF8 can be decoded with this program:

#include <stdio.h>
#include <stdlib.h>

int main(int argc, char **argv)
{
        unsigned char bus, device, function, reg;

        int num = strtoul(argv[1], NULL, 16);

        num = num >> 2;
        reg = num & 0x3f;
        num = num >> 6;
        function = num & 7;
        num = num >> 3;
        device = num & 0x1f;
        num = num >> 5;
        bus = num & 0xff;

        printf("Bus %x  Device %x  Function %x  Reg %x\n", bus, device, function, reg);
        exit(EXIT_SUCCESS);
}

APM

SMM

SMRAM is located at 38000-3FFFF (SMBASE=$30000) or A8000-AFFFF (SMBASE=$A0000) when in SMM. SMRAM is initialized and cloaked by the system firmware and chipset. The chipset snoops SMIACT# (asserted by CPU in response to a SMI) and maps SMRAM in or out of these regions accordingly. The CPU then flushes the pipeline, dumps its state to the bottom of SMRAM (SMM state save map @ SMBASE+$7E00 to $7FFF), invalidates write back cache, and executes the code at SMBASE+$8000. When it encounters an RSM instruction, it reads its state back in from the state save map, and returns to the next program instruction as if nothing ever happened. The only visible evidence of an SMM invocation from the perspective of the operating system is missed timer ticks.

We are interested in analyzing the SMM code. The SMM code should be contained in the “dead” BIOS image somewhere. SMBASE can only be changed while in SMM, so, at boot time, the system firmware installs a dummy SMI handler at $38000 which changes SMBASE to $A0000, and then invokes SMM. Afterwards, the real SMI handler is installed at $A0000. $A0000 is the logical location for SMBASE anyway since this would otherwise be wasted RAM (hidden underneath mapped video card memory). Note: Upon entering SMM, SMI handler's CS is always $3000 even if SMBASE has been relocated, so correct handler code needs to check the current value of SMBASE and update CS accordingly.

VBE

NVRAM (RTC, ESCD, System configuration)

The original PC/AT RTC chip was a Motorola MC146818. The RTC cells in later machines are compatible with it. It has 64 bytes of RAM (commonly referred to as CMOS memory) and a battery-backed real-time clock. Later chips have 128 bytes of RAM. You can recognize RTC RAM accesses by the use of port $70 (index) and $71 (data). The user region of the RTC RAM (indices $0E-$3F, or -$7F on 128-byte chips) is used to store user-defined BIOS settings as well as internal firmware state. The BIOS code will usually read from interesting RTC RAM locations while starting up, for example to get user-defined CPU/FSB settings, or to see if the system needs to be resumed from a BIOS suspend. It will also eventually perform some sort of checksum over the region, to ensure that low supply voltage has not allowed the stored data to become corrupt.

When reading or writing RTC RAM, there should be a delay in between the write of $70 and the subsequent access of $71. However, this delay should not be too long because the RTC chip can become confused. A good rule of thumb is to always read from $71 after setting $70, even if the value is not interesting.

When the RTC IRQ handler is installed, the RTC index port is usually set to $0D (so that the status register can quickly be read). If you play with the RTC yourself, remember to disable interrupts first and restore $0D to the index before re-enabling interrupts, or the IRQ8 handler may become confused. Another good reason for this to be left to $0D is because if it is set to a RAM location, that location could be corrupted when power is removed from the system.

The system NMI mask bit is bit 7 of port $70. Yes, the Non-Maskable Interrupt is actually maskable, through the 8042 keyboard controller – the high bit of that port is latched by the 8042. This was overlapped with the index port so that the index could be written simultaneously with masking NMI (so that an NMI would not occur while RTC RAM was being updated – a potentially hazardous situation since the RTC RAM is preserved across reset even if it is in a corrupt state). However, it also means that being able to index more than 128 locations is impossible. It also means that if it is possible for the NMI to be masked when you go to perform your RTC access, you must first read the index port and preserve the high bit when writing your index value, otherwise you may inadvertently unmask the NMI.

Some MC146818-compatible RTC chips have even more RAM stored in extra banks, but the access method is implementation defined. Some use registers within $70 to access the extra banks, others (such as Intel PIIX) implement $72 and $73 as another 128-byte bank with identical access strategy to $70 and $71.

Reference: http://www.maxim-ic.com/appnotes.cfm/an_pk/77

MPS
APIC
IRQ routing table
ACPI
Boot flag (RSDT: “BOOT”)
Reference: Simple Boot Flag Specification, 2.1, Microsoft
System setup program

Code Trace analysis

Execution starts at $FFF0 and the code will jump to the real start routine from there. The system BIOS is not relocatable so use $F000 as the segment for disassembly. Other BIOS images must be relocatable either by dip switch (ISA) or by PCI configuration, so they will either employ relative addressing or have relocation fixups at the beginning of the code.

Accesses to $60-$68 are KBC accesses (i.e. a self test, or checking for the key combination to enter setup or to invoke a BIOS upgrade)

cr0 reads or writes are CPU configuration (enable/disable cache or FPU)

2 Responses to “Anatomy of a PC BIOS”

  1. Rex says:

    Hi, Where did you gather all this information from ? The Intel manuals for x86 ?

    Thanks

  2. nemesis says:

    No, where a source was not cited it as just from disassembly and experimentation. It is more a collection of notes regarding analysis I was doing at that time, so sorry about the disorganization.

Leave a Reply