Archive for November, 2006

Get rid of Arctic Silver heatsink grease

Thursday, November 30th, 2006

Replacing a heatsink on your NVIDIA monster? The compound that’s on there might be a mess. Also, it might have migrated around/underneath memory chips, wreaking havoc along the way. Wiping the stuff just gets it all over your fingers and messes up a good microfiber cloth (since paper towels tear and other cloths leave lint).

Acetone and rubbing alcohol mitigate that a bit, but leave something to be desired.
The real solution is Goo Gone or Goof Off or a similar product, and a toothbrush.

Pour the Goo Gone over the ram chips and/or GeForce core, and use the toothbrush to scrub them clear of the heatsink goop. Rinse with water, making sure not to allow the goo to drip onto the other side of the card, and allow to thoroughly dry (24 hours) before reinstalling heatsink.
And for deity’s sake, use less goop next time! It does NOT take much.

You can use the same technique to remove goop from the cooler’s heatsink. I recommend removing the heatsink from the cooler first, because when you wash the solvent + goo off the heatsink, it has an uncanny attraction to plastic and will end up coating the plastic housing of the cooler with a sticky, cloudy layer. It doesn’t affect cooling but looks terrible.

Implementing a software modem

Tuesday, November 28th, 2006

To test the protocol, use two sound cards, line out on one to line in of the other. Use symbol rate between 2400 and 3200 baud to approximate telephone line conditions.

Modulation protocols:
Bell 103 (300bps)
v.21 (300bps)
v.22 (1200bps)
v.22bis (2400bps) – Patents on v.21/v.22/v.22bis expire 2008
v.23 (1200bps/75bps half duplex)
v.32 (9600bps)
v.32bis (14.4kbps)
HST (9600bps, 14.4kbps, 16.8kbps, 21kbps, 24kbps, proprietary)
v.32terbo (19.2kbps/12.0kbps, proprietary), 19.2ZYX (19.2kbps, proprietary)
v.FC (28.8kbps, proprietary)
v.34 (28.8kbps)
v.34bis/v.34plus (33.6kbps)
K56Flex (56kbps PCM downstream, v.34+ upstream)
X2 (56kbps PCM downstream, v.34+ upstream)
v.90 (56kbps PCM downstream, v.34+ upstream)
v.92 (56kbps, PCM downstream/upstream)

Error correction protocols:
MNP (2-4, 10)
v.42 (LAPM)

Data compression protocols:
MNP5 (RLE, Huffman), MNP7
v.42bis (BTLZ)
v.44 (LZJH)

Handshaking protocols:
MNP6,MNP9 (Universal Link Detection)

OpenGL profiling

Thursday, November 16th, 2006

– Find software paths by comparing performance of some commands and extensions to the equivalent done through the software driver. Allow the application to query whether a software path will be used or not.
– Amount of fast texture RAM. Since amount of local memory on card and texture compression may be enabled, consider this “cached” RAM compared to having to go to AGP or system memory. Also find the size of AGP aperture and benchmark AGP transfer rate and system memory speed. Application can decide how to manage its textures accordingly.
– Find fill rate and triangle rate. Application can decide level of detail.
– Find maximum multitexturing level. (Fill rate should decrease drastically when this is exceeded.)

This may violate OpenGL's transparency, but it is already violated when hardware rendered graphics do not match software rendered graphics pixel for pixel.

PC stability and performance issues

Thursday, November 16th, 2006

These issues are related to:
– CPU bugs
– Motherboard electrical bugs
– AGP/PCI Chipset hardware bugs
– Video hardware bugs
– Device driver bugs

VIA and ATI chipsets AGP 3.0 AGP cycles to system memory do not work properly and cause memory corruption.

GA-6BX board has a 5A regulator instead of the 6A required by AGP specification, resulting in low voltage on the 3.3V power supply, crashing video cards which do not have their own power circuits.

AMD Irongate (AMD751) must be used in 1X mode with certain NVIDIA video cards. It is also possible for a Savage4 card to hang the AMD751 under specific circumstances unless . (

ALI 1541, 1647 is incompatible with G200's 2x AGP and must be run in 1X. NVIDIA disables AGP for ALI chipsets.

The condition where an Aureal A3D chip did not get a bus grant within a 4us (16 PCI clocks) window on a MMIO read causes a system lockup. (

Windows SBLive drivers constantly transfer data to the sound card when MIDI is enabled, even when no sound is being played, reducing the available bandwidth. (

Athlon north bridges can be programmed to disconnect the CPU when Stop Grant is asserted (a result of HLT instruction called during OS idle loop, or of south bridge asserting STPCLK as a result of ACPI), resulting in very good power saving and cooling. The BIOS vendors do not seem to program the appropriate bits in the north bridge, so the user can do it himself. There can be problems with this though. Disconnecting and reconnecting the processor bus adds latency and reduces bandwidth. It also may fail on Athlon Model 4 (Thunderbird) due to two errata involving the CPU multiplier and the CLK_CTRL MSR. ( Some AMD751 chipsets seem to have a problem with incomplete processor disconnects that cause hangs ( Doing this behind the BIOS's back may also conflict with the BIOS ACPI implementation. The local APIC timer may also have problem.

Athlon Model 1/2 (slot) resume from suspend may fail due to L2 cache corruption on specific processors (errata). BIOS needs to check for this condition and leave L2 cache enabled at suspend.

VIA bugs

VIA chipsets are frequently relabeled. Some relabeled VIA chipsets are AMD-640 (VP2/97), Soyo ETEQ (MVP3), VIAgra (MVP4), etc. (

A chipset such as Via Apollo Pro VPX is a set of chips under a product name (VT82C580VPX), such as the north bridge (CPU to AGP/PCI) and south bridge (PCI to ISA, some have integrated super I/O). A product name integrates a set of north bridge, south bridge and data buffers.

Name Product name North bridge NB PCI ID South bridge SB PCI ID IDE controller PCI ID
Apollo VP (VX Pro) VT82C580VP VT82C585VP 0x585 VT82C586 0x586 0x571
Apollo VPX (VX Pro+) VT82C580VPX VT82C585VPX 0x585 VT82C586B 0x586 0x571
Apollo VP3 VT82C597 VT82C597 0x597 VT82C586B 0x586 0x571
Apollo MVP3 VT82C598 VT82C598[AT|MVP] 0x598 VT82C586B/VT82C596B/VT82C686A 0x586/0x596/0x686 0x571

VT82C598MVP revision 'CD' (made as late as 9821) seems to have some compatibility problems with certain AGP video cards (like Intel i740). Revision 'CE' (made as early as 9825) seem to fix this, and FIC also produced a BIOS update that claimed to fix it.

VT82C597 only supports AGP 1.0.

586A does not support ACPI, 586B does.

The combination of SBLive, PCI 2.1 Delayed Transaction, and VIA 686B seems to cause data corruption when transferring between the 686B IDE channels in DMA mode. ( ( ( (

VIA 691 (Apollo Pro), 693 (Apollo Pro Plus), 693A (Apollo Pro 133) and 694X (Apollo Pro 133A) all share the same PCI ID. 694X supports AGP 4X similar to KX133, but NVIDIA uses 2X AGP on 694X and KX133.

Many VIA PCI-ISA (and VLB-ISA) bridges have a bug that causes frequent ISA DMA transactions to hang the board. Intel 430FX (Triton) may suffer from the same bug.

Some instability problems with Western Digital hard disk drivers on VIA VPX/97, VP2/97-based boards using UDMA mode were reported. WD agreed to workaround their firmware to fix this incompatibility in future releases of their products.
VIA will also adress this issue in the next release of BusMaster drive (v. 2.13 has not implemented this fix and for v. 2.19 (not approved yet) it will be known after lab tests are finished).
Temporary solutions:
a) disable UDMA, or
b) disable 586B ISA refresh option (RX41 bit 0 set to 1, disable ISA refresh but Port 61 still Toggle)
WD releases Tucson CCC:C2 and Sedona CCC:B2 and later have updated firmware that solves the problem implemented. There have been no reports on problems for disks shipped after November 1997. In case you suspect your HDD has been affected by this, check your HDD firmware version.

PCI bridge performance

Several generic performance options can be set in order to speed up PCI operations. The locations differ per PCI chipset, and should be set up by BIOS, but on many systems are not. L2 cache as write-back instead of write-through, PCI posting (from CPU to memory, CPU to PCI, and PCI to memory), and PCI burst transfers are all helpful. Linux used to set these up in pci/quirks.c but no longer.

Dealing with some OpenAFS issues

Tuesday, November 14th, 2006

Some OpenAFS versions less than 1.4.1 have a bug in the fileserver that will cause tokens to be randomly discarded. The fix is to make sure the fileserver is at least 1.4.1.

Some previous OpenAFS versions allowed a principal with a ‘.’ (period) in the name. This no longer works. The symptom of this will be that you are able to obtain a token, but any operation on the filesystem that requires authentication will result in a short hang and the token is then discarded. The fix is to rename all such principals with a different character such as an underscore to replace the periods.

OpenAFS and Linux 2.6 have some issues with PAG support still. In my case, this manifested as an authenticated shell process that forks and calls another shell process will result in the child process having no tokens. Strangely, a shell process that runs other system programs that are not shells, such as ‘mv’, ‘rm’, etc, will succeed. So to work around this, convert your maintenance shell scripts that run authenticated to “source” each other with a . (dot) rather than to call each other. This will require some changes in error/exit handling but completely worked around the PAG issue here.

Sometimes you might notice your OpenAFS fileserver periodically restarting, with no cron job to explain it.  The culprit is the bos setrestart command.  By default, the fileserver restarts at 4:00am every Sunday, and at 5:00am every day if new binaries are detected.
To disable the periodic restart, issue the following command:
# bos setrestart -time never

Geforce2 passive cooling replacement

Tuesday, November 14th, 2006

If you have an older Geforce 2 GTS or MX card that came with a fan on it, chances are the fan is dead by now. The sleeve bearing fans they typically ship on video cards do not last. What to do?

If you can replace just the fan and avoid removing the heatsink, do that. Unfortunately, many chip coolers are sold exactly as that, so buying only the fan as a replacement is impossible. And the chip cooler itself is rarely sold at a reasonable price considering the age of the video card. Otherwise, a solution with more longevity is passive cooling. You can buy a part number HS325-ND heatsink from Digi-Key.

Remove the existing chip cooler by CAREFULLY prying between the chip and heatsink on the side where there is the least amount of glue. You have to be careful here for two reasons. There are small surface mount components and traces on the circuit board that could be damaged if you are not careful. Also, the chip package itself has tiny traces that lead from the chip core to the BGA contact points. If even one of these is damaged, the card is ruined.

Use a medium grit sandpaper and sand off the thermal glue that remains on the chip. You don't have to get rid of all of it, but you do need to get rid of as much as possible so that the chip with sanded glue is as flat as possible. Here again, avoid accidentally sanding the BGA traces on the outer diameter of the chip.

Wipe the chip down and attach the heatsink using the provided thermal tape, or glue a new chip cooler onto it. If you go with the thermal tape or pad, it would be advisable to secure the heat sink with a zip tie or something similar, because the tape or sticky pad has about a 50/50 chance of letting go at some point, which will then lead to a fried GPU.

Monitor the chip temperature for a while to make sure the cooling solution is effective. On a GF2, I have no problems with normal desktop use with passive cooling.

Check those capacitors for bulging and/or leaking while you are at it!

The Schwag on Google Video/YouTube

Wednesday, November 8th, 2006

Parallel ports

Monday, November 6th, 2006

Parallel switchboxes
Don't use a parallel switchbox if you want to do high speed transfers. I have not yet found one, manual or automatic type, that is reliable at high byte rates.

PS/2 (or Extended) mode takes the standard parallel port (SPP) and introduces an output latch and direction control for bidirectional port operation. Contrary to ECP/EPP mode, the behavior of the nACK interrupt is also changed so that the IRQ becomes active on the trailing edge instead of mirroring the pin; also, bit 2 of the status register reflects the status of the ACK interrupt (latched when IRQ generated, cleared on read) – a violation of ECP spec.

The difference between EPP 1.7 and EPP 1.9 as set in a system BIOS is one trivial difference in the EPP handshake. EPP 1.9 is equivalent to IEEE 1284. The only purpose for the EPP 1.7 setting is for any particular EPP devices built before IEEE 1284 that malfunction with the IEEE 1284/EPP 1.9 handshake. Note: IEEE 1284 defines the electrical characteristics and handshake protocols of an EPP port, not the register definition.

An ECP-capable port is a functional superset of IEEE 1284. An ECP-capable port in ECP mode is incompatible with non-ECP devices, however.

Many ECP ports do not implement the full ECP specification. Common elements to leave out are:
– nFault IRQ generation (full/empty FIFO can still generate IRQ though!)
– Hardware RLE compression (not required by spec)
– DMA (PCI cards cannot implement ECP DMA and generally do not need to since PCI write buffers provide sufficient speed)
– IRQ/DMA resource configuration (PCI cards cannot implement this)

Handling a parallel port interrupt
By definition, when sharing interrupts it is necessary for your device driver to be able to determine whether your device is the source of the interrupt or not (so you can pass the interrupt on unclaimed to other drivers if it is not). This is exceedingly difficult to do in a generic fashion for PCI parallel cards. Whether or not an interrupt is delivered in a particular operating mode, and where the status of that interrupt is reflected, is highly implementation dependent.

There are three places where an interrupt can be enabled:
– Control register bit 4 (~ACK interrupt)
– ECP Extended Control register (ECR) bit 4 (~ERR interrupt)
– ECP Extended Control register (ECR) bit 3 (DMA interrupt)
– ECP Extended Control register (ECR) bit 2 (FIFO interrupts)

There are five places where an interrupt can be generated:
– ~ACK transition
– ECP ~ERR transition
– ECP DMA completion
– ECP read FIFO filling
– ECP write FIFO emptying
– Some devices (NS) generate an interrupt on an unexpected EPP read

There are at least three places where an interrupt can be detected:
– Status register bit 2 (latched after ~ACK transition)
– ECP Config B register bit 6 (follows interrupt pin on bus)
– ECP Extended Control register (ECR) bit 2 (check for 0->1 transition)

PCI multifunction cards usually also have a global control register, which has some location outside of the usual parallel port register set that reflects the status of a parallel interrupt.

We don't really care what in particular caused the interrupt, but we do need to find some proof somewhere in the registers that this card was the one responsible for the interrupt, or things will go horribly wrong.

– ~ACK transition is only latched to Status[2] in PS/2 mode by many cards. In SPP and other modes, it either reads 1 or follows the IRQ pin. Since a spec-conforming PCI card will use a level triggered interrupt, we can in theory use this to test for the interrupt (but only on PCI cards!)
– ECP Config B register can be used, but first the port has to be switched into Test mode to read it, which means the ECP FIFOs must be flushed and current ECP transaction terminated, possibly too high a cost for interrupt handling.
– There is no way to determine whether a ~ERR transition caused the interrupt or not. On an ISA card or one without a shared interrupt, it can be determined by a process of elimination (since a spec-conforming driver disables the ~ACK interrupt when in ECP mode), but on a shared interrupt it is impossible.
– ECR bit 2 is only useful if in ECP mode and FIFOs are being used.

Basically, the most useful parallel interrupts (those generated by external events) give us no reliable way to determine which card owns the interrupt. The ~ACK interrupt could be probed, had the PC parallel port's designers thought to put in a loop-back test, but they did not.

The best thing you can do to handle PCI parallel interrupt sharing in a generic fashion is to:
– Disable the ~ERR interrupt.
– The DMA interrupt is not an issue on PCI cards since they don't support it anyway.
– Keep track of the state of the ECR bit 2 when you set it to 0 (unmasks the ECP FIFO interrupt) so that you can check if it changed in your interrupt handler (meaning we generated an interrupt).
– Ensure that your card cannot both have the ~ACK interrupt enabled AND be in a mode that will not latch that interrupt in Status[2] (reflecting the pin state is not enough!). Then you can assume that a Status[2]==0 event means that we generated the interrupt. Note: On most/all PCI cards, the status register must be read in order to clear the level-triggered interrupt.
– Assure yourself to whatever degree of confidence required that your card will not produce ANY other type of interrupt (vendor's logic equation for IRQ event helps)!

If you are lucky enough to have a global interrupt flag for the parallel port on your PCI card, USE THAT INSTEAD! Then you can use ~ERR and ~ACK as external interrupt sources without worries, and you can also handle spurious interrupts with a high degree of confidence! Only use the above “generic” mechanism as a last resort. If someone would look into using the ECP Register B to check for the interrupt and see how well that works, that may be an even better “generic” solution for PCI parallel cards.

Simple PC parallel port detection in DOS

unsigned short lpt_base;
char lpt_irq;
unsigned char lpt_vector;
unsigned char lpt_pic; /* 0 = pic1, 1 = pic2 */
unsigned char lpt_mask; /* bit in PIC OCW to unmask/mask */
unsigned char received; /* The last byte received */
char is_ecp;
void interrupt(*old_lpt_irqhandler)(__CPPARGS);

// The following code should be inserted into a setup function, and allow
// user to override base address and IRQ
	// setup parallel port
	if (lpt_base == 0) {
		// Use BDA to find base address of system's first parallel port
		unsigned short far *bda_lpt = (unsigned short far*)MK_FP(0x40, 8);

		lpt_base = *bda_lpt;
		//printf("lpt_base %0.4x", *bda_lpt);
		assert(lpt_base == 0x3bc || lpt_base == 0x378 || lpt_base == 0x278);

	if (lpt_base == 0x3bc) {
	  // We can assume a port at 0x3BC has IRQ 7 unless we find otherwise
	  lpt_irq = 7;
	// Detect ECP port according to ECP spec p.31
	// ECR is at lpt_base + 0x402
	unsigned char test = inp(lpt_base+0x402);
	if ((test & 1) /* fifo empty */ && !(test & 2) /* fifo not full */) {
		// Attempt to write a read only bit (fifo empty) in ECR
		outp(lpt_base+0x402, 0x34);
		test = inp(lpt_base+0x402);
		if (test == 0x35)
			is_ecp = 1;

	// If ECP port, read cnfgB to find parallel port IRQ number
	if (is_ecp) {
		// Put port into configuration mode
		test = inp(lpt_base+0x402);
		test |= 0xE0;
		outp(lpt_base+0x402, test);
		// Read cnfgB
		unsigned char irq = inp(lpt_base+0x401);
		irq &= 0x38;
		irq >>= 3;
		// irq0 means selected via jumper, user will have to hard code the irq
		if (irq != 0) {
				case 1: lpt_irq = 7; break;
				case 2: lpt_irq = 9; break;
				case 3: lpt_irq = 10; break;
				case 4: lpt_irq = 11; break;
				case 5: lpt_irq = 14; break;
				case 6: lpt_irq = 15; break;
				case 7: lpt_irq = 5; break;
				default: break;
		// Set ECP port mode to PS2
		test = inp(lpt_base+0x402);
		test &= ~0xE0;
		test |= 0x20;
		outp(lpt_base+0x402, test);

	if (lpt_irq == -1) {
		fprintf(stderr, "Couldn't find interrupt for parallel port at 0x%x !\n", lpt_base);

	// Convert IRQ number to interrupt vector
	switch(lpt_irq) {
		case 5: lpt_vector = 0x0d; lpt_mask = (1 << 5); break;
		case 7: lpt_vector = 0x0f; lpt_mask = (1 << 7); break;
		case 9: lpt_vector = 0x71; lpt_pic = 1; lpt_mask = (1 << 1); break;
		case 10: lpt_vector = 0x72; lpt_pic = 1; lpt_mask = (1 << 2); break;
		case 11: lpt_vector = 0x73; lpt_pic = 1; lpt_mask = (1 << 3); break;
		case 14: lpt_vector = 0x76; lpt_pic = 1; lpt_mask = (1 << 6); break;
		case 15: lpt_vector = 0x77; lpt_pic = 1; lpt_mask = (1 << 7); break;
		default: abort();

        fprintf(stderr, "Parallel port at 0x%x, irq %d", lpt_base, lpt_irq);
        if (is_ecp)
                fprintf(stderr, ", ECP");
        fprintf(stderr, "\n");

        // set to data input mode using DCR
        outp(lpt_base+2, inp(lpt_base+2) | 0x20);

        // check that data lines are not driven by us
        int fail = 1;

        for (i = 0; i < 5; i++) {
                outp(lpt_base, 0x5a+i);
                if (inp(lpt_base) != 0x5a+i) {
                        fail = 0;
        if (fail) {
                fprintf(stderr, "Parallel port does not appear to be bidirectional!\n");        
        disable();  // cli()
        // grab IRQ vector
        setvect(lpt_vector, lpt_irqhandler);
        if (lpt_pic > 0) {
                // unmask our IRQ
		outp(PICB_1, inp(PICB_1) & ~lpt_mask);
		// then unmask IRQ2
		outp(PICA_1, inp(PICA_1) & ~0x04);

	else {
		// unmask our IRQ
		outp(PICA_1, inp(PICA_1) & ~lpt_mask);
	// enable parallel port interrupt via ACK line
	outp(lpt_base+2, inp(lpt_base+2) | 0x10);

	enable(); // sti()

Simple bidirectional communication between two PCs with a standard parallel port cable
Swap STROBE and nACK pins on one end of the parallel cable. Ensure that the parallel port nACK interrupt is enabled on both ends (DCR[5] := 1). Then the communication looks like the following:

// Parallel port ISR, Turbo C++ 3.1 DOS code
void interrupt lpt_irqhandler(__CPPARGS)

  received = inp(lpt_base);
  // Interrupt the sender, since STROBE on this end
  // is connected to ACK on the other end
  unsigned char tmp = inp(lpt_base+2);
  outp(lpt_base+2, tmp ^ LPT_STROBE);
  outp(lpt_base+2, tmp);

  old_lpt_irqhandler(); // chain old IRQ handler
  outp(PICA_0, EOI); // EOI
  if (lpt_pic > 0)
	outp(PICB_0, EOI); // also send EOI to PIC2


I have found this to be a sufficient quick & dirty way of transferring bytes from one PC to another in interrupt driven fashion.

coLinux adventures

Thursday, November 2nd, 2006

This is the best way to setup CoLinux networking, because it has a high speed TAP interface for the local X server traffic, and only uses WinPCap for external traffic. It also does not require Internet Connection Sharing to be enabled because WinPCap creates a virtual network adapter on top of the existing one.

Note that this will only work if your local X server can accept connections from 192.168.x.x, some commercial X servers have a license scheme limiting the IP range. If yours does, you can assign IP addresses within that rang (more…)

Automatic security updates with Debian

Thursday, November 2nd, 2006

The following /etc/dpkg/dpkg.cfg settings are useful when you have scripted apt-get to do automatic security updates:

# Choose the default action regarding conffiles
# If no default, keep the old conffile

This will prevent an update from aborting because the local version of the conf file differed from the package maintainer's version and no user input was available.