Archive for the ‘Linux/UNIX/Open Source’ Category

PDC202xx_old driver is broken in Linux 2.6

Wednesday, August 1st, 2007

At some point, I started noticing my system (MSI BXMaster) would completely freeze under heavy disk load. After watching the logs, I would see something akin to the following:

Mar 11 17:27:19 dbz kernel: hdg: dma_timer_expiry: dma status == 0x60
Mar 11 17:27:19 dbz kernel: hdg: DMA timeout retry
Mar 11 17:27:19 dbz kernel: PDC202XX: Secondary channel reset.
Mar 11 17:27:19 dbz kernel: PDC202XX: Primary channel reset.
Mar 11 17:27:19 dbz kernel: hdg: timeout waiting for DMA
Mar 11 17:27:40 dbz kernel: hdg: dma_timer_expiry: dma status == 0x60
Mar 11 17:27:40 dbz kernel: hdg: DMA timeout retry
Mar 11 17:27:40 dbz kernel: PDC202XX: Secondary channel reset.
Mar 11 17:27:40 dbz kernel: PDC202XX: Primary channel reset.
Mar 11 17:27:40 dbz kernel: hdg: timeout waiting for DMA
Mar 11 17:28:02 dbz kernel: hdg: dma_timer_expiry: dma status == 0x60
Mar 11 17:28:02 dbz kernel: hdg: DMA timeout retry
Mar 11 17:28:02 dbz kernel: PDC202XX: Secondary channel reset.
Mar 11 17:28:02 dbz kernel: PDC202XX: Primary channel reset.
Mar 11 17:28:02 dbz kernel: hdg: timeout waiting for DMA
Mar 11 17:28:22 dbz kernel: hdg: dma_timer_expiry: dma status == 0x60
Mar 11 17:28:22 dbz kernel: hdg: DMA timeout retry
Mar 11 17:28:22 dbz kernel: PDC202XX: Secondary channel reset.
Mar 11 17:28:22 dbz kernel: PDC202XX: Primary channel reset.
Mar 11 17:28:22 dbz kernel: hdg: timeout waiting for DMA

Well, I was unable to find a 2.6 kernel that would work reliabily on my Promise chip (PDC20265), I went back several versions.

The fix is to NOT use the pdc202xx_old driver, but instead use the libata driver for old Promise chips (CONFIG_PATA_PDC_OLD=y). You will probably need to recompile your kernel for this because distribution kernels are not using libata yet.

Also, this will change your disk devices from /dev/hdX to /dev/sdX so be sure to update your /etc/fstab correspondingly.

If your distribution kernel includes CONFIG_PATA_PDC_OLD=m, you can use this by adding it to the initrd image. For initramfs-tools, the file /etc/initramfs-tools/modules should exist. Edit it, and add “pata_pdc202xx_old” without the quotes on a new line. Run update-initramfs -k all -u and you should be all set. This preloads the libata driver during the initrd, so that the faulty pdc202xx_old driver cannot be loaded later.

Idea for blog spam

Wednesday, July 18th, 2007

This only works if blog spamming is a compute and/or bandwidth intensive activity.

Publish a standard file or URL like robots.txt that blogs can provide in order to notify a robot that comments are moderated.

By market forces, robots should check this so they do not waste their computing time and bandwidth spamming moderated blogs.

Alternatively. Remember those copy protection things in computer games of the 80’s? “Please type in the third word of the second paragraph on the eighth page.”

This could be used in blog captchas too. Make the instructions of which word(s) to find just convoluted enough to avoid machine parsing, and you’re good to go. The instructions can then be generated on the fly.

Proof of concept to follow.

How to do an end run around the GPL with retail software

Tuesday, June 26th, 2007

1. Include a GPL-based program in your retail software version 1.0.
2. 3 months after 1.0 is released, release version 1.1 and cease distributing 1.0.
3. Delete the sources to 1.0.
4. Repeat ad infinitum.
5. ???
6. Profit!

This way you only have to satisfy source requests from customers who request it up front immediately after their purchase. Immediately after a direct purchase from you, that is — if the boxed software is sold by a distribution house after you have moved to a new version, well that’s no longer your problem, is it? And those pesky customers who have to maintain your software in the field will just have to do without the sources to any prior versions that you no longer distribute. Bonus points if they purchase upgrades to the new version because of this scheme.

For extra evil, you can refer customers who ask for source to a generic upstream mirror for the “sources”, as long as they don’t make enough noise about your violation of the GPL.

AFS authenticated daemons

Thursday, May 31st, 2007

If your web server serves out of an AFS space that is accessible to local users, you probably want to limit its access to files that you have already audited for copyright issues. When a user requests a page, Apache will respect the UNIX mode bits for “other” when determining whether or not an unauthenticated web user should be able to access the page. Intuitively, a new admin may think that the system:anyuser and system:authuser ACLs would control access both by off-site AFS users and by off-site web users. However, with AFS, Apache is running with tokens (perhaps for a httpd service principal), and so it will pass any system:authuser ACL!

The long and the short of it is that you need a different policy for controlling access from AFS-authenticated daemons than you do for controlling access by AFS clients. This means that you cannot simply use system:authuser to control access to material that may be private to your site, because authenticated daemons will happily serve up that information to external users.

I recommend creating a separate AFS group ‘authuser’ to control access to material that MAY be private, that authenticated users SHOULD be able to access, and that non-authenticated users or service daemon clients SHOULD NOT be able to access. Add all AFS accounts that represent a user to this group.

Then you have the problem of users starting daemons and serving authuser files to the world using their tokens. The only solution I see here so far is to disallow listening on network ports on user accounts, and create a separate user account for daemons which chroots the user to his home directory when he logs in. User applications not being able to listen on network ports may impact specific client applications such as FTP and IRC.

Idea for seamless VPN use

Tuesday, May 1st, 2007

A network can publish a VPNDB record containing the IPv4 address, IPv6 address, VPN protocol, and other information about the network’s VPN server in its DNS zone.

A modified TCP stack could query the VPNDB record and establish the VPN session with the target network on the user’s behalf, if not already established, before initiating the connection to the target machine as usual.

Running a public VPN server with this scheme would ensure that a user is not inconvenienced with special setup for your network in order to access your network services in a secure fashion. For example, an internal server that requires end-to-end stream security, but uses a protocol that is known to be insecure, can be configured to refuse connections that do not originate from the internal network or the public VPN server. Another potential use would be in encrypting by default as much traffic as possible in order to foil law enforcement data loggers.

Of course, with respect to firewalling, a public VPN server should be treated with the same caution that an open wireless access point would.

Hacking 3ware’s management utility for setuid programs

Tuesday, March 27th, 2007

Warning

If you do any of these hacks, be sure that you do NOT install the tw_cli program itself setuid root; use sudo, or another wrapper that filters user access to running tw_cli as root. If you do not take appropriate precautions, any user will be able to run tw_cli, bugs and all, and have all the powers of root while doing so!

Problem and solutions

3Ware’s management utility for their RAID cards under Linux is called tw_cli. I have found that it may be desirable to script certain activities in the tw_cli. One such instance required writing a setuid wrapper program so that a non-root user could invoke tw_cli as root (a sudo setup would be similar). But the tw_cli program unfortunately does a getuid() check against root (the precise system call according to strace(1) is getuid32()). Since in a setuid environment the effective user ID is root but the real user ID is non-root, this check fails and tw_cli refuses to run. Aside from getting 3ware to change this call to geteuid(), the user would be out of luck.

Actually, we are not totally out of luck. tw_cli is stripped, which makes binary analysis difficult, but it is statically linked. This aids analysis because all of the code is included in the binary. On IA-32, Linux system calls are invoked by moving the system call number into eax and executing int $80. The actual system call is performed by a macro in the C library which does exactly this; when statically linked, this code will reside in the binary image.

What I did was to search for a word move placing getuid32()’s system call number into the eax register immediately followed by an int $80. getuid32()’s system call number can be found by checking the Linux kernel source code; all the system call numbers are defined as __NR_syscall. __NR_getuid32 is 199, which is $C7. The op code for a 32-bit move to eax is $B8. So, since IA-32 is little endian, this instruction is B8 C7 00 00 00. The INT instruction has an opcode of $CD and an 8-bit argument ($80 in this case). So the hex string to search for is B8 C7 00 00 00 CD 80. Well wouldn’t you know, there it is. And only one instance! It must be our culprit.

Now, what to change it to? We want to change this call to geteuid32(). Luckily, getuid32() and geteuid32() have the same arguments (none at all) and the same return type, so this hack is trivial. __NR_geteuid32 is 201 ($C9), so just change the move to B8 C9 00 00 00 and save the file. Now your tw_cli works as a setuid program.

A better way to do this might be to skip this call altogether. tw_cli operates on the 3ware device node, which has its own UNIX permissions, so… the tw_cli program does not really even need this check. Since the return value of the system call (the UID number) is placed in eax, we could make this hack just pass every time by changing the move to B8 00 00 00 00, and changing the CD 80 to 90 90 (nop nop). Then the program’s behavior will be controlled by device and file permissions as expected, instead of being controlled by a crude root check.

Corrupted NTFS filesystem recovery

Monday, March 19th, 2007

The quick guide to recovering a corrupt Windows NTFS filesystem from a dead or dying hard drive:
1) If the drive does not power up or respond at all to host I/O, replace the drive controller board with a compatible one (i.e. from an identical drive purchased on Ebay), unless it is a drive known to not work with a controller board swap. Don’t bother doing this if the drive responds but clicks when accessing certain files. If a controller swap doesn’t get the drive to at least respond to ID, the drive has serious problems and will require professional service (or a do-it-yourself head stack/preamp replacement, and possible reserved region rewrite…not for the faint of heart).
2) Put the hard drive in a Linux system with excess hard disk capacity.
3) Attempt to mount the partition. Recover any utterly irreplaceable files immediately, in order of necessity. You may not be able to get anything, and it may take several reboots if you “poke” the drive in the wrong place, but if you do get something, at least you know you have _that_.
4) Use dd_rescue, and dd_rhelp if necessary, to make a “clone” image of the drive. The clone image can be a file or it can be another blank hard disk. This may take several weeks and the drive may die while it is being cloned. Not much you can do if that happens but send it in to the recovery house like you would have had to do anyway.
5) Attempt to loop-mount the NTFS filesystem (mount -o loop /tmp/image.img /mnt). If it succeeds, try to copy the data you need out of /mnt that way. Very likely that the filesystem will not mount. Even more likely that it will mount, but then attempting to read certain files crashes the kernel.
6) If you couldn’t get the files you need, copy the image to a sufficiently sized blank hard disk if you hadn’t already (dd if=/tmp/image.img of=/dev/hdd bs=10M), and then attach the cloned drive to a Windows XP machine. Do NOT allow Windows to “Chkdsk” the drive when it boots.
7) If Windows blue screens when it looks at the drive while booting up, wipe out the partition table in Linux (dd if=/dev/zero of=/dev/hdd bs=512 count=1). This will cause Windows to effectively ignore the drive.
8) Use EasyRecovery from Ontrack in “Advanced” mode to scan the disk for directory structure, and recover as necessary. The result can be copied to another disk or uploaded to a FTP server.

Hints for EasyRecovery:

  • Don’t bother with the Undelete tool because it does not deal with massive filesystem corruption.
  • The Format recovery tool will only work on an existing NTFS volume, which it won’t see because yours is corrupted.
  • The Raw scan should only be used a last resort because it omits all file and directory names, resulting in a disorganized mess. However, it may find files that the Advanced scan does not, because they have been severed from the directory structure by corruption. If you know the contents of the file you are looking for, you can do a Raw recovery, and then “grep” through the files for a pattern that you know is in the interesting file.

If EasyRecovery cannot find your file, use a hex editor to search through the raw disk image for a piece of the file contents. You may get lucky and find it in the hex dump, and use the hex editor to save it to a file, or copy and paste from the hex editor to another program. If you don’t, well, time to decide if that file is worth $500+ for an attempted professional recovery…

OpenAFS for Windows, Error: 3 (unknown authentication error 3)

Monday, February 26th, 2007

If you are getting this error “Error: 3 (unknown authentication error 3)” when you attempt to obtain tokens using OpenAFS for Windows, you forgot to install Kerberos for Windows, KfW is not configured correctly for your realm, or you do not currently have a Kerberos TGT for some other reason. The error is returned because the AFS client cannot obtain tokens if you do not already possess a Kerberos TGT.

Making Windows XP bearable

Wednesday, February 7th, 2007

Powertoys:
Cmd Here
Task Switch
TweakUI

Third party apps:
TXMouse
VirtuaWin
DAEMON Tools
Skype
Privoxy
Java
Flash

Replacements:
Internet Explorer -> Firefox, Opera
Word -> Abiword
Office -> OpenOffice
Outlook -> Thunderbird/Sunbird (unless using shared office calendar and address book)
Windows Media Player -> Media Player Classic, VLC, XP Codec Pack
Windows Messenger -> Gaim, Psi
Paint, Visio -> Inkscape, Dia, GIMP
Acrobat Reader -> Ghostscript & GSview
Notepad/Wordpad -> GVim
WinZip -> 7-Zip
GnuPG -> WinPT

Unixy additions:
MinGW/MSYS
Cygwin
GNUWin32
coLinux/andLinux
Strawberry Perl
PuTTY
WinSCP
Python

Unix interop:
Services for Unix (free from MS)
Xming (X.Org X server)
Cygwin (includes an X server)
mingw32/MSYS (Unix style build environment for Windows)
True X-Mouse (Unix style mouse focus and highlight-copy semantics)
VirtuaWin (Pseudo-virtual desktops)
WinSCP (SCP/SFTP client)
PuTTY (SSH client)
GNU-Win32 (Win32 ports of Unix utilities)
Strawberry Perl (Win32 “Official” Perl port)
coLinux (Linux as Win32 process)
Kerberos for Windows
OpenAFS client (to access AFS fileservers)
LyX (LaTeX technical writing GUI)
TightVNC (VNC client and server for platform independent remote framebuffer access)

Simplification:
gVim (text editor)
AbiWord/OpenOffice (office suites)
Sunbird (Calendar with WebDAV support)
Firefox/Opera (browsers)
ImageMagick (image conversion and viewers)
GIMP (image editing)
GTK for Windows (for Dia/GIMP/etc)
Inkscape/Dia (chart drawing)
Media Player Classic and XP Codec Pack (media player)
7-Zip (file archiver)
Ghostscript/GSview, Foxit Reader (PDF/PS viewer)
Powertoys (enhanced Alt-tab, command line here, TweakUI)
Icon Restore
Psi (Jabber IM client)
Gaim (Multi protocol IM client)
Process Explorer (Task manager replacement)
DAEMON Tools (virtual CD/DVD driver)
Java 2 SE 1.5 or greater (replace Microsoft JVM)
Privoxy (filtering HTTP/HTTPS proxy server)
TortoiseCVS/TortoiseSVN (shell integration for revision control systems)

UNIX malloc() debugging

Tuesday, January 30th, 2007

If you suspect memory leaks or corruption in your program due to the improper use of pointers, here are several things you can use, ordered by simplicity:

  • GNU libc’s malloc check. Set the MALLOC_CHECK_ environment variable to 1 to spam stderr when corruption is detected, or 2 to call abort() when corruption is detected.
  • Dmalloc. A debug malloc library that works as a drop-in replacement for the system malloc(), realloc(), and free() functions.
  • Electric Fence. This is a replacement malloc library similar to Dmalloc. Its manner of operation is to use the hardware memory protection of the processor to cause an immediate segmentation fault if the process attempts to write to memory preceding or following an allocated region.
  • Valgrind‘s memcheck tool. Valgrind runs the program under an emulated CPU, so the execution is quite a bit slower. However, it catches all memory corruption and pointer errors at runtime and prints a backtrace. It also prints memory leak and other statistics at the program exit. Invoke with valgrind –tool=memcheck <program>.

Here is another list I ran across.