Dealing with some OpenAFS issues

Some OpenAFS versions less than 1.4.1 have a bug in the fileserver that will cause tokens to be randomly discarded. The fix is to make sure the fileserver is at least 1.4.1.

Some previous OpenAFS versions allowed a principal with a ‘.’ (period) in the name. This no longer works. The symptom of this will be that you are able to obtain a token, but any operation on the filesystem that requires authentication will result in a short hang and the token is then discarded. The fix is to rename all such principals with a different character such as an underscore to replace the periods.

OpenAFS and Linux 2.6 have some issues with PAG support still. In my case, this manifested as an authenticated shell process that forks and calls another shell process will result in the child process having no tokens. Strangely, a shell process that runs other system programs that are not shells, such as ‘mv’, ‘rm’, etc, will succeed. So to work around this, convert your maintenance shell scripts that run authenticated to “source” each other with a . (dot) rather than to call each other. This will require some changes in error/exit handling but completely worked around the PAG issue here.

Sometimes you might notice your OpenAFS fileserver periodically restarting, with no cron job to explain it.  The culprit is the bos setrestart command.  By default, the fileserver restarts at 4:00am every Sunday, and at 5:00am every day if new binaries are detected.
To disable the periodic restart, issue the following command:
# bos setrestart server.domain.com -time never

Leave a Reply