I was having a problem booting CentOS 5.2 on a Supermicro H8QMi-2 mobo with 4 physical cpus and 32 gig of ram recently. Using the PAE kernel image (yes still 32bit for now) the machine would boot and after initializing the cpu's it would just hang. No kernel panic and no message. Booting the non-PAE kernel image worked fine so it had something to do with the PAE kernel. The PAE kernel would boot fine with 1 cpu in but I needed all 4 cpu's in and needed the PAE kernel so I could address all of the memory on 32bit Linux. Hitting google gave me nothing helpful.
So I decided to check manufactures website to see if they had any info on this problem. I found nothing there but while I was there I decided to see if the mobo had a newer BIOS with a possible fix. Hitting the download page for the mobo's BIOS I found this warning "Please do not download / upgrade the BIOS UNLESS your system has a BIOS-related issue". So I thought how do I know if my issue is BIOS related unless you tell me what you fixed in each release. Supermicro in it's infinite wisdom does not give you a change revision on any of their BIOS's. So you have no idea what they fixed and if your issue is one that was fixed. The newest BIOS was released 2 weeks ago and the BIOS on the board had not been updated in a year so I decided to take my chances and download it.
After downloading it and putting the files on a bootable usb stick with dos on it I flashed the new BIOS to the board. The flash program said it completed successfully and dropped me back to a prompt. I hit the reset button and was rewarded with nothing. Nothing on screen and no beeps. Removing the power and plugging it back in, clearing the CMOS, removing the CMOS battery for 30min, etc. Nothing worked.
I decided to try the emergency BIOS recovery that Supermicro built in to the board. This consists of hooking up a floppy drive and putting a BIOS file called super.rom on a bootable floppy. Then you hold down the Ctrl and Home keys (found out the the Home key you might need to hold down is the one on the number pad) while powering on the system. Doing this and booting the system at the same time did nothing. No beeping (that is supposed to happen) at all. Switching out cables and drives did not help either.
Finally in desperation I pulled the other 3 processors. The system booted right up. I put the other 3 processors back in and it booted right up again. Only having one processor in to do the BIOS upgrade on a quad socket board was not mentioned anywhere. Not the support site. Not the BIOS manual. Not in files downloaded with the BIOS. NOWHERE!.
This truly frustrating and an embarrassment for Supermicro. First not telling anyone what fixes are in each BIOS release and then not documenting a primary step of upgrading the BIOS. Which reminds me of a third example of the lack of documentation on Supermicro's part. If you only put one processor in this board you can't use all of the card slots in the back. Another failure to document. I have already complained to my vendor on this and he has sent it up the chain to Supermicro. I doubt they will do anything but it's worth a shot.
In the end the BIOS upgrade did fix the booting issue. I can now boot the latest CentOS 5.2 PAE kernel image fine.
If you have a Supermicro H8DA8 motherboard with the latest 5/22/06 BIOS heed this warning. If you have the on board SCSI controller turned on and you turn both serial ports off in the BIOS config your machine will hang on the next boot. The hang will be a blank screen after POST with a blinking cursor in the upper left hand corner of the screen. Supermicro has acknowledged this problem through my correspondence with them. They said it's a problem with the Adaptec controller on the board. The only way to fix it is to leave the serial ports turned on.
If you do turn them off and get the blank screen your will not be able to get back into the bios. To remedy this you will need to disable the SCSI chip by jumping it off on the motherboard. The pin for that is JPA1. Then boot and go into the BIOS and turn the serial ports back on. After that its up to you to jump the SCSI chip back on if you like.
I was getting the error "error loading operating system" after restoring a partition image from partimage. I thought it was the master boot record or the boot partition so I tried booting the windows rescue cd and going to a rescue prompt and typing "fixmbr" and "fixboot" that did not work. It just messed up the partition table. Even tried the old DOS boot disk and the "fdisk /mbr" trick (I've read not to use the fdisk /mbr command on a WinXP or 2k and up machines). I still got nothing same error. Then I found out that the crap computer I was using had a BIOS that when set to "auto detect" decided to not choose "LBA" for it's block mode. Well forcing this setting and reinstalling the OS then redoing the image worked. I made the partition again and put the image back on and walla it works. So remember take bios off "Auto" and select "LBA" if you see this error. Don't forget to put the bootable flag on after you restore the partition. If you forget you'll get "No operating system found".