Re: SuSE 10.0 Something broke: /dev/hd* and friends no longer getcreated, boot fails
Ken Ryan wrote:
> Hello.
>
> Last night I put a new I/O board in my machine, which means I had to
> boot for the first time in about two months. Something happened where I
> can no longer boot.
>
> A little while ago I added a rule file to /etc/udev/rules.d (a
> 99-something which attempted to set permissions on /dev/ttyS0) but never
> tested it across a boot.
>
> First, rest assured I reversed the hardware and udev change, so my
> system should be the same as it was before. When getting ready to make
> the change I did a proper shutdown etc.
>
> I was negligent in three respects: I didn't keep up with my backups
> (most recent is a week or so ago), I never boot-tested my udev rules
> change, and I've been periodically running YOU updating everything it
> suggests including the kernel and whatnot but I hadn't been rebooting
> the machine to ensure all is well. So this problem could be caused by
> something that happened or something I did as long as two months ago and
> I didn't run across it until now.
>
> Here is my machine configuration:
>
> - 2.4(?) GHz P4, 1GB RAM, NVidia video, 10/100+USB1.1+1394 combo card,
> Audigy 2, USB2+FW combo card, Promise 20269-based IDE card, two hard
> disks (hda and hdg, both WD 120GB), LG dvd/cd writer, IDE zip drive
> (Dell Dimension 8200 with a couple peripheral changes)
> - Boot on hda1, swap on hda2 and hdg2, root on md0=hda3+hdg3, /home on
> md1=hda4+hdg4. The hdg1 partition is mounted on /altboot; I was going
> to rsync /boot onto it but I never got around to it.
> - All filesystems are ext3
> - Was running KDE with the NVidia driver
>
> When I shut down to add the new board one thing was a little odd - when
> I logged out of my user KDE session I was dropped to a console prompt
> rather than an xdm screen. I assumed that was simply because I had a
> YOU kernel update that hadn't gotten booted on before. I logged in as
> root at the console prompt and executed 'halt'. The system seemed to
> shut down OK at that point (I use the verbose boot, no splash screen).
>
> I made the hardware change, then powered on. The kernek booted, initrd
> loaded, / passed fsck and was mounted (it forced fsck due to being 63
> days since last fsck). It detected and assembled both raid1 volumes BUT
> fsck failed on hda1 and hdg1. At first I though "great, disk error or
> something". It dropped me to single-user, and when I tried rerunning
> fsck I realized it failed because /dev did not contain any hd* devices.
>
> I rebooted into "failsafe" with the same results except this time md0
> and md1 got fsck forced because it claimed 49710 days elapsed since last
> fsck. That makes me uncomfortable, obviously, but the root partition
> (md0) at least seemed to be OK from within single-user.
>
> It was at this point that I reverted the hardware change and my
> /etc/udev/rules.d/99-foo file (by removing the 99-foo file).
>
> Right now whether I try to boot into failsafe or normal mode I end up
> with /dev/hd* missing (/dev/md* is there). If I reboot into the same
> mode fsck doesn't get forced, if I switch from normal to failsafe or
> vice versa I get that weird 49710-day fsck (always the same number). It
> also doesn't matter whether I reboot or halt/powerdown then boot.
>
> A few things I was able to find:
>
> - It appears that /etc/init.d/boot.udev did not get run. I haven't
> figured out yet when it is supposed to run; if it's before or after
> boot.localfs (where I end up in single-user shell).
>
> - Sometimes udevd is running when I'm in singleuser, sometimes not. I
> haven't figured out the pattern yet. As I write this, I booted failsafe
> and am in singleuser with udevd running and /dev is missing the hd* files
>
> - If I run boot.udev force-reload I get a properly populated /dev.
>
> - Note: While I'm concentrating on /dev/hd* (especially /dev/hda1)
> missing, I have not checked if that is the only thing missing. As I
> write this, /dev has some files such as tty*, lp*, parport*, ippp*,
> isdn*, console, and the misc devices (zero, mem, null, etc.).
>
> - /proc and /sys are mounted and appear to be OK. Particularly I
> checked that /sys/block is OK, including /sys/block/hda/hda1.
>
> - if I cd to /dev and run 'df' I see "-" as the device and "/dev" as the
> mount point (I don't know if that's normal or not).
>
> - Booting with the installation DVD (OpenSuSE Eval DVD for 10.0) comes
> up to the installation screens OK, but the repair options don't work
> because they can't figure out where my root is. It appears to find hda1
> OK.
>
> - I tried searching google and google-groups for anything related to
> this but the only clue I was able to find was to verify /sys/block. I
> was unable to come up with a search string that produced something
> useful (a common problem with me, unfortunately).
>
> I appreciate any suggestions of what to try or what to look at.
> Hopefully this afternoon I'll have another 10.0 installation on another
> machine I can compare against, at least so I can see what is right and
> what is broken. Obviously I'm most suspicious that my attempt to use
> udev rules to modify ttyS0 permissions royally screwed things up - I'd
> never tried writing a udev rule before. I've reverted the file change
> as I mentioned, but I'd guess if the saved udevdb got messed up maybe
> that's what's wrong. I haven't posted to the udev lists, though; I want
> to see if there might be another reason or suggestion.
>
> Thanks in advance!
>
> ken
>
further investigation shows something really bizzarre.
When I run udevinfo e.g.
udevinfo -q all -p /sys/block/hda/hda1
all the lines look OK *except* the line
N: ttyS0
is in all files. This is also in /dev/.udevdb files.
I'm certain now that my attempt to write a rule for permissions on ttyS0
is the cause of this. The question is how do I fix it? I removed the
rule I wrote, but something is remembering it. I looked around with
find and grep but I don't know udev and the SuSE boot process well at
all, so I'm having no luck figuring out where the problem is.
Again, any tips would be immensely appreciated!
Thanks...
ken
|