SuSE 10.0 Something broke: /dev/hd* and friends no longer get created,boot fails

This is a discussion on SuSE 10.0 Something broke: /dev/hd* and friends no longer get created,boot fails within the Linux Administration forums, part of the Linux Forums category; Hello. Last night I put a new I/O board in my machine, which means I had to boot for ...


Go Back   Usenet Forums > Linux Forums > Linux Administration

FAQ Members List Calendar Search Today's Posts Mark Forums Read
  #1 (permalink)  
Old 06-08-2006
Ken Ryan
 
Posts: n/a
Default SuSE 10.0 Something broke: /dev/hd* and friends no longer get created,boot fails

Hello.

Last night I put a new I/O board in my machine, which means I had to
boot for the first time in about two months. Something happened where I
can no longer boot.

A little while ago I added a rule file to /etc/udev/rules.d (a
99-something which attempted to set permissions on /dev/ttyS0) but never
tested it across a boot.

First, rest assured I reversed the hardware and udev change, so my
system should be the same as it was before. When getting ready to make
the change I did a proper shutdown etc.

I was negligent in three respects: I didn't keep up with my backups
(most recent is a week or so ago), I never boot-tested my udev rules
change, and I've been periodically running YOU updating everything it
suggests including the kernel and whatnot but I hadn't been rebooting
the machine to ensure all is well. So this problem could be caused by
something that happened or something I did as long as two months ago and
I didn't run across it until now.

Here is my machine configuration:

- 2.4(?) GHz P4, 1GB RAM, NVidia video, 10/100+USB1.1+1394 combo card,
Audigy 2, USB2+FW combo card, Promise 20269-based IDE card, two hard
disks (hda and hdg, both WD 120GB), LG dvd/cd writer, IDE zip drive
(Dell Dimension 8200 with a couple peripheral changes)
- Boot on hda1, swap on hda2 and hdg2, root on md0=hda3+hdg3, /home on
md1=hda4+hdg4. The hdg1 partition is mounted on /altboot; I was going
to rsync /boot onto it but I never got around to it.
- All filesystems are ext3
- Was running KDE with the NVidia driver

When I shut down to add the new board one thing was a little odd - when
I logged out of my user KDE session I was dropped to a console prompt
rather than an xdm screen. I assumed that was simply because I had a
YOU kernel update that hadn't gotten booted on before. I logged in as
root at the console prompt and executed 'halt'. The system seemed to
shut down OK at that point (I use the verbose boot, no splash screen).

I made the hardware change, then powered on. The kernek booted, initrd
loaded, / passed fsck and was mounted (it forced fsck due to being 63
days since last fsck). It detected and assembled both raid1 volumes BUT
fsck failed on hda1 and hdg1. At first I though "great, disk error or
something". It dropped me to single-user, and when I tried rerunning
fsck I realized it failed because /dev did not contain any hd* devices.

I rebooted into "failsafe" with the same results except this time md0
and md1 got fsck forced because it claimed 49710 days elapsed since last
fsck. That makes me uncomfortable, obviously, but the root partition
(md0) at least seemed to be OK from within single-user.

It was at this point that I reverted the hardware change and my
/etc/udev/rules.d/99-foo file (by removing the 99-foo file).

Right now whether I try to boot into failsafe or normal mode I end up
with /dev/hd* missing (/dev/md* is there). If I reboot into the same
mode fsck doesn't get forced, if I switch from normal to failsafe or
vice versa I get that weird 49710-day fsck (always the same number). It
also doesn't matter whether I reboot or halt/powerdown then boot.

A few things I was able to find:

- It appears that /etc/init.d/boot.udev did not get run. I haven't
figured out yet when it is supposed to run; if it's before or after
boot.localfs (where I end up in single-user shell).

- Sometimes udevd is running when I'm in singleuser, sometimes not. I
haven't figured out the pattern yet. As I write this, I booted failsafe
and am in singleuser with udevd running and /dev is missing the hd* files

- If I run boot.udev force-reload I get a properly populated /dev.

- Note: While I'm concentrating on /dev/hd* (especially /dev/hda1)
missing, I have not checked if that is the only thing missing. As I
write this, /dev has some files such as tty*, lp*, parport*, ippp*,
isdn*, console, and the misc devices (zero, mem, null, etc.).

- /proc and /sys are mounted and appear to be OK. Particularly I
checked that /sys/block is OK, including /sys/block/hda/hda1.

- if I cd to /dev and run 'df' I see "-" as the device and "/dev" as the
mount point (I don't know if that's normal or not).

- Booting with the installation DVD (OpenSuSE Eval DVD for 10.0) comes
up to the installation screens OK, but the repair options don't work
because they can't figure out where my root is. It appears to find hda1 OK.

- I tried searching google and google-groups for anything related to
this but the only clue I was able to find was to verify /sys/block. I
was unable to come up with a search string that produced something
useful (a common problem with me, unfortunately).

I appreciate any suggestions of what to try or what to look at.
Hopefully this afternoon I'll have another 10.0 installation on another
machine I can compare against, at least so I can see what is right and
what is broken. Obviously I'm most suspicious that my attempt to use
udev rules to modify ttyS0 permissions royally screwed things up - I'd
never tried writing a udev rule before. I've reverted the file change
as I mentioned, but I'd guess if the saved udevdb got messed up maybe
that's what's wrong. I haven't posted to the udev lists, though; I want
to see if there might be another reason or suggestion.

Thanks in advance!

ken

Reply With Quote
  #2 (permalink)  
Old 06-08-2006
Ken Ryan
 
Posts: n/a
Default Re: SuSE 10.0 Something broke: /dev/hd* and friends no longer getcreated, boot fails

Ken Ryan wrote:
> Hello.
>
> Last night I put a new I/O board in my machine, which means I had to
> boot for the first time in about two months. Something happened where I
> can no longer boot.
>
> A little while ago I added a rule file to /etc/udev/rules.d (a
> 99-something which attempted to set permissions on /dev/ttyS0) but never
> tested it across a boot.
>
> First, rest assured I reversed the hardware and udev change, so my
> system should be the same as it was before. When getting ready to make
> the change I did a proper shutdown etc.
>
> I was negligent in three respects: I didn't keep up with my backups
> (most recent is a week or so ago), I never boot-tested my udev rules
> change, and I've been periodically running YOU updating everything it
> suggests including the kernel and whatnot but I hadn't been rebooting
> the machine to ensure all is well. So this problem could be caused by
> something that happened or something I did as long as two months ago and
> I didn't run across it until now.
>
> Here is my machine configuration:
>
> - 2.4(?) GHz P4, 1GB RAM, NVidia video, 10/100+USB1.1+1394 combo card,
> Audigy 2, USB2+FW combo card, Promise 20269-based IDE card, two hard
> disks (hda and hdg, both WD 120GB), LG dvd/cd writer, IDE zip drive
> (Dell Dimension 8200 with a couple peripheral changes)
> - Boot on hda1, swap on hda2 and hdg2, root on md0=hda3+hdg3, /home on
> md1=hda4+hdg4. The hdg1 partition is mounted on /altboot; I was going
> to rsync /boot onto it but I never got around to it.
> - All filesystems are ext3
> - Was running KDE with the NVidia driver
>
> When I shut down to add the new board one thing was a little odd - when
> I logged out of my user KDE session I was dropped to a console prompt
> rather than an xdm screen. I assumed that was simply because I had a
> YOU kernel update that hadn't gotten booted on before. I logged in as
> root at the console prompt and executed 'halt'. The system seemed to
> shut down OK at that point (I use the verbose boot, no splash screen).
>
> I made the hardware change, then powered on. The kernek booted, initrd
> loaded, / passed fsck and was mounted (it forced fsck due to being 63
> days since last fsck). It detected and assembled both raid1 volumes BUT
> fsck failed on hda1 and hdg1. At first I though "great, disk error or
> something". It dropped me to single-user, and when I tried rerunning
> fsck I realized it failed because /dev did not contain any hd* devices.
>
> I rebooted into "failsafe" with the same results except this time md0
> and md1 got fsck forced because it claimed 49710 days elapsed since last
> fsck. That makes me uncomfortable, obviously, but the root partition
> (md0) at least seemed to be OK from within single-user.
>
> It was at this point that I reverted the hardware change and my
> /etc/udev/rules.d/99-foo file (by removing the 99-foo file).
>
> Right now whether I try to boot into failsafe or normal mode I end up
> with /dev/hd* missing (/dev/md* is there). If I reboot into the same
> mode fsck doesn't get forced, if I switch from normal to failsafe or
> vice versa I get that weird 49710-day fsck (always the same number). It
> also doesn't matter whether I reboot or halt/powerdown then boot.
>
> A few things I was able to find:
>
> - It appears that /etc/init.d/boot.udev did not get run. I haven't
> figured out yet when it is supposed to run; if it's before or after
> boot.localfs (where I end up in single-user shell).
>
> - Sometimes udevd is running when I'm in singleuser, sometimes not. I
> haven't figured out the pattern yet. As I write this, I booted failsafe
> and am in singleuser with udevd running and /dev is missing the hd* files
>
> - If I run boot.udev force-reload I get a properly populated /dev.
>
> - Note: While I'm concentrating on /dev/hd* (especially /dev/hda1)
> missing, I have not checked if that is the only thing missing. As I
> write this, /dev has some files such as tty*, lp*, parport*, ippp*,
> isdn*, console, and the misc devices (zero, mem, null, etc.).
>
> - /proc and /sys are mounted and appear to be OK. Particularly I
> checked that /sys/block is OK, including /sys/block/hda/hda1.
>
> - if I cd to /dev and run 'df' I see "-" as the device and "/dev" as the
> mount point (I don't know if that's normal or not).
>
> - Booting with the installation DVD (OpenSuSE Eval DVD for 10.0) comes
> up to the installation screens OK, but the repair options don't work
> because they can't figure out where my root is. It appears to find hda1
> OK.
>
> - I tried searching google and google-groups for anything related to
> this but the only clue I was able to find was to verify /sys/block. I
> was unable to come up with a search string that produced something
> useful (a common problem with me, unfortunately).
>
> I appreciate any suggestions of what to try or what to look at.
> Hopefully this afternoon I'll have another 10.0 installation on another
> machine I can compare against, at least so I can see what is right and
> what is broken. Obviously I'm most suspicious that my attempt to use
> udev rules to modify ttyS0 permissions royally screwed things up - I'd
> never tried writing a udev rule before. I've reverted the file change
> as I mentioned, but I'd guess if the saved udevdb got messed up maybe
> that's what's wrong. I haven't posted to the udev lists, though; I want
> to see if there might be another reason or suggestion.
>
> Thanks in advance!
>
> ken
>



further investigation shows something really bizzarre.

When I run udevinfo e.g.

udevinfo -q all -p /sys/block/hda/hda1

all the lines look OK *except* the line

N: ttyS0

is in all files. This is also in /dev/.udevdb files.

I'm certain now that my attempt to write a rule for permissions on ttyS0
is the cause of this. The question is how do I fix it? I removed the
rule I wrote, but something is remembering it. I looked around with
find and grep but I don't know udev and the SuSE boot process well at
all, so I'm having no luck figuring out where the problem is.

Again, any tips would be immensely appreciated!

Thanks...

ken

Reply With Quote
Reply


Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are Off
[IMG] code is Off
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On



All times are GMT +1. The time now is 01:11 PM.


Powered by vBulletin® Version 3.6.8
Copyright ©2000 - 2008, Jelsoft Enterprises Ltd.
Content Relevant URLs by vBSEO 3.0.0