This is a discussion on Missconfiguration of ethernet interfaces on reboot within the Linux Networking forums, part of the Linux Forums category; Hello everybody, recently I have the same error in two different machines running RedHat Linux 9 (kernel 2.4.20-...
|
|||||||
| FAQ | Members List | Calendar | Search | Today's Posts | Mark Forums Read |
|
|||
|
Hello everybody,
recently I have the same error in two different machines running RedHat Linux 9 (kernel 2.4.20-8). Each machine has 3 NICs: 2 of them come included in motherboard (one network card with 2 NICs: one Intel e100 and one intel e1000) and the other is an Intel e1000 Fiber Network Card attached in a PCI-Express slot. The interfaces' alias are assigned like this (looking at /etc/ modules.conf): eth0: e100 NIC in motherboard eth1: e1000 NIC in mother board eth2: e1000 NIC in PCI slot All the system was working properly untill the other day the machine made a reboot (due to the softdog) and the interfaces' configuration became wrong: the system couldn't find the eth2 interface and also it tried to asign eth0 a e1000 module, but it couldn't, so it didn't result in a good rising of the interface and failed to be up. Indeed, the e100 module was not loaded (consulting via lsmod) When I tried to re-configure the interfaces trough the redhat-config- networg assistant, then appeared a message box telling me that I was wrongly trying to asign a e100 module to eth0 while it needed a e1000 module (it's false). It seems like someone is saying that interface eth0 needs a e1000 module, but neither the /etc/module.conf file, nor the /etc/sysconfig/ network-scripts/ifcfg-ethX files had changed. I looked at the /etc/sysconfig/hwconf file and it was the same like few weeks ago. No changes. But I realized that on the interface name especification, the COMPLETE interface name was not correct. I mean, it was written just eth, without 0, 1 or 2, insteed of eth0,eth1,eth2. The file hwconf network part is like this: class: NETWORK bus: PCI detached: 0 device: eth driver: e100 desc: "Intel Corp.|82801BD PRO/100 VE (LOM) Ethernet Controller" vendorId: 8086 deviceId: 1039 subVendorId: 8086 subDeviceId: 103a pciType: 1 - class: NETWORK bus: PCI detached: 0 device: eth driver: e1000 desc: "Unknown vendor|Generic e1000 device" vendorId: 8086 deviceId: 1076 subVendorId: 8086 subDeviceId: 1076 pciType: 1 - class: NETWORK bus: PCI detached: 0 device: eth driver: e1000 desc: "Unknown vendor|Generic e1000 device" vendorId: 8086 deviceId: 1027 subVendorId: 8086 subDeviceId: 1027 pciType: 1 In other (newer) RedHat systems (kernel 2.6.9-55), interface name in that file is complete (eth0, eth1, eth2). The /etc/modules.conf file is like this: alias eth0 e100 alias eth1 e1000 alias eth2 e1000 Does anybody knows if there's a bug in hwconf file or something related with those files? I was several days looking for a clue about that, but I haven't find anything. Thanks for any advice, Fionn |
|
|||
|
On Mon, 26 May 2008, in the Usenet newsgroup comp.os.linux.networking, in
article <52633167-2a60-4a3c-89c4-e80c4c64f04d@f36g2000hsa.googlegroups.com>, Fionn wrote: NOTE: Posting from groups.google.com (or some web-forums) dramatically reduces the chance of your post being seen. Find a real news server. >recently I have the same error in two different machines running >RedHat Linux 9 (kernel 2.4.20-8). That's the original kernel on an unmaintained 5 year old system. There were at least 9 kernel errata during the supported life, and three more backports - ending with 2.4.20-46.9.legacy in March 2006. >All the system was working properly untill the other day the machine >made a reboot (due to the softdog) Both systems suffered the same fault at the same time??? >and the interfaces' configuration became wrong: the system couldn't >find the eth2 interface and also it tried to asign eth0 a e1000 >module, but it couldn't, so it didn't result in a good rising of the >interface and failed to be up. Indeed, the e100 module was not loaded >(consulting via lsmod) That's usually a hardware or BIOS problem, where the e100 NIC isn't being found for some reason. Look at the boot messages in /var/log/messages relating to finding the cards. RH9 had a piece of crap application called 'kudzu' that was meant to reconfigure the system when hardware changed. I usually uninstalled that package as my hardware wasn't being changed every time the systems reboot. >driver: e1000 >desc: "Unknown vendor|Generic e1000 device" >vendorId: 8086 >deviceId: 1076 That's another indication that the kernel is obsolete - the 1076 device should be identified as a 82541GI Gigabit Ethernet Controller or a PRO/1000 MT >driver: e1000 >desc: "Unknown vendor|Generic e1000 device" >vendorId: 8086 >deviceId: 1027 and that should be a 82545GM Gigabit Ethernet Controller or a PRO/1000 MF Server Adapter(LX). None the less, the e1000 is the correct driver for both of these cards, and the e100 is correct for the 82801BD PRO/100 VE NIC. Old guy |
|
|||
|
On 26 mayo, 22:41, ibupro...@painkiller.example.tld (Moe Trin) wrote:
> On Mon, 26 May 2008, in the Usenet newsgroup comp.os.linux.networking, in > article <52633167-2a60-4a3c-89c4-e80c4c64f...@f36g2000hsa.googlegroups.com>, > Fionn wrote: > >recently I have the same error in two different machines running > >RedHat Linux 9 (kernel 2.4.20-8). > That's the original kernel on an unmaintained 5 year old system. > There were at least 9 kernel errata during the supported life, and > three more backports - ending with 2.4.20-46.9.legacy in March 2006. For some hardware dependencies (a MOXA multiport communication card), I must use that O.S. with that kernel. > >All the system was working properly untill the other day the machine > >made a reboot (due to the softdog) > Both systems suffered the same fault at the same time??? Yep, in a period of one week both machines suffered the same fault. And they were working nice since several weeks. They have no connection to internet (only local network between my machines). > >and the interfaces' configuration became wrong: the system couldn't > >find the eth2 interface and also it tried to asign eth0 a e1000 > >module, but it couldn't, so it didn't result in a good rising of the > >interface and failed to be up. Indeed, the e100 module was not loaded > >(consulting via lsmod) > That's usually a hardware or BIOS problem, where the e100 NIC isn't > being found for some reason. *Look at the boot messages in > /var/log/messages relating to finding the cards. * The only thing that /var/log/messages told me about network devices is that the interface eth2 (fiber card) could not be found, and so it didn't appear when the system was already booted. >RH9 had a piece of > crap application called 'kudzu' that was meant to reconfigure the > system when hardware changed. I usually uninstalled that package as > my hardware wasn't being changed every time the systems reboot. I tried to launch kudzu in order to recognize the lost NICs, but it didn't tell me anything. That's right, since we have seen that the NICs are right configured in hwconf file. I also do that. I always switch the kudzu service off from the boot of the system. But I realized that, when the error appeared, kudzu was configured (in the forst machine) to start on level 5 (the one I use), although it hadn't found anything new and so there was no change in hwconf file. > >driver: e1000 > >desc: "Unknown vendor|Generic e1000 device" > >vendorId: 8086 > >deviceId: 1076 > That's another indication that the kernel is obsolete - the 1076 device > should be identified as a 82541GI Gigabit Ethernet Controller or a > PRO/1000 MT > >driver: e1000 > >desc: "Unknown vendor|Generic e1000 device" > >vendorId: 8086 > >deviceId: 1027 > and that should be a 82545GM Gigabit Ethernet Controller or a PRO/1000 > MF Server Adapter(LX). None the less, the e1000 is the correct driver for > both of these cards, and the e100 is correct for the 82801BD PRO/100 VE > NIC. Exact, that's the fiber card (82545GM). I have tried to manually change hwconf file, as I've seen that in other computers was written the complete interface name (eth0, eth1, eth2). Now, I have this in hwconf: ------------------------------------------------- class: NETWORK bus: PCI detached: 0 device: eth0 driver: e100 desc: "Intel Corp.|82801BD PRO/100 VE (LOM) Ethernet Controller" vendorId: 8086 deviceId: 1039 subVendorId: 8086 subDeviceId: 103a pciType: 1 - class: NETWORK bus: PCI detached: 0 device: eth1 driver: e1000 desc: "Unknown vendor|Generic e1000 device" vendorId: 8086 deviceId: 1076 subVendorId: 8086 subDeviceId: 1076 pciType: 1 - class: NETWORK bus: PCI detached: 0 device: eth2 driver: e1000 desc: "Unknown vendor|Generic e1000 device" vendorId: 8086 deviceId: 1027 subVendorId: 8086 subDeviceId: 1027 pciType: 1 -------------------------------------------- When I reboot the computer (not just restart network services), the interfaces were just there and right configured. I could work with them. The next test I will make is deleting the hwconf (as I don't work with kudzu detecting new hardware) and see if everything works fine without that file. Now, I must know if there was a problem with kudzu in that kernel and a missconfiguration of the hwconf, for making sure that if I also change that file in all my computers, there will be no more problems. > * * * * Old guy |
|
|||
|
On Tue, 27 May 2008, in the Usenet newsgroup comp.os.linux.networking, in
article <2d718185-ced4-45fa-9657-359b49b58a11@m36g2000hse.googlegroups.com>, Fionn wrote: NOTE: Posting from groups.google.com (or some web-forums) dramatically reduces the chance of your post being seen. Find a real news server. >(Moe Trin) wrote: >> Fionn wrote: >>>RedHat Linux 9 (kernel 2.4.20-8). >> That's the original kernel on an unmaintained 5 year old system. >For some hardware dependencies (a MOXA multiport communication card), >I must use that O.S. with that kernel. Oh, I hate those kinds of problems. kernel.org is still maintaining the 2.4.x kernel, and the latest version there is 2.4.36.4 released about 3 weeks ago. >> Both systems suffered the same fault at the same time??? > >Yep, in a period of one week both machines suffered the same fault. >And they were working nice since several weeks. Stretching credibility for it to be a hardware or BIOS fault, but that is the "normal" problem. Were there any other changes in hardware? >They have no connection to internet (only local network between my >machines). That's good, as the kernel errata were for at least two security problems. >> That's usually a hardware or BIOS problem, where the e100 NIC isn't >> being found for some reason. <A0>Look at the boot messages in >> /var/log/messages relating to finding the cards. <A0> > >The only thing that /var/log/messages told me about network devices is >that the interface eth2 (fiber card) could not be found, and so it >didn't appear when the system was already booted. Is there any boot messages in the older log files? ('logrotate' is usually set to rotate /var/log/messages every Sunday ~04:00, and the default used to be to keep four weeks of such logs.) >> RH9 had a piece of crap application called 'kudzu' that was meant to >> reconfigure the system when hardware changed. I usually uninstalled >> that package as my hardware wasn't being changed every time the >> systems reboot. >I also do that. I always switch the kudzu service off from the boot of >the system. I was never able to understand the rational for that program. >I have tried to manually change hwconf file, as I've seen that in >other computers was written the complete interface name (eth0, eth1, >eth2). >Now, I have this in hwconf: >When I reboot the computer (not just restart network services), the >interfaces were just there and right configured. I could work with >them. OK! >The next test I will make is deleting the hwconf (as I don't work with >kudzu detecting new hardware) and see if everything works fine without >that file. I'd rename the file, but that's just my paranoia ;-) >Now, I must know if there was a problem with kudzu in that kernel and >a missconfiguration of the hwconf, for making sure that if I also >change that file in all my computers, there will be no more problems. [compton /net/johnstown/redhat/old]$ ls 9* 9-errata.05.01.04.gz 9-legacy.01.23.07.gz [compton /net/johnstown/redhat/old]$ zgrep kudzu 9* [compton /net/johnstown/redhat/old]$ zgrep kudzu rpms.9-i386.gz | cut -c30- 314267 Feb 25 23:20 kudzu-0.99.99-1.i386.rpm 112652 Feb 25 23:20 kudzu-devel-0.99.99-1.i386.rpm [compton /net/johnstown/redhat/old]$ Near as I can tell, kudzu was never updated from the out-of-box version (the rpms.9-i386.gz is a directory listing from April 2003 when RH9 was released - the February date looks to be right after the release of the last 'phoebe' redhat-8.0.94 beta). As mentioned, the kernel had quite a number of updates - the oldest one listed on the 9-errata.05.01.04.gz file was the sixth I have records of, but no dates: redhat-9 07 Apr 03 shrike 2.4.20-8 -> 2.4.20-13.9 -> 2.4.20-18.9 -> 2.4.20-19.9 -> 2.4.20-24.9 ->2.4.20-27.9 -> 2.4.20-28.9 -> 2.4.20-30.9 -> 2.4.20-31.9 -> 2.4.20-42.9.legacy -> 2.4.20-43.9.legacy -> 2.4.20-46.9.legacy kernel-2.4.20-28.9.i386.rpm 24-Dec-2003 14:24 Unless you are running into some bizarre Y2K-like problem, it seems unlikely that this would be a problem in kudzu or the 2.4.20-8, as if it were going to happen, you'd think it would have happened before now. My best guess remains some hardware change, but that's only a guess. Old guy |