This is a discussion on RAID 6 / Reiserfs problem within the Linux Administration forums, part of the Linux Forums category; I'm having a severe problem whose root cause I cannot determine. I have a RAID 6 array managed by ...
|
|||||||
| FAQ | Members List | Calendar | Search | Today's Posts | Mark Forums Read |
|
|||
|
I'm having a severe problem whose root cause I cannot determine. I have a
RAID 6 array managed by mdadm running on Debian "Lenny" with a 3.2GHz AMD Athlon 64 x 2 processor and 8G of RAM. There are ten 1 Terabyte SATA drives, unpartitioned, fully allocated to the /dev/md0 device. The drive are served by 3 Silicon Image SATA port multipliers and a Silicon Image 4 port eSATA controller. The /dev/md0 device is also unpartitioned, and all 8T of active space is formatted as a single Reiserfs file system. The entire volume is mounted to /RAID. Various directories on the volume are shared using both NFS and SAMBA. Performance of the RAID system is very good. The array can read and write at over 450 Mbps, and I don't know if the limit is the array itself or the network, but since the performance is more than adequate I really am not concerned which is the case. The issue is the entire array will occasionally pause completely for about 40 seconds when a file is created. This does not always happen, but the situation is easily reproducible. The frequency at which the symptom occurs seems to be related to the transfer load on the array. If no other transfers are in process, then the failure seems somewhat more rare, perhaps accompanying less than 1 file creation in 10.. During heavy file transfer activity, sometimes the system halts with every other file creation. Although I have observed many dozens of these events, I have never once observed it to happen except when a file creation occurs. Reading and writing existing files never triggers the event, although any read or write occurring during the event is halted for the duration. (There is one cron jog which runs every half-hour that creates a tiny file; this is the most common failure vector.) There are other drives formatted with other file systems on the machine, but the issue has never been seen on any of the other drives. When the array runs its regularly scheduled health check, the problem is much worse. Not only does it lock up with almost every single file creation, but the lock-up time is much longer - sometimes in excess of 2 minutes. Transfers via Linux based utilities (ftp, NFS, cp, mv, rsync, etc) all recover after the event, but SAMBA based transfers frequently fail, both reads and writes. How can I troubleshoot and more importantly resolve this issue? |
|
|||
|
lrhorer wrote:
> I'm having a severe problem whose root cause I cannot determine. I have a > RAID 6 array managed by mdadm running on Debian "Lenny" with a 3.2GHz AMD > Athlon 64 x 2 processor and 8G of RAM. There are ten 1 Terabyte SATA > drives, unpartitioned, fully allocated to the /dev/md0 device. The drive > are served by 3 Silicon Image SATA port multipliers and a Silicon Image 4 > port eSATA controller. The /dev/md0 device is also unpartitioned, and all > 8T of active space is formatted as a single Reiserfs file system. The > entire volume is mounted to /RAID. Various directories on the volume are > shared using both NFS and SAMBA. > > Performance of the RAID system is very good. The array can read and write > at over 450 Mbps, and I don't know if the limit is the array itself or the > network, but since the performance is more than adequate I really am not > concerned which is the case. > > The issue is the entire array will occasionally pause completely for about > 40 seconds when a file is created. This does not always happen, but the > situation is easily reproducible. The frequency at which the symptom > occurs seems to be related to the transfer load on the array. If no other > transfers are in process, then the failure seems somewhat more rare, > perhaps accompanying less than 1 file creation in 10.. During heavy file > transfer activity, sometimes the system halts with every other file > creation. Although I have observed many dozens of these events, I have > never once observed it to happen except when a file creation occurs. > Reading and writing existing files never triggers the event, although any > read or write occurring during the event is halted for the duration. > (There is one cron jog which runs every half-hour that creates a tiny file; > this is the most common failure vector.) There are other drives formatted > with other file systems on the machine, but the issue has never been seen > on any of the other drives. When the array runs its regularly scheduled > health check, the problem is much worse. Not only does it lock up with > almost every single file creation, but the lock-up time is much longer - > sometimes in excess of 2 minutes. > > Transfers via Linux based utilities (ftp, NFS, cp, mv, rsync, etc) all > recover after the event, but SAMBA based transfers frequently fail, both > reads and writes. > > How can I troubleshoot and more importantly resolve this issue? may be way off...but i do remember reading something about a reiser incompatibility with a kernel released in the Oct-Dec 2008 time frame.. sorry, i can't find it now but i remember a kernel patch solved it...*maybe* that will put you on the right path.. -- see caveat: http://tinyurl.com/6aagco DenverD (Linux Counter 282315) via Thunderbird 3.0.1-1.1, KDE 3.5.7, openSUSE Linux 10.3, 2.6.22.19-0.2-default #1 SMP i686 athlon |