Re: IXFR journal dump making 9.2.4 server non-responsive

This is a discussion on Re: IXFR journal dump making 9.2.4 server non-responsive within the Bind Users forums, part of the DNS and Related Forums category; In article <cpsk8r$22ej$1@sf1.isc.org>, Derek D. wrote: > We subscribe to an e-mail ...


Go Back   Usenet Forums > DNS and Related Forums > Bind Users

FAQ Members List Calendar Search Today's Posts Mark Forums Read
  #1 (permalink)  
Old 12-16-2004
Brian Widdas
 
Posts: n/a
Default Re: IXFR journal dump making 9.2.4 server non-responsive

In article <cpsk8r$22ej$1@sf1.isc.org>, Derek D. wrote:
> We subscribe to an e-mail DNS RBL that we zone transfer via IXFR and
> have noticed what we believe to be a correlation of BIND stop answering
> queries and the dumping of the journal file to disk.
>
> The server is a Sun v120 with 2GB of RAM running Solaris 8 and Bind
> 9.2.4.
>
> I noticed the Bind 9 ARM mentions that the default time for dumping the
> journal file to disk is 15 minutes, but we seem to be seeing it at
> about 20 minutes. For example the end of transfer log entry for the
> zone is at 00:56:25 and all is well until 01:16:51 when log entries
> stopped. Then at 01:20:45 queries start getting logged again. During
> this outage the machine is running pretty close to 100% CPU and a truss
> shows that a new zone file is being dumped to disk. Normally the
> machine is running with a load average of about 0.2. A fresh start of
> BIND takes about 8 to 10 minutes to load this zone plus the others that
> is has. The RBL zone file is about 102MB.


Presumably you mean the zone file is being dumped, rather than the
journal - the journal is constantly updated as updates to the zone come
in.

I've seen this problem before on a large zone slaved using IXFR. The
problem appears to be that, 15 minutes after an update, BIND will write
out the zone file. While it's doing this, the in-memory copy is locked,
which prevents access to it. Any thread which attempts to read this
copy will block until it becomes unlocked. In doing so, the thread is
prevented from doing any other work (normally, the zone file would
be written out and unlocked in a few milliseconds, so this wouldn't
be an issue).

If sufficient queries are made against the zone in question, all the
threads on your server will be taken up waiting for the zone to finish
writing, and you'll stop responding to all queries.

> Does the above make any sense?
> Would a dual CPU box help this?


Not really. You'll be able to have more threads, but if your server
is busy enough, they'll still all eventually block. Disk IO speed is
probably the real limiting factor.

> Any ideas or suggestions?


Increase the number of threads (beware of overloading the server if it's
busy, though), remove the "file" directive from the zone config (if you
can live with having to refetch the entire zone every time you start
the nameserver), or put the file into a memory filesystem, syncing it to
disk every 15 minutes or so, and putting it back after a reboot.

None of these are ideal solutions. I wish I could tell you how I solved
the problem when I saw it, but I ended up not having to slave the
huge zone, so the issue went away.

Brian
--
* * * * ** * * ** ** * *
* ** * * ** * * * *
* * * * * *


Reply With Quote
Reply
Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are Off
[IMG] code is Off
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On



All times are GMT +1. The time now is 05:17 PM.


Powered by vBulletin® Version 3.7.3
Copyright ©2000 - 2008, Jelsoft Enterprises Ltd.
Content Relevant URLs by vBSEO 3.0.0