This is a discussion on BIND 8.4.4 assertion failure on Tru64 within the Bind Users forums, part of the DNS and Related Forums category; Hello, I'm testing v8.4.4 on Tru64 4.0E, 4.0G, and 5.1A. On all of those, ...
|
|||||||
| FAQ | Members List | Calendar | Search | Today's Posts | Mark Forums Read |
|
|||
|
Hello,
I'm testing v8.4.4 on Tru64 4.0E, 4.0G, and 5.1A. On all of those, it compiles (using defaults from port/decunix/Makefile.set) with no errors, but when it is started, it frequently dies with log entry: insist: critical: ns_main.c:4439: INSIST(evDo(ev, "handle_needs") != -1) No such file or directory failed. This usually happens very quickly after it has loaded all zones and is listening for requests. It also frequently dies with the exact same error immediately after "ndc reload", "ndc reload <domain>", and "ndc reconfig". Occasionally it does keep running without the INSIST error. There are no errors in named.conf, and making changes to named.conf (such as logging) has no affect on the issue. It does not seem to matter how many zones are defined in named.conf - 2 or over 2000, the behavior is the same. Production servers are currently running v8.4.3 with no problems (we don't have ipv6 enabled on these servers yet, so the bug that caused v8.4.3 to be deprecated isn't really bothering us). Briefly comparing 8.4.3 code to 8.4.4, assertions.h has not changed (the comments have, but not the code), but the way INSIST_ERR() is coded does seem to have changed - for example ns_main.c around line 4439 has: v8.4.3 - if (queued != 0) { INSIST_ERR(evDo(ev, (void *)handle_needs) != -1); return; } v8.4.4 - if (queued != 0) { INSIST_ERR(evDo(ev, "handle_needs") != -1); return; } A colleague attempted a workaround by trying to force CHECK_INSIST to zero. To include/isc/assertions.h he added #define CHECK_INSIST 0 just above #if CHECK_INSIST != 0 #define INSIST(cond) \ ((void) ((cond) || \ ((__assertion_failed)(__FILE__, __LINE__, assert_insist, \ #cond, 0), 0))) That was a mistake - On the test nameserver that was running >2,000 slave zones (a number of which were pointing to bad master servers), xfer-in seemed to get stuck: "ndc status" always showed 10 xfer's in progress (the max by default), with hundreds queued, and zones were simply not getting updated. Apparently CHECK_INSIST is, um, necessary :) I'm not familiar enough with BIND's code (yet) to trace it much further than that. In several years of maintaining ISC BIND servers, this is the first time a bug has bitten me in rear, so I'm not very familiar with debugging it. But I'll be submitting a bug report as soon as I get full info from running in debug mode. This is just a heads up. If you're running v8.4.4 on Tru64 and *not* seeing this problem, I'd sure like hear about it. I'm also testing this on Solaris 8 - so far, no problems there. Mark A Jones Systems Administrator netINS, Inc. http://netins.net (515) 830-0698 markjo@netins.net |