View Single Post

  #9 (permalink)  
Old 03-26-2007
Clifford Kite
 
Posts: n/a
Default Re: Why does pppd (pppoe) go to 95% to 100% of CPU?

hazzmat <hazzmat@unitedstatesgovernmentbellsouth.net> wrote:
> On Mon, 19 Mar 2007 15:45:03 -0500, Clifford Kite wrote:
>>
>> There were many nameserver requests by 148.46 to different hosts with no
>> answer, all within approximately 20 seconds. I'm not sure what is being
>> requested, or why there are no replies, but suspect if replies were forth
>> coming there would be no problem.


> What I can say about this is that the nameserver requests are to
> dyndns.org a dynamic address dns service. They were under DDOS attack
> starting on March 10. When that system gets a new ip it's supposed to
> update its record at dyndns.org. The DDOS attack has made that sometimes
> impossible, sometimes difficult.


Does the start of the attack coincide with the start of your problem?
Alternately, is there anything else that does coincide with it?

> Another thing I know is involved is,
> when that system loses its ip address and reconnects, ntpd no longer is
> in sync. You get 'ntpd sendto i.pa.d.dr invalid argument messages' in the
> system log. So I made it happen that ntpd gets stopped when the link goes
> down and then is restarted again when the ppp0 link comes back up.
> Seemed like a workaround for ntpd's inability to maintain connection with
> a changing IP. This certainly adds to the CPU strain, particularly when
> the link is going up and down. The only captures I've been able to get
> off the system under high load from pppd show ntpd synchronization
> exchanges like the one you saw.


The one I saw was just that, one. In order for a CPU to be loaded to
95%+ by traffic through pppd there would have be many at a high rate.
Even the rate of DNS requests did not seem high enough to me, but the
requests were both numerous and odd compared to the other traffic.

> One of the problems I have had with the
> dsl service is that the ISP is somewhat casual about LCP --evidently more
> so than the Linux box. Using the default values for LCP interval and
> failure, the Linux system will conclude the ppp link is not working


Response to LCP echo-requests are an RFC requirement, but echo-requests
or echo-replies can be lost.

> anymore, take it down and try to reconnect. Sometimes however, the peer at
> the other end doesn't think the previous session is dead yet. So the link
> cannot be reestablished and there are "too many sessions for this host"
> messages back from the ISP in the log. pppoe tries and retries for a
> while. It sorts itself out eventually--most of the time. What I really
> want is a way to make the pppoe system WAIT longer before trying to
> reconnect. That way the peer should have caught on to the fact that the
> link is down, and clear the way for a new ppp session.


> I initially thought that pppoe-timeout was related to how quickly
> pppoe tries to reconnect, but I see that it is something else
> altogether. Do you think a holdoff statement in /etc/ppp/options
> might work?


You see something I don't, namely pppoe-timeout. A grep of rp-pppoe's
source directory for pppoe-timeout turned up empty. Ah, I see now from
man pppoe you probably mean the pppoe -T option which is akin to the
pppd idle option.

Using holdoff should introduce a delay before pppd tries to restart
the PPP link but how will rp-pppoe know to delay the PADI requests?
Instead you could try increasing lcp-echo-interval or lcp-echo-failure to
delay termination of the link (and thus the beginning of PADI requests).
Or increasing the sleep time near the end of the pppoe-connect script
might work.

Just to be sure (even though you said it's pppd hogging the CPU): The
Ethernet interfaces for PPPoE should not be used for anything else.

--
Clifford Kite
Reply With Quote