Re: pfil2.1.11 performance issues - pfil_printmchain sprintf

This is a discussion on Re: pfil2.1.11 performance issues - pfil_printmchain sprintf within the IPFilter forums, part of the System Security and Security Related category; > > Ian Donaldson wrote: > > I have a pair of Sun Fire X2100M2's connected via 100M eth ...


Go Back   Usenet Forums > System Security and Security Related > IPFilter

FAQ Members List Calendar Search Today's Posts Mark Forums Read
  #1 (permalink)  
Old 03-04-2007
Ian Donaldson
 
Posts: n/a
Default Re: pfil2.1.11 performance issues - pfil_printmchain sprintf

>
> Ian Donaldson wrote:
> > I have a pair of Sun Fire X2100M2's connected via 100M eth switches
> > (yeah, crippling gig-E) and running pfil 2.1.11, ip_fil4.1.16 and
> > was noticing significant TCP throughput performance differences
> > for traffic between various ethernet interfaces on the two systems.
> >
> > (both systems running Solaris 10/x86 6/06 with 26 Feb recommmended
> > patch cluster, NVIDIA add-on driver patch 122530-02 for nge)
> >
> > eg: system1 bge0 -> system2 bge0 1700KB/s
> > system1 bge1 -> system2 bge1 11000KB/s
> > system1 nge1 -> system2 nge1 11000KB/s
> >
> > With top I noticed a significant portion of system time being consumed
> > in the bge0 test (like 50%).
> >
> > Using
> >
> > lockstat -kIi997 sleep 10
> >
> > What is curious though is that this problem only manifests itself
> > on one of the 3 interfaces I have enabled in the system, suggesting
> > something else is broken, as I would have though that all interface
> > traffic would pass thru the same code.
> > (yes I've verified pfil module is pushed on all interfaces)
> >
> > It doesn't manifest itself on another X2100M2 system that only has
> > bge0 enabled but.
> >

>
> Are you saying that where bge1 is used but not bge0, the problem doesn't
> arise?
> That would be strange! if it happened when either bge0 or bge1 was
> being used,
> I could understand that...kinda...it'll be because the bge driver is
> communicating
> with IP "differently" because pfil is there in between.
>


Yep, as stated. traffic between bge1 and nge1 on both systems was fine,
only bge0 was affected.

Since this I've also discovered this problem existed on some of our
Solaris 9 systems that run similar ipf/pfil versions.
ie: pfil_2.1.9 ip_fil4.1.13 but not in all combinations.

eg:
- Sun Fire V60x; no problems at all. Can't reproduce it on
either e1000g0 or e1000g1.
(Solaris 9/x86 2003/08 base with May 2005 recommended patch cluster)

lockstat doesn't even show pfil_printmchain being called at all.

- Sun Netra T1 105 sparc its 100% reproducable on both interfaces
(hme0 and hme1).
(Solaris 9 sparc 2003/12 base with May 2005 recommended patch cluster)
lockstat shows vsnprintf and pfil_printmchain at the top of usage.

Thoughput is abysmal; 300KB/sec. Kernel CPU usage 97%.

- Sun Fire V100 sparc; no problems at all. Can't reproduce it on either
dmfe0 or dmfe1. pfil_printmchain showed only a handful of calls
in the trace.
(identical OS/patch base as for the Netra)
Tested two similar systems. Same results.

Note that the ipf/pfil on the sparc systems were absolutely identical;
installed from the same package I built.

So what other factors can control whether pfil_printmchain is called?
(couldn't spot anything in the code myself; and I hate an unsolved
mystery like this as its probably related to another bug which could
be way more serious)

Ian D
Reply With Quote
Reply
Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are Off
[IMG] code is Off
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On



All times are GMT +1. The time now is 04:38 AM.


Powered by vBulletin® Version 3.7.3
Copyright ©2000 - 2008, Jelsoft Enterprises Ltd.
Content Relevant URLs by vBSEO 3.0.0