Search Text         
Search Tips?
Search By   And   Or   Boolean   Exact Match   TA #
Search In   Whole Doc   Keywords Sort By  
Product   Sub Product  

View Technical Articles (sorted by Product) New/Updated in the last:    7 days      14 days      30 days             
TA # Date Created Date Updated Resolved Issue?   Printer Friendly Version of This TA   Print Article
  E-mail This TA   E-mail Article
116684 01/29/2002 03:38 PM 10/22/2010 09:59 AM
Yes No
OpenServer 5, How to Debug STREAMS failures.
Keywords
Openserver v5 osr5 5.0.0 5.0.2 5.0.4 5.0.5 5.0.6 troubleshooting debugging streams failures out of resources system hang hung NSTRPAGES exceeded allocb failed 5.0.7 507
Release
          SCO OpenServer Enterprise System Release 5.0.4, 5.0.5, 5.0.6, 5.0.7 
          SCO OpenServer Desktop System Release 5.0.4, 5.0.5, 5.0.6, 5.0.7 
Problem
          The system is failing to allocate streams resources. This is
          evidenced by one or more of the following:

              - Failures in the system logs or on the console that say "Out of
                streams resources" or "Out of streams memory (NSTRPAGES = XXXX
                exceeded)" or "allocb failed"

              - netstat -m shows non-zero numbers in the fail column

              - ndstat -l shows a non-zero number in "No STREAMS Buffers"
                column

              - crash -> strstat shows non-zero numbers in the FAIL column

              - Mysterious system hangs or lockups

          Note: A few occasional failures reported by netstat, ndstat, or crash
          do not necessarily indicate a serious problem and can most likely be
          ignored.

CAUSE:
          There can be many reasons why the system fails to allocate streams
          resources. The usual causes are:

              1. Improper kernel tuning

              2. Streams leak in the network card driver

              3. Streams leaks in 3rd party serial card drivers or management
                 drivers

              4. Failing hardware

              5. External network hardware misbehaving

              6. Extremely high network traffic

              7. Streams leak in a base operating system driver

              8. Improper synchronization of data transfer between the client
                 and server components of a network application

          Any of the above factors can lead to exhaustion of the configured
          amount of STREAMS memory available for use by kernel drivers and
          modules.

          Background information: The kernel tunable NSTRPAGES controls the
          amount of pages of memory available for streams. However, not all of
          these resources are immediately available.  The streams daemon (strd)
          reserves a subset of NSTRPAGES in a few pools of memory for streams
          allocation during interrupt time.

          These pools are defined in /etc/conf/pack.d/str/space.c:

unsigned int    str_pool_size = 20;     /* size of interrupt pool in pages */

unsigned int    mblk_pool_size = 70;    /* size of mblk interrupt reserve */

          So, one important distinction to make when debugging streams failures
          is whether you have actually exceeded the maximum streams parameter,
          NSTRPAGES, or whether there was not enough resources in the available
          pools at interrupt time to satisfy the requests. A simple way to
          determine this is with netstat -m.

          Consider the following output:

streams allocation:
                         config    alloc     free    total    max     fail
stream                     7200      250     6950     6016    259        0
queues                     1248      552      696    14327    577        0
mblks                      5732     1442     4290  3560707    699        0
buffer headers             6458     6319      139   460399   6336        0
class  1,     64 bytes      128       67       61  1206187     99        0
class  2,    128 bytes       96        0       96   329540     91        0
class  3,    256 bytes      352       30      322    33563   1351        0
class  4,    512 bytes       16        6       10    10696     13        0
class  5,   1024 bytes       20        0       20    10258     18        0
class  6,   2048 bytes     5394     1032     4362   749789   5394    41784
class  7,   4096 bytes      123      123        0     2911    123        0
class  8,   8192 bytes        6        0        6     1172      6        0
class  9,  16384 bytes        0        0        0       31      3        0
class 10,  32768 bytes        0        0        0        0      0        0
class 11,  65536 bytes        0        0        0        0      0        0
class 12, 131072 bytes        0        0        0        0      0        0
class 13, 262144 bytes        0        0        0        0      0        0
class 14, 524288 bytes        0        0        0        0      0        0
total configured streams memory: 32000.00KB
streams memory in use:  2754.73KB
maximum streams memory used: 11792.11KB

          As you can see, the system failed to allocate 41784 2KB buffers. But,
          the total configured streams memory (NSTRPAGES) was not exceeded.
          This could indicate that a large amount of data has been coming in
          from the network and the corresponding large number of NIC interrupts
          are handled by attempting to allocate STREAMS messages at interrupt
          time, to pass the data upstream.  These allocations commonly occur in
          2KB chunks.  Such failures, as explained above, could very likely be
          caused by an inadequate value of str_pool_size. In situations like
          this, you can increase str_pool_size and/or mblk_pool_size to attempt
          to stop the failures (relink and reboot).

          Please Note: Often tuning streams resources upward (NSTRPAGES, 
          str_pool_size, mblk_pool_size, ...) will only delay or mask the 
          actual problem. 

          If the failures continue no matter how many resources you allocated 
          to the streams subsystem, the problem is likely not tuning and is 
          one of the others mentioned above.


Solution
          1. Improper kernel tuning - Refer to the SCO Openserver Performance
             Guide and Technical Article #107566, "How do I tune STREAMS
             resources under OpenServer 5 (TCP/IP 2.0.0)?" for information on
             tuning the various STREAMS parameters.

             Typical messages seen are:

Oct 19 08:42:58 mysvr WARNING: allocb failed - NSTRPAGES exceeded

Oct 19 17:16:42 mysvr WARNING: xdr_bytes: bad size FAILED

Oct 22 13:23:49 mysvr WARNING: svckclt_send: allocb ENOSR

Oct 22 13:23:49 mysvr lockd[491]: get_myaddress: ioctl (get interface 
configuration): Out of stream resources

Oct 22 13:23:49 mysvr mountd[479]: get_myaddress: ioctl (get interface 
configuration): Out of stream resources

Oct 22 13:23:49 mysvr nfsd[481]: get_myaddress: ioctl (get interface 
configuration): Out of stream resources

Oct 22 13:23:53 mysvr syslog: get_myaddress: ioctl (get interface 
configuration): Out of stream resources

             These values in /etc/conf/cf.d/mtune are:

             Value           Current Min     Max
             --------------- ------- ------- -------        
             NSTRPAGES       500     0       8000
             STRSPLITFRAC    80      50      100
             NSTREAM         64      1       32768

             and can be tuned by changing the values from the default values in:

             /etc/conf/cf.d/stune

             eg, to their maximum values:

             NSTRPAGES 8000
             STRSPLITFRAC 100
             NSTREAM 32768

             If you allocate the maximum sizes, as in the above example, then 
             this will allocate more memory required for the kernel so it may 
             be advisable to increase any parameters with controlled values, eg:

             Just change one parameter:

             NSTRPAGES 1000

             Then relink and reboot the server with:

             # /etc/conf/cf.d/link_unix -y
             # init 6

          2. Streams leak in the network card driver:

             - Check ftp://ftp.sco.com/pub/openserver5/drivers/ and/or
               the network card manufacturer for an updated driver. If you are
               at the latest driver and still suspect it is the culprit, try
               swapping out the NIC for another brand that uses a different
               driver.

          3. Streams leaks in 3rd party serial card drivers or management
             drivers:

               Check with the card or driver manufacturer for updated drivers.
               Many of the older 3rd party serial drivers had these problems.
               Streams leaks have also been seen with older Compaq agent
               drivers.

          4. Failing hardware:

               Try running hardware diagnostic and swapping out the NIC and/or
               RAM to see if this eliminates the problem.

             
NOTE:
      User-level hardware diagnostics often fail to uncover
             serious hardware problems. To be sure, you should have your
             hardware vendor onsite to diagnose.

          5. External network hardware misbehaving:

               Every network packet that is destined for the machine will be
               processed by the network card and driver and streams resources
               will be allocated. It is therefore possible that some other
               machine and/or piece of network hardware can misbehave and
               cause a depletion of streams resources on the local system. You
               can use a network sniffer and/or switch diagnostics to monitor
               the traffic on your local network and locate the offending
               machine.

          6. Extremely high network traffic:

               Occasionally, on very highly traveled networks, there are too
               many packets to be processed at interrupt time.  Consider the
               following:

                  - Tuning str_pool_size and mblk_pool_size to see if it
                    remedies the situation. You can start by doubling the pool
                    sizes (relink and reboot). If the problem persists and you
                    are sure that there is not some other problem, try tuning
                    these up another 50%. Another thing to consider here is
                    that there may be too much of a burden on your network
                    infrastructure.

                  - Consider upgrading your network hardware to support higher
                    transfer rates or spreading the load across other servers
                    and networks.

          7. Streams leak in a base operating system driver:

               All known streams leaks in base operating system drivers have
               been resolved by various support level supplements. Ensure that
               your system is properly patched so these fixes are incorporated.

               At a minimum, you will need the following:

                   5.0.4        Release Supplement rs504c
                                SLS oss469d

                   5.0.5        Release Supplement rs505a
                                SLS oss497c

                   5.0.6        Release Supplement rs506a

                   5.0.7        The latest Maintenance Pack available from:

                   http://www.sco.com/support/update/download/osr507list.html

          8. Improper synchronization of data transfer between the client and
             server components of a network application:

               Check the "Recv-Q" and "Send-Q" columns of "netstat -a" for any
               continuous, significantly non-zero values and examine the state
               of the client applications involved.

SEE ALSO:
          Technical Article 107566, "How do I tune STREAMS resources under OpenServer 5
          (TCP/IP 2.0.0)?"

          Technical Article 105811, "What is the Release Supplement 504C for SCO OpenServer
          Release 5.0.4 (rs504c) and how can I obtain it?"

          Technical Article 109975, "What is SLS OSS469D, the Core OS Supplement for SCO
          OpenServer Release 5.0.4?"

          Technical Article 109909, "What is Release Supplement 505A for SCO OpenServer Release
          5.0.5?"

          Technical Article 110105, "What is SLS OSS497C, the Core OS Supplement for SCO
          OpenServer Release 5.0.5?"

          Technical Article 114142, "What is Release Supplement 5.0.6a and where can I get it?"

          Technical Article 101596, "Using Intel Pro/100B NIC: "WARNING: eeE: Allocb failure
          during initialization."

          Technical Article 105041, "I get the error "WARNING: table_alloc - Failed to get
          address space for mnode table."

          Technical Article 104851, "I get the message "LOGIN: ERROR- Failed to initialize
          policy manager."

          Technical Article 115066, "Telnet logins fail with syslog error, "inetd[302]: accept:
          (for telnet) Out of stream resources."

          SCO OpenServer Performance Guide
Back to Search ResultsBack to Search Results