blueDonkey.org

Books.VxWorksCookbookNetworking

Networking


Configuration

Parameters

There are a number of parameters that are important when tuning the performance of the VxWorks network stack. By default, the stack is set up to be very conservative with memory usage, and is really only suitable for network based debugging, or the very simplest of network devices that do not need many connections, or have high throughput requirements. Some simple tuning can improve both the performance and the number of simultaneous connections that the stack can support.

Stack Buffer Sizes

The network stack is allocated a fixed amount of memory at the time it is initialised. This memory is subdivided into clusters so that all allocations and deallocations are fast. The table below shows the parameters (macros for those configuring outside the Tornado project facility), and some recommended settings:

Parameter Default Value Recommended Value Description
NUM_NET_MBLK
400
NUM_CL_BLKS * 2
Number of mBlk structures to allocate for the network stack's data memory pool. Should be at least as many of these as there are clusters (i.e. NUM_NET_MBLK >= NUM_CL_BLKS)
NUM_64
100
256
Number of 64 byte data clusters to allocate.
NUM_128
100
256
Number of 128 byte data clusters to allocate.
NUM_256
40
256
Number of 256 byte data clusters to allocate.
NUM_512
40
256
Number of 512 byte data clusters to allocate.
NUM_1024
25
128
Number of 1024 byte data clusters to allocate.
NUM_2048
25
128
Number of 2048 byte data clusters to allocate.
NUM_CL_BLKS
Sum of NUM_XXX
Sum of NUM_XXX
Total number of data clusters being allocated. This should be left as the sum of the NUM_XXX values.
NUM_SYS_MBLK
NUM_SYS_CL_BLKS * 2
NUM_SYS_CL_BLKS * 2
Number of mBlk structures to allocate for the network stack's system memory pool. Should be at least as many of these as there are clusters (i.e. NUM_NET_MBLK >= NUM_CL_BLKS)
NUM_SYS_64
40
256
Number of 64 byte data clusters to allocate.
NUM_SYS_128
40
256
Number of 128 byte data clusters to allocate.
NUM_SYS_256
40
256
Number of 256 byte data clusters to allocate.
NUM_SYS_512
20
256
Number of 512 byte data clusters to allocate.
NUM_SYS_CL_BLKS
Sum of NUM_SYS_XXX
Sum of NUM_SYS_XXX
Total number of data clusters being allocated. This should be left as the sum of the NUM_SYS_XXX values.

Driver Buffer Sizes

The buffers that allocated to the driver are as important, if not more so, as the main network stack buffers. In particular, pay attention to the receive buffer setting. To change these, you will need to adjust a value in the END_LOAD_STRING for your driver. In some cases, this string is defined as a constant string in the configNet.h header file in your BSP. For others, the string is generated dynamically as part of the initialisation process. This code is normally found in either sysLib.c or sysDriverNameEnd.c (again in the BSP).

To discover where in the string the receive and transmit buffers are specified, check the reference manual page for the driver. In some cases, the appropriate string definition is in a comment in the code or header file as well, but it is advisable to confirm this format against the reference manual. Also note that not all drivers have this as an option. For example, many PCMCIA cards come with on-card memory that essentially fixes the number of buffers that the driver can use.

As with the defaults for the main network stack, the default values for the number of driver buffers tends to be very conservative (often just 32 buffers for each). Each buffer takes around 1.5KB, so allocate as many as you have space for. For really network-intensive applications, provide at least 256 receive buffers (this is less than 400KB of memory). Increase the transmit buffers according to your application's needs - some experimentation might be needed here to get the best performance.

Driver Flags

Depending on the driver, there are a number of flags and parameters passed via the END_LOAD_STRING that could affect performance. Read the reference manual page for your driver to see which apply to your interface(s). The table below shows some of the settings that could affect performance:

Setting Description
PHY Flags Settings for the ethernet PHY, in particular pay attention to the speed and duplex settings. Many drivers will default to auto-negotiate, but some combinations of PHY and hub/switch will either fail to negotiate at all, or will negotiate a slower speed than they are capable of. These flags will force the driver to set a speed or duplex setting instead of auto-negotiating it.
Offset On some CPUs it is necessary with the VxWorks network stack to offset the incoming packets in order to correctly align the IP header on a 32 bit word boundary. The values of the offset will be either 0 or 2. If your CPU needs this, then there is nothing you can do, but if your CPU is capable of accessing unaligned words correctly make sure that this is set to zero.

Finally, check the documentation for the driver to see if there are any performance or tuning hints. Some have sections describing how to improve the performance of the driver.

Related OS Parameters

There is one operating system configuration parameter that affects the number of connections that the network stack can make: NUM_FILES. This is because each socket is actually a operating system file descriptor as well, so that it may be used in calls to functions like read() and ioctl(). As a result of this, the value of NUM_FILES will influence the number of network connections that can be made.

Changing the value of NUM_FILES can have some other undesirable effects on the system though if certain limits are not observed. In VxWorks 5.4.x, and earlier, setting NUM_FILES greater than 256 could cause problems for code that used the select() function. The FD_SETSIZE was limited to 256 in these versions. In VxWorks AE 1.x, the value of FD_SETSIZE was increased to 512, allowing NUM_FILES to also be safely increased to 512. In VxWorks 5.5, this limitation was removed as the management of the FD sets for select() was changed to be more dynamic (the side effect of this is that the width parameter passed to select() should no longer be the constant FD_SETSIZE as that may not produce the desired behaviour).

Note: Due to the way that network connections must be managed by the stack, even when a socket is closed, the socket descriptor cannot be released until a fixed time has expired (2 * MSL). The MSL for VxWorks is defined as 30 seconds, and cannot be changed unless the source for the network stack is available. That means that 1 minute must pass from the time that a socket is closed until the time that its associated file descriptor is returned to the pool of available descriptors. How does this affect systems in practice? Let's look at an example. Assume that your system will be making a new connection every second, that means that at any time after the first minute, there will be 61 file descriptors reserved (60 waiting on 2 * MSL timouts, plus the one active one). So, the value chosen for NUM_FILES must be 61 + other system requirements.

The default value for NUM_FILES is only 40, so without changes to the configuration, we would expect the system to stop working as required somewhere between 30 and 40 seconds of operation due to a lack of system file descriptors (the exact time will depend on the number of other active file descriptors). It will not return an error though. Instead, it will block until a file descriptor becomes available. That means that the observed behaviour will be bursts of correct operation, interspersed with pauses of several tens of seconds in length.

Measurement and Tuning

Perhaps the biggest problem with network performance tuning is determining where the bottleneck(s) are in the system. There are a number of possible places that could cause the overall performance to drop:

  • The application
  • The TCP/IP stack implementation
  • The driver
  • The network itself (e.g. sustained high traffic levels on the wire)
  • The remote system(s)

The send and receive speeds could also be affected by different things, so fixing something that is clearly a problem might not improve the overall performance as much as was expected.

Diagnostic Routines

There are a number of diagnostic routines that can be useful when monitoring network performance. Some are included in every VxWorks installation, and can be added by simply changing the configuration and rebuilding the kernel; others are external tools that will need to be downloaded.

Tool Component or Location Description
ifShow INCLUDE_NET_SHOW Display information about network interfaces in the system.
inetstatShow INCLUDE_NET_SHOW
INCLUDE_TCP_SHOW
Display status of sockets in the system. Output is similar to the Unix netstat tool's output.
ipstatShow INCLUDE_NET_SHOW Displays statistics about the IP protocol. Pass TRUE as a parameter to reset the statistics (useful to measure only the statistics for a particular set of operations when benchmarking for example).
netStackDataPoolShow INCLUDE_NET_SHOW Displays statistics on the usage of mBlks and clusters in the network stack's data pool.
netStackSysPoolShow INCLUDE_NET_SHOW Displays statistics on the usage of mBlks and clusters in the network stack's system pool.
mbufShow INCLUDE_NET_SHOW Displays mbuf usage statistics. See the VxWorks Network Programmer's Guide for more information about mbufs.
tcpDebugShow INCLUDE_TCP_SHOW
INCLUDE_TCP_DEBUG
Displays detailed debugging information from the TCP protocol. To enable collection of debug, turn on the SO_DEBUG option for the socket(s) you are interested in monitoring.
tcpstatShow INCLUDE_TCP_SHOW Displays detailed statistics about the TCP protocol layer.
netPoolShow INCLUDE_NET_SHOW Lower level routine for displaying mBlk and cluster usage in a pool. Useful for monitoring the usage in the pools used by a particular device. See the code below for endPoolShow() which uses this routine to analyse the pool for a specified device.
udpstatShow INCLUDE_NET_SHOW Displays detailed statistics about the UDP protocol layer.
netperf http://www.netperf.org Benchmarking software for both TCP and UDP performance measurement. There have been ports to VxWorks done in the past, but these do not appear to be available online anymore. Since the VxWorks network stack is based on the BSD 4.4 stack though, new ports from the latest sources should not be difficult.

There is no routine included in VxWorks by default that can show the usage statistics for the mBlk and clusters in the driver pool associated with a particular network interface. The following simple example should achieve this:

void devPoolShow (
        const char * devName)
{
        END_OBJ *    pEndObj;
        int          unit;
        char         device [8];
        int          devLen; 
        
        /* Take a private copy to modify */
        
        strcpy (device, devName);
        
        /* The unit number here is assumed to be just a single
        * digit; this assumption should be OK for most people.
        */
        
        devLen = strlen (devName) - 1;
        
        if (isdigit (device[devLen]))
        {
                unit   = atoi (&devName[devLen]);
                
                /* Remove the number now */
                
                device [devLen] = '\0'; 
        }
        else
        {
                /* Assume unit number is 0 if none specified */
                
                unit = 0;
        }
        
        if ((pEndObj = endFindByName (device, unit)) != NULL)
        {
                netPoolShow (pEndObj->pNetPool);
        }
        else
        {
                printf ("Could not find device %s%d\n",
                        device, unit);
        }
        return;
}

To use this, pass it a device name of the form dc0 or fei. If no unit number is specified, then it will assume a unit number of zero. If you need support for more than one digit unit numbers, then some changes will be needed to the code (or it might be simpler to pass the device name and the unit number as separate parameters).

Socket Performance Tuning

There are a number of parameters that can be tuned, on a per socket basis to improve performance:

Receive and Transmit Windows:

The receive and transmit windows control how much information can be queued up in the network stack for a particular socket. These are controlled by the SO_RCVBUF and SO_SNDBUF socket options. The defaults for these values are shown in the table below:

Protocol Receive Buffer
bytes
Transmit Buffer
bytes
TCP
8192
8192
UDP
41600
9216

Changing these values can be achieved using code like this example, which changes the TCP receive window:

int wndw = 16384;

setsockopt(socket, SOL_SOCKET, SO_RCVBUF, (char *) &wndw, sizeof(wndw));

There are also some undocumented global variables that can be changed to modify the defaults that are used for all sockets. Any new sockets created after the change will use the new values. As ever though, when using undocumented features be aware that your code may not work in future releases of the operating system. The table below shows the variables you can change:

Protocol Receive Buffer Transmit Buffer
TCP
tcp_recvspace
tcp_sendspace
UDP
udp_recvspace
udp_sendspace

Nagle Algorithm:

The Nagle algorithm was designed to reduce the number of packets being sent on the network by waiting until there is enough data in the send queue to fill a packet, or a timeout expires. This is usually better for applications where throughput is important, but worse for applications where minimising the delay between the send and receive is critical.

Luckily, the algorithm can be disabled on a per-socket basis using the TCP_NODELAY option. Here is an example of the code fragment you will need to disable the Nagle algorithm:

int opt = 1;

setsockopt(socket, IPPROTO_TCP, TCP_NODELAY, (char *) &opt, sizeof(opt));

Dynamically Changing Interface Settings

It is relatively simple to change the IP address and mask settings of a given interface in VxWorks on-the-fly should it be necessary to do so. Here is an short function that encapsulates the necessary steps for changing the IP address and mask of a specific interface.

STATUS ifAddrAndMaskChange (
        char* drv,        /* Driver name, e.g. "dc" */
        int unit,         /* Unit number */
        char * addr,      /* New IP address, in string form */
        UINT32 mask)      /* New netmask, in numeric form */
{
        char dev [8];
        
        /* Delete any routes using this interface */
        
        if (ifRouteDelete (drv, unit) == ERROR)
        {
                return ERROR;
        }
        
        /* Detach the interface fromt the TCP/IP stack for now */
        
        if (ipDetach (unit, drv) == ERROR)
        {
                return ERROR;
        }
        
        /* Re-attach it to the IP stack as a fresh interface */
        
        if (ipAttach (unit, drv) == ERROR)
        {
                return ERROR;
        }
        
        /* Build a device name string from the driver name & unit */
        
        sprintf (dev, "%s%d", drv, unit);
        
        /* Set the netmask (must happen before the address) */
        
        if (ifMaskSet (dev , mask) == ERROR)
        {
                return ERROR;
        }
        
        /* Set the address */
        
        return ifAddrSet (dev, addr);
}

To use the routine, simply pass it the driver name, unit number and new settings. Here's an example, called from the shell, that changes the IP address of dc0 to 192.168.1.100 and uses a netmask value of 255.255.255.0. Notice that the mask is passed in hexadecimal notation and not in dot format as a string.


-> ifAddrAndMaskChange "dc", 0, "192.168.1.100", 0xffffff00

Socket Programming

Basic Client Sequence

  • Add example

Basic Server Sequence

  • Add example

Advanced Server Sequence

  • Multi-threaded server example

Socket Options

  • SO_REUSEADDR
  • SO_LINGER
  • SO_KEEPALIVE
  • TCP_NODELAY
  • Others?

select() vs Multiple Tasks

  • Using select()
  • Issues with selection
  • Using select() with sockets and pipes
  • Using I/O tasks & shared memory

Address Resolution Protocol (ARP)

What is ARP?

Using ARP

Changing the ARP Keep Time

By default, the VxWorks implementation of ARP will keep address mappings in its cache for 20 minutes. There is no API that allows this to be changed, but the variable that controls it is a global one, so its value could be changed.

Warning: Changing values like this is not portable to other operating systems, nor necessarily to future versions of VxWorks. User beware.

To change the time mappings stay alive in the cache, use the following code sequence:

#include "vxWorks.h"
#include "arpLib.h"

extern int arpt_keep;

void arpKeepTimeSet (
        int seconds)      /* New keep time */
{
        arpt_keep = seconds;
        arpFlush();
}

You will need to have the ARP API component, INCLUDE_ARP, configured into your kernel for this to work. Since the new time will only affect new entries added the cache, the code above calls the arpFlush() function to purge pre-existing mappings from the cache immediately. When they are needed again, they will be re-inserted into the cache with the new keep time.

Basic Routing

  • Adding a Default Route
  • Others?

DNS

  • Setting up the resolver
  • Resolving a name

END Drivers

  • How to write one
  • Differences from a BSD driver

802.11 Wireless Networking

Stations

Access Points

Multicast

  • Using multicast with VxWorks
  • Others?

IPv6 Support

Client Protocols

DHCP

  • Obtaining target IP address using DHCP

FTP

  • Example using the ftpLib functions

TFTP

  • Transferring data using TFTP

SNTP

  • Example of setting the system clock using the SNTP client

Server Protocols

DHCP

  • Configuring VxWorks as a DHCP server
  • Maintaining lease information correctly over reboots

FTP

  • Configuring the VxWorks FTP server
  • Adding basic authentication

TFTP

  • Configuring the TFTP server

HTTP

  • Adding an HTTP server to VxWorks

SNTP

  • Configuring the SNTP server

Network Security Protocols

-- JohnGordon - 15 Apr 2003

 
 
© 2003-5 blueDonkey.org, except where otherwise noted. All rights reserved.