<< Previous | Home | Next >>

Bandwidth testing of LACP bonding link in Linux with iperf

Validating our multi-channel ethernet teams on Debian Lenny & Ubuntu Lucid Lynx
Bookmark and Share

Over the past two months my company (xtendx AG) moved our servers to a new data center at green.ch. One of the primary motivations for this move was to gain access to a multiple gigabit per second Internet link. Each of our production streaming servers have either 2 or 4 channels ethernet bonding configuration with LACP. Once they were configured I set out to test their capacity and validate the entire design and configuration.

To that end, I installed iperf onto each of our servers. One box was configured as a server. Two others were configured as clients. Because of the method LACP uses to split up the traffic, it is impossible or merely very difficult to setup a server-to-server that uses more than 1Gbps. Generally, LACP accomplish multiple link speeds by splitting the individual client-server traffic over the two links on a client-server address pair basis.

Our setup here is fairly simple: three servers, each with either a 2-channel or 4-channel LACP ethernet bonding setup connected to a lone HP ProCurve 2510G-24 switch. The switch was manually configured to place the ports into a dynamic LACP bond. The bonds themselves are configured with the kernel module parameters mode=4 miimon=100 max_bonds=4 xmit_hash_policy=1

Running iperf in server mode. Note that iperf uses port 5001 by default, so adjust your firewalling solution if necessary.

iperf -s -i 2

Running iperf in client mode. This was done on two physically separate machines.

cat /dev/zero |  iperf -c svr.example.com -t 2400 -i 2

2 x 1Gbps link LACP bandwidth graph

Yup, looks good. There was the possibility that both clients would have come in on the same link. This is possible because the decision about which channel to use is based upon a the source and destination addresses. It is also by design--don't fret! Simply using a different server for one of the clients would resolve the issues.

Exploring DRBD: Notes for a Newbie

For your reference, and mine!
Bookmark and Share

drbd logoOver the past month I've been converting an existing pair of NFS/ext4 file servers (live & hot-spare with regular rsync-based synchronization) to a two-node Class C DRBD high availability cluster. It's been a learning experience, to say the least. And while the DRBD documentation is excellent, there were some concepts and tasks that did not immediately sink in. The below notes are for my future reference, and anyone who is also new to DRBD.

  • Making a DRBD resource mount on boot is not trivial. i've left my fstab with noauto in the configuration. Why? The thought of a node rebooting, becoming stale, and then automatically setting itself to the primary scares me!

  • A DRBD resource that is a Class C secondary is not mountable. you'll get errors like:
    root@umhlanga:/home/stu# mount /mnt/data
    mount: block device /dev/drbd1 is write-protected, mounting read-only
    mount: Wrong medium type
  • If you want a DRBD to come up from a cold boot to be primary, then setting up a heartbeat is a requirement. But you don't have to do this for a basic, first step HA cluster.
  • Add the startup timeouts options to the configuration file. if your kit is at a data center, without it you may end up making a site visit because of a hung boot. it'll sit there until done. My /etc/drbd.conf snippet:
    common {
      	protocol C;
    	startup {
    		# wfc-timeout degr-wfc-timeout outdated-wfc-timeout wait-after-sb;
    		wfc-timeout 600;
    		degr-wfc-timeout 600;
    		outdated-wfc-timeout 600;
    	}
    }
  • A fundamental task to perform is querying the state of the drbd resource(s). There are two ways to do this. Example:
    stu@umhlanga:~$ cat /proc/drbd 
    version: 8.3.7 (api:88/proto:86-91)
    GIT-hash: ... build by root@umhlanga, 2010-07-28 11:28:28
     1: cs:WFConnection ro:Secondary/Unknown ds:UpToDate/DUnknown C r----
        ns:0 nr:0 dw:0 dr:0 al:0 bm:6 lo:0 pe:0 ua:0 ap:0 ep:1 wo:b oos:9765272500
    stu@umhlanga:~$ 
    stu@umhlanga:~$ sudo drbd-overview
    [sudo] password for stu: 
      1:r0  WFConnection Primary/Unknown UpToDate/DUnknown C r---- /mnt/data ext4 9.0T 1.5T 7.5T 17% 
    stu@umhlanga:~$ 
    
    

    This output contains lots of important information. The details of each method and the output can be found on DRBD's documentation page 'Checking DRBD Status'

  • The high level steps required to bring up a DRBD resource:
    1. kernel starts
    2. network stats
    3. DRBD starts
    4. bring up DRBD resource
    5. make primary (if it is the primary)
    6. mount resource
    And presto! The DRBD resource is ready for use.

  • After having created a resource, and rebooting, you get this message with drbdadm up resource-name, don't follow the instructions in the error message.
    drbdadm up r0 
    1: Failure: (124) Device is attached to a disk (use detach first)
    Command 'drbdsetup 1 disk /dev/sdb /dev/sdb internal --set-defaults 
        --create-device' terminated with exit code 10
    
  • If you, like me, are exporting the mounted file system with NFS then be careful with the auto-mount configuration in /ets/fstab.
Tags :

Spooky mcelog messages: MCA: Generic CACHE Level-2 Generic Error

New error message haunting my fresh Debian Lenny installation
Bookmark and Share

In mid-June I installed Debian Lenny onto an old HP DL380 of ours. Since then there has been six of these Generic CACHE Level-2 Generic Error messages.

MCE 0
HARDWARE ERROR. This is *NOT* a software problem!
Please contact your hardware vendor
CPU 6 BANK 3 TSC 1e600ed894af4
ADDR 1719edeb0 
MCG status:
MCi status:
Error enabled
MCi_ADDR register valid
MCA: Generic CACHE Level-2 Generic Error
STATUS 942000420001010a MCGSTATUS 0
The Scream

Shortly after installing Debian, I also dropped in a second quad core CPU and doubled the RAM to 8GB. Coincidence? Probably not. Then again, I don't have ready access to the old RHEL4 system logs. (They are with our old hosting vendor.)

What exactly does this mean? Is it a recoverable condition? Is it a warning sign of future problems to come?

Spooky.

Simple linux bandwidth monitoring with bwm-ng

Where has this tool been all my life?
Bookmark and Share

These past few days I've been running some load tests on new servers, configurations and our Internet link. While we have a Cacti installation graphing everything via SNMP, the 5 minute polling interval means waiting for what seems like forever.

I just want some quick and clean bandwidth statistics!

After some years of needing a simple tool, finally bwm-ng has entered my life. Where have you been all these years, bwm-ng? I assume that bwm-ng is short for "bandwidth monitor, next generation."

bwm-ng in action

My only complaint is that when monitoring network interfaces, ethernet bonds are treated as equals to the real interfaces rather than the virtual cumulative devices they are.

Other than that, this tool rocks. Some features that are most pleasing:

  • Displays different units: bps to Mbps, Bps to MBps
  • Network interfaces and hard drives
  • Simple, clean and easy to comprehend user interface
  • 100ms polling interval granularity
  • apt-get installation on Debian Lenny

#geekloveatfirstsight

Multiple bonds on Debian Lenny, and related No such device error

SIOCSIFADDR: No such device on Debian Lenny with NIC bonding
Bookmark and Share

While setting up a few Debian Lenny machines this summer, I came across the error message SIOCSIFADDR: No such device a few times. All of these servers have NIC bonding configured, and a few of them have multiple Ethernet bonds. Here are a couple of potential causes for this error message:

  • The /etc/modprobe.d/arch/X86_64 file does not contain the bonded device name. For multiple bonded devices, the file must contain an alias entry for each. Here is an example for a two device system named bond0 and bond1:
    
    alias bond0 bonding
    alias bond1 bonding
    
    
  • The /etc/modules module for bonding (bonding mode=4 miimon=100 max_bonds=2) is configured for fewer bonds than the server has. The here max_bonds=2 is the maximum number of bonding devices your system will have. The default is 1. If you machine has more, then SIOCSIFADDR: No such device will appear for the devices that did not come up.