Sun Quad NICs and x86_64 kernels

After the last post when I had built up and installed my new Dynamips server along with a set of Sun Quad NIC cards (501-4366, HME or Happy Meal as they are also known) I started to run into some issues.

After building a simple topology with one router connected to my 3550 I was seeing each device in the others CDP table which was good. It wasn’t until today when I was trying to lab something up where two routers connected via that external 3550 (using 2 ports on the quad NIC) would not form a neighbourship, everything was checked and sure enough they couldn’t even ping each other.

I checked the ARP table and there were entries and they were all correct and when I ran a set of debugs between the routers the packets were arriving however there was something wrong with the packets as the router was simply dropping them.

My next step involved ruling out Dynamips and its Pcap wizardry to ensure that it wasn’t the card, after assigning an IP to the card and pinging across to the Vlan1 interface on the 3550 I was getting ping responses however it was complaining about the packet being different from what it was expecting (see below)

[email protected]:~> ping 10.1.2.10
PING 10.1.2.10 (10.1.2.10) 56(84) bytes of data.
64 bytes from 10.1.2.10: icmp_seq=1 ttl=64 time=2.20 ms
wrong data byte #54 should be 0x36 but was 0xba
16     10 11 12 13 14 15 16 17 18 19 1a 1b 1c 1d 1e 1f 20 21 22 23 24 25 26 27 28 29 2a 2b 2c 2d 2e 2f 
48     30 31 32 33 34 35 ba cc 
64 bytes from 10.1.2.10: icmp_seq=2 ttl=64 time=0.223 ms
wrong data byte #54 should be 0x36 but was 0x64
16     10 11 12 13 14 15 16 17 18 19 1a 1b 1c 1d 1e 1f 20 21 22 23 24 25 26 27 28 29 2a 2b 2c 2d 2e 2f 
48     30 31 32 33 34 35 64 d8
After further troubleshooting I could see that the bytes it was complaining about were always 2 less than the amount of the datagram (54 in this case for a 56byte datagram).

I spent all afternoon scouring reports of this issue and finally came to the conclusion that this issue is only present on systems that use a 64-bit kernel and have 2GB+ RAM in, I fall into both of these criteria and if I remove 6GB to take me down to 2GB the system works fine.

 

How to fix the issue

I am not going to guide you through compiling your own kernel as it is different for pretty much every distro, make a friend of google and search for something along the lines of ‘compile new kernel ubuntu’

Once you have your kernel source navigate to the ‘drivers/net/ethernet/sun’ folder where you should find a sunhme.c file, download the patch from here and run it like this:

patch -p0 < sunhme.patch
Once you have done this it will ask you which file to patch, tell it sunhme.c and then you are ready to compile and reboot into your new custom kernel.

 

 

After I compiled and booted into my new kernel the card worked perfectly and I can now form neighbourships, ping without issues and all the other fun L3 stuff you could possibly want to do 😛

I hope this makes the issue easier to resolve for anyone else that experiences it as it took me quite some time to piece together what was wrong and the best way to go about fixing it.

thanks,

David

28 thoughts on “Sun Quad NICs and x86_64 kernels

  1. hi. i have issue too.

    same issue.

    but, i can’t fixed issue.

    can’t fine sunhme.c file in my system. (driver/net in sunhme.ko file) (i try to sunhme.ko but it failed)

    please help me.

    (sorry, i can speak eng a little)

    • Hi Kwonsun, It sounds like you do not have the kernel source downloaded, .ko files are loadable kernel modules that have already been compiled to work with your kernel.

      If you search Google for assistance in recompiling your kernel I’m sure you will be able to find someone that has already documented what to do, just remember to apply my patch before you compile the new kernel.

      thanks,

      David

  2. Hi,

    I stumbled on your post while troubleshooting my darn quad nic sun cards.

    I got the same issues as you,cdp works file,arp finds the mac addresses but can’t ping.

    I’m using debian sqeeze x64.

    I wan’t to try your patch,problem is that the link is broken.

    So could you upload it somewhere else?

    Thank you in advance

    • Hi Nikos, I have updated the link, give it a try now.

      There is no reason that the patch shouldn’t work, as far as I can tell there should be no difference between Ubuntu and Debian.

      thanks,

      David

  3. [[email protected] net]# patch -p1 -i sunhme.patch (Stripping trailing CRs from patch.) can’t find file to patch at input line 3 Perhaps you used the wrong -p or –strip option?

    The text leading up to this was:

    |— sunhme.c.dist 2011-04-26 21:24:34.509574001 +0100

    |+++ sunhme.c 2011-04-26 16:03:15.429692214 +0100

    File to patch: sunhme.c patching file sunhme.c Hunk #1 FAILED at 1265. Hunk #2 FAILED at 2031. 2 out of 2 hunks FAILED — saving rejects to file sunhme.c.rej

  4. Pingback: My GNS3/Virtual Box/CCIE Lab Build « tweqouu

  5. [email protected]:/d1/development/kernel/maverick/source/drivers/net$ sudo patch -p1 -i sunhme.patch can’t find file to patch at input line 3 Perhaps you used the wrong -p or –strip option?

    The text leading up to this was:

    |— sunhme.c.dist 2011-04-26 21:24:34.509574001 +0100

    |+++ sunhme.c 2011-04-26 16:03:15.429692214 +0100

    File to patch: sunhme.c patching file sunhme.c Hunk #1 FAILED at 1265. Hunk #2 FAILED at 2031. 2 out of 2 hunks FAILED — saving rejects to file sunhme.c.rej

    • If you are having issues getting the patch to work you can always carry out the changes yourself, if you look in the patch file you can see the lines marked with (-) are the lines you change and after the change they should look like the lines prepended with a (+).

    • [email protected]:/d1/development/kernel/maverick/source/drivers/net$ sudo patch -p1 -i sunhme.patch can’t find file to patch at input line 3 Perhaps you used the wrong -p or –strip option?

      The text leading up to this was:

      |— sunhme.c.dist 2011-04-26 21:24:34.509574001 +0100

      |+++ sunhme.c 2011-04-26 16:03:15.429692214 +0100

      File to patch: sunhme.c patching file sunhme.c Hunk #1 FAILED at 1265. Hunk #2 FAILED at 2031. 2 out of 2 hunks FAILED — saving rejects to file sunhme.c.rej

      • Read my comment…. edit the sunhme.c file manually if you are having issues.

        Ofcourse if you don’t actually have the sunhme.c file then you have a bigger issue than my patch not working…

  6. Thanks David,

    i could complie new kernal for fedora 17 with your patch. Thank uou verymuch. it works great. i pached it manually.

    Regards

Leave a Reply

Your email address will not be published.