After the last post when I had built up and installed my new Dynamips server along with a set of Sun Quad NIC cards (501-4366, HME or Happy Meal as they are also known) I started to run into some issues.
After building a simple topology with one router connected to my 3550 I was seeing each device in the others CDP table which was good. It wasn’t until today when I was trying to lab something up where two routers connected via that external 3550 (using 2 ports on the quad NIC) would not form a neighbourship, everything was checked and sure enough they couldn’t even ping each other.
I checked the ARP table and there were entries and they were all correct and when I ran a set of debugs between the routers the packets were arriving however there was something wrong with the packets as the router was simply dropping them.
My next step involved ruling out Dynamips and its Pcap wizardry to ensure that it wasn’t the card, after assigning an IP to the card and pinging across to the Vlan1 interface on the 3550 I was getting ping responses however it was complaining about the packet being different from what it was expecting (see below)
[email protected]:~> ping 10.1.2.10 PING 10.1.2.10 (10.1.2.10) 56(84) bytes of data. 64 bytes from 10.1.2.10: icmp_seq=1 ttl=64 time=2.20 ms wrong data byte #54 should be 0x36 but was 0xba 16 10 11 12 13 14 15 16 17 18 19 1a 1b 1c 1d 1e 1f 20 21 22 23 24 25 26 27 28 29 2a 2b 2c 2d 2e 2f 48 30 31 32 33 34 35 ba cc 64 bytes from 10.1.2.10: icmp_seq=2 ttl=64 time=0.223 ms wrong data byte #54 should be 0x36 but was 0x64 16 10 11 12 13 14 15 16 17 18 19 1a 1b 1c 1d 1e 1f 20 21 22 23 24 25 26 27 28 29 2a 2b 2c 2d 2e 2f 48 30 31 32 33 34 35 64 d8After further troubleshooting I could see that the bytes it was complaining about were always 2 less than the amount of the datagram (54 in this case for a 56byte datagram).
I spent all afternoon scouring reports of this issue and finally came to the conclusion that this issue is only present on systems that use a 64-bit kernel and have 2GB+ RAM in, I fall into both of these criteria and if I remove 6GB to take me down to 2GB the system works fine.
How to fix the issue
I am not going to guide you through compiling your own kernel as it is different for pretty much every distro, make a friend of google and search for something along the lines of ‘compile new kernel ubuntu’
Once you have your kernel source navigate to the ‘drivers/net/ethernet/sun’ folder where you should find a sunhme.c file, download the patch from here and run it like this:
patch -p0 < sunhme.patchOnce you have done this it will ask you which file to patch, tell it sunhme.c and then you are ready to compile and reboot into your new custom kernel.
After I compiled and booted into my new kernel the card worked perfectly and I can now form neighbourships, ping without issues and all the other fun L3 stuff you could possibly want to do 😛
I hope this makes the issue easier to resolve for anyone else that experiences it as it took me quite some time to piece together what was wrong and the best way to go about fixing it.
thanks,
David
hi. i have issue too.
same issue.
but, i can’t fixed issue.
can’t fine sunhme.c file in my system. (driver/net in sunhme.ko file) (i try to sunhme.ko but it failed)
please help me.
(sorry, i can speak eng a little)
Hi Kwonsun, It sounds like you do not have the kernel source downloaded, .ko files are loadable kernel modules that have already been compiled to work with your kernel.
If you search Google for assistance in recompiling your kernel I’m sure you will be able to find someone that has already documented what to do, just remember to apply my patch before you compile the new kernel.
thanks,
David
thx for u’re answer.
if you say ‘kernel upgrade’ what can i download kernel ver.
Kwonsun
i use kernel Linux 2.6.35-29-generic now.
Hi,
I stumbled on your post while troubleshooting my darn quad nic sun cards.
I got the same issues as you,cdp works file,arp finds the mac addresses but can’t ping.
I’m using debian sqeeze x64.
I wan’t to try your patch,problem is that the link is broken.
So could you upload it somewhere else?
Thank you in advance
Hi Nikos, I have updated the link, give it a try now.
There is no reason that the patch shouldn’t work, as far as I can tell there should be no difference between Ubuntu and Debian.
thanks,
David
David I don’t know how to thank you,you are a lifesaver.
Saved me a lot of banging my head against the wall + spending money + waiting time
Cheers 🙂
No problem dude. Glad I could help!
Hi, can you update a link please
Hi Andy,
Sorry for the delay getting back, I didn’t see your comment come through.
The link should be live again, it should point to: http://networkbroadcast.co.uk/downloads/sunhme.patch
thanks,
David
Wow, I’ve been searching for a solution forever! The link to the patch is broken though, any chance you can post a new link?
Much appreciated!
Give that a go shane, updated the link as for some reason WP had put in two http’s in the URL.
thanks,
David
Ah, I missed that. Thanks!
[[email protected] net]# patch -p1 -i sunhme.patch (Stripping trailing CRs from patch.) can’t find file to patch at input line 3 Perhaps you used the wrong -p or –strip option?
The text leading up to this was:
|— sunhme.c.dist 2011-04-26 21:24:34.509574001 +0100
|+++ sunhme.c 2011-04-26 16:03:15.429692214 +0100
File to patch: sunhme.c patching file sunhme.c Hunk #1 FAILED at 1265. Hunk #2 FAILED at 2031. 2 out of 2 hunks FAILED — saving rejects to file sunhme.c.rej
Linux mothership 2.6.32-220.7.1.el6.x86_64 #1 SMP Wed Mar 7 00:52:02 GMT 2012 x86_64 x86_64 x86_64 GNU/Linux
Could you post up a copy of your sunhme.c or email it to me (david [dot] rothera [at] gmail [dot] com)
I am trying this patch first https://bugzilla.novell.com/show_bug.cgi?id=376831
i am compiling my new kernel now. that patch applied sucessfully. will post sunhme.c shortly
this worked: https://bugzilla.kernel.org/show_bug.cgi?id=10790
Since patching my system crashes once a day. I may have recompiled the kernel incorrectly as it was 200MB+. I recompiled disabling debug logging and now it is 30MB, yet to test. I will report back
my sunhme.c
http://pastebin.com/aSCMH6Dt
Pingback: My GNS3/Virtual Box/CCIE Lab Build « tweqouu
[email protected]:/d1/development/kernel/maverick/source/drivers/net$ sudo patch -p1 -i sunhme.patch can’t find file to patch at input line 3 Perhaps you used the wrong -p or –strip option?
The text leading up to this was:
|— sunhme.c.dist 2011-04-26 21:24:34.509574001 +0100
|+++ sunhme.c 2011-04-26 16:03:15.429692214 +0100
File to patch: sunhme.c patching file sunhme.c Hunk #1 FAILED at 1265. Hunk #2 FAILED at 2031. 2 out of 2 hunks FAILED — saving rejects to file sunhme.c.rej
If you are having issues getting the patch to work you can always carry out the changes yourself, if you look in the patch file you can see the lines marked with (-) are the lines you change and after the change they should look like the lines prepended with a (+).
[email protected]:/d1/development/kernel/maverick/source/drivers/net$ sudo patch -p1 -i sunhme.patch can’t find file to patch at input line 3 Perhaps you used the wrong -p or –strip option?
The text leading up to this was:
|— sunhme.c.dist 2011-04-26 21:24:34.509574001 +0100
|+++ sunhme.c 2011-04-26 16:03:15.429692214 +0100
File to patch: sunhme.c patching file sunhme.c Hunk #1 FAILED at 1265. Hunk #2 FAILED at 2031. 2 out of 2 hunks FAILED — saving rejects to file sunhme.c.rej
Read my comment…. edit the sunhme.c file manually if you are having issues.
Ofcourse if you don’t actually have the sunhme.c file then you have a bigger issue than my patch not working…
Thanks David,
i could complie new kernal for fedora 17 with your patch. Thank uou verymuch. it works great. i pached it manually.
Regards
Good to hear the patch is still working well.
Wonder if this kind of patch will ever make it into the stock kernel…
I had the same problem and. I update the kernel from time to time, so just patching my current kernel wouldn’t do. I created an akmod package for Fedora that has this patch. It’s at https://github.com/neatbasis/sunhme2g You should be able to update the kernel and the driver will be rebuilt against the new kernel with akmod.