Live traffic on port via snmp and discrepancies - performance

Iam trying to get data from HP switches and Juniper firewalls and its port via snmp.
I am looking for the way how to analyze live traffic on port so I can create a graph of utilization of the ports like on Solarwinds or Observium.
So far I have the results I am getting are from the formula on How to calculate traffic on cisco
It works fine, however, every couple of readings I get abnormal speeds. I.e. for a virtual interface on the firewall, which is limited to 4MB I get 20+ MB every now and then.
I have a cron job which polls the devices every 5 minutes so the formula is using 300 seconds as a delta of time.
So the question is, is it possible for a port to be showing these abnormalities or am I doing something wrong? Any insight would be amazing :-)

The problem is that you are using ifTable defined in RFC1213. It is sort of outdated due to ifInOctets and ifOutOctets are defined as 32-bit counters. So they will overflow and reset real fast and you'll face abnormal results when this happens. I'd suggest switching to ifXTable (IF-MIB) where these counters are defined as 64-bit values.

Related

EC2 Instance, sudden network performance cap

Anyone else experienced a sudden network performance cap recently?
Our instances managed to go up to average 100,000,000 bytes average but all of a sudden we're down to 50,000,000 without warning. This happened two days ago at around Oct 16 11:40 UTC.
I'm using a c3.xlarge type instance with network performance moderate, did they lower the cap of the "moderate" performance?
Would be nice to hear if anyone else have this problem since its pretty weird that they would do that without warning, I cant find any information on this.
I've attached screenshot of proof, the instance-type was not changed at that time.
Its the same problem on both Network In and Network Out.
Graph:
http://i.stack.imgur.com/WQ9Sf.png
This is par for the course with shared tenancy. Most instances except for the largest instance sizes, are all on hardware shared with other instances. This means all resources are shared including network bandwidth.
When no other instances are using the bandwidth available to your host, you can generally take advantage of most, if not all of it. If other hosts are attempting to saturate the host bandwidth, then host will schedule your bandwidth based on your network priority.
Moderate does not mean you are guaranteed a certain amount of bandwidth, instead it gives you a certain priority in comparison with the other instances on the host.
What can you do about this? You could stop/start your instance until you get assigned to a host without any noisy neighbors. You could also scale horizontally to give yourself more available bandwidth.

Synchronizing a counter across a network

I have two computers that can talk to each other over a serial connection. The connection is made over a wireless network. There is a variable, changing delay in communications between the two systems. On both systems I have a counter runtime that increments by 1 every ms. They both start as soon as the applications start. Say each computer is started at different times. How can I with with the serial connection synchronize the counters so that systemA.counter will equal systemB.counter and so that both counters increment at the same time (or as close as possible).
Ideally once synchronized the counters would drift only slowly apart so that once every 3 or 4 thousand incs I could re-synchronize.
I'm looking for good resources on the topic, example algorythms, example code (c/c++), anything to point me in the right direction.
Update
This is a closed system, no internet. For all intents and purposes no real protocol at all besides and open serial line over the wireless link. That link at the moment is bluetooth, but I'm thinking over moving it to a ZigBee Mesh. There are currently 2 nodes, but if I have 30 nodes all running this same application I would want them all to synchronize. There is not client/server designation, just a couple of devices running the same program with a counter. I don't have access to anything like time, just this counter that increments once a millisecond and whatever algorithm I can put in place.
Once I can get this working, I would like to put in place a propositioning and mapping system, but to figure out distances between nodes, I need actuate timing synchronized on the devices.
If you use this counters to order events in a system, you should look at vector clocks or Lamport timestamps.
The obvious resource is NTP, which is documented for example at http://www.eecis.udel.edu/~mills/ntp.html and with links off there. Basically, this uses timestamps to adjust the frequency at which local clocks run. The protocol has been around for years and been the subject of continuous research - I can't see any pack of slides there which immediately makes it clear how it works. You might be better to see if there is already an NTP implementation available than to try and re-implement it yourself.
It appears (e.g. from searching) that there is a small industry of people working on time synchronisation algorithms, especially in the context of wireless sensor networks. One jumping-off point, apart from searches, is the survey paper at http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.85.2012 - Time synchronization in sensor networks: A survey (2004)

How to optimize ZeroMQ Performance On Windows (XP SP3)

I have two Windows XP SP3 machines in which I am trying to send 3k ZMQ messages from one to the other. These are both fairly modern system (Dual Quad Core Xeon with 5100 chipset and Dual Hex Core Xeon with 5500 chipset) with server grade Intel gigabit ethernet cards.
The two machines are connected point to point without a switch or router in between.
With pcttcp for performance comparison I am able to send 70MB/s (56% utilization) via TCP from one machine to the other. With ZMQ PUSH/PULL I am only able to get ~28MB/s between the two.
With the sender and receiver on the same machine (the slower machine of the two) I am able to achieve a rate of 97MB/s. (220MB/s in the dual hex core)
The PUSH/PULL channel has a HWM set on both ends. It performs marginally better if the HWM sizes are set to low (~150 messages) rather than a larger value like 1024.
I tried 6000 byte jumbo frames and it go worse. (pcttcp performed marginally better though # 72MB/s)
I tried setting TcpWindowSize to a larger value but it seemed to get worse as well. ZMQ liked the lower size, pcttcp did not change. TcpWindowSize is now set to 32K
Other parameters:
TcpAckFrequency = 1 // would not work with out this.
Tcp1323Opts = 1
Receive Side Scaling enabled
How should I approach finding the bottle neck? What should I expect to achieve with TCP and ZMQ performance? The ZeroMQ web site performance section details tests in which the throughput approaches that of TCP (95%+).
Any performance tips / wisdom (aside from use linux, ;-) ) would be greatly appreciated.
Thanks!!!
Another clue: if I setup multiple sender / receiver pairs between the two systems (same direction, different ports) I am able to achieve a higher aggrigate rate. (a total of ~42MB/s with three)
A quick google pulled this up http://comments.gmane.org/gmane.network.zeromq.devel/10089
The nugget out of that thread is TcpDelAckTicks: [quote]
I got a huge increase of performance (2.4 seconds to 0.4 seconds) after setting TcpDelAckTicks registry value to the machine that does
"apr_socket_accept()" -call in the server code. Client just sends
request and waits for response in loop. There was no change in
performance.
The reason I got there was because I was looking for something around MTU, thinking that it might be network related.
And then I found this http://lists.zeromq.org/pipermail/zeromq-dev/2010-November/007814.html, which has a number of performance tuning recommendations (tho not specifically xp), I won't summarise here, as it would be an almost direct copy and paste (not sure I can be more succinct.)
I'm not sure this'll be helpful, but you might not have spotted them.

Cannot achieve full speed on Symmetrical Internet Connection

We are using a business Ethernet connection (3Mbit upload, 3Mbit download) and trying to understand issues with our tested bandwidth speeds. When uploading a large file we sustain 340 KB/s; downloading we sustain 340KB/s. However when we run these transfers simultaneously the two transfer speeds rise and fall erratically with a average speed for both at around 250 KB/s. We're using a Hatteras HN404 CPi and we've bypassed the router (plugged a machine directly into the Hatteras; set the NIC to full-duplex).
Is this expected? Should a max upload interfere with a max download on this type of Internet connection?
Are you sure the bottleneck is your connection?
Do you also see this behavior when the simultaneous upload and download are occurring on different systems, or only when one system is handling both the upload and download?
If the problem goes away when independent machines are doing the work, the bottleneck is likely closer to the hard drive.
This sounds expected from my experience with lower end lines. On a home line, I've found that traffic shaping and changing buffer sizes can be a huge help.
TCP/IP without any unusual traffic shaping will favor the most aggressive traffic at the expense of everything else. In your case, this means responses to the outgoing ACKs and such for the download will be delayed or maybe even dropped. See if your HN404 supports class based queuing or something similar and try it out.
Yes it is expected. This is symptomatic of any case in which you have a throttled or capped connection. If you saturate your uplink it will affect your downlink and vice versa.
This is because the your connection's rate-limiting impacts the TCP handshake acknowledgement packets (ACKs) and disrupts the normal "balance" of how these packets flow.
This is very thoroughly described on this page about Cable Modem Troubleshooting Tips, although it is not limited to cable modems:
If you saturate your cable modem's
upload cap with an upload, the ACK
packets of your download will have to
queue up waiting for a gap between the
congested upload data packets. So your
ACKs will be delayed getting back to
the remote download server, and it
will therefore believe you are on a
very slow link, and slow down the
transmission of further data to you.
So how do you avoid this? The best way is to implement some sort of traffic-shaping or QoS (Quality of Service) on individual sessions to limit them to a maximum throughput based on a percentage of your total available bandwidth.
For example on my home network I have it so that no outbound connection can utilize any more than 67% (2/3rd) of my 192Kbps uplink. That means any single outbound session can only utilized 128Kbps, therefore protecting my downlink speed by preventing the uplink from becoming saturated.
In most cases you are able to perform this kind of traffic-shaping based on any available criteria such as source ip, destination ip, protocol, port, time of day, etc.
It appears that I was wrong about the simultaneous transfer speeds. The 250KB/s speeds up and down were miscalculated by the transfer program (seemed to have been showing a high average speed). Apparently the Business Ethernet (in this case it is an XO circuit provisioned by Speakeasy) only supports 3Mb total, not up AND down (for 6Mbit total). So if I am transferring up and down at the same time in theory I should only have 1.5Mbit up and down or 187.5KB/s at the maximum (if there was zero overhead).

How can I estimate ethernet performance?

I need to think about performance limitations of 100 mbps ethernet (including scenarios with up to ~100 endpoints on the same subnet) and I'm wondering how best to go about estimating the capacity of the network. Are there any rules of thumb for this?
The reason I ask is that I am working on some back-of-the-envelope level calculations about performance limitations, so it doesn't need to be incredibly accurate. I just haven't been through this exercise before and was hoping to gain some insight from those who have. Mark Brackett's answer (as of 1/26) is along the lines of what I am looking for.
If you're using switches (and, honestly, who isn't these days) - then I've found 80% of capacity a reasonable estimate. Usually, it's really about 90% because of TCP overhead - but 80% accounts for occasional retransmits.
If it's a single collision domain (hubs), then you'd probably be around 30% with moderate activity on those 100 nodes. But, it'd be pretty variable based on the traffic generated. And anyone putting 100 nodes in a single CD these days would no doubt be shot - so I don't think you'll actually run into those IRL.
Edit: Note that these numbers are for a relatively healthy network - one that is generally defined as working. Extremely excessive broadcasts or other anomalous traffic patterns have been known to bring a network to it's knees.
Use WANem
WANem is a Wide Area Network Emulator,
meant to provide a real experience of
a Wide Area Network/Internet, during
application development / testing over
a LAN environment.
You can simulate any network scenario using it and then test your application's behaviour using it. It is open-source and is available with sourceforge.
Link : WANem - The Wide Area Network emulator
Opnet creates software for simulating network performance. I once used Opnet IT Guru Academic edition. Maybe this application or some other software from opnet may be of some help.
100 endpoints are not suppose to be an issue. If the network is properly configured (nothing special) the only issue is the bandwidth. Fast Ethernet (100 mbps) should be able to transfer almost 10Mb (bytes) per second. It is able to transfer it to one client or to many. If you are using hubs instead of switches. And if you are using half-duplex instead of full-duplex. Then you should change that( this is the rule of thumb).
Working from the title of your post, "How can I estimate Ethernet performance", see this wiki link; http://en.wikipedia.org/wiki/Ethernet_frame#Maximum_throughput

Resources