We've all had a complaint that the "network" is "slow" at some point: might be localized to one room (switch) or one computer, might just be Internet (DNS? Browser issue?), might be just one application (long-running SQL queries? AV scan running?).
When you've ruled out obvious system and/or application issues, how do you go about testing a network for slowness or erratic behavior? Do you work your way up the OSI layers? If so, how do go about checking each layer? What do you do to make sure the physical network is ok at an unknown environment? What about too many broadcasts or a broadcast storm? Layer 3 and up? traceroute? Any other tips, methods, ideas? Must-have features and tools (port mirroring, SNMP, monitoring, etc.) for all sizes of networks?
tcpdump and wireshark are your friends.
I find that watching packets on the wire of a 'slow' network vs a 'good' network is usually what pinpoints a problem.
There are many types of 'slow'.
You can track latency to local and internet sites using a tool like SmokePing. (SmokePing can be configured to track ICMP latency as well as service latency from TCP services)
Your switches should track broadcast packets vs unicast packets. Graph that ratio.
I also like to monitor traceroutes (checking domain names of ISP hops between myself 'important' sites).
I hope these comments help.
It is hard to give specific answers since 90% of this job is experience which teaches you where to look for which kind of problem, and the other 90% is knowing where to look on Google to get hints of where to start.
I usually try the paper-bag stuff like getting the customer to demonstrate the problem (mostly to rule out finger-problems and any issues the customer may have describing his problem), then trying to duplicate the problem on another computer. Doing that often gives you insight into where to look.
Don't forget the corrective problem of a reboot, especially for Windows systems, even today. It used to be like this so much that I would ask people "Have you rebooted? Well try that and let me know if the problem persists" -- this fixed a very large percentage of the issues I was asked about.
There's frequently also low-hanging fruit in DNS resolution problems and basic connectivity (ACLs on routers, air-gaps in the network, pings/traceroutes/mtrs to remote sites, etc).
For services you have direct control over, running nagios or something to ensure the service is actually running can frequently trigger you to fix problems before customers tell you about them. You probably also want to be running stats gathering, either directly through munin or something, or via SNMP to something like Cacti.
I usually try to have Cacti running against at least all my core switches and firewalls; where possible, I run Cacti against everything I can. In these cases I am usually looking for things like port error counts or excessive traffic. Firewall graphs from some devices can show you CPU usage and concurrent sessions; you'll get to learn at what thresholds your firewall device starts to have issues.
Your firewall may be able to log to a syslog device; if so, log everything you can and look through those for hints. This will be easier if you run something like syslog-ng or rsyslog or splunk that lets you divide your logs somewhat rather than dealing with one monolithic file.
I also try to run nfsen against at least the inside of my firewall, and the uplink to the internet provider where possible. This lets you go back in time to look at sessions to see who was doing what; this sometimes can catch interesting behaviors.
Here are a couple of useful tools for troubleshooting latency and other network issues:
If you're running a wireless network, one of the frequent slow downs is channel interference. A bunch of SSIDs in one area can really slow down network traffic. (Think: the demo of the iPhone 4 at WWDC '10).
Troubleshooting this problem is fairly easy if with software that can show you the wireless traffic patterns in the area. There's a good free and web-based one at: http://meraki.com/tools/stumbler. (disclosure: I work for Meraki)
To reduce interference, it's best to be on channels 1, 6, or 11. Using 802.11n gear with the 5GHz frequency could also help.
I always start with monitoring the layer 2 stuff using Cacti. That will give you a good amount of data which you can use to look for patterns and you can compare your Cacti graphs when everything is working well vs when the users see slowness.
It probably isn't going to find the exact problem but it will give you a good starting place to help narrow down the problem.
I start at the outermost router and work my way down, and I measure performance in the most primitive way: use a bandwidth testing site, or a known external FTP site that will give you your upload/download speed, and keep going down until you find the level where the problem resides.
Once you know where the problem is, deploy your fancy tools and monitors. But don't waste time doing that stuff on every layer. It'll take forever.
You also need to know your servers and desktop/client environment, rather than simply assuming the user is correct when they say "the network is slow." You need to methodically troubleshoot each issue - as others have said, you should first be able to view and ideally reproduce the error, and then work from there in a way that makes sense for the scenario.
Having good management and monitoring on the network and servers can save you a lot of time, however, because you're not trying to come up with instrumentation on the fly while possibly also trying to mitigate or fix the symptoms, and deal with complaining users/customers.
The answers for tcpdump and wireshark aren't wrong, those can be vital pieces of your toolkit. But unless you're dead certain that it's actually the network, they shouldn't be the first thing you reach for.
Slow network is a common phenomenon. Slow network speed can be caused by a number of things. to troubleshoot slow network is one of the most common and troublesome work in daily network management.
According to analysis, major reasons for slow network are:
How can we quickly find out the cause for slow network happens? It's a good idea to capture and analyze packets with a network analyzer (Ax3soft Unicorn, wireshark and so on).
You also read the article "Find Reasons for Slow Network", clicking to the URL(http://www.ids-sax2.com//Unicorn/Tutorials/Find-Reasons-for-Slow-Network-with-Ax3soft-Unicorn.htm) to visit it.