Ping a Specific Port

Question

Phil

Asked: 2010-01-07 19:40:32 +0800 CST2010-01-07 19:40:32 +0800 CST 2010-01-07 19:40:32 +0800 CST

Good tools and approaches for diagnosing poor performance

772

My company is developing a web-based data viewer application which requires a fairly decent amount of bandwidth to function well. However recently we have been changing a lot of things. For example, we changed our internal network infrastructure so that data can be hosted on separate machines connected by Gigabit Ethernet. On top of that, the application itself keeps coming out with new versions since we are still in alpha and beta testing.

Recently we made some changes that are causing poorer performance, and we want to try to identify where the problem is before we start tearing things apart. It is a very small network, and I have limited experience as an IT admin. I have a few ideas for where to start, but I would like to harvest a little wisdom from the pros first: How do you tackle/avoid similar problems? What are the most useful (Windows) tools you have used?

7 Answers

Voted

Justin · Answer 1 · 2010-01-07T20:13:37+08:00

I always follow this approach: Try to test one thing at a time.

The trusty "Scientific method" works really well for troubleshooting:

Come up with a theory for why the app is slow
Devise a test that may confirm that theory.
repeat.

For a webapp this might mean:

could it be the databse? Run some standalone SQL queries
could it be the web server? Test the web server by fetching static pages
could it be the app? Test the web server by hitting dynamic pages that don't hit a database
could it be the apps interface to the db? Test the web server by hitting dynamic pages that do hit a database.

also running basic benchmarks for testing cpu,memory,disk speed can help rule one of those things out before you go any further.

I see things like this all the time:

back ups take longer on the new server than they did on the old one.

But no one did a basic disk benchmark to find out that the older server had twice as many spindles than the new server does... or a network benchmark to find out that the new servers gigabit ethernet was only running at 100M.

all that said, if this is a custom web application, the framework you are using most definitely has a way to dump performance information to a log file.. but that is more of a question for stackoverflow.

pcapademic · Answer 2 · 2010-01-07T21:20:09+08:00

pcapademic

2010-01-07T21:20:09+08:002010-01-07T21:20:09+08:00

I have subscribed to the "Sherlock Holmes" method of troubleshooting, aka Binary Search Troubleshooting Method:

Divide the problem space in half.
Rule out one half of the problem space.
Repeat with remaining problem space.

In my experience, you sometimes get lucky by trying some obvious things first, but once you exhaust the truly quick fixes, you need to get methodical quickly.

This method is compatible with Scientific Method and Test One Thing At A Time.

2

mmDonuts · Answer 3 · 2010-01-07T22:57:28+08:00

The sum of the answers above are 90% of what I would say, here's the other 10%:

There's a lesson to be learned about controlling the environment, more specifically changes to the environment. Even if you are already measuring performance, changing more than one thing at a time means any problem turns into a two step problem: finding both the effect and the cause. If you change one thing at a time and have a valid plan on how to rollback any performance issue can usually be associated with that change (usually, sometimes oddball stuff happens or someone changes something you don't know about) and hopefully fixed by rolling back the change.
The most beneficial thing to do is to measure early and often. Facts and accurate data make solving performance problems easier.
The least beneficial thing you can do is guess what's wrong and change it without measuring. You'd be surprised how often a reasonable sounding guess doesn't solve the problem or makes it worse.
You can't measure something you haven't defined yet. Any time you have a performance problem, define what the end user expectation is and then find a way to measure the success or failure to meet that expectation in a way you can repeat. Do this in as specific a way as you can and you'll narrow down the scope of what you have to investigate and the tests you'll need to run to do so.
For Windows, I'm a big fan of performance counter logs and using PAL to process and help interpret them. The system overview report and suggested counters for that report cover most of the probable sources of a bottleneck. http://pal.codeplex.com

Joe Internet · Answer 4 · 2010-01-07T21:38:12+08:00

Joe Internet

2010-01-07T21:38:12+08:002010-01-07T21:38:12+08:00

Some of the best tools to be found for Windows troubleshooting are from Microsoft's Sysinternals. And some of the best info on how to use them (and Windows technical info in general) can be found on Mark Russinovich's blog and webcasts. His book on Windows Internals is also full of good information.

With the above, I would suggest starting with the programs Process Explorer and Process Monitor to take a look at whatever web service you have running, and seeing what's going on. Both programs allow you to display a large amount of info about running processes, which can be configured by right-clicking the column headings.

1

joeqwerty · Answer 5 · 2010-01-07T19:43:13+08:00

joeqwerty

2010-01-07T19:43:13+08:002010-01-07T19:43:13+08:00

What was changed that introduced the performance problem? If only the code was changed, then I'd start my troubleshooting there.

0

pcapademic · Answer 6 · 2010-01-07T21:30:21+08:00

pcapademic

2010-01-07T21:30:21+08:002010-01-07T21:30:21+08:00

Compare Problem Stat to a Known Good State and look for the discrepancies.

A Known Good State can be an actual documented state. It can also be based on a standard of expected behavior, such as known expected behavior of networking protocols or such as rules of thumb about appropriate average CPU usage.

Examples:

Using Wireshark or other network sniffer tool, you repeatedly see duplicate packets. Now you can delve in to try an figure out why you are seeing the same IP packet on the wire. Perhaps you have a "local router" scenario, or perhaps something is fragmenting IP packets.

Average CPU usage is at 90%. If the average is 90%, then the server is likely maxing out CPU frequently, causing everything to back up.

0

pcapademic · Answer 7 · 2010-01-07T21:36:13+08:00

pcapademic

2010-01-07T21:36:13+08:002010-01-07T21:36:13+08:00

At the recommendation of John T, I have been enjoying using dstat with gnuplot.

0

Good tools and approaches for diagnosing poor performance

Ping a Specific Port

How do I tell Git for Windows where to find my private RSA key?

How do you restart php-fpm?

What's the default superuser username/password for postgres after a new install?

What port does SFTP use?

Resolve host name from IP address

How can I sort du -h output by size

Command line to list users in a Windows Active Directory group?

What is a Pem file and how does it differ from other OpenSSL Generated Key File Formats?

How to determine if a bash variable is empty?