I recently wrote a complex SELECT statement (based on many very, very complex views) that would take 1 hour and 50 minutes to execute when run from Toad on my desktop, and only a few minutes faster when run from a Python script using the cx_Oracle library (written to flush to disk every 50 rows). The total size of the single result set was about 8000 rows totalling 5MB. During execution, my workstation was not thrashing and the CPU load was reasonably low.
The exact same query ran on the server took an astonishing 21 seconds to produce a byte-for-byte identical result set. This was also generated by the same Python/cx_Oracle script.
Transferring the 5MB result set file from the server to my workstation took only 3 seconds, so I don't think network bandwidth is the direct problem.
Could SQL*Net or one of its associated libraries be the culprit? Is there some non-linear memory management problems when queries are invoked across the network? A 5MB result set is large, but not ginormous in this day and age. Are there perhaps some buffer size configuration settings that would help? I'm using a vanilla Oracle client install.
The workstation is Windows XP Pro SP3 (only 1GB of RAM) with the Oracle 10g client and Toad for Oracle Xpert 9.7.2.5 and Python 2.6.2 with cx_Oracle 5.0.2. The server is Red Hat 2.6.9-67.ELsmp on a quad Xeon 3.8GHz 8GB server, running Oracle 10.2.0.4, Python 2.3.4, cx_Oracle 4.4.1.
Edit: Whoops! The file was only 5 Megabytes, not GB. Very sorry.
Solved: There was a population script that ran before the extract query that I mentioned. Once that population script was re-run, the extract query took 2 hours to run regardless of the location of the client program. After that first long run, the result set must've been cached somewhere, and I didn't notice that effect until I methodically went through all of the combinations.
Trace the execution in oracle and probably you'll see your workstation fetching small chunks of data and adding up latency fast.
The solution may be to fetch results in bulk, like done here:
Resolving Oracle Contention in TOAD; look on toadworld, search for
(sorry, can't add hyperlinks)
I don't know Oracle that well in particular, but previous experience with the network protocols for other databases has told me that they totally ignore latency problems. In other words, even if there's not an overly large amount of data to transfer, things can still get glacial if you have to make several round trips to the server for each value.
You could test this theory if you added a bit of latency to your network (I know this is possible in Linux, I can only imagine there's some way to do it in Windows) and seeing if it affects the overall runtime significantly.
Another option is that some of the processing may be being done on the client side, which could be transferring much more than 5 GB of intermediate data to get the result. This would probably show up as some fairly notable amount of load on your desktop though, so it's not all that likely.
All signs seem to point to a network bottleneck. What kind of throughput do you get with file transfers between these two computers? Can you try running the same query from another computer, but one closer in the network topology?
The effective transfer rate is 6.2Mbps -- not good at all. This is a shot in the dark, but... are you neglecting to tell us that there is a cable modem between your workstation and the server? :)
Perhaps your workstation is daisy-chained off an IP phone and you are actually connected at 10Mbps?
Sorry if I'm stating the obvious, but you have made no mention of your network configuration in your question.
Two points to take into account:
Have you executed the query first in Toad, and afterwards in python+cx_Oracle ? The first time you execute the query, Oracle needs to parse the query, create an execution plan, and execute the plan: read from disk into memory (buffer cache), execute the joins, etc, ... The second time, Oracle uses the same execution plan (stored in the SGA), and reads from the buffer cache, not disk. It can be a lot less time the second time you execute the same query.
Loading 8000 rows/5GB of data (655Kbytes per row!!!) into TOAD, to display them in the GUI, can take a lot of time. With python+cx_Oracle you are not displaying anything, so you are saving a lot of time here.
EDITED: OK, so 8000 rows/5 Mb of data (655 bytes per row) shouldn't be a problem for TOAD to display.
One SELECT statement should be shipped in it's entirety to the server for parsing and execution. It's up to the client to manage how those results are returned. I would bet a lot that the problem lies in how the results are being obtained by your clients. It's been a while since i used TOAD, so I don't know if it's doing bulk fetches of the data but that's one place you can get big improvements in speed, as @slovon notes.
Just for grins, see what traceroute reports from your workstation to the server, and vice-versa.
EDIT: another thing to try is to run your python script on the server, but go thru a TNS connection through the listener on the server. This should give you an idea of what effect the TNS software is having on your query while removing any intervening network issues.
Also, make sure your DNS is healthy (gotta be able to do reverse lookups) and that you're not using DHCP on the server. Underscores in the hostname are a no-no, as well.
Get a trace with wait time on the sessions, both from TOAD and locally. Then compare the wait events and times, plus the execution plan. It's possible that the plan's different.