"There is no significant disk activity (read 0), one CPU is pegged, and
that process is consuming 218M Resident memory, 168M Shared (10% available
memory total). All reasonable, except for the fact it doesn't come back..."
Just to let you know, I've observed the identical phenomenon on my RHEL3-WS
server running PostgreSQL V7.3.4: One of the CPU's pegged at 100% (2-way
SMP with hyperthreading, so 4 apparent CPU's), virtually zero disk I/O
activity, high memory usage, etc. I thought it might be due to a casting
problem in a JOIN's ON clause, but that did not turn out to be the case. I
*have* recently observed that if I run a vacuum analyze on the entire
database, the amount of time spent in this looping state decreases greatly,
but it has *not* disappeared in all cases.
Next week I hope to be able to run some directed test with stats collection
turned on, to try to see if I can find out what's causing this to occur.
I'll post the results if I find anything significant.
Senior IT Architect/Specialist | IBM Virtual Server Administration
Voice/FAX: 845-689-2064 | Cell: 646-345-6978 | Tieline: 930-6001
Text Messaging: 6463456978 @ mobile.mycingular.com
Email: srosenst @ us.ibm.com
"Learn from the mistakes of others because you can't live long enough to
make them all yourself." -- Eleanor Roosevelt
Sent by: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Re: [PERFORM] This query is still
09/28/2004 11:44 running after 10 hours...
When grilled further on (Tue, 28 Sep 2004 11:04:23 -0400),
Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> confessed:
> Robert Creager <Robert_Creager(at)LogicalChaos(dot)org> writes:
> > Normally, this query takes from 5 minutes to 2 hours to run. On this
> > update, it's been running for more than 10 hours.
> > ...
> > -> Nested Loop (cost=250.69..129094.19 rows=77211 width=59)
> > -> Hash Join (cost=250.69..307.34 rows=67 width=12)
> > Hash Cond: ("outer".pair_id = "inner".pair_id)
> > ...
> It chose a nested loop here because it was only expecting 67 rows out of
> the next-lower join, and so it thought it would only need 67 repetitions
> of the index probe into obs_v_file_id_index. I'm suspicious that that
> estimate was way low and so the nestloop is taking forever. You might
> try "SET enable_nestloop = off" as a crude way of avoiding that trap.
I tried your suggestion. Did generate a different plan (below), but the
estimation is blown as it still used a nested loop. The query is currently
running(42 minutes so far). For the query in question, there are 151
pair_id's in the pairs table, which equates to 302 entries in the files
(part of the query), which moves on to 533592 entries in the obs_v table
533699 entries in the obs_i table.
The groups table has 76 total entries, files 9028, zero_pair 2532,
147. Only the obs_v and obs_i tables have data of any significant
with 10M rows apiece. The trigger hitting the catalog table (875499
searching for single entries to match (one fire per obs_v/obs_i update) on
index (took 54ms on the first query of a random id just now).
There is no significant disk activity (read 0), one CPU is pegged, and that
process is consuming 218M Resident memory, 168M Shared (10% available
total). All reasonable, except for the fact it doesn't come back...
Hash Join (cost=100267870.17..100751247.13 rows=1578889 width=63)
Hash Cond: ("outer".star_id = "inner".star_id)
-> Seq Scan on obs_i i (cost=0.00..213658.19 rows=10391319 width=8)
-> Hash (cost=100266886.39..100266886.39 rows=77113 width=59)
-> Hash Join (cost=100000307.51..100266886.39 rows=77113
Hash Cond: ("outer".file_id = "inner".file_id)
-> Seq Scan on obs_v (cost=0.00..213854.50 rows=10390650
1) -> Hash (cost=100000307.34..100000307.34 rows=67
-> Hash Join (cost=100000250.69..100000307.34 rows=67
width=12) Hash Cond: ("outer".pair_id =
"inner".pair_id) -> Seq Scan on zero_pair zp
(cost=0.00..43.32 rows=2532 width=8) -> Hash
(cost=100000250.40..100000250.40 rows=118 width=12)
-> Hash Join (cost=100000004.80..100000250.40 rows=118 width=12)
Hash Cond: ("outer".night_id = "inner".night_id)
-> Seq Scan on files f (cost=0.00..199.28
rows=9028 width=12) -> Hash
-> Nested Loop (cost=100000000.00..100000004.80 rows=1 width=8)
-> Seq Scan on color_groups cg
(cost=0.00..2.84 rows=1 width=8)
Filter: (175 = group_id)
-> Seq Scan on groups g (cost=0.00..1.95 rows=1 width=8)
Filter: (group_id = 175)
20:48:23 up 5 days, 23:14, 2 users, load average: 2.56, 2.91, 2.78
Linux 2.6.5-02 #8 SMP Mon Jul 12 21:34:44 MDT 2004
(See attached file: color.explain)(See attached file: attlakjy.dat)
Description: application/octet-stream (3.6 KB)
Description: application/octet-stream (204 bytes)
In response to
pgsql-performance by date
|Next:||From: Shea,Dan [CIS]||Date: 2004-09-29 14:44:12|
|Subject: Re: Interest in perf testing?|
|Previous:||From: Richard Huxton||Date: 2004-09-29 08:40:11|
|Subject: Re: [PERFORM] Poor Performance for large queries|