Re: This query is still running after 10 hours...

From: Robert Creager <Robert_Creager(at)LogicalChaos(dot)org>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: PGPerformance <pgsql-performance(at)postgresql(dot)org>
Subject: Re: This query is still running after 10 hours...
Date: 2004-09-29 03:44:24
Message-ID: 20040928214424.2583b925@thunder.mshome.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

When grilled further on (Tue, 28 Sep 2004 11:04:23 -0400),
Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> confessed:

> Robert Creager <Robert_Creager(at)LogicalChaos(dot)org> writes:
> > Normally, this query takes from 5 minutes to 2 hours to run. On this
> > update, it's been running for more than 10 hours.
>
> > ...
> > -> Nested Loop (cost=250.69..129094.19 rows=77211 width=59)
> > -> Hash Join (cost=250.69..307.34 rows=67 width=12)
> > Hash Cond: ("outer".pair_id = "inner".pair_id)
> > ...
>
> It chose a nested loop here because it was only expecting 67 rows out of
> the next-lower join, and so it thought it would only need 67 repetitions
> of the index probe into obs_v_file_id_index. I'm suspicious that that
> estimate was way low and so the nestloop is taking forever. You might
> try "SET enable_nestloop = off" as a crude way of avoiding that trap.

I tried your suggestion. Did generate a different plan (below), but the
estimation is blown as it still used a nested loop. The query is currently
running(42 minutes so far). For the query in question, there are 151 different
pair_id's in the pairs table, which equates to 302 entries in the files table
(part of the query), which moves on to 533592 entries in the obs_v table and
533699 entries in the obs_i table.

The groups table has 76 total entries, files 9028, zero_pair 2532, color_groups
147. Only the obs_v and obs_i tables have data of any significant quantities
with 10M rows apiece. The trigger hitting the catalog table (875499 entries) is
searching for single entries to match (one fire per obs_v/obs_i update) on an
index (took 54ms on the first query of a random id just now).

There is no significant disk activity (read 0), one CPU is pegged, and that
process is consuming 218M Resident memory, 168M Shared (10% available memory
total). All reasonable, except for the fact it doesn't come back...

Hash Join (cost=100267870.17..100751247.13 rows=1578889 width=63)
Hash Cond: ("outer".star_id = "inner".star_id)
-> Seq Scan on obs_i i (cost=0.00..213658.19 rows=10391319 width=8)
-> Hash (cost=100266886.39..100266886.39 rows=77113 width=59)
-> Hash Join (cost=100000307.51..100266886.39 rows=77113 width=59)
Hash Cond: ("outer".file_id = "inner".file_id)
-> Seq Scan on obs_v (cost=0.00..213854.50 rows=10390650 width=5
1) -> Hash (cost=100000307.34..100000307.34 rows=67 width=12)
-> Hash Join (cost=100000250.69..100000307.34 rows=67
width=12) Hash Cond: ("outer".pair_id =
"inner".pair_id) -> Seq Scan on zero_pair zp
(cost=0.00..43.32 rows=2532 width=8) -> Hash
(cost=100000250.40..100000250.40 rows=118 width=12)
-> Hash Join (cost=100000004.80..100000250.40 rows=118 width=12)
Hash Cond: ("outer".night_id = "inner".night_id)
-> Seq Scan on files f (cost=0.00..199.28
rows=9028 width=12) -> Hash
(cost=100000004.80..100000004.80rows=1 width=8)
-> Nested Loop (cost=100000000.00..100000004.80 rows=1 width=8)
-> Seq Scan on color_groups cg
(cost=0.00..2.84 rows=1 width=8)
Filter: (175 = group_id)
-> Seq Scan on groups g (cost=0.00..1.95 rows=1 width=8)
Filter: (group_id = 175)

--
20:48:23 up 5 days, 23:14, 2 users, load average: 2.56, 2.91, 2.78
Linux 2.6.5-02 #8 SMP Mon Jul 12 21:34:44 MDT 2004

Attachment Content-Type Size
color.explain application/octet-stream 3.6 KB

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Robert Creager 2004-09-29 03:51:56 Re: This query is still running after 10 hours...
Previous Message John Meinel 2004-09-29 03:14:23 Poor Performance for large queries in functions