Re: Query causing explosion of temp space with join involving partitioning

From: Krzysztof Nienartowicz <Krzysztof(dot)Nienartowicz(at)unige(dot)ch>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-bugs(at)postgresql(dot)org, pgsql-performance(at)postgresql(dot)org
Subject: Re: Query causing explosion of temp space with join involving partitioning
Date: 2010-05-26 15:27:50
Message-ID: AANLkTiljwIjxXQU-TFrt1qYky3dkxDLUAAWUss35dnFy@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs pgsql-performance

Hello,
Thank you for the clarifications. The plan as run from the psql looks ok,
also did not notice any specific locks for this particular query.

Logs of the system running queries are not utterly clear, so chasing the
parameters for the explosive query is not that simple (shared logs between
multiple threads), but from what I see there is no difference between them
and the plan looks like (without removal of irrelevant parameters this time,
most of them are float8, but also bytea)

explain SELECT t0.surveyid, t0.srcid, t1.survey_pk, t1.source_pk, t1.tstype,
t1.homoscedasticitytest, t1.ljungboxrandomnesstest, t1.maxvalue,
t1.meanobstime,
t1.meanvalue, t1.median, t1.minvalue, t1.range, t1.robustweightedstddev,
t1.symmetrytest, t1.trimmedweightedmean, t1.trimmedweightedrange,
t1.variabilityflag, t1.weightedkurtosis, t1.weightedmean,
t1.weightedmeanconfidenceinterval, t1.weightedmeanobstime,
t1.weightednormalizedp2pscatter,
t1.weightedskewness, t1.weightedstddevdf,
t1.weightedstddevwdf, t1.vals, t1.ccdids, t1.flags, t1.obstime, t1.len,
t1.valueerrors FROM sources t0 INNER JOIN ts t1 ON
t0.surveyid = t1.survey_pk AND t0.srcid = t1.source_pk WHERE (t0.surveyid =
16 AND t0.srcid >= 200210107009116 AND t0.srcid <= 200210107009991)
ORDER BY t0.surveyid ASC, t0.srcid ASC ;
QUERY
PLAN
---------------------------------------------------------------------------------------------------------------------------------------------------------
Nested Loop (cost=0.00..2363.21 rows=835683 width=1527)
Join Filter: (t0.srcid = t1.source_pk)
-> Index Scan using sources_pkey on sources t0 (cost=0.00..17.88 rows=1
width=12)
Index Cond: ((surveyid = 16) AND (srcid >= 200210107009116::bigint)
AND (srcid <= 200210107009991::bigint))
-> Append (cost=0.00..2325.93 rows=1552 width=1053)
-> Index Scan using ts_pkey on ts t1 (cost=0.00..4.27 rows=1
width=1665)
Index Cond: ((t1.survey_pk = 16) AND (t1.source_pk =
t0.srcid))
-> Index Scan using ts_part_bs3000l00000_ts_pkey on
ts_part_bs3000l00000 t1 (cost=0.00..6.30 rows=2 width=327)
Index Cond: ((t1.survey_pk = 16) AND (t1.source_pk =
t0.srcid))
-> Index Scan using ts_part_sm0073k00001_ts_pkey on
ts_part_sm0073k00001 t1 (cost=0.00..1232.63 rows=608 width=327)
Index Cond: ((t1.survey_pk = 16) AND (t1.source_pk =
t0.srcid))
-> Index Scan using ts_part_bs3000l00001_cg0346l00000_ts_pkey on
ts_part_bs3000l00001_cg0346l00000 t1 (cost=0.00..145.41 rows=127
width=1556)
Index Cond: ((t1.survey_pk = 16) AND (t1.source_pk =
t0.srcid))
-> Index Scan using ts_part_cg0346l00001_cg0816k00000_ts_pkey on
ts_part_cg0346l00001_cg0816k00000 t1 (cost=0.00..147.64 rows=127
width=1669)
Index Cond: ((t1.survey_pk = 16) AND (t1.source_pk =
t0.srcid))
-> Index Scan using ts_part_cg0816k00001_cg1180k00000_ts_pkey on
ts_part_cg0816k00001_cg1180k00000 t1 (cost=0.00..138.09 rows=119
width=1615)
Index Cond: ((t1.survey_pk = 16) AND (t1.source_pk =
t0.srcid))
-> Index Scan using ts_part_cg1180k00001_cg6204k00000_ts_pkey on
ts_part_cg1180k00001_cg6204k00000 t1 (cost=0.00..125.69 rows=109
width=1552)
Index Cond: ((t1.survey_pk = 16) AND (t1.source_pk =
t0.srcid))
-> Index Scan using ts_part_cg6204k00001_lm0022n00000_ts_pkey on
ts_part_cg6204k00001_lm0022n00000 t1 (cost=0.00..133.23 rows=116
width=1509)
Index Cond: ((t1.survey_pk = 16) AND (t1.source_pk =
t0.srcid))
-> Index Scan using ts_part_lm0022n00001_lm0276m00000_ts_pkey on
ts_part_lm0022n00001_lm0276m00000 t1 (cost=0.00..131.08 rows=115
width=1500)
Index Cond: ((t1.survey_pk = 16) AND (t1.source_pk =
t0.srcid))
-> Index Scan using ts_part_lm0276m00001_lm0584k00000_ts_pkey on
ts_part_lm0276m00001_lm0584k00000 t1 (cost=0.00..158.11 rows=135
width=1471)
Index Cond: ((t1.survey_pk = 16) AND (t1.source_pk =
t0.srcid))
-> Index Scan using ts_part_lm0584k00001_sm0073k00000_ts_pkey on
ts_part_lm0584k00001_sm0073k00000 t1 (cost=0.00..103.47 rows=93 width=1242)
Index Cond: ((t1.survey_pk = 16) AND (t1.source_pk =
t0.srcid))

I could increase debug level on the server, but not sure if the plan printed
there is of any help. Could this be caused by some race where there is too
much activity? - DB box is at around 10% CPU load, small io wait, before the
query starts to overload the machine.

For sake of clarity this is the plan for the non-joined parameters to show
which partition would be used (i.e. a single one)

explain select * from ts t0 where t0.survey_pk = 16 AND t0.source_pk >=
200210107009116 AND t0.source_pk <= 200210107009991;
QUERY
PLAN
-----------------------------------------------------------------------------------------------------------------------------------------------------
Result (cost=0.00..10.72 rows=2 width=1614)
-> Append (cost=0.00..10.72 rows=2 width=1614)
-> Index Scan using ts_pkey on ts t0 (cost=0.00..4.27 rows=1
width=1669)
Index Cond: ((survey_pk = 16) AND (source_pk >=
200210107009116::bigint) AND (source_pk <= 200210107009991::bigint))
-> Index Scan using ts_part_bs3000l00001_cg0346l00000_ts_pkey on
ts_part_bs3000l00001_cg0346l00000 t0 (cost=0.00..6.45 rows=1 width=1560)
Index Cond: ((survey_pk = 16) AND (source_pk >=
200210107009116::bigint) AND (source_pk <= 200210107009991::bigint))
(6 rows)

Time: 1.559 ms

and to check the bin size:
> select count(*) from ts t0 where t0.survey_pk = 16 AND t0.source_pk >=
200210107009116 AND t0.source_pk <= 200210107009991;
count
-------
1000
(1 row)

and analyzed plan:
> explain analyze SELECT t0.surveyid, t0.srcid, t1.survey_pk, t1.source_pk,
t1.tstype, t1.homoscedasticitytest, t1.ljungboxrandomnesstest, t1.maxvalue,
t1.meanobstime,
t1.meanvalue, t1.median, t1.minvalue, t1.range, t1.robustweightedstddev,
t1.symmetrytest, t1.trimmedweightedmean, t1.trimmedweightedrange,
t1.variabilityflag, t1.weightedkurtosis, t1.weightedmean,
t1.weightedmeanconfidenceinterval, t1.weightedmeanobstime,
t1.weightednormalizedp2pscatter,
t1.weightedskewness, t1.weightedstddevdf,
t1.weightedstddevwdf, t1.vals, t1.ccdids, t1.flags, t1.obstime, t1.len,
t1.valueerrors FROM oglehip.sources t0 INNER JOIN oglehip.ts t1 ON
t0.surveyid = t1.survey_pk AND t0.srcid = t1.source_pk WHERE (t0.surveyid =
16 AND t0.srcid >= 200210107009116 AND t0.srcid <= 200210107009991)
ORDER BY t0.surveyid ASC, t0.srcid ASC ;

QUERY PLAN

-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Nested Loop (cost=0.00..2363.21 rows=835692 width=1527) (actual
time=73.629..585.003 rows=1000 loops=1)
Join Filter: (t0.srcid = t1.source_pk)
-> Index Scan using sources_pkey on sources t0 (cost=0.00..17.88 rows=1
width=12) (actual time=73.507..560.589 rows=500 loops=1)
Index Cond: ((surveyid = 16) AND (srcid >= 200210107009116::bigint)
AND (srcid <= 200210107009991::bigint))
-> Append (cost=0.00..2325.93 rows=1552 width=1053) (actual
time=0.014..0.045 rows=2 loops=500)
-> Index Scan using ts_pkey on ts t1 (cost=0.00..4.27 rows=1
width=1665) (actual time=0.001..0.001 rows=0 loops=500)
Index Cond: ((t1.survey_pk = 16) AND (t1.source_pk =
t0.srcid))
-> Index Scan using ts_part_bs3000l00000_ts_pkey on
ts_part_bs3000l00000 t1 (cost=0.00..6.30 rows=2 width=327) (actual
time=0.002..0.002 rows=0 loops=500)
Index Cond: ((t1.survey_pk = 16) AND (t1.source_pk =
t0.srcid))
-> Index Scan using ts_part_sm0073k00001_ts_pkey on
ts_part_sm0073k00001 t1 (cost=0.00..1232.63 rows=608 width=327) (actual
time=0.004..0.004 rows=0 loops=500)
Index Cond: ((t1.survey_pk = 16) AND (t1.source_pk =
t0.srcid))
-> Index Scan using ts_part_bs3000l00001_cg0346l00000_ts_pkey on
ts_part_bs3000l00001_cg0346l00000 t1 (cost=0.00..145.41 rows=127
width=1556) (actual time=0.006..0.007 rows=2 loops=500)
Index Cond: ((t1.survey_pk = 16) AND (t1.source_pk =
t0.srcid))
-> Index Scan using ts_part_cg0346l00001_cg0816k00000_ts_pkey on
ts_part_cg0346l00001_cg0816k00000 t1 (cost=0.00..147.64 rows=127
width=1669) (actual time=0.004..0.004 rows=0 loops=500)
Index Cond: ((t1.survey_pk = 16) AND (t1.source_pk =
t0.srcid))
-> Index Scan using ts_part_cg0816k00001_cg1180k00000_ts_pkey on
ts_part_cg0816k00001_cg1180k00000 t1 (cost=0.00..138.09 rows=119
width=1615) (actual time=0.004..0.004 rows=0 loops=500)
Index Cond: ((t1.survey_pk = 16) AND (t1.source_pk =
t0.srcid))
-> Index Scan using ts_part_cg1180k00001_cg6204k00000_ts_pkey on
ts_part_cg1180k00001_cg6204k00000 t1 (cost=0.00..125.69 rows=109
width=1552) (actual time=0.004..0.004 rows=0 loops=500)
Index Cond: ((t1.survey_pk = 16) AND (t1.source_pk =
t0.srcid))
-> Index Scan using ts_part_cg6204k00001_lm0022n00000_ts_pkey on
ts_part_cg6204k00001_lm0022n00000 t1 (cost=0.00..133.23 rows=116
width=1509) (actual time=0.004..0.004 rows=0 loops=500)
Index Cond: ((t1.survey_pk = 16) AND (t1.source_pk =
t0.srcid))
-> Index Scan using ts_part_lm0022n00001_lm0276m00000_ts_pkey on
ts_part_lm0022n00001_lm0276m00000 t1 (cost=0.00..131.08 rows=115
width=1500) (actual time=0.004..0.004 rows=0 loops=500)
Index Cond: ((t1.survey_pk = 16) AND (t1.source_pk =
t0.srcid))
-> Index Scan using ts_part_lm0276m00001_lm0584k00000_ts_pkey on
ts_part_lm0276m00001_lm0584k00000 t1 (cost=0.00..158.11 rows=135
width=1471) (actual time=0.004..0.004 rows=0 loops=500)
Index Cond: ((t1.survey_pk = 16) AND (t1.source_pk =
t0.srcid))
-> Index Scan using ts_part_lm0584k00001_sm0073k00000_ts_pkey on
ts_part_lm0584k00001_sm0073k00000 t1 (cost=0.00..103.47 rows=93 width=1242)
(actual time=0.004..0.004 rows=0 loops=500)
Index Cond: ((t1.survey_pk = 16) AND (t1.source_pk =
t0.srcid))
Total runtime: 585.566 ms
(28 rows)

Time: 588.102 ms

Would be grateful for any pointers as the server restart is the only option
now once such a query starts trashing the disk.

Best Regards,
Krzysztof

Krzysztof Nienartowicz <krzysztof(dot)nienartowicz(dot)cern(at)gmail(dot)com> writes:
> surveys-> SELECT t1.SURVEY_PK, t1.SOURCE_PK, t1.TSTYPE, t1.VALS
> surveys-> FROM sources t0 ,TS t1 where
> surveys-> (t0.SURVEYID = 16 AND t0.SRCID >= 203510110032281 AND
> t0.SRCID <= 203520107001677 and t0.SURVEYID = t1.SURVEY_PK AND t0.SRCID =
> t1.SOURCE_PK ) ORDER BY t0.SURVEYID ASC, t0.SRCID ASC

We don't make any attempt to infer derived inequality conditions,
so no, those constraints on t0.srcid won't be propagated over to
t1.source_pk. Sorry. It's been suggested before, but it would be
a lot of new mechanism and expense in the planner, and for most
queries it'd just slow things down to try to do that.

> I have around 30 clients running the same query with different
> parameters, but the query always returns 1000 rows (boundary values
> are pre-calculated,so it's like traversal of the equiwidth histogram
> if it comes to srcid/source_pk) and the rows from parallel queries
> cannot be overlapping. Usually query returns within around a second.
> I noticed however there are some queries that hang for many hours and
> what is most curious some of them created several GB of temp files.

Can you show us the query plan for the slow cases?

regards, tom lane

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Kevin Grittner 2010-05-26 16:28:35 Re: BUG #5474: Installation
Previous Message MarceloG 2010-05-26 13:53:13 Reinicialização - Pg_Listener

Browse pgsql-performance by date

  From Date Subject
Next Message Kevin Grittner 2010-05-26 15:55:04 Re: Random Page Cost and Planner
Previous Message tv 2010-05-26 10:19:28 Re: Random Page Cost and Planner