Skip site navigation (1) Skip section navigation (2)

query has huge variance in execution times

From: Brian Cox <brian(dot)cox(at)ca(dot)com>
To: "pgsql-performance(at)postgresql(dot)org" <pgsql-performance(at)postgresql(dot)org>
Subject: query has huge variance in execution times
Date: 2010-03-31 04:11:36
Message-ID: 4BB2CB78.1020409@ca.com (view raw or flat)
Thread:
Lists: pgsql-performance
postgres 8.3.5 on RHEL4 update 6

This query starts executing at 18:41:

cemdb=> select query_start,current_query from pg_stat_activity where 
procpid=10022;           query_start          | 
 
          current_query
-------------------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
  2010-03-30 18:41:11.685261-07 | select b.ts_id from 
ts_stats_tranunit_user_daily b, ts_stats_tranunit_user_interval c where 
b.ts_transet_incarnation_id = c.ts_transet_incarnation_id and 
b.ts_tranunit_id = c.ts_tranunit_id and b.ts_user_incarnation_id = 
c.ts_user_incarnation_id and c.ts_interval_start_time >= $1 and 
c.ts_interval_start_time < $2 and b.ts_interval_start_time >= $3 and 
b.ts_interval_start_time < $4
(1 row)

about 5 mins later, I, suspecting problems, do (the values are the same 
as for $1 et al above; EXPLAIN was done on purpose to keep stats 
[hopefully] the same as when pid 10022 started; there are 80,000 rows in 
each of the 2 tables at the time of this EXPLAIN and when 10022 started):

cemdb=> explain select b.ts_id from ts_stats_tranunit_user_daily b, 
ts_stats_tranunit_user_interval c where b.ts_transet_incarnation_id = 
c.ts_transet_incarnation_id and b.ts_tranunit_id = c.ts_tranunit_id and 
b.ts_user_incarnation_id = c.ts_user_incarnation_id and 
c.ts_interval_start_time >= '2010-3-29 01:00' and 
c.ts_interval_start_time < '2010-3-29 02:00' and 
b.ts_interval_start_time >= '2010-3-29' and b.ts_interval_start_time < 
'2010-3-30';
 
                    QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
  Merge Join  (cost=33574.89..34369.38 rows=25207 width=8)
    Merge Cond: ((b.ts_transet_incarnation_id = 
c.ts_transet_incarnation_id) AND (b.ts_tranunit_id = c.ts_tranunit_id) 
AND (b.ts_user_incarnation_id = c.ts_user_incarnation_id))
    ->  Sort  (cost=13756.68..13854.96 rows=78623 width=32)
          Sort Key: b.ts_transet_incarnation_id, b.ts_tranunit_id, 
b.ts_user_incarnation_id
          ->  Index Scan using ts_stats_tranunit_user_daily_starttime on 
ts_stats_tranunit_user_daily b  (cost=0.00..10560.13 rows=78623 width=32)
                Index Cond: ((ts_interval_start_time >= '2010-03-29 
00:00:00-07'::timestamp with time zone) AND (ts_interval_start_time < 
'2010-03-30 00:00:00-07'::timestamp with time zone))
    ->  Sort  (cost=19818.21..19959.72 rows=113207 width=24)
          Sort Key: c.ts_transet_incarnation_id, c.ts_tranunit_id, 
c.ts_user_incarnation_id
          ->  Index Scan using ts_stats_tranunit_user_interval_starttime 
on ts_stats_tranunit_user_interval c  (cost=0.00..15066.74 rows=113207 
width=24)
                Index Cond: ((ts_interval_start_time >= '2010-03-29 
01:00:00-07'::timestamp with time zone) AND (ts_interval_start_time < 
'2010-03-29 02:00:00-07'::timestamp with time zone))
(10 rows)

cemdb=> \q

I then run the query manually:

[root(at)rdl64xeoserv01 log]# time PGPASSWORD=quality psql -U admin -d 
cemdb -c "select b.ts_id from ts_stats_tranunit_user_daily b, 
ts_stats_tranunit_user_interval c where b.ts_transet_incarnation_id = 
c.ts_transet_incarnation_id and b.ts_tranunit_id = c.ts_tranunit_id and 
b.ts_user_incarnation_id = c.ts_user_incarnation_id and 
c.ts_interval_start_time >= '2010-3-29 01:00' and 
c.ts_interval_start_time < '2010-3-29 02:00' and 
b.ts_interval_start_time >= '2010-3-29' and b.ts_interval_start_time < 
'2010-3-30'" > /tmp/select.txt 2>&1

real    0m0.813s
user    0m0.116s
sys     0m0.013s

I let process 10022 run for an hour. an strace shows lots of I/O:

[root(at)rdl64xeoserv01 log]# strace -p 10022
read(18, "\214\2\0\0\374<\200#\1\0\0\0<\0P\3\0 \4 \0\0\0\0\320\234"..., 
8192) = 8192
semop(73007122, 0xbfe0fc20, 1)          = 0
_llseek(18, 538451968, [538451968], SEEK_SET) = 0
read(18, "\214\2\0\0\274\347\t#\1\0\0\0<\0P\3\0 \4 \0\0\0\0\320\234"..., 
8192) = 8192
_llseek(18, 535928832, [535928832], SEEK_SET) = 0
read(18, "\214\2\0\0\310\300\226\"\1\0\0\0<\0P\3\0 \4 \0\0\0\0\320"..., 
8192) = 8192
_llseek(18, 532398080, [532398080], SEEK_SET) = 0

<many more similar lines>

I then kill 10022 and the application retries the same query:

[10022-cemdb-admin-2010-03-30 19:02:37.460 PDT]FATAL:  terminating 
connection due to administrator command
[10022-cemdb-admin-2010-03-30 19:02:37.460 PDT]STATEMENT:  select 
b.ts_id from ts_stats_tranunit_user_daily b, 
ts_stats_tranunit_user_interval c where b.ts_transet_incarnation_id = 
c.ts_transet_incarnation_id and b.ts_tranunit_id = c.ts_tranunit_id and 
b.ts_user_incarnation_id = c.ts_user_incarnation_id and 
c.ts_interval_start_time >= $1 and c.ts_interval_start_time < $2 and 
b.ts_interval_start_time >= $3 and b.ts_interval_start_time < $4

[10820-cemdb-admin-2010-03-30 19:02:40.363 PDT]LOG:  duration: 1096.598 
ms  execute <unnamed>: select b.ts_id from ts_stats_tranunit_user_daily 
b, ts_stats_tranunit_user_interval c where b.ts_transet_incarnation_id = 
c.ts_transet_incarnation_id and b.ts_tranunit_id = c.ts_tranunit_id and 
b.ts_user_incarnation_id = c.ts_user_incarnation_id and 
c.ts_interval_start_time >= $1 and c.ts_interval_start_time < $2 and 
b.ts_interval_start_time >= $3 and b.ts_interval_start_time < $4

Ideas on why the big difference in execution times of the same query and 
how to avoid same are solicited. I tend to doubt, but don't know how to 
prove, that the query stats for these 2 tables were updated between the 
start of pid 10022 and when I did the EXPLAIN.

Thanks,
Brian

Responses

pgsql-performance by date

Next:From: David WilsonDate: 2010-03-31 04:37:40
Subject: Re: query has huge variance in execution times
Previous:From: Robert HaasDate: 2010-03-30 19:59:46
Subject: Re: experiments in query optimization

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group