Re: Postgresql 9.0.2 explain analyze very slow (10x), compared to actual query time

From: Achilleas Mantzios <achill(at)matrix(dot)gatewaynet(dot)com>
To: pgsql-admin(at)postgresql(dot)org
Subject: Re: Postgresql 9.0.2 explain analyze very slow (10x), compared to actual query time
Date: 2011-01-13 12:13:21
Message-ID: 201101131413.21757.achill@matrix.gatewaynet.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-admin

Στις Wednesday 12 January 2011 18:45:20 ο/η Tom Lane έγραψε:
> Achilleas Mantzios <achill(at)matrix(dot)gatewaynet(dot)com> writes:
> > Στις Wednesday 12 January 2011 17:07:53 ο/η Tom Lane έγραψε:
> >> Right offhand I'd wonder whether that was more bound by gettimeofday or
> >> by printf. Please try it without printf in the loop.
>
> > Changed that to smth like: micros_total = micros_total + (double) micros;
> > instead of the printf to beat any compiler optimization, and still linux runs at light speed:
> > FBSD_TEST : user 0.089s, sys 1.4s
> > FBSD_DEV : user 0.183s, sys 3.8s
> > LINUX_PROD : user 0.168s, sys 0s
>
> Well, there's your problem all right: the FBSD_DEV system takes 22X
> longer to execute gettimeofday() than the LINUX_PROD system. The
> particular plan that 9.0 is choosing is especially vulnerable to this
> because it involves a whole lot of calls of the Materialize plan node:
>
> > -> Nested Loop (cost=3390.49..9185.73 rows=1 width=4) (actual time=279.916..93280.499 rows=1824 loops=1)
> > Join Filter: (msold.marinerid = mold.id)
> > -> Index Scan using mariner_pkey on mariner mold (cost=0.00..1957.89 rows=15914 width=4) (actual time=0.009..38.449 rows=15914 loops=1)
> > Filter: ((marinertype)::text = 'Mariner'::text)
> > -> Materialize (cost=3390.49..6989.13 rows=1 width=8) (actual time=0.013..2.881 rows=1888 loops=15914)
>
> You've got 1888 * 15914 * 2 gettimeofday() calls needed to count up the
> time spent in Materialize, which at 3.8usec apiece adds up to ... hmm
> ... I get 228 seconds, which is more than the reported runtime. So I
> think there's something wrong with your measurement of the gettimeofday
> cost. But I wouldn't be a bit surprised to hear that the true cost of
> gettimeofday on that machine is a microsecond or so, in which case all
> of the EXPLAIN ANALYZE time bloat is indeed due to that.
>
> This problem has been discussed repeatedly on the pgsql-performance
> list, which is where your question really belongs, not here. The short
> of it is that cheap motherboards tend to provide cheap clock hardware
> that takes about a microsecond to read. It also helps to be running
> an OS that is able to read the clock in userspace without invoking a
> kernel call. It looks like your Linux box is winning on both counts
> compared to your BSD boxes.

Just comparing the two FBSD boxes on equal terms gives crazy results, so
I'll try to leave LINUX out of the equation for a while, and also i'll try to leave the
difference in DB sizes out of the equation so what i did was make a dump of the small DB
on the small (slow) FBSD box (FBSD_DEV) and load it on the bigger (fast) FBSD box (FBSD_TEST).
I set : sysctl kern.timecounter.hardware=TSC on both systems
and now the below program

#include <stdio.h>
#include <stdlib.h>
#include <sys/time.h>
int main(int argc,char** argv) {
struct timeval *tp=calloc(1,sizeof(struct timeval));
int runna;
long double micros_total=0;
for (runna=0;runna<1000000;runna++) {
int rc=gettimeofday(tp,NULL);
long micros = tp->tv_sec * 1000000 + tp->tv_usec;
micros_total = micros_total + (long double) micros;
}
printf("micros_total=%LE\n",micros_total);
}

% time ./gettimeofday_test gives

FBSD_DEV (slow) : 0.127u 0.231s 0:00.35 100.0% 5+192k 0+0io 0pf+0w
FBSD_TEST (fast) : 0.002u 0.014s 0:00.15 6.6% 84+24192k 0+0io 0pf+0w

Database on the fast system, as i said is exactly the same as in the slow one.

The query "SELECT avg(md.perioddue) from status st,items it,machdefs md WHERE st.apptblidval=it.id AND it.defid=md.defid;"
gives with \timing

FBSD_DEV (slow) : Time: 28747,374 ms, EXPLAIN ANALYZE: Total runtime: 31816.648 ms
FBSD_TEST (fast) : Time: 3153,372 ms, EXPLAIN ANALYZE: Total runtime: 5701.942 ms

(the fast system is indeed a lot faster on this one (by 10x))

However the query (known from the initial mail as well)
SELECT distinct m.id,coalesce(m.givenname,''),coalesce(m.midname,''),m.surname from marinerstates ms,vessels vsl,mariner m
where m.id=ms.marinerid and ms.vslid=vsl.id and ms.state='Active' and coalesce(ms.endtime,now())::date >= '2006-07-15' and
ms.starttime::date <= '2007-01-11' and m.marinertype='Mariner' and m.id not in
(SELECT distinct mold.id from marinerstates msold,vessels vslold,mariner mold where mold.id=msold.marinerid and msold.vslid=vslold.id
and msold.state='Active' and coalesce(msold.endtime,now())::date >= '2006-07-15' and msold.starttime::date <= '2007-01-11' and exists
(select 1 from marinerstates msold2 where msold2.marinerid=msold.marinerid and msold2.state='Active' and msold2.id <> msold.id and
msold2.starttime<msold.starttime AND (msold.starttime-msold2.endtime)<='18 months') and mold.marinertype='Mariner' )
order by m.surname,coalesce(m.givenname,''),coalesce(m.midname,'')
gives:

FBSD_DEV (slow) : Time : 246,032 ms EXPLAIN ANALYZE: Total runtime: 274.425 ms
FBSD_TEST (fast) : Time: 2314,914 ms EXPLAIN ANALYZE: Total runtime: 10438.061 ms

This query runs much much slower in the case of FBSD_TEST.

The explain analyze for this query are:

FBSD_DEV:

Unique (cost=633677.56..633700.48 rows=1834 width=23) (actual time=270.184..271.042 rows=603 loops=1)
-> Sort (cost=633677.56..633682.14 rows=1834 width=23) (actual time=270.182..270.440 rows=603 loops=1)
Sort Key: m.surname, (COALESCE(m.givenname, ''::character varying)), (COALESCE(m.midname, ''::character varying)), m.id
Sort Method: quicksort Memory: 53kB
-> Hash Join (cost=630601.65..633578.15 rows=1834 width=23) (actual time=194.815..269.294 rows=603 loops=1)
Hash Cond: (ms.vslid = vsl.id)
-> Hash Join (cost=630580.33..633530.01 rows=2261 width=27) (actual time=194.444..268.140 rows=603 loops=1)
Hash Cond: (ms.marinerid = m.id)
-> Seq Scan on marinerstates ms (cost=0.00..2875.32 rows=4599 width=8) (actual time=0.012..70.281 rows=2625 loops=1)
Filter: (((state)::text = 'Active'::text) AND ((starttime)::date <= '2007-01-11'::date) AND ((COALESCE(endtime, now()))::date >= '
2006-07-15'::date))
-> Hash (cost=630491.54..630491.54 rows=7103 width=23) (actual time=194.391..194.391 rows=12832 loops=1)
-> Index Scan using mariner_pkey on mariner m (cost=628776.89..630491.54 rows=7103 width=23) (actual time=162.607..184.282 rows=
12832 loops=1)
Filter: ((NOT (hashed subplan)) AND ((marinertype)::text = 'Mariner'::text))
SubPlan
-> Unique (cost=0.00..628772.30 rows=1834 width=4) (actual time=0.064..161.099 rows=1454 loops=1)
-> Nested Loop (cost=0.00..628767.72 rows=1834 width=4) (actual time=0.062..159.091 rows=1835 loops=1)
-> Nested Loop (cost=0.00..627027.98 rows=1865 width=4) (actual time=0.056..147.464 rows=1876 loops=1)
-> Index Scan using marinerstates_marinerid on marinerstates msold (cost=0.00..626316.07 rows=2299 wid
th=8) (actual time=0.050..138.100 rows=1876 loops=1)
Filter: (((state)::text = 'Active'::text) AND ((starttime)::date <= '2007-01-11'::date) AND ((COAL
ESCE(endtime, now()))::date >= '2006-07-15'::date) AND (subplan))
SubPlan
-> Bitmap Heap Scan on marinerstates msold2 (cost=4.28..12.11 rows=1 width=0) (actual time=0.0
08..0.008 rows=1 loops=2625)
Recheck Cond: ((marinerid = $0) AND (starttime < $2))
Filter: ((id <> $1) AND ((state)::text = 'Active'::text) AND (($2 - endtime) <= '1 year 6
mons'::interval))
-> Bitmap Index Scan on marinerstates_marinerid_starttime (cost=0.00..4.28 rows=2 width=
0) (actual time=0.005..0.005 rows=6 loops=2625)
Index Cond: ((marinerid = $0) AND (starttime < $2))
-> Index Scan using vessels_pkey on vessels vslold (cost=0.00..0.30 rows=1 width=4) (actual time=0.002
..0.003 rows=1 loops=1876)
Index Cond: (vslold.id = msold.vslid)
-> Index Scan using mariner_pkey on mariner mold (cost=0.00..0.92 rows=1 width=4) (actual time=0.004..0.004
rows=1 loops=1876)
Index Cond: (mold.id = msold.marinerid)
Filter: ((mold.marinertype)::text = 'Mariner'::text)
-> Hash (cost=17.81..17.81 rows=281 width=4) (actual time=0.363..0.363 rows=272 loops=1)
-> Seq Scan on vessels vsl (cost=0.00..17.81 rows=281 width=4) (actual time=0.009..0.199 rows=272 loops=1)
Total runtime: 271.466 ms

FBSD_TEST:

Unique (cost=11525.09..11571.55 rows=3717 width=23) (actual time=10455.286..10455.635 rows=603 loops=1)
-> Sort (cost=11525.09..11534.38 rows=3717 width=23) (actual time=10455.285..10455.390 rows=603 loops=1)
Sort Key: m.surname, (COALESCE(m.givenname, ''::character varying)), (COALESCE(m.midname, ''::character varying)), m.id
Sort Method: quicksort Memory: 71kB
-> Hash Join (cost=8281.98..11304.67 rows=3717 width=23) (actual time=10418.032..10454.399 rows=603 loops=1)
Hash Cond: (ms.marinerid = m.id)
-> Hash Join (cost=20.12..2963.83 rows=3717 width=4) (actual time=0.199..34.986 rows=2625 loops=1)
Hash Cond: (ms.vslid = vsl.id)
-> Seq Scan on marinerstates ms (cost=0.00..2889.32 rows=4590 width=8) (actual time=0.015..33.507 rows=2625 loops=1)
Filter: (((state)::text = 'Active'::text) AND ((starttime)::date <= '2007-01-11'::date) AND ((COALESCE(endtime, now()))::date >= '
2006-07-15'::date))
-> Hash (cost=16.72..16.72 rows=272 width=4) (actual time=0.179..0.179 rows=272 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 10kB
-> Seq Scan on vessels vsl (cost=0.00..16.72 rows=272 width=4) (actual time=0.003..0.099 rows=272 loops=1)
-> Hash (cost=8172.57..8172.57 rows=7143 width=23) (actual time=10417.803..10417.803 rows=12832 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 702kB
-> Seq Scan on mariner m (cost=7614.86..8172.57 rows=7143 width=23) (actual time=10402.282..10412.840 rows=12832 loops=1)
Filter: ((NOT (hashed SubPlan 1)) AND ((marinertype)::text = 'Mariner'::text))
SubPlan 1
-> Unique (cost=2768.00..7614.86 rows=1 width=4) (actual time=85.291..10401.289 rows=1454 loops=1)
-> Nested Loop (cost=2768.00..7614.86 rows=1 width=4) (actual time=85.290..10400.399 rows=1835 loops=1)
Join Filter: (msold.marinerid = mold.id)
-> Index Scan using mariner_pkey on mariner mold (cost=0.00..1728.60 rows=14286 width=4) (actual time=0.007..13.53
1 rows=14286 loops=1)
Filter: ((marinertype)::text = 'Mariner'::text)
-> Materialize (cost=2768.00..5671.97 rows=1 width=8) (actual time=0.003..0.329 rows=1876 loops=14286)
-> Nested Loop (cost=2768.00..5671.96 rows=1 width=8) (actual time=37.502..83.326 rows=1876 loops=1)
-> Hash Semi Join (cost=2768.00..5671.67 rows=1 width=12) (actual time=37.491..79.434 rows=1876 loops=
1)
Hash Cond: (msold.marinerid = msold2.marinerid)
Join Filter: ((msold2.id <> msold.id) AND (msold2.starttime < msold.starttime) AND ((msold.startti
me - msold2.endtime) <= '1 year 6 mons'::interval))
-> Seq Scan on marinerstates msold (cost=0.00..2889.32 rows=4590 width=20) (actual time=0.004..3
4.281 rows=2625 loops=1)
Filter: (((state)::text = 'Active'::text) AND ((starttime)::date <= '2007-01-11'::date) AND
((COALESCE(endtime, now()))::date >= '2006-07-15'::date))
-> Hash (cost=2251.66..2251.66 rows=41307 width=24) (actual time=37.446..37.446 rows=41250 loops=1)
Buckets: 8192 Batches: 1 Memory Usage: 2246kB
-> Seq Scan on marinerstates msold2 (cost=0.00..2251.66 rows=41307 width=24) (actual time=0.002..23.350 rows=41250 loops=1)
Filter: ((state)::text = 'Active'::text)
-> Index Scan using vessels_pkey on vessels vslold (cost=0.00..0.28 rows=1 width=4) (actual time=0.001..0.001 rows=1 loops=1876)
Index Cond: (vslold.id = msold.vslid)
Total runtime: 10456.092 ms

So the questions remain,
a) Why do we have "EXPLAIN ANALYZE" in FBSD_TEST (fast) taking an order of magnitude slower than the actual query.
b) Why this particular query is so slow on this generally much faster system.

PS

I agree that this is a lot "performance" related, but the "EXPLAIN ANALYZE" part just dont seem right.
Thats why i posted on admin in the first place.

>
> regards, tom lane
>

--
Achilleas Mantzios

In response to

Responses

Browse pgsql-admin by date

  From Date Subject
Next Message Achilleas Mantzios 2011-01-13 12:27:26 Re: Postgresql 9.0.2 explain analyze very slow (10x), compared to actual query time
Previous Message Achilleas Mantzios 2011-01-13 10:35:12 Re: Postgresql 9.0.2 explain analyze very slow (10x), compared to actual query time