Re: [HACKERS] All things equal, we are still alot slower then MySQL?

From: The Hermit Hacker <scrappy(at)hub(dot)org>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers(at)postgreSQL(dot)org
Subject: Re: [HACKERS] All things equal, we are still alot slower then MySQL?
Date: 1999-09-23 00:25:03
Message-ID: Pine.BSF.4.10.9909222100560.38923-100000@thelab.hub.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers


Okay, after playing around with this some more tonight, and playing with
the PGOPTIONS you've presented...I've gotten the query to be faster then
with mysql :) The error of my ways: not enough indices *sigh* I created a
few more on the fields that were being used on the query, and have:

SELECT c.id, c.name, c.url
FROM aecCategory c
WHERE EXISTS (
SELECT a.status
FROM aecEntMain a, aecWebEntry b
WHERE a.status LIKE 'active:ALL%'
AND a.representation LIKE '%:ALL%'
AND b.status LIKE 'active:ALL%'
AND b.indid='000001'
AND b.divid='100016'
AND ((a.id,a.mid) = (b.id,b.mid))
AND ((b.catid,b.indid,b.divid) = (c.id,c.ppid,c.pid)));

==========
Seq Scan on aeccategory c (cost=69.61 rows=1170 width=36)
SubPlan
-> Nested Loop (cost=4.10 rows=1 width=60)
-> Index Scan using aecwebentry_divid on aecwebentry b (cost=2.03 rows=1 width=24)
-> Index Scan using aecentmain_primary on aecentmain a (cost=2.07 rows=480 width=36)
===========

producing the results I need in 1.26seconds, using 1.5% of the CPU.

Now, something does bother me here, and I'm not sure if its a problem we
need to address, or if its expected, but if I remove the index
aecwebentry_divid, it reverts to using aecwebentry_primary and increases
the query time to 12secs, which is:

create unique index aecWebEntry_primary on aecWebEntry ( indid,divid,catid,id,mid);

Should it do that?

On Wed, 22 Sep 1999, Tom Lane wrote:

> The Hermit Hacker <scrappy(at)hub(dot)org> writes:
> > Anyone get a chance to look into this?
>
> Only just now, but I do have a couple of thoughts.
>
> For the query
>
> SELECT distinct b.indid, b.divid, b.catid, a.id, a.mid \
> FROM aecEntMain a, aecWebEntry b \
> WHERE (a.id=b.id AND a.mid=b.mid) \
> AND (a.status like 'active%' and b.status like 'active%')
> AND (a.status like '%active:ALL%' and b.status like '%active:ALL%')
> AND (a.representation like '%:ALL%')
> AND (b.indid=? and b.divid=? and b.catid=?)";
>
> you're showing a plan of
>
> Unique (cost=1271.15 rows=5 width=84)
> -> Sort (cost=1271.15 rows=5 width=84)
> -> Nested Loop (cost=1271.15 rows=5 width=84)
> -> Index Scan using aecwebentry_primary on aecwebentry b (cost=1269.08 rows=1 width=60)
> -> Index Scan using aecentmain_primary on aecentmain a (cost=2.07 rows=16560 width=24)
>
> which indicates that the optimizer is guessing only one match in
> aecwebentry and is therefore putting it on the outside of the nested
> loop (so that the inner scan over aecentmain would only have to be
> done once, if it's guessing right). But in a later message you
> say that the actual number of hits is more like 39 for aecwebentry
> and one for aecentmain. Which means that the nested loop would go
> faster if it were done the other way round, aecentmain on the outside.
> I'm not sure of a way to force the system to try it that way, though.
>
> The other question is why is it using a nested loop at all, rather
> than something more intelligent like merge or hash join. Presumably
> the optimizer thinks those would be more expensive, but it might be
> wrong.
>
> You could try forcing selection of merge and hash joins for this
> query and see (a) what kind of plan do you get, (b) how long does
> it really take? To do that, start psql with PGOPTIONS environment
> variable set:
>
> PGOPTIONS="-fn -fh" # forbid nestloop and hash, ie, force mergejoin
>
> PGOPTIONS="-fn -fm" # forbid nestloop and merge, ie, force hashjoin
>
> Also, I don't think you ever mentioned exactly what the available
> indexes are on these tables?
>
> regards, tom lane
>

Marc G. Fournier ICQ#7615664 IRC Nick: Scrappy
Systems Administrator @ hub.org
primary: scrappy(at)hub(dot)org secondary: scrappy(at){freebsd|postgresql}.org

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Jan Wieck 1999-09-23 01:19:39 Re: [HACKERS] Progress report: buffer refcount bugs and SQL functions
Previous Message Jan Wieck 1999-09-23 00:22:54 Re: [HACKERS] Compile timing