Re: [HACKERS] Another index "buglet"?

From: The Hermit Hacker <scrappy(at)hub(dot)org>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers(at)postgreSQL(dot)org
Subject: Re: [HACKERS] Another index "buglet"?
Date: 2000-01-08 19:05:12
Message-ID: Pine.BSF.4.21.0001081500330.18498-100000@thelab.hub.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers


After the VACUUM ANALYZE:

Straight start up:
Hash Join (cost=9994.31 rows=2740488 width=36)
-> Seq Scan on url (cost=3368.58 rows=37866 width=20)
-> Hash (cost=3368.58 rows=37866 width=16)
-> Seq Scan on url url2 (cost=3368.58 rows=37866 width=16)

788u 0.327s 0:03.89 28.2% 104+14868k 0+179io 0pf+0w

Forbid merge:
Hash Join (cost=9994.31 rows=2740488 width=36)
-> Seq Scan on url (cost=3368.58 rows=37866 width=20)
-> Hash (cost=3368.58 rows=37866 width=16)
-> Seq Scan on url url2 (cost=3368.58 rows=37866 width=16)

0.900u 0.217s 0:04.19 26.4% 103+14638k 0+175io 0pf+0w

Forbid Hash:
Merge Join (cost=11188.76 rows=2740488 width=36)
-> Index Scan using url_pkey on url url2 (cost=4347.30 rows=37866 width=16)
-> Index Scan using url_referrer on url (cost=4342.30 rows=37866 width=20)

0.897u 0.210s 0:03.19 34.4% 106+15120k 0+179io 0pf+0w

On Sat, 8 Jan 2000, Tom Lane wrote:

> The Hermit Hacker <scrappy(at)hub(dot)org> writes:
> > SELECT url.status,url2.url,url.url
> > FROM url,url url2
> > WHERE url.referrer=url2.rec_id;
>
> > There is an index on rec_id and one on referrer ... shouldn't one of the
> > be used?
>
> Not necessarily --- hash join is a perfectly respectable alternative
> choice. I'd expect to see either a hash or a merge join here (the
> merge *would* use both indexes).
>
> Now it could be that the optimizer is misestimating the relative costs
> of merge and hash join. If you're interested in checking that, do
> this (*after* running VACUUM ANALYZE, ahem):
>
> 1. Start psql with environment variable PGOPTIONS="-fh" (forbid hash).
> Do the EXPLAIN --- it'll probably give a mergejoin plan now. Note
> the estimated total cost. Run the query itself, and note the runtime.
>
> 2. Start psql with environment variable PGOPTIONS="-fm" (forbid merge),
> and repeat the experiment to get the estimated cost and actual time
> for the hash join.
>
> I'd be interested to know what you find out. I'm in the middle of
> rejiggering the optimizer's cost estimates right now, so more data
> points would be helpful.
>
> regards, tom lane
>

Marc G. Fournier ICQ#7615664 IRC Nick: Scrappy
Systems Administrator @ hub.org
primary: scrappy(at)hub(dot)org secondary: scrappy(at){freebsd|postgresql}.org

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2000-01-08 20:07:35 Re: [HACKERS] Re: ERROR: out of free buffers: time to abort !
Previous Message The Hermit Hacker 2000-01-08 18:58:54 Re: [HACKERS] Costs: Index vs Non-Index