Re: Performance improvement hints + measurement

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: devik(at)cdi(dot)cz
Cc: pgsql-hackers(at)hub(dot)org
Subject: Re: Performance improvement hints + measurement
Date: 2000-09-13 14:47:36
Message-ID: 7484.968856456@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

devik(at)cdi(dot)cz writes:
> What do you think about this approach:

> 1) add all validity & tx fields from heap tuple into
> index tuple too

Non-starter I'm afraid. That would mean that whenever we update a
tuple, we'd have to find and update all the index entries that refer to
it. You'd be taking a tremendous performance hit on all update
operations in the hope of saving time on only a relatively small number
of inquiries.

This has been discussed before (repeatedly, IIRC). Please peruse the
pghackers archives.

> I regulary need to run this query against it:
> select nazev,sum(cnt) from bigrel group by name;
> With index, in pg there is a big overhead of heap tuple
> reading - mssql uses data directly from scanned index.

How exactly is MSSQL going to do that with only an index on "name"?
You need to have access to the cnt field as well, which wouldn't be
present in an index entry for name.

> I'm not sure how complex the proposed changes are. Another
> way would be to implement another aggregator like HashAgg
> which will use hashing.

That would be worth looking at --- we have no such plan type now.

> But it could be even more complicated as one has to use
> temp relation to store all hash buckets ..

You could probably generalize the existing code for hashjoin tables
to support hash aggregation as well. Now that I think about it, that
sounds like a really cool idea. Should put it on the TODO list.

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2000-09-13 15:10:11 Re: null in constraints
Previous Message Philip Warner 2000-09-13 14:17:02 RE: current is broken