Quick Links

Re: Performance improvement hints + measurement

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	devik(at)cdi(dot)cz
Cc:	pgsql-hackers(at)hub(dot)org
Subject:	Re: Performance improvement hints + measurement
Date:	2000-09-15 00:17:30
Message-ID:	15856.968977050@sss.pgh.pa.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

devik(at)cdi(dot)cz writes:
>> You could probably generalize the existing code for hashjoin tables
>> to support hash aggregation as well. Now that I think about it, that
>> sounds like a really cool idea. Should put it on the TODO list.

> Yep. It should be easy. It could be used as part of Hash
> node by extending ExecHash to return all hashed rows and
> adding value{1,2}[nbuckets] to HashJoinTableData.

Actually I think what we want is a hash table indexed by the
grouping-column value(s) and storing the current running aggregate
states for each agg function being computed. You wouldn't really
need to store any of the original tuples. You might want to form
the agg states for each entry into a tuple just for convenience of
storage though.

> By the way, what is the "portal" and "slot" ?

As far as the hash code is concerned, a portal is just a memory
allocation context. Destroying the portal gets rid of all the
memory allocated therein, without the hassle of finding and freeing
each palloc'd block individually.

As for slots, you are probably thinking of tuple table slots, which
are used to hold the tuples returned by plan nodes. The input
tuples read by the hash node are stored in a slot that's filled
by the child Plan node each time it's called. Similarly, the hash
join node has to return a new tuple in its output slot each time
it's called. It's a pretty simplistic form of memory management,
but it works fine for plan node output tuples.

If you are interested in working on this idea, you should be looking
at current sources --- both the memory management for hash tables
and the implementation of aggregate state storage have changed
materially since 7.0, so code based on 7.0 would need a lot of work
to be usable.

regards, tom lane

In response to

Re: Performance improvement hints + measurement at 2000-09-13 14:47:36 from Tom Lane

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Hiroshi Inoue	2000-09-15 00:19:34	RE: Status of new relation file naming
Previous Message	Mikheev, Vadim	2000-09-15 00:16:00	RE: Status of new relation file naming