| From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> | 
|---|---|
| To: | mac_man2005(at)hotmail(dot)it | 
| Cc: | pgsql-hackers(at)postgresql(dot)org | 
| Subject: | Re: Replacement Selection | 
| Date: | 2007-11-26 22:17:07 | 
| Message-ID: | 4430.1196115427@sss.pgh.pa.us | 
| Views: | Whole Thread | Raw Message | Download mbox | Resend email | 
| Thread: | |
| Lists: | pgsql-hackers | 
<mac_man2005(at)hotmail(dot)it> writes:
> Anyway, even in my RS implementation a longer run is created. The first M 
> initialization elements will surely form part of the current run. M is the 
> memory size so at least a run sized M will be created. After initialization, 
> the elements are not suddenly output, but an element from heap is output 
> into run as soon as I get an element from stream. In other words, for each 
> element from stream, the root element of the heap is output, and the input 
> element takes the root place into the heap. If that element is a "good 
> record" I just heapify (since the element will be placed at the now free 
> root place). If that input element is a dead record I swap it with the last 
> leaf and reduce the heap size.
AFAICS that produces runs that are *exactly* the same length as Knuth's
method --- you're just using a different technique for detecting when
the run is over, to wit "record is not in heap" vs "record is in heap
but with a higher run number".  I guess you would save some comparisons
while the heap is shrinking, but it's not at all clear that you'd save
more than what it will cost you to re-heapify all the dead records once
the run is over.
regards, tom lane
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Mark Cave-Ayland | 2007-11-26 22:17:09 | Re: Locating sharedir in PostgreSQL on Windows | 
| Previous Message | Tom Lane | 2007-11-26 22:07:59 | Re: 8.3devel slower than 8.2 under read-only load |