Skip site navigation (1) Skip section navigation (2)

Re: Replacement Selection

From: <mac_man2005(at)hotmail(dot)it>
To: <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Replacement Selection
Date: 2007-11-26 18:08:55
Message-ID: BAY132-DS194113DD78CD5D3842F97E6750@phx.gbl (view raw, whole thread or download thread mbox)
Lists: pgsql-hackers

I'm trying to integrate my code into PostgreSQL. At the moment I have got my 
working code, with my own main() etc etc.
The code is supposed to perform run generation during external sorting. 
That's all, my code won't do any mergesort. Just run generation.

I'm studing the code and I don't know where to put my code into. Which part 
I need to substitute and which other are absolutely "untouchables".
I admit I'm not an excellent programmer. I've always been writing my own 
codes, simple codes. Now I have got some ideas that can possibly help 
postgreSQL to get better. And for the first time I'm to integrate code into 
others code. I say it just to apologize in case some things that could be 
obvious for someone else, maybe are not for me.

Anyway... back to work.
My code has the following structure.

1) Generates a random input stream to sort.
As for this part, i just generate an integer input stream, not a stream of 
db records. I talk about stream because I'm in a general case in which the 
input source can be unknown and we cannot even know how much elements to 

2)Fill the available memory with the first M elements from stream. They will 
be arranged into an heap structure.

3) Start run generation. As for this phase, I see PostgreSQL code (as Knuth 
algorithm) marks elements belonging to runs in otder to know which run they 
belong to and to know when the current heap has finished building the 
current run. I don't memorize this kind of info. I just output from heap to 
run all of the elements going into the current run. The elements supposed to 
go into the next run (I call them "dead records") are still stored into main 
memory, but as leaves of the heap. This implies reducing the heap size and 
so heapifying a smaller number of elements each time I get a dead record 
(it's not necessary to sort dead records). When the heap size is zero a new 
run is created heapifying all the dead records currently present into main 

I haven't seen something similar into tuplesort.c, apparently no heapify is 
called no new run created and stuff like this.
Do you see any parallelism between PostgreSQL code with what I said in the 
previous points?

Thanks for your attention.

From: "Heikki Linnakangas" <heikki(at)enterprisedb(dot)com>
Sent: Monday, November 26, 2007 5:42 PM
To: <mac_man2005(at)hotmail(dot)it>
Cc: <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [HACKERS] Replacement Selection

> mac_man2005(at)hotmail(dot)it wrote:
>> Unfortunately I'm lost into the code... any good soul helping me to 
>> understand what should be the precise part to be modified?
> You haven't given any details on what you're trying to do. What are you 
> trying to do?
> -- 
>   Heikki Linnakangas
>   EnterpriseDB

In response to


pgsql-hackers by date

Next:From: Tom LaneDate: 2007-11-26 18:31:01
Subject: Re: Replacement Selection
Previous:From: Tom LaneDate: 2007-11-26 18:02:14
Subject: Re: maintenance_work_mem memory constraint?

Privacy Policy | About PostgreSQL
Copyright © 1996-2017 The PostgreSQL Global Development Group