Re: Replacement Selection

From: <mac_man2005(at)hotmail(dot)it>
To: <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Replacement Selection
Date: 2007-11-26 18:08:55
Message-ID: BAY132-DS194113DD78CD5D3842F97E6750@phx.gbl
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Sorry.

I'm trying to integrate my code into PostgreSQL. At the moment I have got my
working code, with my own main() etc etc.
The code is supposed to perform run generation during external sorting.
That's all, my code won't do any mergesort. Just run generation.

I'm studing the code and I don't know where to put my code into. Which part
I need to substitute and which other are absolutely "untouchables".
I admit I'm not an excellent programmer. I've always been writing my own
codes, simple codes. Now I have got some ideas that can possibly help
postgreSQL to get better. And for the first time I'm to integrate code into
others code. I say it just to apologize in case some things that could be
obvious for someone else, maybe are not for me.

Anyway... back to work.
My code has the following structure.

1) Generates a random input stream to sort.
As for this part, i just generate an integer input stream, not a stream of
db records. I talk about stream because I'm in a general case in which the
input source can be unknown and we cannot even know how much elements to
sort

2)Fill the available memory with the first M elements from stream. They will
be arranged into an heap structure.

3) Start run generation. As for this phase, I see PostgreSQL code (as Knuth
algorithm) marks elements belonging to runs in otder to know which run they
belong to and to know when the current heap has finished building the
current run. I don't memorize this kind of info. I just output from heap to
run all of the elements going into the current run. The elements supposed to
go into the next run (I call them "dead records") are still stored into main
memory, but as leaves of the heap. This implies reducing the heap size and
so heapifying a smaller number of elements each time I get a dead record
(it's not necessary to sort dead records). When the heap size is zero a new
run is created heapifying all the dead records currently present into main
memory.

I haven't seen something similar into tuplesort.c, apparently no heapify is
called no new run created and stuff like this.
Do you see any parallelism between PostgreSQL code with what I said in the
previous points?

Thanks for your attention.

--------------------------------------------------
From: "Heikki Linnakangas" <heikki(at)enterprisedb(dot)com>
Sent: Monday, November 26, 2007 5:42 PM
To: <mac_man2005(at)hotmail(dot)it>
Cc: <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [HACKERS] Replacement Selection

> mac_man2005(at)hotmail(dot)it wrote:
>> Unfortunately I'm lost into the code... any good soul helping me to
>> understand what should be the precise part to be modified?
>
> You haven't given any details on what you're trying to do. What are you
> trying to do?
>
> --
> Heikki Linnakangas
> EnterpriseDB http://www.enterprisedb.com
>

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2007-11-26 18:31:01 Re: Replacement Selection
Previous Message Tom Lane 2007-11-26 18:02:14 Re: maintenance_work_mem memory constraint?