Skip site navigation (1) Skip section navigation (2)

Re: CTID issues and a soc student in need of help

From: Tzahi Fadida <tzahi(dot)ml(at)gmail(dot)com>
To: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: CTID issues and a soc student in need of help
Date: 2006-06-01 17:39:24
Message-ID: 1149183564.4871.71.camel@llord (view raw or flat)
Thread:
Lists: pgsql-hackers
On Thu, 2006-06-01 at 12:45 -0400, Tom Lane wrote:
> Tzahi Fadida <tzahi(dot)ml(at)gmail(dot)com> writes:
> > I am not sure about the definition of a context of a single SQL command.
> 
> Well, AFAICS selecting a disjunction ought to qualify as a single SQL
> command using a single snapshot.  It's not that different from a JOIN
> or UNION operation, no?

Yes, it is (at least the current version i am implementing) a one shot
computation. It is computed top-down and not bottom-up as regular
joins. For example, A natural join B natural join C can be broken down
to a left deep plan tree. Full disjunctions cannot be broken into such a
thing (in this version) and FD('A,B,C') directly returns a set of
results.

> 
> > Inside C-language FullDisjunctions() function i repeatedly call, using
> > SPI:
> > SELECT * FROM Relation1;
> > SELECT * FROM Relation2;
> > SELECT * FROM Relation1 WHERE...;
> > SELECT * FROM Relation3;
> > ....
> 
> You would need to force all these operations to be done with the same
> snapshot; should be possible with SPI_execute_snapshot.  But really the
> above sounds like a toy prototype implementation to me.  Why aren't you
> building this as executor plan-tree machinery?

I actually use cursors because i reiterate on the
"SELECT * FROM Relation1" queries using the FETCH_ALL technique.
Hopefully cursors uses something similar to SPI_execute_snapshot?
(maybe on READ_ONLY that i use. i see it uses something called
ActiveSnapshot)
(but for WHERE queries that are intended to exploit indices in
the relations i must execute repeatedly).

The reason, is two fold.
- At this time i don't see any big advantage (aside from the schema) 
in putting it in the parser and subsequently the executor.
- I want to work inside the frame of time for the soc.

I think that i should first have a stable contrib module that looks
acceptable before i continue to something more problematic to maintain. 

We have a new paper that was accepted to VLDB yesterday that breaks down
the problem into smaller ones + iterators + have polynomial delay that
is suited for streaming, hence the possibility for implementing in
the planner but it's too complex for soc. Lets have a stable something
first.

> 
> > p.s.: In a different version of the function i create a temporary
> > relation and insert tuples in it, but it is exclusively used and
> > destroyed by the specific instance of that function.
> 
> Why?  You could use a tuplestore for transient data.

I do use tuplestore, but the other version needs an index and you can't
put an index on a tuplestore. Unless, you can give me a hint on how to
create a btree/hash index without a relation but that can be stored on
disk like tuplestore. I.e. all data is stored in the index. The key is
the whole tuple (the array of CTIDs) anyway.

> 
> 			regards, tom lane


In response to

pgsql-hackers by date

Next:From: Greg StarkDate: 2006-06-01 18:25:56
Subject: Re: More thoughts about planner's cost estimates
Previous:From: Josh BerkusDate: 2006-06-01 17:28:09
Subject: Re: More thoughts about planner's cost estimates

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group