Re: CTID issues and a soc student in need of help

From: Tzahi Fadida <tzahi(dot)ml(at)gmail(dot)com>
To: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: CTID issues and a soc student in need of help
Date: 2006-06-01 17:39:24
Message-ID: 1149183564.4871.71.camel@llord
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, 2006-06-01 at 12:45 -0400, Tom Lane wrote:
> Tzahi Fadida <tzahi(dot)ml(at)gmail(dot)com> writes:
> > I am not sure about the definition of a context of a single SQL command.
>
> Well, AFAICS selecting a disjunction ought to qualify as a single SQL
> command using a single snapshot. It's not that different from a JOIN
> or UNION operation, no?

Yes, it is (at least the current version i am implementing) a one shot
computation. It is computed top-down and not bottom-up as regular
joins. For example, A natural join B natural join C can be broken down
to a left deep plan tree. Full disjunctions cannot be broken into such a
thing (in this version) and FD('A,B,C') directly returns a set of
results.

>
> > Inside C-language FullDisjunctions() function i repeatedly call, using
> > SPI:
> > SELECT * FROM Relation1;
> > SELECT * FROM Relation2;
> > SELECT * FROM Relation1 WHERE...;
> > SELECT * FROM Relation3;
> > ....
>
> You would need to force all these operations to be done with the same
> snapshot; should be possible with SPI_execute_snapshot. But really the
> above sounds like a toy prototype implementation to me. Why aren't you
> building this as executor plan-tree machinery?

I actually use cursors because i reiterate on the
"SELECT * FROM Relation1" queries using the FETCH_ALL technique.
Hopefully cursors uses something similar to SPI_execute_snapshot?
(maybe on READ_ONLY that i use. i see it uses something called
ActiveSnapshot)
(but for WHERE queries that are intended to exploit indices in
the relations i must execute repeatedly).

The reason, is two fold.
- At this time i don't see any big advantage (aside from the schema)
in putting it in the parser and subsequently the executor.
- I want to work inside the frame of time for the soc.

I think that i should first have a stable contrib module that looks
acceptable before i continue to something more problematic to maintain.

We have a new paper that was accepted to VLDB yesterday that breaks down
the problem into smaller ones + iterators + have polynomial delay that
is suited for streaming, hence the possibility for implementing in
the planner but it's too complex for soc. Lets have a stable something
first.

>
> > p.s.: In a different version of the function i create a temporary
> > relation and insert tuples in it, but it is exclusively used and
> > destroyed by the specific instance of that function.
>
> Why? You could use a tuplestore for transient data.

I do use tuplestore, but the other version needs an index and you can't
put an index on a tuplestore. Unless, you can give me a hint on how to
create a btree/hash index without a relation but that can be stored on
disk like tuplestore. I.e. all data is stored in the index. The key is
the whole tuple (the array of CTIDs) anyway.

>
> regards, tom lane

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Greg Stark 2006-06-01 18:25:56 Re: More thoughts about planner's cost estimates
Previous Message Josh Berkus 2006-06-01 17:28:09 Re: More thoughts about planner's cost estimates