On 6/27/08, Chris Browne <cbbrowne(at)acm(dot)org> wrote:
> josh(at)agliodbs(dot)com (Josh Berkus) writes:
> > Jonah,
> >> Hmm, I didn't think the Skype tools could really provide federated
> >> database functionality without a good amount of custom work. Or, am I
> >> mistaken?
> > Sure, what do you think pl/proxy is for?
> Ah, but the thing is, it changes the model from a relational one,
> where you can have fairly arbitrary "where clauses," to one where
> parameterization of queries must be predetermined.
> The "hard part" of federated database functionality at this point is
> the [parenthesized portion] of...
> select * from table(at)node [where criterion = x];
> What we'd like to be able to do is to ascertain that [where criterion
> = x] portion, and run it on the remote DBMS, so that only the relevant
> tuples would come back.
> What if table(at)node is a remote table with 200 million tuples, and
> [where criterion = x] restricts the result set to 200 of those.
> If you *cannot* push the "where clause" down to the remote node, then
> you're stuck with pulling all 200 million tuples, and filtering out,
> on the "local" node, the 200 tuples that need to be kept.
> To do better, with pl/proxy, requires having a predetermined function
> that would do that filtering, and if it's missing, you're stuck
> pulling 200M tuples, and throwing out nearly all of them.
> In contrast, with the work David Fetter's looking at, the [where
> criterion = x] clause would get pushed to the node which the data is
> being drawn from, and so the query, when running on "table(at)node,"
> could use indices, and return only the 200 tuples that are of
> It's a really big win, if it works.
I agree that for doing free-form queries on remote database,
the PL/Proxy is not the right answer. (Although the recent patch
to support dynamic records with AS clause at least makes them work.)
But I want to clarify it's goal - it is not to run "pre-determined
queries." It is to run "pre-determined complex transactions."
And to make those work in a "federated database" takes huge amount
of complexity that PL/Proxy simply sidesteps. At the price of
requiring function-based API. But as the function-based API has
other advantages even without PL/Proxy, it seems fine tradeoff.
In response to
pgsql-performance by date
|Next:||From: Jonah H. Harris||Date: 2008-06-30 13:34:27|
|Subject: Re: Federated Postgresql architecture ?|
|Previous:||From: Moritz Onken||Date: 2008-06-30 12:56:57|
|Subject: Re: Planner should use index on a LIKE 'foo%' query|