i have been thinking about this issue for quite a while ...
given your idea i am not sure how this can work at all.
assume this ends up in the same node,
now you split it into two …
1 and 2 will have exactly the same visibility to and transaction.
i am not sure how you can get this right without looking at the data.
alternative idea: what if the proxy would add / generate a filter by looking at the data?
a quick idea would be that once you split you add a simple directive such as "FILTER GENERATOR $1" or so to the PL/proxy code.
it would then behind the scene arrange the filter passed on.
what do you think?
On Sep 1, 2011, at 10:13 AM, Hannu Krosing wrote:
> Hallow hackers
> I have the following problem to solve and would like to get advice on
> the best way to do it.
> The problem:
> When growing a pl/proxy based database cluster, one of the main
> operations is splitting a partition. The standard flow is as follows:
> 1) make a copy of the partitions table(s) to another database
> 2) reconfigure pl/proxy to use 2 partitions instead of one
> The easy part is making a copy of all or half of the table to another
> database. The hard part is fast deletion (i mean milliseconds,
> comparable to TRUNCATE) the data that should not be in a partition (so
> that RUN ON ALL functions will continue to return right results).
> It would be relatively easy, if we still had the RULES for select
> available for plain tables, but even then the eventual cleanup would
> usually mean at least 3 passes of disk writes (set xmax, write deleted
> flag, vacuum and remove)
> What I would like to have is possibility for additional visibility
> checks, which would run some simple C function over tuple data (usually
> hash(fieldval) + and + or ) and return visibility (is in this partition)
> as a result. It would be best if this is run at so low level that also
> vacuum would use it and can clean up the foreign partition data in one
> pass, without doing the delete dance first.
> So finally the QUESTION :
> where in code would be the best place to check for this so that
> 1) both regular queries and VACUUM see it
> 2) the tuple data (and not only system fields or just xmin/xmax) would
> be available for the function to use
> Hannu Krosing
> PostgreSQL Unlimited Scalability and Performance Consultant
> 2ndQuadrant Nordic
> PG Admin Book: http://www.2ndQuadrant.com/books/
> Sent via pgsql-hackers mailing list (pgsql-hackers(at)postgresql(dot)org)
> To make changes to your subscription:
Cybertec Schönig & Schönig GmbH
A-2700 Wiener Neustadt, Austria
In response to
pgsql-hackers by date
|Next:||From: Magnus Hagander||Date: 2011-09-02 12:19:35|
|Subject: Cascaded standby message|
|Previous:||From: Magnus Hagander||Date: 2011-09-02 10:45:34|
|Subject: Re: PATCH: regular logging of checkpoint progress|