Re: dblink: add polymorphic functions.

From: Corey Huinker <corey(dot)huinker(at)gmail(dot)com>
To: Joe Conway <mail(at)joeconway(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, Merlin Moncure <mmoncure(at)gmail(dot)com>, Michael Paquier <michael(dot)paquier(at)gmail(dot)com>, PostgreSQL mailing lists <pgsql-hackers(at)postgresql(dot)org>, Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
Subject: Re: dblink: add polymorphic functions.
Date: 2015-07-29 16:40:07
Message-ID: CADkLM=eiYFUR2P-V-5UN-=LipvshVY4CtGUM5y+_FeqHOO-u9Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Jul 29, 2015 at 12:14 PM, Joe Conway <mail(at)joeconway(dot)com> wrote:

> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> On 07/29/2015 08:56 AM, Corey Huinker wrote:
> > On Wed, Jul 29, 2015 at 10:48 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us Not
> > sure why inserting a variable name is so much better than inserting
> > a type name?
>
> > In a polymorphic function, I don't know the return type. It's
> > whatever type was specified on the function call.
> >
> > Say I've written a function with a function like
> > outer_polymorphic_function(p_rowtype anyelement,p1 ,p2,p3, ...)
> > returns setof anyelement
>
> Remind me where you get p_rowtype at runtime again? At some point you
> are still having to calculate it, no?
>
>
Say I've got a table my_partitioned_table (key1 integer, key2 integer,
metric1 integer, metric2 integer);

And I've got many partitions on that table.

My code lets you do something like this:

select key1, key2, sum(metric1) as a_sum_of_sums, sum(metric2) as
another_sum_of_sums
from execute_buncha_queries(null::my_partitioned_table,
'connection_string_thats_just_a_loopback',
array['select key1, key2, sum(metric1),
sum(metric2) from my_partition_p1 group by 1,2',
'select key1, key2, sum(metric1),
sum(metric2) from my_partition_p2 group by 1,2',
...])
group by 1,2

All those queries happen to return a shape the same as
my_partitioned_table. The query takes the partially summed values, spread
out across a lot of processors, and does the lighter work of summing the
sums.

The function execute_buncha_queries fires off those string queries async,
enough to fill up X number of processors, fetches results as they happen,
and keeps feeding the processors queries until it runs out. But
execute_buncha_queries needs to send back results in the shape of whatever
value was passed in the first parameter.

I can't put a cast around execute_buncha_queries because the function won't
know how to cast the results coming back from dblink.

select * from
execute_lotta_queries(null::my_table_or_type,'connection_string_to_remote_db',
array['query 1','query 2','query 3'])

Now, it's up to the user to make sure that all the query strings return a
query of shape "my_table_or_type", but that's a runtime problem.
And obviously, there are a lot of connection strings, but that's too much
detail for the problem at hand.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Dean Rasheed 2015-07-29 16:40:18 Re: more RLS oversights
Previous Message Heikki Linnakangas 2015-07-29 16:37:53 Re: Supporting TAP tests with MSVC and Windows