Re: pg_background (and more parallelism infrastructure patches)

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pg_background (and more parallelism infrastructure patches)
Date: 2014-09-11 11:34:31
Message-ID: CAA4eK1J=X_Kuhm8XQD37+yQiqnYF04OsZy4KCRxWu4Wz1rYy5Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Jul 25, 2014 at 11:41 PM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>
> Patch 4 adds infrastructure that allows one session to save all of its
> non-default GUC values and another session to reload those values.
> This was written by Amit Khandekar and Noah Misch. It allows
> pg_background to start up the background worker with the same GUC
> settings that the launching process is using. I intend this as a
> demonstration of how to synchronize any given piece of state between
> cooperating backends. For real parallelism, we'll need to synchronize
> snapshots, combo CIDs, transaction state, and so on, in addition to
> GUCs. But GUCs are ONE of the things that we'll need to synchronize
> in that context, and this patch shows the kind of API we're thinking
> about for these sorts of problems.

Don't we need some way to prohibit changing GUC by launching process,
once it has shared the existing GUC?

> Patch 6 is pg_background itself. I'm quite pleased with how easily
> this came together. The existing background worker, dsm, shm_toc, and
> shm_mq infrastructure handles most of the heavily lifting here -
> obviously with some exceptions addressed by the preceding patches.
> Again, this is the kind of set-up that I'm expecting will happen in a
> background worker used for actual parallelism - clearly, more state
> will need to be restored there than here, but nonetheless the general
> flow of the code here is about what I'm imagining, just with somewhat
> more different kinds of state. Most of the work of writing this patch
> was actually figuring out how to execute the query itself; what I
> ended up with is mostly copied form exec_simple_query, but with some
> difference here and there. I'm not sure if it would be
> possible/advisable to try to refactor to reduce duplication.

1. This patch generates warning on windows
1>pg_background.obj : error LNK2001: unresolved external symbol
StatementTimeout

You need to add PGDLLIMPORT for StatementTimeout

2.
CREATE FUNCTION pg_background_launch(sql pg_catalog.text,
queue_size pg_catalog.int4 DEFAULT 65536)

Here shouldn't queue_size be pg_catalog.int8 as I could see
some related functions in test_shm_mq uses int8?

CREATE FUNCTION test_shm_mq(queue_size pg_catalog.int8,
CREATE FUNCTION test_shm_mq_pipelined(queue_size pg_catalog.int8,

Anyway I think corresponding C function doesn't use matching function
to extract the function args.

pg_background_launch(PG_FUNCTION_ARGS)
{
text *sql = PG_GETARG_TEXT_PP(0);
int32 queue_size = PG_GETARG_INT64(1);

Here it should _INT32 variant to match with current sql definition,
otherwise it leads to below error.

postgres=# select pg_background_launch('vacuum verbose foo');
ERROR: queue size must be at least 64 bytes

3.
Comparing execute_sql_string() and exec_simple_query(), I could see below
main differences:
a. Allocate a new memory context different from message context
b. Start/commit control of transaction is outside execute_sql_string
c. enable/disable statement timeout is done from outside incase of
execute_sql_string()
d. execute_sql_string() prohibits Start/Commit/Abort transaction statements.
e. execute_sql_string() doesn't log statements
f. execute_sql_string() uses binary format for result whereas
exec_simple_query()
uses TEXT as defult format
g. processed stat related info from caller incase of execute_sql_string().

Can't we handle all these or other changes inside exec_simple_query()
based on some parameter or may be a use a hook similar to what we
do in ProcessUtility?

Basically it looks bit odd to have a duplicate (mostly) copy of
exec_simple_query().

4.
Won't it be better if pg_background_worker_main() can look more
like PostgresMain() (in terms of handling different kind of messages),
so that it can be extended in future to handle parallel worker.

5.
pg_background_result()
{
..
/* Read and processes messages from the shared memory queue. */
}

Shouldn't the processing of messages be a separate function as
we do for pqParseInput3().

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Heikki Linnakangas 2014-09-11 11:51:54 Re: inherit support for foreign tables
Previous Message Etsuro Fujita 2014-09-11 11:30:29 Re: inherit support for foreign tables