Re: assessing parallel-safety

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Noah Misch <noah(at)leadboat(dot)com>
Cc: "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: assessing parallel-safety
Date: 2015-02-12 16:44:41
Message-ID: CA+TgmoaMH0akw30V5WHidNTeLgd9OBrYv1bmwp9htXeYUr7uZA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Feb 11, 2015 at 3:21 PM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> I think we may want a dedicated parallel-safe property for functions
> rather than piggybacking on provolatile ...

I went through the current contents of pg_proc and tried to assess how
much parallel-unsafe stuff we've got. I think there are actually
three categories of things: (1) functions that can be called in
parallel mode either in the worker or in the leader ("parallel safe"),
(2) functions that can be called in parallel mode in the worker, but
not in the leader ("parallel restricted"), and (3) functions that
cannot be called in parallel mode at all ("parallel unsafe"). On a
first read-through, the number of things that looked not to be
anything other than parallel-safe looked to be fairly small; many of
these could be made parallel-safe with more work, but it's unlikely to
be worth the effort.

current_query() - Restricted because debug_query_string is not copied.
lo_open(), lo_close(), loread(), lowrite(), and other large object
functions - Restricted because large object state is not shared.
age(xid) - Restricted because it uses a transaction-lifespan cache
which is not shared.
now() - Restricted because transaction start timestamp is not copied.
statement_timestamp() - Restricted because statement start timestamp
is not copied.
pg_conf_load_time() - Restricted because PgReloadTime is not copied.
nextval(), currval() - Restricted because sequence-related state is not shared.
setval() - Unsafe because no data can be written in parallel mode.
random(), setseed() - Restricted because random seed state is not
shared. (We could alternatively treat setseed() as unsafe and random()
to be restricted only in sessions where setseed() has never been
called, and otherwise safe.)
pg_stat_get_* - Restricted because there's no guarantee the value
would be the same in the parallel worker as in the leader.
pg_backend_pid() - Restricted because the worker has a different PID.
set_config() - Unsafe because GUC state must remain synchronized.
pg_my_temp_schema() - Restricted because temporary namespaces aren't
shared with parallel workers.
pg_export_snapshot() - Restricted because the worker will go away quickly.
pg_prepared_statement(), pg_cursor() - Restricted because the prepared
statements and cursors are not synchronized with the worker.
pg_listening_channels() - Restricted because listening channels are
not synchronized with the worker.
pg*advisory*lock*() - Restricted because advisory lock state is not
shared with workers - and even if it were, the semantics would be hard
to reason about.
txid_current() - Unsafe because it might attempt XID assignment.
pg_logical_slot*() - Unsafe because they do all kinds of crazy stuff.

That's not a lot, and very little of it is anything you'd care about
parallelizing anyway. I expect that the incidence of user-written
parallel-unsafe functions will be considerably higher. I'm not sure
if this impacts the decision about how to design the facility for
assessing parallel-safety or not, but I thought it was worth sharing.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Thom Brown 2015-02-12 17:07:24 Re: Index-only scans for GiST.
Previous Message Heikki Linnakangas 2015-02-12 16:40:40 Re: Index-only scans for GiST.