Re: assessing parallel-safety

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Noah Misch <noah(at)leadboat(dot)com>
Cc: "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: assessing parallel-safety
Date: 2015-03-12 15:21:37
Message-ID: CA+TgmoaDFwAsuWiQZiXoTCFk3cStvROCYx9UwoBQVBGuPYhmYA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

[ deferring responses to some points until a later time ]

On Thu, Feb 19, 2015 at 1:19 AM, Noah Misch <noah(at)leadboat(dot)com> wrote:
>> This seems backwards to me. If some hypothetical selectivity
>> estimator were PROPARALLEL_UNSAFE, then any operator that uses that
>> function would also need to be PROPARALLEL_UNSAFE.
>
> It's fair to have a PROPARALLEL_UNSAFE estimator for a PROPARALLEL_SAFE
> function, because the planning of a parallel query is often not itself done in
> parallel mode. In that case, "SELECT * FROM tablename WHERE colname OP 0"
> might use a parallel seqscan but fail completely if called from inside a
> function running in parallel mode. That is to say, an affected query can
> itself use parallelism, but placing the query in a function makes the function
> PROPARALLEL_UNSAFE. Surprising, but not wrong.
>
> Rereading my previous message, I failed to make the bottom line clear: I
> recommend marking eqsel etc. PROPARALLEL_UNSAFE but *not* checking an
> estimator's proparallel before calling it in the planner.

But what do these functions do that is actually unsafe?

>> > - Assuming you don't want to propagate XactLastRecEnd from the slave back to
>> > the master, restrict XLogInsert() during parallel mode. Restrict functions
>> > that call it, including pg_create_restore_point, pg_switch_xlog and
>> > pg_stop_backup.
>>
>> Hmm. Altogether prohibiting XLogInsert() in parallel mode seems
>> unwise, because it would foreclose heap_page_prune_opt() in workers.
>> I realize there's separate conversation about whether pruning during
>> SELECT queries is good policy, but in the interested of separating
>> mechanism from policy, and in the sure knowledge that allowing at
>> least some writes in parallel mode is certainly going to be something
>> people will want, it seems better to look into propagating
>> XactLastRecEnd.
>
> Good points; that works for me.

The key design decision here seems to be this: How eagerly do we need
to synchronize XactLastRecEnd? What exactly goes wrong if it isn't
synchronized? For example, if the value were absolutely critical in
all circumstances, one could imagine storing a shared XactLastRecEnd
in shared memory. This doesn't appear to be the case: the main
requirement is that we definitely need an up-to-date value at commit
time. Also, at abort time, we don't really the value for anything
critical, but it's worth kicking the WAL writer so that any
accumulated WAL gets flushed.

Here's an incremental patch - which I will incorporate into the
parallel mode patch if it seems about right to you - that tries to
tackle all this.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Attachment Content-Type Size
sync-xactlastrecend.patch text/x-patch 5.5 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Kapila 2015-03-12 15:29:32 Re: Parallel Seq Scan
Previous Message Andres Freund 2015-03-12 15:08:02 pg_xlog_replay_resume() considered armed and dangerous