Re: Parallel Sort

From: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
To: Noah Misch <noah(at)leadboat(dot)com>
Cc: PostgreSQL mailing lists <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Parallel Sort
Date: 2013-05-14 04:51:42
Message-ID: CAB7nPqRfK2e_iM2L-ccMGSUGajDZTwm2Xzro3fLn9CE0LhgfCA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, May 13, 2013 at 11:28 PM, Noah Misch <noah(at)leadboat(dot)com> wrote:

> * Identifying Parallel-Compatible Functions
>
> Not all functions can reasonably run on a worker backend. We should not
> presume that a VOLATILE function can tolerate the unstable execution order
> imposed by parallelism, though a function like clock_timestamp() is
> perfectly
> reasonable to run that way. STABLE does not have that problem, but neither
> does it constitute a promise that the function implementation is compatible
> with parallel execution. Consider xid_age(), which would need code
> changes to
> operate correctly in parallel. IMMUTABLE almost guarantees enough; there
> may
> come a day when all IMMUTABLE functions can be presumed parallel-safe. For
> now, an IMMUTABLE function could cause trouble by starting a (read-only)
> subtransaction. The bottom line is that parallel-compatibility needs to be
> separate from volatility classes for the time being.
>
I am not sure that this problem is only limited to functions, but to all
the expressions
and clauses of queries that could be shipped and evaluated on the worker
backends when
fetching tuples that could be used to accelerate a parallel sort. Let's
imagine for example
the case of a LIMIT clause that can be used by worker backends to limit the
number of tuples
to sort as final result.
In some ways, Postgres-XC has faced (and is still facing) similar
challenges and they have
been partially solved.

I'm not sure what the specific answer here should look like. Simply having
> a
> CREATE FUNCTION ... PARALLEL_IS_FINE flag is not entirely satisfying,
> because
> the rules are liable to loosen over time.
>
Having a flag would be enough to control parallelism, but cannot we also
determine if
the execution of a function can be shipped safely to a worker based on its
volatility
only? Immutable functions are presumably safe as they do not modify the
database state
and give always the same result, volatile and stable functions are
definitely not safe.
For such reasons, it would be better to keep things simple and rely on
simple rules to
determine if a given expression can be executed safely on a backend worker.
--
Michael

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Simon Riggs 2013-05-14 07:05:03 Slicing TOAST
Previous Message Daniel Farina 2013-05-14 04:23:05 Re: Better handling of archive_command problems