Re: Append with naive multiplexing of FDWs

From: Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>
To: movead(dot)li(at)highgo(dot)ca
Cc: bruce(at)momjian(dot)us, ahsan(dot)hadi(at)highgo(dot)ca, robertmhaas(at)gmail(dot)com, thomas(dot)munro(at)gmail(dot)com, pgsql-hackers(at)postgresql(dot)org, sfrost(at)snowman(dot)net
Subject: Re: Append with naive multiplexing of FDWs
Date: 2020-01-29 08:39:35
Message-ID: 20200129.173935.1752784195747118665.horikyota.ntt@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Thanks!

At Wed, 29 Jan 2020 14:41:07 +0800, Movead Li <movead(dot)li(at)highgo(dot)ca> wrote in
> >"Parallel scan" at the moment means multiple workers fetch unique
> >blocks from *one* table in an arbitrated manner. In this sense
> >"parallel FDW scan" means multiple local workers fetch unique bundles
> >of tuples from *one* foreign table, which means it is running on a
> >single session. That doesn't offer an advantage.
>
> It maybe not "parallel FDW scan", it can be "parallel shards scan"
> the local workers will pick every foreign partition to scan. I have ever
> draw a picture about that you can see it in the link below.
>
> https://www.highgo.ca/2019/08/22/parallel-foreign-scan-of-postgresql/
>
> I think the "parallel shards scan" make sence in this way.

It is "asynchronous append on async-capable'd postgres-fdw scans". It
could be called as such in the sense that it is intended to be used
with sharding.

> >If parallel query processing worked in worker-per-table mode,
> >especially on partitioned tables, maybe the current FDW would work
> >without much of modification. But I believe asynchronous append on
> >foreign tables on a single process is far resource-effective and
> >moderately faster than parallel append.
>
> As the test result, current patch can not gain more performance when 
> it returns a huge number of tuples. By "parallel shards scan" method,
> it can work well, because the 'parallel' can take full use of CPUs while 
> 'asynchronous' can't. 

Did you looked at my benchmarking result upthread? Even it gives
significant gain even when gathering large number of tuples from
multiple servers or even from a single server. It is because of its
asynchronous nature.

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Masahiko Sawada 2020-01-29 08:42:56 Re: [Proposal] Table-level Transparent Data Encryption (TDE) and Key Management Service (KMS)
Previous Message Amit Langote 2020-01-29 08:39:10 Re: adding partitioned tables to publications