Re: Global temporary tables

From: Konstantin Knizhnik <k(dot)knizhnik(at)postgrespro(dot)ru>
To: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
Cc: Craig Ringer <craig(at)2ndquadrant(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Global temporary tables
Date: 2019-08-19 11:16:56
Message-ID: d43b82e2-41da-eaae-065f-6aaf1c7728a8@postgrespro.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 19.08.2019 11:51, Konstantin Knizhnik wrote:
>
>
> On 18.08.2019 11:28, Pavel Stehule wrote:
>>
>>
>> ne 18. 8. 2019 v 9:02 odesílatel Konstantin Knizhnik
>> <k(dot)knizhnik(at)postgrespro(dot)ru <mailto:k(dot)knizhnik(at)postgrespro(dot)ru>> napsal:
>>
>>
>>
>> On 16.08.2019 20:17, Pavel Stehule wrote:
>>>
>>>
>>> pá 16. 8. 2019 v 16:12 odesílatel Konstantin Knizhnik
>>> <k(dot)knizhnik(at)postgrespro(dot)ru <mailto:k(dot)knizhnik(at)postgrespro(dot)ru>>
>>> napsal:
>>>
>>> I did more investigations of performance of global temp
>>> tables with shared buffers vs. vanilla (local) temp tables.
>>>
>>> 1. Combination of persistent and temporary tables in the
>>> same query.
>>>
>>> Preparation:
>>> create table big(pk bigint primary key, val bigint);
>>> insert into big values
>>> (generate_series(1,100000000),generate_series(1,100000000));
>>> create temp table lt(key bigint, count bigint);
>>> create global temp table gt(key bigint, count bigint);
>>>
>>> Size of table is about 6Gb, I run this test on desktop with
>>> 16GB of RAM and postgres with 1Gb shared buffers.
>>> I run two queries:
>>>
>>> insert into T (select count(*),pk/P as key from big group by
>>> key);
>>> select sum(count) from T;
>>>
>>> where P is (100,10,1) and T is name of temp table (lt or gt).
>>> The table below contains times of both queries in msec:
>>>
>>> Percent of selected data
>>> 1%
>>> 10%
>>> 100%
>>> Local temp table
>>> 44610
>>> 90
>>> 47920
>>> 891
>>> 63414
>>> 21612
>>> Global temp table
>>> 44669
>>> 35
>>> 47939
>>> 298
>>> 59159
>>> 26015
>>>
>>>
>>> As you can see, time of insertion in temporary table is
>>> almost the same
>>> and time of traversal of temporary table is about twice
>>> smaller for global temp table
>>> when it fits in RAM together with persistent table and
>>> slightly worser when it doesn't fit.
>>>
>>>
>>>
>>> 2. Temporary table only access.
>>> The same system, but Postgres is configured with
>>> shared_buffers=10GB, max_parallel_workers = 4,
>>> max_parallel_workers_per_gather = 4
>>>
>>> Local temp tables:
>>> create temp table local_temp(x1 bigint, x2 bigint, x3
>>> bigint, x4 bigint, x5 bigint, x6 bigint, x7 bigint, x8
>>> bigint, x9 bigint);
>>> insert into local_temp values
>>> (generate_series(1,100000000),0,0,0,0,0,0,0,0);
>>> select sum(x1) from local_temp;
>>>
>>> Global temp tables:
>>> create global temporary table global_temp(x1 bigint, x2
>>> bigint, x3 bigint, x4 bigint, x5 bigint, x6 bigint, x7
>>> bigint, x8 bigint, x9 bigint);
>>> insert into global_temp values
>>> (generate_series(1,100000000),0,0,0,0,0,0,0,0);
>>> select sum(x1) from global_temp;
>>>
>>> Results (msec):
>>>
>>> Insert
>>> Select
>>> Local temp table 37489
>>> 48322
>>> Global temp table 44358
>>> 3003
>>>
>>>
>>> So insertion in local temp table is performed slightly
>>> faster but select is 16 times slower!
>>>
>>> Conclusion:
>>> In the assumption then temp table fits in memory, global
>>> temp tables with shared buffers provides better performance
>>> than local temp table.
>>> I didn't consider here global temp tables with local buffers
>>> because for them results should be similar with local temp
>>> tables.
>>>
>>>
>>> Probably there is not a reason why shared buffers should be
>>> slower than local buffers when system is under low load.
>>>
>>> access to shared memory is protected by spin locks (are cheap
>>> for few processes), so tests in one or few process are not too
>>> important (or it is just one side of space)
>>>
>>> another topic can be performance on MS Sys - there are stories
>>> about not perfect performance of shared memory there.
>>>
>>> Regards
>>>
>>> Pavel
>>>
>>  One more test which is used to simulate access to temp tables
>> under high load.
>> I am using "upsert" into temp table in multiple connections.
>>
>> create global temp table gtemp (x integer primary key, y bigint);
>>
>> upsert.sql:
>> insert into gtemp values (random() * 1000000, 0) on conflict(x)
>> do update set y=gtemp.y+1;
>>
>> pgbench -c 10 -M prepared -T 100 -P 1 -n -f upsert.sql postgres
>>
>>
>> I failed to find some standard way in pgbech to perform
>> per-session initialization to create local temp table,
>> so I just insert this code in pgbench code:
>>
>> diff --git a/src/bin/pgbench/pgbench.c b/src/bin/pgbench/pgbench.c
>> index 570cf33..af6a431 100644
>> --- a/src/bin/pgbench/pgbench.c
>> +++ b/src/bin/pgbench/pgbench.c
>> @@ -5994,6 +5994,7 @@ threadRun(void *arg)
>>                 {
>>                         if ((state[i].con = doConnect()) == NULL)
>>                                 goto done;
>> +                       executeStatement(state[i].con, "create
>> temp table ltemp(x integer primary key, y bigint)");
>>                 }
>>         }
>>
>>
>> Results are the following:
>> Global temp table: 117526 TPS
>> Local temp table:   107802 TPS
>>
>>
>> So even for this workload global temp table with shared buffers
>> are a little bit faster.
>> I will be pleased if you can propose some other testing scenario.
>>
>>
>> please, try to increase number of connections.
>
> With 20 connections and 4 pgbench threads results are similar: 119k
> TPS for global temp tables and 115k TPS for local temp tables.
>
> I have tried yet another scenario: read-only access to temp tables:
>
> \set id random(1,10000000)
> select sum(y) from ltemp where x=:id;
>
> Tables are created and initialized in pgbench session startup:
>
> knizhnik(at)knizhnik:~/postgresql$ git diff
> diff --git a/src/bin/pgbench/pgbench.c b/src/bin/pgbench/pgbench.c
> index 570cf33..95295b0 100644
> --- a/src/bin/pgbench/pgbench.c
> +++ b/src/bin/pgbench/pgbench.c
> @@ -5994,6 +5994,8 @@ threadRun(void *arg)
>                 {
>                         if ((state[i].con = doConnect()) == NULL)
>                                 goto done;
> +                       executeStatement(state[i].con, "create temp
> table ltemp(x integer primary key, y bigint)");
> +                       executeStatement(state[i].con, "insert into
> ltemp values (generate_series(1,1000000), generate_series(1,1000000))");
>                 }
>         }
>
>
> Results for 10 connections with 10 million inserted records per table
> and 100 connections with 1 million inserted record per table :
>
> #connections:
> 10
> 100
> local temp
> 68k
> 90k
> global temp, shared_buffers=1G
> 63k
> 61k
> global temp, shared_buffers=10G 150k
> 150k
>
>
>
> So temporary tables with local buffers are slightly faster when data
> doesn't fit in shared buffers, but significantly slower when it fits.
>
>

All previously reported results were produced at my desktop.
I also run this read-only test on huge IBM server (POWER9, 2 NUMA nodes,
176 CPU, 1Tb RAM).

Here the difference between local and global tables is not so large:

Local temp:   739k TPS
Global temp:  924k TPS

--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Pavel Stehule 2019-08-19 11:25:37 Re: Global temporary tables
Previous Message Surafel Temesgen 2019-08-19 11:11:48 Re: FETCH FIRST clause PERCENT option