Re: Global temporary tables

From: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To: Konstantin Knizhnik <k(dot)knizhnik(at)postgrespro(dot)ru>
Cc: Craig Ringer <craig(at)2ndquadrant(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Global temporary tables
Date: 2019-08-18 08:28:26
Message-ID: CAFj8pRBEVp-wp-EAbuHaTuefc1WPg897iBW_1r5-wZ2NKKkNOw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

ne 18. 8. 2019 v 9:02 odesílatel Konstantin Knizhnik <
k(dot)knizhnik(at)postgrespro(dot)ru> napsal:

>
>
> On 16.08.2019 20:17, Pavel Stehule wrote:
>
>
>
> pá 16. 8. 2019 v 16:12 odesílatel Konstantin Knizhnik <
> k(dot)knizhnik(at)postgrespro(dot)ru> napsal:
>
>> I did more investigations of performance of global temp tables with
>> shared buffers vs. vanilla (local) temp tables.
>>
>> 1. Combination of persistent and temporary tables in the same query.
>>
>> Preparation:
>> create table big(pk bigint primary key, val bigint);
>> insert into big values
>> (generate_series(1,100000000),generate_series(1,100000000));
>> create temp table lt(key bigint, count bigint);
>> create global temp table gt(key bigint, count bigint);
>>
>> Size of table is about 6Gb, I run this test on desktop with 16GB of RAM
>> and postgres with 1Gb shared buffers.
>> I run two queries:
>>
>> insert into T (select count(*),pk/P as key from big group by key);
>> select sum(count) from T;
>>
>> where P is (100,10,1) and T is name of temp table (lt or gt).
>> The table below contains times of both queries in msec:
>>
>> Percent of selected data
>> 1%
>> 10%
>> 100%
>> Local temp table
>> 44610
>> 90
>> 47920
>> 891
>> 63414
>> 21612
>> Global temp table
>> 44669
>> 35
>> 47939
>> 298
>> 59159
>> 26015
>>
>> As you can see, time of insertion in temporary table is almost the same
>> and time of traversal of temporary table is about twice smaller for
>> global temp table
>> when it fits in RAM together with persistent table and slightly worser
>> when it doesn't fit.
>>
>>
>>
>> 2. Temporary table only access.
>> The same system, but Postgres is configured with shared_buffers=10GB,
>> max_parallel_workers = 4, max_parallel_workers_per_gather = 4
>>
>> Local temp tables:
>> create temp table local_temp(x1 bigint, x2 bigint, x3 bigint, x4 bigint,
>> x5 bigint, x6 bigint, x7 bigint, x8 bigint, x9 bigint);
>> insert into local_temp values
>> (generate_series(1,100000000),0,0,0,0,0,0,0,0);
>> select sum(x1) from local_temp;
>>
>> Global temp tables:
>> create global temporary table global_temp(x1 bigint, x2 bigint, x3
>> bigint, x4 bigint, x5 bigint, x6 bigint, x7 bigint, x8 bigint, x9 bigint);
>> insert into global_temp values
>> (generate_series(1,100000000),0,0,0,0,0,0,0,0);
>> select sum(x1) from global_temp;
>>
>> Results (msec):
>>
>> Insert
>> Select
>> Local temp table 37489
>> 48322
>> Global temp table 44358
>> 3003
>>
>> So insertion in local temp table is performed slightly faster but select
>> is 16 times slower!
>>
>> Conclusion:
>> In the assumption then temp table fits in memory, global temp tables with
>> shared buffers provides better performance than local temp table.
>> I didn't consider here global temp tables with local buffers because for
>> them results should be similar with local temp tables.
>>
>
> Probably there is not a reason why shared buffers should be slower than
> local buffers when system is under low load.
>
> access to shared memory is protected by spin locks (are cheap for few
> processes), so tests in one or few process are not too important (or it is
> just one side of space)
>
> another topic can be performance on MS Sys - there are stories about not
> perfect performance of shared memory there.
>
> Regards
>
> Pavel
>
> One more test which is used to simulate access to temp tables under high
> load.
> I am using "upsert" into temp table in multiple connections.
>
> create global temp table gtemp (x integer primary key, y bigint);
>
> upsert.sql:
> insert into gtemp values (random() * 1000000, 0) on conflict(x) do update
> set y=gtemp.y+1;
>
> pgbench -c 10 -M prepared -T 100 -P 1 -n -f upsert.sql postgres
>
>
> I failed to find some standard way in pgbech to perform per-session
> initialization to create local temp table,
> so I just insert this code in pgbench code:
>
> diff --git a/src/bin/pgbench/pgbench.c b/src/bin/pgbench/pgbench.c
> index 570cf33..af6a431 100644
> --- a/src/bin/pgbench/pgbench.c
> +++ b/src/bin/pgbench/pgbench.c
> @@ -5994,6 +5994,7 @@ threadRun(void *arg)
> {
> if ((state[i].con = doConnect()) == NULL)
> goto done;
> + executeStatement(state[i].con, "create temp table
> ltemp(x integer primary key, y bigint)");
> }
> }
>
>
> Results are the following:
> Global temp table: 117526 TPS
> Local temp table: 107802 TPS
>
>
> So even for this workload global temp table with shared buffers are a
> little bit faster.
> I will be pleased if you can propose some other testing scenario.
>

please, try to increase number of connections.

Regards

Pavel

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Juan José Santamaría Flecha 2019-08-18 08:42:52 Allow to_date() and to_timestamp() to accept localized names
Previous Message Daniel Migowski 2019-08-18 07:43:09 Re: Patch: New GUC prepared_statement_limit to limit memory used by prepared statements