Re: Improving connection scalability: GetSnapshotData()

From: Andres Freund <andres(at)anarazel(dot)de>
To: Alexander Korotkov <a(dot)korotkov(at)postgrespro(dot)ru>
Cc: Peter Geoghegan <pg(at)bowt(dot)ie>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Improving connection scalability: GetSnapshotData()
Date: 2020-04-06 20:53:29
Message-ID: 20200406205329.eiklmey7yqa4yd6y@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 2020-04-06 06:39:59 -0700, Andres Freund wrote:
> These benchmarks are on my workstation. The larger VM I used in the last
> round wasn't currently available.

One way to reproduce the problem at smaller connection counts / smaller
machines is to take more snapshots. Doesn't fully reproduce the problem,
because resetting ->xmin without xact overhead is part of the problem,
but it's helpful.

I use a volatile function that loops over a trivial statement. There's
probably an easier / more extreme way to reproduce the problem. But it's
good enough.

-- setup
CREATE OR REPLACE FUNCTION snapme(p_ret int, p_loop int) RETURNS int VOLATILE LANGUAGE plpgsql AS $$BEGIN FOR x in 1..p_loop LOOP EXECUTE 'SELECT 1';END LOOP; RETURN p_ret; END;$$;
-- statement executed in parallel
SELECT snapme(17, 10000);

before (all above 1.5%):
+ 37.82% postgres postgres [.] GetSnapshotData
+ 6.26% postgres postgres [.] AllocSetAlloc
+ 3.77% postgres postgres [.] base_yyparse
+ 3.04% postgres postgres [.] core_yylex
+ 1.94% postgres postgres [.] grouping_planner
+ 1.83% postgres libc-2.30.so [.] __strncpy_avx2
+ 1.80% postgres postgres [.] palloc
+ 1.73% postgres libc-2.30.so [.] __memset_avx2_unaligned_erms

after:
+ 5.75% postgres postgres [.] base_yyparse
+ 4.37% postgres postgres [.] palloc
+ 4.29% postgres postgres [.] AllocSetAlloc
+ 3.75% postgres postgres [.] expression_tree_walker.part.0
+ 3.14% postgres postgres [.] core_yylex
+ 2.51% postgres postgres [.] subquery_planner
+ 2.48% postgres postgres [.] CheckExprStillValid
+ 2.45% postgres postgres [.] check_stack_depth
+ 2.42% postgres plpgsql.so [.] exec_stmt
+ 1.92% postgres libc-2.30.so [.] __memset_avx2_unaligned_erms
+ 1.91% postgres postgres [.] query_tree_walker
+ 1.88% postgres libc-2.30.so [.] __GI_____strtoll_l_internal
+ 1.86% postgres postgres [.] _SPI_execute_plan
+ 1.85% postgres postgres [.] assign_query_collations_walker
+ 1.84% postgres postgres [.] remove_useless_results_recurse
+ 1.83% postgres postgres [.] grouping_planner
+ 1.50% postgres postgres [.] set_plan_refs

If I change the workload to be
BEGIN;
SELECT txid_current();
SELECT snapme(17, 1000);
COMMIT;

the difference reduces (because GetSnapshotData() only needs to look at
procs with xids, and xids are assigned for much longer), but still is
significant:

before (all above 1.5%):
+ 35.89% postgres postgres [.] GetSnapshotData
+ 7.94% postgres postgres [.] AllocSetAlloc
+ 4.42% postgres postgres [.] base_yyparse
+ 3.62% postgres libc-2.30.so [.] __memset_avx2_unaligned_erms
+ 2.87% postgres postgres [.] LWLockAcquire
+ 2.76% postgres postgres [.] core_yylex
+ 2.30% postgres postgres [.] expression_tree_walker.part.0
+ 1.81% postgres postgres [.] MemoryContextAllocZeroAligned
+ 1.80% postgres postgres [.] transformStmt
+ 1.66% postgres postgres [.] grouping_planner
+ 1.64% postgres postgres [.] subquery_planner

after:
+ 24.59% postgres postgres [.] GetSnapshotData
+ 4.89% postgres postgres [.] base_yyparse
+ 4.59% postgres postgres [.] AllocSetAlloc
+ 3.00% postgres postgres [.] LWLockAcquire
+ 2.76% postgres postgres [.] palloc
+ 2.27% postgres postgres [.] MemoryContextAllocZeroAligned
+ 2.26% postgres postgres [.] check_stack_depth
+ 1.77% postgres postgres [.] core_yylex

Greetings,

Andres Freund

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Alvaro Herrera 2020-04-06 20:54:38 Re: [PATCH] Incremental sort (was: PoC: Partial sort)
Previous Message Tom Lane 2020-04-06 20:48:35 Re: [PATCH] Incremental sort (was: PoC: Partial sort)