Re: LSM tree for Postgres

From: Konstantin Knizhnik <k(dot)knizhnik(at)postgrespro(dot)ru>
To: pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: LSM tree for Postgres
Date: 2020-08-05 07:08:45
Message-ID: d5619cec-c591-536a-a941-aeb4fcd33206@postgrespro.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 04.08.2020 20:44, Tomas Vondra wrote:
>
> IMO the 6x difference is rather misleading, as it very much depends on
> the duration of the benchmark and how much data it ends up with. I think
> it's better to test 'stable states' i.e. with small data set that does
> not exceed RAM during the whole test, and large ones that already starts
> larger than RAM. Not sure if it makes sense to make a difference between
> cases that fit into shared buffers and those that exceed shared buffers
> but still fit into RAM.

I have changed benchmark scenario.
Now I inserted 200 million records with sequential key: it is fast
enough and makes index size about 19Gb.
Then I perform 1 million random inserts.

-- init schema
create table t(k bigint, v1 bigint, v2 bigint, v3 bigint, v4 bigint, v5
bigint, v6 bigint, v7 bigint, v8 bigint);
create index lsm_index on t using lsm3(k) include (v1,v2,v3,v4,v5,v6,v7,v8);
create table t2(k bigint, v1 bigint, v2 bigint, v3 bigint, v4 bigint, v5
bigint, v6 bigint, v7 bigint, v8 bigint);
create index on t2(k) include (v1,v2,v3,v4,v5,v6,v7,v8);

-- fill with sequential data
insert into t values (generate_series(1,200000000),0,0,0,0,0,0,0,0);
Time: 520655,635 ms (08:40,656)

insert into t2 values (generate_series(1,200000000),0,0,0,0,0,0,0,0);
Time: 372245,093 ms (06:12,245)

-- random inserts
insert into t (v1,k,v2,v3,v4,v5,v6,v7,v8) values
(generate_series(1,1000000),(random()*1000000000)::bigint,0,0,0,0,0,0,0);
Time: 3781,614 ms (00:03,782)

insert into t2 (v1,k,v2,v3,v4,v5,v6,v7,v8) values
(generate_series(1,1000000),(random()*1000000000)::bigint,0,0,0,0,0,0,0);
Time: 39034,574 ms (00:39,035)

The I perform random selects

select.sql:
\set k random(1, 1000000000)
select * from t where k=:k;

select2.sql:
\set k random(1, 1000000000)
select * from t2 where k=:k;

pgbench -n -T 100 -P 10 -M prepared -f select.sql postgres
tps = 11372.821006 (including connections establishing)

pgbench -n -T 100 -P 10 -M prepared -f select2.sql postgres
tps = 10392.729026 (including connections establishing)

So as you can see - insertion speed of Lsm3 is ten times higher and
select speed is the same as of nbtree.

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Magnus Hagander 2020-08-05 07:29:50 Re: Reg. Postgres 13
Previous Message Asim Praveen 2020-08-05 07:08:41 Re: [PATCH] - Provide robust alternatives for replace_string