Re: [WIP] Zipfian distribution in pgbench

From: Peter Geoghegan <pg(at)bowt(dot)ie>
To: Alik Khilazhev <a(dot)khilazhev(at)postgrespro(dot)ru>
Cc: PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [WIP] Zipfian distribution in pgbench
Date: 2017-07-07 18:53:13
Message-ID: CAH2-WznTTGXCKPiXskTGEGpxnBkFBnaDzey=v5SPdDs0nRHuQg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Jul 7, 2017 at 12:45 AM, Alik Khilazhev
<a(dot)khilazhev(at)postgrespro(dot)ru> wrote:
> On scale = 10(1 million rows) it gives following results on machine with 144 cores(with synchronous_commit=off):
> nclients tps
> 1 8842.401870
> 2 18358.140869
> 4 45999.378785
> 8 88713.743199
> 16 170166.998212
> 32 290069.221493
> 64 178128.030553
> 128 88712.825602
> 256 38364.937573
> 512 13512.765878
> 1000 6188.136736

Is it possible for you to instrument the number of B-Tree page
accesses using custom instrumentation for pgbench_accounts_pkey?

If that seems like too much work, then it would still be interesting
to see what the B-Tree keyspace looks like before and after varying
the "nclient" count from, say, 32 to 128. Maybe there is a significant
difference in how balanced or skewed it is in each case. Or, the index
could simply be more bloated.

There is a query that I sometimes use, that itself uses pageinspect,
to summarize the keyspace quickly. It shows you the highkey for every
internal page, starting from the root and working down to the lowest
internal page level (the one just before the leaf level -- level 1),
in logical/keyspace order. You can use it to visualize the
distribution of values. It could easily include the leaf level, too,
but that's less interesting and tends to make the query take ages. I
wonder what the query will show here.

Here is the query:

with recursive index_details as (
select
'pgbench_accounts_pkey'::text idx
),
size_in_pages_index as (
select
(pg_relation_size(idx::regclass) / (2^13))::int4 size_pages
from
index_details
),
page_stats as (
select
index_details.*,
stats.*
from
index_details,
size_in_pages_index,
lateral (select i from generate_series(1, size_pages - 1) i) series,
lateral (select * from bt_page_stats(idx, i)) stats),
internal_page_stats as (
select
*
from
page_stats
where
type != 'l'),
meta_stats as (
select
*
from
index_details s,
lateral (select * from bt_metap(s.idx)) meta),
internal_items as (
select
*
from
internal_page_stats
order by
btpo desc),
-- XXX: Note ordering dependency within this CTE, on internal_items
ordered_internal_items(item, blk, level) as (
select
1,
blkno,
btpo
from
internal_items
where
btpo_prev = 0
and btpo = (select level from meta_stats)
union
select
case when level = btpo then o.item + 1 else 1 end,
blkno,
btpo
from
internal_items i,
ordered_internal_items o
where
i.btpo_prev = o.blk or (btpo_prev = 0 and btpo = o.level - 1)
)
select
idx,
btpo as level,
item as l_item,
blkno,
btpo_prev,
btpo_next,
btpo_flags,
type,
live_items,
dead_items,
avg_item_size,
page_size,
free_size,
-- Only non-rightmost pages have high key.
case when btpo_next != 0 then (select data from bt_page_items(idx,
blkno) where itemoffset = 1) end as highkey
from
ordered_internal_items o
join internal_items i on o.blk = i.blkno
order by btpo desc, item;

--
Peter Geoghegan

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Alvaro Herrera 2017-07-07 19:59:56 Re: [WIP] Zipfian distribution in pgbench
Previous Message Peter Geoghegan 2017-07-07 18:19:44 Re: [WIP] Zipfian distribution in pgbench