Re: [PoC] Improve dead tuple storage for lazy vacuum

From: John Naylor <john(dot)naylor(at)enterprisedb(dot)com>
To: Nathan Bossart <nathandbossart(at)gmail(dot)com>
Cc: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Matthias van de Meent <boekewurm+postgres(at)gmail(dot)com>, Yura Sokolov <y(dot)sokolov(at)postgrespro(dot)ru>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [PoC] Improve dead tuple storage for lazy vacuum
Date: 2022-09-22 15:11:18
Message-ID: CAFBsxsF6OT+4Hkjm3J1+qtE_OuegWO3D-UQ5hO0krTTrRZ=8hw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Sep 22, 2022 at 11:46 AM John Naylor <john(dot)naylor(at)enterprisedb(dot)com>
wrote:
> One thing I want to try soon is storing fewer than 16/32 etc entries, so
that the whole node fits comfortably inside a power-of-two allocation. That
would allow us to use aset without wasting space for the smaller nodes,
which would be faster and possibly would solve the fragmentation problem
Andres referred to in

>
https://www.postgresql.org/message-id/20220704220038.at2ane5xkymzzssb%40awork3.anarazel.de

While calculating node sizes that fit within a power-of-two size, I noticed
the current base node is a bit wasteful, taking up 8 bytes. The node kind
only has a small number of values, so it doesn't really make sense to use
an enum here in the struct (in fact, Andres' prototype used a uint8 for
node_kind). We could use a bitfield for the count and kind:

uint16 -- kind and count bitfield
uint8 shift;
uint8 chunk;

That's only 4 bytes. Plus, if the kind is ever encoded in a pointer tag,
the bitfield can just go back to being count only.

Here are the v6 node kinds:

node4: 8 + 4 +(4) + 4*8 = 48 bytes
node16: 8 + 16 + 16*8 = 152
node32: 8 + 32 + 32*8 = 296
node128: 8 + 256 + 128/8 + 128*8 = 1304
node256: 8 + 256/8 + 256*8 = 2088

And here are the possible ways we could optimize nodes for space using aset
allocation. Parentheses are padding bytes. Even if my math has mistakes,
the numbers shouldn't be too far off:

node3: 4 + 3 +(1) + 3*8 = 32 bytes
node6: 4 + 6 +(6) + 6*8 = 64
node13: 4 + 13 +(7) + 13*8 = 128
node28: 4 + 28 + 28*8 = 256
node31: 4 + 256 + 32/8 + 31*8 = 512 (XXX not good)
node94: 4 + 256 + 96/8 + 94*8 = 1024
node220: 4 + 256 + 224/8 + 220*8 = 2048
node256: = 4096

The main disadvantage is that node256 would balloon in size.

--
John Naylor
EDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2022-09-22 15:13:28 Re: [PATCH] Improve amcheck to also check UNIQUE constraint in btree index.
Previous Message Andres Freund 2022-09-22 15:10:07 Re: proposal: possibility to read dumped table's name from file