Re: WIP: Avoid creation of the free space map for small tables

From: John Naylor <jcnaylor(at)gmail(dot)com>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: WIP: Avoid creation of the free space map for small tables
Date: 2018-11-19 02:00:02
Message-ID: CAJVSVGXKUN8x+Xv9PsGCQ=btyNN4ybLosbAhO2zxb6QuC81L6w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 11/16/18, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> +/* Status codes for the local map. */
> +#define FSM_LOCAL_ZERO 0x00 /* Beyond the end of the relation */
> +#define FSM_LOCAL_AVAIL 0x01 /* Available to try */
> +#define FSM_LOCAL_TRIED 0x02 /* Already tried, not enough space */
>
> Instead of maintaining three states, can't we do with two states
> (Available and Not Available), basically combine 0 and 2 in your case.
> I think it will save some cycles in
> fsm_local_set, where each time you need to initialize all the entries
> in the map. I think we can argue that it is not much overhead, but I
> think it is better code-wise also if we can make it happen with fewer
> states.

That'd work too, but let's consider this scenario: We have a 2-block
table that has no free space. After trying each block, the local cache
looks like

0123
TT00

Let's say we have to wait to acquire a relation extension lock,
because another backend had already started extending the heap by 1
block. We call GetPageWithFreeSpace() and now the local map looks like

0123
TTA0

By using bitwise OR to set availability, the already-tried blocks
remain as they are. With only 2 states, the map would look like this
instead:

0123
AAAN

If we assume that an insert into the newly-created block 2 will almost
always succeed, we don't have to worry about wasting time re-checking
the first 2 full blocks. Does that sound right to you?

> Some assorted comments:
> 1.
> <para>
> -Each heap and index relation, except for hash indexes, has a Free Space
> Map
> +Each heap relation, unless it is very small, and each index relation,
> +except for hash indexes, has a Free Space Map
> (FSM) to keep track of available space in the relation. It's stored
>
> It appears that line has ended abruptly.

Not sure what you're referring to here.

> 2.
> page = BufferGetPage(buffer);
> + targetBlock = BufferGetBlockNumber(buffer);
>
> if (!PageIsNew(page))
> elog(ERROR, "page %u of relation \"%s\" should be empty but is not",
> - BufferGetBlockNumber(buffer),
> + targetBlock,
> RelationGetRelationName(relation));
>
> PageInit(page, BufferGetPageSize(buffer), 0);
> @@ -623,7 +641,18 @@ loop:
> * current backend to make more insertions or not, which is probably a
> * good bet most of the time. So for now, don't add it to FSM yet.
> */
> - RelationSetTargetBlock(relation, BufferGetBlockNumber(buffer));
> + RelationSetTargetBlock(relation, targetBlock);
>
> Is this related to this patch? If not, I suggest let's do it
> separately if required.

I will separate this out.

> 3.
> static int fsm_set_and_search(Relation rel, FSMAddress addr, uint16 slot,
> - uint8 newValue, uint8 minValue);
> + uint8 newValue, uint8 minValue);
>
> This appears to be a spurious change.

It was intentional, but I will include it separately as above.

> 4.
> @@ -378,24 +386,15 @@ RelationGetBufferForTuple(Relation relation, Size
> len,
> * target.
> */
> targetBlock = GetPageWithFreeSpace(relation, len + saveFreeSpace);
> +
> + /*
> + * In case we used an in-memory map of available blocks, reset
> + * it for next use.
> + */
> + if (targetBlock < HEAP_FSM_CREATION_THRESHOLD)
> + ClearLocalMap();
>
> How will you clear the local map during error? I think you need to
> clear it in abort path and you can name the function as
> FSMClearLocalMap or something like that.

That sounds right, and I will rename the function that way. For the
abort path, were you referring to this or somewhere else?

if (!PageIsNew(page))
elog(ERROR, "page %u of relation \"%s\" should be empty but is not",
targetBlock,
RelationGetRelationName(relation));

> 5.
> +/*#define TRACE_TARGETBLOCK */
>
> Debugging leftover, do you want to retain this and related stuff
> during the development of patch?

I modeled this after TRACE_VISIBILITYMAP in visibilitymap.c. It's
useful for development, but I don't particularly care whether it's in
the final verision.

Also, I found an off-by-one error that caused an unnecessary
smgrexists() call in tables with threshold + 1 pages. This will be
fixed in the next version.

-John Naylor

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Alvaro Herrera 2018-11-19 02:17:37 Re: ToDo: show size of partitioned table
Previous Message Michael Paquier 2018-11-19 01:57:44 Re: New GUC to sample log queries