Re: WIP: Avoid creation of the free space map for small tables

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: jcnaylor(at)gmail(dot)com
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: WIP: Avoid creation of the free space map for small tables
Date: 2018-10-15 03:40:49
Message-ID: CAA4eK1J957pGLFVRzB1ao-rOwfY2j-AwMjMfNAzZrCaJn5-rmw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sun, Oct 14, 2018 at 1:09 AM John Naylor <jcnaylor(at)gmail(dot)com> wrote:
>
> On 10/13/18, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
>
> > I think you have found a good way to avoid creating FSM, but can't we
> > use some simpler technique like if the FSM fork for a relation doesn't
> > exist, then check the heapblk number for which we try to update the
> > FSM and if it is lesser than HEAP_FSM_EXTENSION_THRESHOLD, then avoid
> > creating the FSM.
>
> I think I see what you mean, but to avoid the vacuum problem you just
> mentioned, we'd need to check the relation size, too.
>

Sure, but vacuum already has relation size. In general, I think we
should try to avoid adding more system calls in the common code path.
It can impact the performance.

Few comments on your latest patch:
-
+static bool
+allow_write_to_fsm(Relation rel, BlockNumber heapBlk)
+{
+ BlockNumber heap_nblocks;
+
+ if (heapBlk > HEAP_FSM_EXTENSION_THRESHOLD ||
+ rel->rd_rel->relkind != RELKIND_RELATION)
+ return true;
+
+ /* XXX is this value cached? */
+ heap_nblocks = RelationGetNumberOfBlocks(rel);
+
+ if (heap_nblocks > HEAP_FSM_EXTENSION_THRESHOLD)
+ return true;
+ else
+ {
+ RelationOpenSmgr(rel);
+ return smgrexists(rel->rd_smgr, FSM_FORKNUM);
+ }
+}

I think you can avoid calling RelationGetNumberOfBlocks, if you call
smgrexists before and for the purpose of vacuum, we can get that as an
input parameter. I think one can argue for not changing the interface
functions like RecordPageWithFreeSpace to avoid calling
RelationGetNumberOfBlocks, but to me, it appears worth to save the
additional system call.

-
targetBlock = GetPageWithFreeSpace(relation, len + saveFreeSpace);
-
- /*
- * If the FSM knows nothing of the rel, try the last page before we
- * give up and extend. This avoids one-tuple-per-page syndrome during
- * bootstrapping or in a recently-started system.
- */
if (targetBlock == InvalidBlockNumber)
- {
- BlockNumber nblocks = RelationGetNumberOfBlocks(relation);
-
- if (nblocks > 0)
- targetBlock = nblocks - 1;
- }
+ targetBlock = get_page_no_fsm(relation, InvalidBlockNumber,
+ &try_every_page);

Is it possible to hide the magic of trying each block within
GetPageWithFreeSpace? It will simplify the code and in future, if
another storage API has a different function for
RelationGetBufferForTuple, it will work seamlessly, provided they are
using same FSM. One such user is zheap.

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andrey Borodin 2018-10-15 04:57:29 Re: B-tree cache prefetches
Previous Message Tom Lane 2018-10-15 03:25:09 Re: "SELECT ... FROM DUAL" is not quite as silly as it appears