Re: Relation extension scalability

From: Dilip Kumar <dilipbalaut(at)gmail(dot)com>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: Andres Freund <andres(at)anarazel(dot)de>, Robert Haas <robertmhaas(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Andres Freund <andres(at)2ndquadrant(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Relation extension scalability
Date: 2016-01-25 06:29:46
Message-ID: CAFiTN-u2dZiXLP9jMEtA+OanwE8VPUPFmcJkk+y4c64Sht_n=A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sat, Jan 23, 2016 at 12:19 PM, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
wrote
>
>
> Few comments about patch:
>
Thanks for reviewing..

> 1.
> Patch is not getting compiled.
>
> 1>src/backend/access/heap/hio.c(480): error C2065: 'buf' : undeclared
> identifier
> 1>src/backend/access/heap/hio.c(480): error C2065: 'buf' : undeclared
> identifier
> 1>src/backend/access/heap/hio.c(480): error C2065: 'buf' : undeclared
> identifier
>
Oh, My mistake, my preprocessor is ignoring this error and relacing it with
BLKSIZE

I will fix in next version of patch.

> 2.
> ! page = BufferGetPage(buffer);
> ! PageInit(page, BufferGetPageSize
> (buf), 0);
> !
> ! freespace = PageGetHeapFreeSpace(page);
> !
> MarkBufferDirty(buffer);
> ! UnlockReleaseBuffer(buffer);
> !
> RecordPageWithFreeSpace(relation, BufferGetBlockNumber(buffer), freespace);
>
> What is the need to mark page dirty here, won't it automatically
> be markerd dirty once the page is used? I think it is required
> if you wish to WAL-log this action.
>

These pages are not going to be used immediately and we have done PageInit
so i think it should be marked dirty before adding to FSM, so that if
buffer get replaced out then it flushes the init data.

> 3. I think you don't need to multi-extend a relation if
> HEAP_INSERT_SKIP_FSM is used, as for that case it anyways try to
> get a new page by extending a relation.
>

Yes, if HEAP_INSERT_SKIP_FSM is used and we use multi-extend atleast in
current transaction it will not take pages from FSM and everytime it will
do multi-extend, however pages will be used if there are parallel backend,
but still not a good idea to extend every time in multiple chunk in current
backend.

So i will change this..

4. Again why do you need this multi-extend optimization for local
> relations (those only accessible to current backend)?
>

I think we can change this while adding the table level "extend_by_blocks"
for local table we will not allow this property, so no need to change at
this place.

What do you think ?

5. Do we need this for nbtree as well? One way to check that
> is by Copying large data in table having index.
>
> Ok, i will try this test and update.

> Note: Test with both data and WAL on Magnetic Disk : No significant
>> improvement visible
>> -- I think wall write is becoming bottleneck in this case.
>>
>>
> In that case, can you try the same test with un-logged tables?
>

OK

>
> Also, it is good to check the performance of patch with read-write work
> load to ensure that extending relation in multiple-chunks should not
> regress such cases.
>

Ok

>
> Currently i have kept extend_num_page as session level parameter but i
>> think later we can make this as table property.
>> Any suggestion on this ?
>>
>>
> I think we should have a new storage_parameter at table level
> extend_by_blocks or something like that instead of GUC. The
> default value of this parameter should be 1 which means retain
> current behaviour by default.
>

+1

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Dilip Kumar 2016-01-25 06:30:56 Re: Relation extension scalability
Previous Message Tom Lane 2016-01-25 06:22:24 Re: Add generate_series(date, date) and generate_series(date, date, integer)