| From: | Dilip Kumar <dilipbalaut(at)gmail(dot)com> | 
|---|---|
| To: | Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> | 
| Cc: | Andres Freund <andres(at)anarazel(dot)de>, Robert Haas <robertmhaas(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Andres Freund <andres(at)2ndquadrant(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org> | 
| Subject: | Re: Relation extension scalability | 
| Date: | 2016-01-25 06:29:46 | 
| Message-ID: | CAFiTN-u2dZiXLP9jMEtA+OanwE8VPUPFmcJkk+y4c64Sht_n=A@mail.gmail.com | 
| Views: | Whole Thread | Raw Message | Download mbox | Resend email | 
| Thread: | |
| Lists: | pgsql-hackers | 
On Sat, Jan 23, 2016 at 12:19 PM, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
wrote
>
>
> Few comments about patch:
>
Thanks for reviewing..
> 1.
> Patch is not getting compiled.
>
> 1>src/backend/access/heap/hio.c(480): error C2065: 'buf' : undeclared
> identifier
> 1>src/backend/access/heap/hio.c(480): error C2065: 'buf' : undeclared
> identifier
> 1>src/backend/access/heap/hio.c(480): error C2065: 'buf' : undeclared
> identifier
>
Oh, My mistake, my preprocessor is ignoring this error and relacing it with
BLKSIZE
I will fix in next version of patch.
> 2.
> ! page = BufferGetPage(buffer);
> ! PageInit(page, BufferGetPageSize
> (buf), 0);
> !
> ! freespace = PageGetHeapFreeSpace(page);
> !
> MarkBufferDirty(buffer);
> ! UnlockReleaseBuffer(buffer);
> !
> RecordPageWithFreeSpace(relation, BufferGetBlockNumber(buffer), freespace);
>
> What is the need to mark page dirty here, won't it automatically
> be markerd dirty once the page is used?  I think it is required
> if you wish to WAL-log this action.
>
These pages  are not going to be used immediately and we have done PageInit
so i think it should be marked dirty before adding to FSM, so that if
buffer get replaced out then it flushes the init data.
> 3. I think you don't need to multi-extend a relation if
> HEAP_INSERT_SKIP_FSM is used, as for that case it anyways try to
> get a new page by extending a relation.
>
Yes, if HEAP_INSERT_SKIP_FSM is used and we use multi-extend atleast in
current transaction it will not take pages from FSM and everytime it will
do multi-extend, however pages will be used if there are parallel backend,
but still not a good idea to extend every time in multiple chunk in current
backend.
So i will change this..
4. Again why do you need this multi-extend optimization for local
> relations (those only accessible to current backend)?
>
I think we can change this while adding the  table level "extend_by_blocks"
for local table we will not allow this property, so no need to change at
this place.
What do you think ?
5. Do we need this for nbtree as well?  One way to check that
> is by Copying large data in table having index.
>
> Ok, i will try this test and update.
> Note: Test with both data and WAL on Magnetic Disk : No significant
>> improvement visible
>> -- I think wall write is becoming bottleneck in this case.
>>
>>
> In that case, can you try the same test with un-logged tables?
>
OK
>
> Also, it is good to check the performance of patch with read-write work
> load to ensure that extending relation in multiple-chunks should not
> regress such cases.
>
Ok
>
> Currently i have kept extend_num_page as session level parameter but i
>> think later we can make this as table property.
>> Any suggestion on this ?
>>
>>
> I think we should have a new storage_parameter at table level
> extend_by_blocks or something like that instead of GUC. The
> default value of this parameter should be 1 which means retain
> current behaviour by default.
>
+1
-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Dilip Kumar | 2016-01-25 06:30:56 | Re: Relation extension scalability | 
| Previous Message | Tom Lane | 2016-01-25 06:22:24 | Re: Add generate_series(date, date) and generate_series(date, date, integer) |