Re: Improving the Performance of Full Table Updates

From: "Gokulakannan Somasundaram" <gokul007(at)gmail(dot)com>
To: "Heikki Linnakangas" <heikki(at)enterprisedb(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Improving the Performance of Full Table Updates
Date: 2007-10-08 06:31:19
Message-ID: 9362e74e0710072331w4b348ef8ge115918a650608a2@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi Heikki,
Finally i found some time to look more into the CRC code. The
time is mostly taken when there is a back-up block in the XLOG structure.
when it calculates the CRC for the entire block(there is some optimization
already for the holes), the time is spent on the CRC macro. I tried doing
some small optimization like changing the int into uint8, thinking that the
exponentiation operation might get slightly faster, with no success. I don't
know whether changing the CRC algorithm is a good option.
I have other observations on including the snapshot information
into the indexes, on which i will make a proposal. Please provide me
guidance on that.

Thanks,
Gokul.

On 9/27/07, Heikki Linnakangas <heikki(at)enterprisedb(dot)com> wrote:
>
> Gokulakannan Somasundaram wrote:
> > Hi Tom/ Heikki,
> > Thanks for the suggestion. After profiling i got similar
> results.
> > So i am thinking of a design like this to get the performance
> improvement.
> >
> > a) We can get one page for insert(during update) and we will hold the
> write
> > lock on it, till the page gets filled. In this way,
> > RelationGetBufferForTuple will get called only once for one page of
> inserts.
>
> The locking actually isn't that expensive once you have the page pinned.
> For starters, keep the page pinned over calls to heap_update, and just
> relock it instead of calling RelationGetBufferForTuple. Unsurprisingly,
> this is not a new idea:
> http://archives.postgresql.org/pgsql-patches/2007-05/msg00499.php.
>
> > b) Do you think if we can optimize the XlogInsert in such a way, it will
> > write a page instead of writing all the records in the page. I think we
> > need to write a recovery routine for the same. Currently the page gets
> > flushed to the WAL, if it gets modified after the checkpoint. So i still
> > need to understand those code pieces. But do you think it is wise to
> > continue working on this line?
>
> It's going to be very difficult at least. There's a lot of race
> conditions lurking if you try to coalesce multiple updates to a single
> WAL record.
>
> That said, making XLogInsert faster would help a lot of use cases, not
> only full-table udpates. Most of the time is spent calculating the CRC,
> but it has some other non-trivial overhead as well. Profiling XLogInsert
> in more detail, and figuring out how to make it faster would be very nice.
>
> --
> Heikki Linnakangas
> EnterpriseDB http://www.enterprisedb.com
>

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Gokulakannan Somasundaram 2007-10-08 06:42:09 Including Snapshot Info with Indexes
Previous Message Tom Lane 2007-10-08 03:40:31 Re: pgsql: Added the Skytools extended transaction ID module to contrib as