Re: pgsql: Improve gist XLOG code to follow the coding

From: Teodor Sigaev <teodor(at)sigaev(dot)ru>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-committers(at)postgresql(dot)org, Oleg Bartunov <oleg(at)sai(dot)msu(dot)su>
Subject: Re: pgsql: Improve gist XLOG code to follow the coding
Date: 2006-03-31 16:27:40
Message-ID: 442D587C.2040102@sigaev.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-committers

>> At line 543 itup vector was filled by invalid tuple, so newly filled root
>> will contains only invalid tuples. Hence, reindex/vacuum full is required.
>
> Hmm. That probably needs to be redesigned then. The problem is that as
> the thing is written, you *must* reinit the buffer here --- the code I
> removed that checked the page LSN was unsafe. If you want to depend on
> the prior page contents during replay, you have to give XLogInsert the
> opportunity to save the whole page when the xlog entry is made.

Don't understand. Root will be fully regenerated and filled with invalid tuples.

> old/inconsistent. We deal with this by having the first WAL record that
> touches a given page after a checkpoint contain a copy of the whole
> page, and then we can rewrite the page from that copy instead of trying

I see, there is a problem with gistSplit: it can generate more than 3 pages
(three - is a limit of XLogInsert) in a case of bad user's picksplit method...
But it is a very "corner" case...

In practice, I don't see more than three pages, three pages are rarely generated
by gist__int_ops (contrib/intarray).

> BTW, there's another problem with gistContinueInsert: it's not making
> WAL entries for the actions it takes. It needs to do so --- consider
> for example the PITR case, where the log will be shipped somewhere else
> and then followed verbatim. So you've got to add WAL entries for any
> recovery actions you take after reading the existing WAL entries.

Ugh, I see

> process. In particular, we really don't want to call any user-defined
> datatype functions inside the critical section.

Agree

> be to compute all the required changes *but not make them*, then start
> the critical section, then apply the changes and make the WAL record.
> This seemed like a large enough change that I figured I'd better talk
> with you about it. One idea I had was to generate the WAL record as the
> output of the decision-making code, and then the critical section would
> use the WAL record as its guide to what to do to the buffers.

Hmm, we will think.

> BTW, I'm confused about gistadjscans: seems to me that's either broken
> or unnecessary. Since we no longer hold an exclusive lock on a gist
> index while inserting into it, the comment at gistscan.c line 33 is
> certainly wrong now. If the adjustment is still necessary then some
> other mechanism needs to be devised to get the information to other
> backends. If it's not necessary then I think we should take it out.
> I'm not totally familiar with the gist logic, but it looks to me like
> gistadjscans is only called during an insert into a non-leaf page,
> which makes me think it might be unnecessary --- do gist index scans
> ever stop on non-leaf pages?

It seems to me - you are right. I missed that. I'll remove it and test changes.

--
Teodor Sigaev E-mail: teodor(at)sigaev(dot)ru
WWW: http://www.sigaev.ru/

In response to

Responses

Browse pgsql-committers by date

  From Date Subject
Next Message User Aparashar 2006-03-31 18:00:19 bizgres - bizgres: Backing out the latest bitmap patch - Ayush
Previous Message Tom Lane 2006-03-31 15:35:26 Re: pgsql: Improve gist XLOG code to follow the coding