Re: Undo logs

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>
Cc: dilip(dot)kumar(at)enterprisedb(dot)com, Dilip Kumar <dilipbalaut(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Undo logs
Date: 2018-11-13 08:36:33
Message-ID: CAA4eK1KSvOB9aWG09aXjHhY8iLmapEEXGF_mWpMUeN0VmOAOrQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Oct 17, 2018 at 3:27 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
>
> On Mon, Oct 15, 2018 at 6:09 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> >
> > On Sun, Sep 2, 2018 at 12:19 AM Thomas Munro
> > <thomas(dot)munro(at)enterprisedb(dot)com> wrote:
> > > I have also pushed a new WIP version of the lower level undo log
> > > storage layer patch set to a public branch[1]. I'll leave the earlier
> > > branch[2] there because the record-level patch posted by Dilip depends
> > > on it for now.
> > >
> >
>
> Till now, I have mainly reviewed undo log allocation part. This is a
> big patch and can take much more time to complete the review. I will
> review the other parts of the patch later.
>

I think I see another issue with this patch. Basically, during
extend_undo_log, there is an assumption that no one could update
log->meta.end concurrently, but it
is not true as the same can be updated by UndoLogDiscard which can
lead to assertion failure in extend_undo_log.

+static void
+extend_undo_log(UndoLogNumber logno, UndoLogOffset new_end)
{
..
/*
+ * Create all the segments needed to increase 'end' to the requested
+ * size. This is quite expensive, so we will try to avoid it completely
+ * by renaming files into place in UndoLogDiscard instead.
+ */
+ end = log->meta.end;
+ while (end < new_end)
+ {
+ allocate_empty_undo_segment(logno, log->meta.tablespace, end);
+ end += UndoLogSegmentSize;
+ }
..
+ Assert(end == new_end);
..
/*
+ * We didn't need to acquire the mutex to read 'end' above because only
+ * we write to it. But we need the mutex to update it, because the
+ * checkpointer might read it concurrently.
+ *
+ * XXX It's possible for meta.end to be higher already during
+ * recovery, because of the timing of a checkpoint; in that case we did
+ * nothing above and we shouldn't update shmem here. That interaction
+ * needs more analysis.
+ */
+ LWLockAcquire(&log->mutex, LW_EXCLUSIVE);
+ if (log->meta.end < end)
+ log->meta.end = end;
+ LWLockRelease(&log->mutex);
..
}

Assume, before we read log->meta.end in above code, concurrently,
discard process discards the undo and moves the end pointer to a later
location, then the above assertion will fail. Now, if discard happens
just after we read log->meta.end and before we do other stuff in this
function, then it will crash in recovery.

Can't we just remove this Assert?

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Hubert Zhang 2018-11-13 08:47:13 Control your disk usage in PG: Introduction to Disk Quota Extension
Previous Message David Rowley 2018-11-13 08:21:13 Re: PostgreSQL Limits and lack of documentation about them.