Re: RE: [COMMITTERS] pgsql/src/backend/access/transam ( xact.c xlog.c)

From: Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: "Mikheev, Vadim" <vmikheev(at)SECTORBASE(dot)COM>, Larry Rosenman <ler(at)lerctr(dot)org>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: RE: [COMMITTERS] pgsql/src/backend/access/transam ( xact.c xlog.c)
Date: 2000-11-18 05:00:34
Message-ID: 200011180500.AAA19918@candle.pha.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

> > sleep(3) should conform to POSIX specification, if anyone has the
> > reference they can check it to see what the effect of sleep(0)
> > should be.
>
> Yes, but Posix also specifies sched_yield() which rather explicitly
> allows a process to yield its timeslice. No idea how well that is
> supported.

OK, I have a new idea.

There are two parts to transaction commit. The first is writing all
dirty buffers or log changes to the kernel, and second is fsync of the
log file.

I suggest having a per-backend shared memory byte that has the following
values:

START_LOG_WRITE
WAIT_ON_FSYNC
NOT_IN_COMMIT
backend_number_doing_fsync

I suggest that when each backend starts a commit, it sets its byte to
START_LOG_WRITE. When it gets ready to fsync, it checks all backends.
If all are NOT_IN_COMMIT, it does fsync and continues.

If one or more are in START_LOG_WRITE, it waits until no one is in
START_LOG_WRITE. It then checks all WAIT_ON_FSYNC, and if it is the
lowest backend in WAIT_ON_FSYNC, marks all others with its backend
number, and does fsync. It then clears all backends with its number to
NOT_IN_COMMIT. Other backend will see they are not the lowest
WAIT_ON_FSYNC and will wait for their byte to be set to NOT_IN_COMMIT
so they can then continue, knowing their data was synced.

This allows a single backend not to sleep, and allows multiple backends
to bunch up only when they are all about to commit.

The reason backend numbers are written is so other backends entering the
commit code will not interfere with the backends performing fsync.

Comments?

--
Bruce Momjian | http://candle.pha.pa.us
pgman(at)candle(dot)pha(dot)pa(dot)us | (610) 853-3000
+ If your life is a hard drive, | 830 Blythe Avenue
+ Christ can be your backup. | Drexel Hill, Pennsylvania 19026

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2000-11-18 05:15:38 Re: WAL fsync scheduling
Previous Message Bruce Momjian 2000-11-18 04:59:06 WAL fsync scheduling