Re: [CORE] WAL & RC1 status

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Vadim Mikheev <vadim4o(at)email(dot)com>
Cc: pgsql-core(at)postgreSQL(dot)org, pgsql-hackers(at)postgreSQL(dot)org
Subject: Re: [CORE] WAL & RC1 status
Date: 2001-03-03 19:06:55
Message-ID: 7129.983646415@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Vadim Mikheev <vadim4o(at)email(dot)com> writes:
>> -- Judging from the commit timestamps surrounding prior
>> -- checkpoints, checkpoints were happening every five
>> -- minutes approximately on the 5-minute mark, so

> You can't count on this: postmaster runs checkpoint
> "maker" in 5 minutes *after* prev checkpoint was created,
> not from the moment "maker" started. And checkpoint can
> take *minutes*.

Good point, although with so little going on (this is the *whole*
relevant section of the log), that seems unlikely.

>> -- here. But it's worse than that: check the commit
>> -- timestamps and the xid numbers before and after the
>> -- discontinuity. Did time go backwards here?

> Commit timestamps are created *before* XLogInsert call,
> which can suspend backend for some time (in multi-user
> env). Random xid-s are also ok, generally.

Hmm ... maybe. Though again, this installation doesn't seem to have
been busy enough to cause a commit to be delayed for very long.

What I realized after posting that analysis is that the last checkpoint
record has SUI 30 whereas the earlier ones have SUI 29 ... so there was
a system restart in there somewhere. That still leaves me wondering
about the discontinuity and broken back-link, but it may account for
the "missing" checkpoint records --- perhaps they weren't generated
because the system wasn't up the entire interval.

>> -- What's even nastier (and the immediate cause of
>> -- Scott's inability to restart) is that the pg_control
>> -- file's checkPoint pointer points to 0/005AF9F0, which
>> -- is *not* the location of this checkpoint, but of
>> -- the record after it.

> Well, well. Checkpoint position is taken from
> MyLastRecord - I wonder how could this internal var
> take "invalid" data from concurrent backend.

I have not been able to figure that one out either.

> Ok, we're leaving Krasnoyarsk in 8 hrs and should
> arrive SF Feb 5 ~ 10pm.

Have a safe trip!

regards, tom lane

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message xuyifeng 2001-03-04 02:01:37 Re: [HACKERS] why the DB file size does not reduce when 'delete'the data in DB?
Previous Message Vadim Mikheev 2001-03-03 18:46:06 RE: [CORE] WAL & RC1 status