Re: BUG #6041: Unlogged table was created bad in slave node

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Alvaro Herrera <alvherre(at)commandprompt(dot)com>, Euler Taveira de Oliveira <euler(at)timbira(dot)com>, Emanuel <postgres(dot)arg(at)gmail(dot)com>, pgsql-bugs <pgsql-bugs(at)postgresql(dot)org>
Subject: Re: BUG #6041: Unlogged table was created bad in slave node
Date: 2011-06-03 16:44:45
Message-ID: BANLkTi=e7+0aXyoLoZENsNeBWX7n9Q+4Pw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs pgsql-hackers

On Wed, Jun 1, 2011 at 2:28 PM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> On Thu, May 26, 2011 at 12:34 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> Alvaro Herrera <alvherre(at)commandprompt(dot)com> writes:
>>> Excerpts from Euler Taveira de Oliveira's message of jue may 26 12:00:05 -0400 2011:
>>>> I think we should emit the real cause in those cases, if possible (not too
>>>> much overhead). The message would be "Unlogged table content is not available
>>>> in standby server".
>>
>>> I guess what it should do is create an empty file in the slave.
>>
>> Probably it should, because won't the table malfunction after the slave
>> is promoted to master, if there's no file at all there?  Or will the
>> process of coming live create an empty file even if there was none?
>
> Coming live creates an empty file.
>
>> But Euler is pointing out a different issue, which is usability.  If the
>> slave just acts like the table is present but empty, we are likely to
>> get bug reports about that too.  An error telling you you aren't allowed
>> to access such a table on slaves would be more user-friendly, if we can
>> do it without too much pain.
>
> I looked into this a bit.  A few observations:
>
> (1) This problem is actually not confined to unlogged tables;
> temporary tables have the same issue.  For example, if you create a
> temporary table on the master and then, on the slave, do SELECT * FROM
>  pg_temp_3.hi_mom (or whatever the name of the temp schema where the
> temp table is) you get the same error.  In fact I suspect if you took
> a base backup that included the temporary relation and matched the
> backend ID you could even manage to read out the old contents (modulo
> any fun and exciting XID wraparound issues).  But the problem is of
> course more noticeable for unlogged tables since they're not hidden
> away in a special funny schema.
>
> (2) It's somewhat difficult to get a clean fix for this error because
> it's coming from deep down in the bowels of the system, inside
> mdnblocks().  At that point, we haven't got a Relation available, just
> an SMgrRelation, so the information that this relation is unlogged is
> not really available, at least not without some sort of gross
> modularity violation like calling smgrexists() on the init fork.  It
> doesn't particularly seem like a good idea to conditionalize the
> behavior of mdnblocks() on relpersistence anyway; that's a pretty
> gross modularity violation all by itself.
>
> (3) mdnblocks() gets called twice during the process of doing a simple
> select on a relation - once from the planner, via estimate_rel_size,
> and again from the executor, via initscan.  A fairly obvious fix for
> this problem is to skip the call to estimate_rel_size() if we're
> dealing with a temporary or unlogged relation and are in recovery; and
> make heap_beginscan_internal() throw a suitable error under similar
> circumstances (or do we want to throw the error at plan time?  tossing
> it out from all the way down in estimate_rel_size() seems a bit
> wacky).  Patch taking this approach attached.
>
> (4) It strikes me that it might be possible to address this problem a
> bit more cleanly by allowing mdnblocks() and smgrnblocks() and
> RelationGetNumberOfBlocksInFork() to take a boolean argument
> indicating whether or not an error should be thrown if the underlying
> physical file happens not to exist.  When no error is to be signaled,
> we simply return 0 when the main fork doesn't exist, rather than
> throwing an error.  We could then make estimate_rel_size() use this
> variant, eliminating the need for an explicit test in
> get_relation_info().  I'm sort of inclined to go this route even
> though it would require touching a bit more code.  We've already
> whacked around RelationGetNumberOfBlocks() a bit in this release (the
> function that formerly had that name is now
> RelationGetNumberOfBlocksInFork(), and RelationGetNumberOfBlocks() is
> now a macro that calls that function w/MAIN_FORKNUM) so it doesn't
> seem like a bad time to get any other API changes that might be useful
> out of our system.
>
> Thoughts?

Anyone? Anyone? Bueller?

If we don't want to gum this with the above-mentioned cruft, the other
obvious alternative here is to do nothing, and live with the
non-beauty of the resulting error message.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Robert Haas 2011-06-03 16:47:01 Re: BUG #6050: Dump and restore of view after a schema change: can't restore the view
Previous Message Jehan-Guillaume (ioguix) de Rorthais 2011-06-03 15:42:13 BUG #6051: wCTE query fail with wrong error text on a table with rules

Browse pgsql-hackers by date

  From Date Subject
Next Message Alvaro Herrera 2011-06-03 16:46:29 Re: Estimating total amount of shared memory required by postmaster
Previous Message Robert Haas 2011-06-03 16:40:41 Re: reducing the overhead of frequent table locks - now, with WIP patch