Re: Remaining beta blockers

From: Kevin Grittner <kgrittn(at)ymail(dot)com>
To: Andres Freund <andres(at)2ndquadrant(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Noah Misch <noah(at)leadboat(dot)com>, "pgsql-hackers(at)postgreSQL(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Remaining beta blockers
Date: 2013-04-30 15:35:32
Message-ID: 1367336132.69972.YahooMailNeo@web162903.mail.bf1.yahoo.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Andres Freund <andres(at)2ndquadrant(dot)com> wrote:
> On 2013-04-30 07:33:05 -0700, Kevin Grittner wrote:
>> Andres Freund <andres(at)2ndquadrant(dot)com> wrote:

>>> 2) Since we don't have a metapage to represent scannability in 9.3
>>>    we cannot easily use one in 9.4+ without pg_upgrade emptying all
>>>    matviews unless we only rely on the catalogs which we currently
>>>    cannot.
>
>> I am not following this argument at all.  Doesn't pg_upgrade use
>> pg_dump to create the tables and matviews WITH NO DATA and take
>> later action for ones which are populated in the source?  It would
>> not be hard at all to move to a new release which used a different
>> technique for tracking populated tables by changing what pg_dump
>> does for populated tables with the switch pg_upgrade uses.
>
> What I am thinking about is a 100GB materialized view which has been
> filled in 9.3 and should now be pg_upgraded into 9.4. If we don't have a
> metapage yet and we want to add one we would need to rewrite the whole
> 100GB which seems like a rather bad idea. Or how are you proposing to
> add the metapage? You cannot add it to the end or somesuch.

Oh, you are suggesting prepending a metapage to existing matviews
(and tables?)?  So to check whether a view has been populated you
not only look at the directory but open the file and read a page?
Now I follow why you think this would be an issue.  I'm not sure I
think that is the best solution, though.  In what way would it be
better than adding info to pg_class or some other system table?
Why would this be important for unlogged matviews but not unlogged
tables?

>> I am not seeing this at all.  Given my comment above about how this
>> works for pg_upgrade and pg_dump, can you explain how this is a
>> problem?
>
> Well, that only works if there is a cheap way to add the new notation to
> existing matview heaps which I don't see.

We could perhaps reserve some space in the special area of the
first page, if we can agree on a generous enough amount of space.

>>> 3) Using the filesize as a flag will make other stuff like
>>>    preallocating on-disk data in bigger chunks and related
>>>    things more complicated.
>>
>> Why?  You don't need to preallocate for a non-populated matview,
>> since its heap will be replaced on REFRESH.  You would not *want*
>> to preallocate a lot of space for something guaranteed to remain at
>> zero length until deleted.
>
> But we would likely also want to optimize reducing the filesize in the
> same chunks because you otherwise would end up with even worse
> fragmentation. And I am not talking about an unscannable relation but
> about one which is currently empty (e.g. SELECT ... WHERE false).

Yes, if we get to both incremental updates *and* preallocations
before we develop a better technique for this, a special case would
be needed for the case where a matview was incrementally updated to
zero rows.

>>> Not sure what the consequence of this is. The most reasonable solution
>>> seems to be to introduce a metapage somewhere *now* which sucks, but
> ...
>>
>> That seems far riskier to me than using the current
>> lame-but-functional approach now and improving it in a subsequent
>> release.
>
> Why? We have bitten by the lack of such metapages several times now and
> I don't think we have bitten by their presence in relation types that
> have them by now.

Like I said, months ago I had a version which used the special area
for the first page of a matview heap, but was convinced to change
that.  I could probably be convinced to change back.  :-)  I don't
know whether you see a problem with using the special like that for
the metadata you envision.

--
Kevin Grittner
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Adrian.Vondendriesch 2013-04-30 15:40:35 BUG #8128: pg_dump (>= 9.1) failed while dumping a scheme named "old" from PostgreSQL 8.4
Previous Message Andres Freund 2013-04-30 15:18:16 Re: The missing pg_get_*def functions