Re: Displaying and dumping of table access methods

From: Stephen Frost <sfrost(at)snowman(dot)net>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: pgsql-hackers(at)postgresql(dot)org, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Dmitry Dolgov <9erthalion6(at)gmail(dot)com>, Haribabu Kommi <kommi(dot)haribabu(at)gmail(dot)com>, Alexander Korotkov <a(dot)korotkov(at)postgrespro(dot)ru>
Subject: Re: Displaying and dumping of table access methods
Date: 2019-01-08 01:30:13
Message-ID: 20190108013013.GD2528@tamriel.snowman.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Greetings,

* Andres Freund (andres(at)anarazel(dot)de) wrote:
> On 2019-01-07 19:19:46 -0500, Stephen Frost wrote:
> > * Andres Freund (andres(at)anarazel(dot)de) wrote:
> > > Over in [1] we're discussing the development of the pluggable storage
> > > patchset, which allows different ways of storing table data.
> > >
> > > One thing I'd like to discuss with a wider audience than the
> > > implementation details is psql and pg_dump handling of table access
> > > methods.
> > >
> > > Currently the patchset neither dumps nor displays table access
> > > methods . That's clearly not right.
> >
> > Agreed.
> >
> > > The reason for that however is not that it's hard to dump/display
> > > code-wise, but that to me the correct behaviour is not obvious.
> >
> > While it might be a lot of changes to the regression output results, I
> > tend to feel that showng the access method is a very important aspect of
> > the relation and therefore should be in \d output.
>
> I don't see how that'd be feasible. Imo it is/was absolutely crucial
> for zheap development to be able to use the existing postgres tests.

I don't agree with the general assumption that "we did this for
development and therefore it should be done that way forever".

Instead, I would look at adding tests where there's a difference
between the two, or possibly some difference, and make sure that there
isn't, and make sure that the code paths are covered.

> > > The reason to make table storage pluggable is after all that the table
> > > access method can be changed, and part of developing new table access
> > > methods is being able to run the regression tests.
> >
> > We certainly want people to be able to run the regression tests, but it
> > feels like we will need more regression tests in the future as we wish
> > to cover both the built-in heap AM and the new zheap AM, so we should
> > really have those both run independently. I don't think we'll be able
> > to have just one set of regression tests that cover everything
> > interesting for both, sadly. Perhaps there's a way to have a set of
> > regression tests which are run for both, and another set that's run for
> > each, but building all of that logic is a fair bit of work and I'm not
> > sure how much it's really saving us.
>
> I don't think there's any sort of contradiction here. I don't think it's
> feasible to have tests tests for every feature duplicated for each
> supported AM, we have enough difficulty maintaining our current
> tests. But that doesn't mean it's a problem to have individual test
> [files] run with an explicitly assigned AM - the test can just do a SET
> default_table_access_method = zheap; or explicitly say USING zheap.

I don't mean to suggest that there's a contradiction. I don't have any
problem with new tests being added which set the default AM to zheap, as
long as it's clear that such is happening for downstream tests.

> > > A patch at [2] adds display of a table's access method to \d+ - but that
> > > means that running the tests with a different default table access
> > > method (e.g. using PGOPTIONS='-c default_table_access_method=...)
> > > there'll be a significant number of test failures, even though the test
> > > results did not meaningfully differ.
> >
> > Yeah, I'm not really thrilled with this approach.
> >
> > > Similarly, if pg_dump starts to dump table access methods either
> > > unconditionally, or for all non-heap AMS, the pg_dump tests fail due to
> > > unimportant differences.
> >
> > In reality, pg_dump already depends on quite a few defaults to be in
> > place, so I don't see a particular issue with adding this into that set.
> > New tests would need to have new pg_dump checks, of course, but that's
> > generally the case as well.
>
> I am not sure what you mean here? Are you suggesting that there'd be a
> separate set of pg_dump test for zheap and every other possible future
> AM?

I'm suggesting that pg_dump would have additional tests for zheap, in
addition to the existing tests we already have. No more, no less,
really.

> To me the approach you're suggesting is going to lead to an explosion of
> redundant tests, which are really hard to maintain, especially for
> out-of-tree AMs. Out of tree AMs with the setup I propose can just
> install the AM into the template database and set PGOPTIONS, and they
> have pretty good coverage.

I'm frankly much less interested in out-of-tree AMs as we aren't going
to have in-core regression tests for them anyway, because we can't as
they aren't in our tree and, ultimately, I don't find them to have
anywhere near the same value that in-core AMs have.

I don't have any problem with out-of-tree AMs hacking things up as they
see fit and then running whatever tests they want, but it is, and always
will be, a very different discussion and ultimately a different result
when we're talking about what will be incorporated and supported as part
of core.

Thanks!

Stephen

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Langote 2019-01-08 01:34:39 Re: speeding up planning with partitions
Previous Message Michael Paquier 2019-01-08 01:24:55 Re: ALTER INDEX fails on partitioned index