Re: Zedstore - compressed in-core columnar storage

From: Stephen Frost <sfrost(at)snowman(dot)net>
To: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Konstantin Knizhnik <k(dot)knizhnik(at)postgrespro(dot)ru>, Ashwin Agrawal <aagrawal(at)pivotal(dot)io>, PostgreSQL mailing lists <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Zedstore - compressed in-core columnar storage
Date: 2019-04-15 12:34:24
Message-ID: 20190415123424.GD6197@tamriel.snowman.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Greetings,

* Tomas Vondra (tomas(dot)vondra(at)2ndquadrant(dot)com) wrote:
> On Tue, Apr 09, 2019 at 02:29:09PM -0400, Robert Haas wrote:
> >On Tue, Apr 9, 2019 at 11:51 AM Alvaro Herrera <alvherre(at)2ndquadrant(dot)com> wrote:
> >>This is not surprising, considering that columnar store is precisely the
> >>reason for starting the work on table AMs.
> >>
> >>We should certainly look into integrating some sort of columnar storage
> >>in mainline. Not sure which of zedstore or VOPS is the best candidate,
> >>or maybe we'll have some other proposal. My feeling is that having more
> >>than one is not useful; if there are optimizations to one that can be
> >>borrowed from the other, let's do that instead of duplicating effort.
> >
> >I think that conclusion may be premature. There seem to be a bunch of
> >different ways of doing columnar storage, so I don't know how we can
> >be sure that one size will fit all, or that the first thing we accept
> >will be the best thing.
> >
> >Of course, we probably do not want to accept a ton of storage manager
> >implementations is core. I think if people propose implementations
> >that are poor quality, or missing important features, or don't have
> >significantly different use cases from the ones we've already got,
> >it's reasonable to reject those. But I wouldn't be prepared to say
> >that if we have two significantly different column store that are both
> >awesome code with a complete feature set and significantly disjoint
> >use cases, we should reject the second one just because it is also a
> >column store. I think that won't get out of control because few
> >people will be able to produce really high-quality implementations.
> >
> >This stuff is hard, which I think is also why we only have 6.5 index
> >AMs in core after many, many years. And our standards have gone up
> >over the years - not all of those would pass muster if they were
> >proposed today.
>
> It's not clear to me whether you're arguing for not having any such
> implementation in core, or having multiple ones? I think we should aim
> to have at least one in-core implementation, even if it's not the best
> possible one for all sizes. It's not like our rowstore is the best
> possible implementation for all cases either.
>
> I think having a colstore in core is important not just for adoption,
> but also for testing and development of the executor / planner bits.

Agreed.

> If we have multiple candidates with sufficient code quality, then we may
> consider including both. I don't think it's very likely to happen in the
> same release, considering how much work it will require. And I have no
> idea if zedstore or VOPS are / will be the only candidates - it's way
> too early at this point.

Definitely, but having as many different indexes as we have is certainly
a good thing and we should be looking to a future where we have multiple
in-core options for row and column-oriented storage.

> FWIW I personally plan to focus primarily on the features that aim to
> be included in core, and that applies to colstores too.

Yeah, same here.

Thanks!

Stephen

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Justin Pryzby 2019-04-15 12:44:18 Re: PANIC: could not flush dirty data: Operation not permitted power8, Redhat Centos
Previous Message Stephen Frost 2019-04-15 12:24:52 Re: [PATCH v20] GSSAPI encryption support