Re: Progress Report on Materialized Views

From: "Simon Riggs" <simon(at)2ndquadrant(dot)com>
To: "'Tom Lane'" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: "'Jonathan M(dot) Gardner'" <jgardner(at)jonathangardner(dot)net>, <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Progress Report on Materialized Views
Date: 2004-02-24 00:17:28
Message-ID: 002301c3fa6b$99d825b0$ec9387d9@LaptopDellXP
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

>Tom Lane
> "Simon Riggs" <simon(at)2ndquadrant(dot)com> writes:
> > Could I suggest that your next step is to sync up with the work
being
> > done on tuning the DBT-3 query workload? As I'm sure you're aware,
that
> > is very similar to TPC-H workload, where most of the commercial
RDBMS
> > vendors utilise Materialized Views to enhance certain queries.
>
> Oh? As far as I can tell, TPC-H forbids use of materialized views.
> See sections 1.5.6 and 1.5.7 of the spec. The effect of the fine
print
> seems to be that the only way you are allowed to store extra copies of
> data is as indexes over columns that are primary keys, foreign keys,
> or date columns.

Sorry, I wasn't very clear there. I'm very happy that you're looking at
this area and are able to slap my imprecision into shape so succinctly.

You are 100% right: MVs are unambiguously not allowed as part of the
TPC-H spec - what 1.5.7 shows is how useful MVs are: they've had to ban
them! My take on the reason we now have MVs in all of the major
commercial DBMS is because they weren't directly banned in the original
TPC-D spec.
[ http://www.tpc.org/tpch/spec/tpch2.1.0.pdf ]

For me, there are two issues:
i) passing the TPC-H test
ii) dealing with a real-world workload

IMHO, TPC-H is very much a real-world workload, so the difference is:
i) coping with a TPC-H style workload when you have no a priori
knowledge of what might be performed, and when: that is the heart of the
TPC-H test. I think it is important that we strive to succeed on the
pure test since there always will be many queries you can't predict.
Removing MVs from that was an important statement about the requirement
to cope with ad-hoc queries.

ii) coping with the same workload when you have learnt something about
it - i.e. you are able to tune the system over time. That's where MVs
come in. Of course, you can go too far here, so there needs to be some
judgement about what constitutes a real-world MV scenario.

For me, the issue isn't passing the test, but exceeding it.

If Jonathan's MV work can make a noticeable improvement on the DBT-3
workload, then it will be worthwhile; this is likely to be intertwined
with optimizer improvements required in other areas.

However, overall it was very cheeky of me to presume to bring you or
anybody else together on these things, so I'll butt out until I have
time to contribute personally,

Best Regards, Simon Riggs

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2004-02-24 01:02:33 Re: Too-many-files errors on OS X
Previous Message Tom Lane 2004-02-24 00:09:22 Re: aliases, &c in HAVING clause?