Re: Including Snapshot Info with Indexes

From: "Luke Lonergan" <llonergan(at)greenplum(dot)com>
To: "Skype Technologies OY" <hannu(at)skype(dot)net>, "Gokulakannan Somasundaram" <gokul007(at)gmail(dot)com>
Cc: "Gregory Stark" <stark(at)enterprisedb(dot)com>, "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "Andreas Joseph Krogh" <andreak(at)officenet(dot)no>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Including Snapshot Info with Indexes
Date: 2007-10-20 17:19:43
Message-ID: C33F86BF.4749E%llonergan@greenplum.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers pgsql-patches

Hi Hannu,

On 10/14/07 12:58 AM, "Hannu Krosing" <hannu(at)skype(dot)net> wrote:

> What has happened in reality, is that the speed difference between CPU,
> RAM and disk speeds has _increased_ tremendously

Yes.

> which makes it even
> more important to _decrease_ the size of stored data if you want good
> performance

Or bring the cpu processing closer to the data it's using (or both).

By default, the trend you mention first will continue in an unending way -
the consequence is that the "distance" between a processor and it's target
data will continue to increase ad-infinitum.

By contrast, you can only decrease the data volume so much - so in the end
you'll be left with the same problem - the data needs to be closer to the
processing. This is the essence of parallel / shared nothing architecture.

Note that we've done this at Greenplum. We're also implementing a DSM-like
capability and are investigating a couple of different hybrid row / column
store approaches.

Bitmap index with index-only access does provide nearly all of the
advantages of a column store from a speed standpoint BTW. Even though
Vertica is touting speed advantages - our parallel engine plus bitmap index
will crush them in benchmarks when they show up with real code.

Meanwhile they're moving on to new ideas - I kid you not "Horizontica" is
Dr. Stonebraker's new idea :-)

So - bottom line - some ideas from column store make sense, but it's not a
cure-all.

> There is also a MonetDB/X100 project, which tries to make MonetOD
> order(s) of magnitude faster by doing in-page compression in order to
> get even more performance, see:

Actually, the majority of the points made by the MonetDB team involve
decreasing the abstractions in the processing path to improve the IPC
(instructions per clock) efficiency of the executor.

We are also planning to do this by operating on data in vectors of projected
rows in the executor, which will increase the IPC by reducing I-cache misses
and improving D-cache locality. Tight loops will make a much bigger
difference when long runs of data are the target operands.

- Luke

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Shelby Cain 2007-10-20 18:29:15 Re: 8.2.3: Server crashes on Windows using Eclipse/Junit
Previous Message Trevor Talbot 2007-10-20 16:53:03 Re: 8.2.3: Server crashes on Windows using Eclipse/Junit

Browse pgsql-patches by date

  From Date Subject
Next Message Greg Sabino Mullane 2007-10-21 16:17:34 Better psql tab-completion support for schemas and tables
Previous Message Heikki Linnakangas 2007-10-20 14:14:26 Re: [HACKERS] Why copy_relation_data only use wal whenWALarchivingis enabled