Re: Including Snapshot Info with Indexes

From: "Gokulakannan Somasundaram" <gokul007(at)gmail(dot)com>
To: "Trevor Talbot" <quension(at)gmail(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Including Snapshot Info with Indexes
Date: 2007-10-20 03:54:07
Message-ID: 9362e74e0710192054s666b6907l5227e96247f6ac7b@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers pgsql-patches

Hi,
I think i have a initial Implementation. It has some bugs and i am working
on fixing it. But to show the advantages, I want to show the number of
Logical I/Os on the screen. In order to show that, i tried enabling the
log_statement option in PostgreSQL.conf. But it shows only the physical
reads. What i wanted was a Logical reads count( No. of ReadBuffer calls,
which is stored in ReadBufferCount variable). So i have added this stats to
the bufmgr.c(function is BufferUsage, i suppose) to show Logical Reads and
Physical Reads. Is this a acceptable change?
I thought logical read count would be helpful, even for SQL tuning. Since
if someone wants to tune the SQL on a test system, things might get cached
and he wouldn't know how much I/O his SQL is potentially capable of. May be
we can add a statistic to show how many of those ReadBuffers are pinned
Buffers.

Expecting your comments.

Thanks,
Gokul.

On 10/14/07, Gokulakannan Somasundaram <gokul007(at)gmail(dot)com> wrote:
>
>
>
> On 10/14/07, Trevor Talbot <quension(at)gmail(dot)com> wrote:
> >
> > On 10/14/07, Gokulakannan Somasundaram <gokul007(at)gmail(dot)com> wrote:
> >
> > > http://www.databasecolumn.com/2007/09/one-size-fits-all.html
> >
> > > > > The Vertica database(Monet is a open source version with the same
> > > > > principle) makes use of the very same principle. Use more disk
> > space,
> > > > > since they are less costly and optimize the data warehousing.
> >
> > > What i meant there was, it has duplicated storage of certain columns
> > of the
> > > table. A table with more than one projection always needs more space,
> > than a
> > > table with just one projection. By doing this they are reducing the
> > number
> > > of disk operations. If they are duplicating columns of data to avoid
> > reading
> > > un-necessary information, we are duplicating the snapshot information
> > to
> > > avoid going to the table.
> >
> > Was this about Vertica or MonetDB? I saw that article a while ago,
> > and I didn't see anything that suggested Vertica duplicated data, just
> > that it organized it differently on disk. What are you seeing as
> > being duplicated?
>
>
> Hi Trevor,
> This is a good paper to read about the basics of
> Column-oriented databases.
> http://db.lcs.mit.edu/projects/cstore/vldb.pdf
> If you goto the Section 2 - Data Model. He has shown the data model, with
> a sample EMP table.
>
> The example shows that EMP table contains four columns - Name, Age, Dept,
> Salary
> From this table, projections are being formed - (In the paper, they have
> shown the creation of four projections for Example 1)
> EMP1 (name, age)
> EMP2 (dept, age, DEPT.floor)
> EMP3 (name, salary)
> DEPT1(dname, floor)
>
> As you can see, the same column information gets duplicated in different
> projections.
> The advantage is that if a query is around name and age, it need not skim
> around other details. But the storage requirements go high, since there is
> redundancy. As you may know, if you increase data redundancy, it will help
> selects at the cost of inserts, updates and deletes.
>
> This is what i was trying to say.
>
> Thanks,
> Gokul.
>
>
>

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Magnus Hagander 2007-10-20 06:22:06 Re: Strange error dropping foreign key
Previous Message Henry B. Hotz 2007-10-19 23:51:04 8.3 GSS Issues

Browse pgsql-patches by date

  From Date Subject
Next Message Martijn van Oosterhout 2007-10-20 08:30:43 Re: Including Snapshot Info with Indexes
Previous Message Gregory Stark 2007-10-19 10:48:22 Re: Crosstab Problems