Re: Zedstore - compressed in-core columnar storage

From: Ashwin Agrawal <aagrawal(at)pivotal(dot)io>
To: Ashutosh Sharma <ashu(dot)coek88(at)gmail(dot)com>
Cc: Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, DEV_OPS <devops(at)ww-it(dot)cn>, PostgreSQL mailing lists <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Zedstore - compressed in-core columnar storage
Date: 2019-08-27 00:33:00
Message-ID: CALfoeiu=wvjVRYm_=d=_sUVCdwvZhknuZBug4fUGuUGqqQGfZg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Aug 26, 2019 at 5:36 AM Ashutosh Sharma <ashu(dot)coek88(at)gmail(dot)com>
wrote:

> Thanks Ashwin and Heikki for your responses. I've one more query here,
>
> If BTree index is created on a zedstore table, the t_tid field of
> Index tuple contains the physical tid that is not actually pointing to
> the data block instead it contains something from which the logical
> tid can be derived. So, when IndexScan is performed on a zedstore
> table, it fetches the physical tid from the index page and derives the
> logical tid out of it and then retrieves the data corresponding to
> this logical tid from the zedstore table. For that, it kind of
> performs SeqScan on the zedstore table for the given tid.

Nope, it won't perform seqscan. As zedstore is laid out as btree itself
with logical TID as its key. It can quickly find which page the logical TID
belongs to and only access that page. It doesn't need to perform the
seqscan for the same. That's one of the rationals for laying out things in
btree fashion to easily connect logical to physical world and not keep any
external mapping.

AFAIU, the following user level query on zedstore table
>
> select * from zed_tab where a = 3;
>
> gets internally converted to
>
> select * from zed_tab where tid = 3; -- assuming that index is created
> on column 'a' and the logical tid associated with a = 3 is 3.
>

So, for this it will first only access the TID btree, find the leaf page
with tid=3. Perform the visibility checks for the tuple and if tuple is
visible, then only will fetch all the columns for that TID. Again using the
btrees for those columns to only fetch leaf page for that logical tid.

Hope that helps to clarify the confusion.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Langote 2019-08-27 00:51:36 Re: A problem about partitionwise join
Previous Message Peter Geoghegan 2019-08-27 00:18:42 Re: IoT/sensor data and B-Tree page splits