Re: I'd like to learn a bit more about how indexes work

From: Mike Christensen <mike(at)kitchenpc(dot)com>
To: Dann Corbit <DCorbit(at)connx(dot)com>
Cc: "pgsql-general(at)postgresql(dot)org" <pgsql-general(at)postgresql(dot)org>
Subject: Re: I'd like to learn a bit more about how indexes work
Date: 2012-06-05 23:28:29
Message-ID: CABs1bs22RJ5faYEZchWVkKAqwsrC4U_oV_ehDTM7_4Cy8P1F4A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

I'm aware of how "B-Trees" work, but I only understand them on the
level of traversing a single tree at a time. I'm curious how Postgres
combines multiple indexes with OR and AND clauses.

I've done some Google searches, however I can't find anything basic.
Everything I've found assumes you already have knowledge of terms such
as "hash join, aggregate join, etc".

At this point I'm not looking at learning how the optimizer works..

Mike

On Tue, Jun 5, 2012 at 3:36 PM, Dann Corbit <DCorbit(at)connx(dot)com> wrote:
> If you want to discover how B+Trees or B-Trees work, I suggest a web search.  A database like PostgreSQL is not going to use an ordinary btree for an index, but they use special trees that have page level structures, such as B-Trees, GiST trees, etc.    For PostgreSQL the list includes {IIRC} B-tree, Hash, GiST and GIN, though I am not sure it is current.  I believe that there is also a GIS extension to PostgreSQL which probably uses Octrees or Quadtrees, but that is purely a guess.
> Place this criteria into your favorite search engine, for instance:
> "B-Tree" index
>
> You can qualify it with "PostgreSQL" if you like, but I suspect you just want to know how indexes work in general with different index types.
>
> I suspect that what you really want to eventually understand is:
>
> "How does the optimizer make plans to create efficient queries" which is what is indicated in your questions below.
>
> If that is the case, then I suggest performing search queries with keywords such as:
> sql cost based optimizer
>
> -----Original Message-----
> From: pgsql-general-owner(at)postgresql(dot)org [mailto:pgsql-general-owner(at)postgresql(dot)org] On Behalf Of Mike Christensen
> Sent: Tuesday, June 05, 2012 3:25 PM
> To: pgsql-general(at)postgresql(dot)org
> Subject: [GENERAL] I'd like to learn a bit more about how indexes work
>
> Hi -
>
> I'm trying to increase my general knowledge about how indexes work in databases.  Though my questions are probably general and implemented in a similar way across major relational DBs, I'm also curious as to how they're implemented in Postgres specifically (mainly because I like PG, and am always interested in knowing if PG does things in some cool and interesting way).
>
> I know the basics of how binary trees work, so I understand a query such as :
>
> select * from Table where Id = 5;
>
> Provided Id has a btree index on it.  I'm curious as to how indexes are used with OR and AND clauses.
>
> Something like:
>
> select * from Table where X = 5 or y = 3;
>
> It seems to me both the index of X would be scanned and those rows would be loaded into memory, and then the index of Y would be scanned and loaded.  Then, Postgres would have to merge both sets into a set of unique rows.  Is this pretty much what's going on?  Let's ignore table stats for now.
>
> Then, something like:
>
> select * from Table where X = 5 AND y = 3;
>
> I would imagine the same thing is going on, only Postgres would find rows that appear in both sets.  I also imagine Postgres might create a hash table from the larger set, and then iterate through the smaller set looking for rows that were in that hash table.
>
> Lastly, If you had a query such as:
>
> select * from Table where X IN (1,2,3,4,5,6,7);
>
> I would imagine Postgres would parse that query as a bunch of OR clauses.  Does this mean the index for X would be scanned 7 times and merged into a set of unique results?  Though, obviously if Postgres estimated this would return the majority of the rows in the table, it would probably just ignore the index completely.
>
> Thanks!
> Mike
>
> --
> Sent via pgsql-general mailing list (pgsql-general(at)postgresql(dot)org) To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message David Johnston 2012-06-05 23:40:39 Re: Populate Table From Two Other Tables
Previous Message Rich Shepard 2012-06-05 23:28:20 Re: Populate Table From Two Other Tables