Skip site navigation (1) Skip section navigation (2)

Combining index scans

From: Victor Yegorov <viy(at)mits(dot)lv>
To: Postgres Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Combining index scans
Date: 2005-01-31 19:15:59
Message-ID: 20050131191559.GC5500@mits.lv (view raw or flat)
Thread:
Lists: pgsql-hackers
Hello again.

This time I'd like to speak about in-memory bitmap to combine index scan
results. I know, this code should use minimal amount of memory, so I really
want to hear any possible pros and cons.

Below, pseudocode is given. After running it, we'll have a list of CTID
pointers, and one bitmap for each clause, that planner marked for index scan.

Consider query with m clauses, each marked for index scan. Let:
- j   denotes number of the clause, 0 <= j < m, m is total number of clauses;
- R   total number of CTID returned by all calls to index scans so far (for
      any of the m clauses), i.e. CTIDs list cardinality;
- p   CTID position in the list, 0 <= p < R;
- c   CTID returned by index scan API.

for (j = 0; j < m; j++) {
  if (no_bitmap_for_caluse(j)) {
    /* create bitmap of length R */
    create_bitmap(j, R);
    for (i = 0; i < R; i++)
      /* set i-th bit of j-th bitmap to 0 */
      set_bitmap_bit(j, i, 0);
  }
  c = get_ctid_from_index_scan_api();

  /* search for CTID in the list */
  p = search_ctid(c);

  /*
   * p will be either -1 if CTIDs not seen so far
   * OR it'll be in the range 0 <= p < R
   */
  if (p == -1) {
    /* this will also increase R (total number of ctids) */
    add_ctid_to_the_list(c);
    
    /* position of just-added CTID entry in list */
    p = R;

    /*
     * update all bitmaps created so far,
     * setting bit for just-added entry to 0
     */
    for (i = 0; i < j; i++) {
      set_bitmap_bit(i, p, 0);
    }
  }

  /* update current bitmap, set bit p to 1 */
  set_bitmap_bit(j, p, 1);
}
  
After that, one could do AND/OR of bitmaps and return the list of CTIDs that
matches final results.

For the case of 3 clauses, we could have such data in memoy:
idx | CTID      idx | clause1 | clause2 | clause3
  1 | 1234        1 |       1 |       0 |       0
  2 | 3456        2 |       0 |       1 |       0
  3 | 5678        3 |       0 |       0 |       1

As you can see, there were 3 scans, each returned exactly 1 CTID.
If there were at list one AND in the WHERE clause, then no rows will be
returned.

After inserting some more data to the table, we could have such more entry in
our in-memory bitmaps:

idx | CTID      idx | clause1 | clause2 | clause3
 ...             ...
  3 | 5678        3 |       0 |       1 |       1

So, in case of AND present in the WHERE clause, CTID 5678 would have been
returned.
  

That's it.
Waiting for your feedback.


-- 

Victor Y. Yegorov

pgsql-hackers by date

Next:From: Vishal Kashyap @ [SaiHertz]Date: 2005-01-31 19:25:55
Subject: Re: Last ID Problem
Previous:From: operationsengineer1Date: 2005-01-31 19:13:58
Subject: Last ID Problem

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group