RE: Index Skip Scan

From: Floris Van Nee <florisvannee(at)Optiver(dot)com>
To: Dmitry Dolgov <9erthalion6(at)gmail(dot)com>
Cc: Dilip Kumar <dilipbalaut(at)gmail(dot)com>, David Rowley <dgrowleyml(at)gmail(dot)com>, Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, Peter Geoghegan <pg(at)bowt(dot)ie>, Jesper Pedersen <jesper(dot)pedersen(at)redhat(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, James Coleman <jtc331(at)gmail(dot)com>, Rafia Sabih <rafia(dot)pghackers(at)gmail(dot)com>, Jeff Janes <jeff(dot)janes(at)gmail(dot)com>, "bhushan(dot)uparkar(at)gmail(dot)com" <bhushan(dot)uparkar(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Alexander Korotkov <a(dot)korotkov(at)postgrespro(dot)ru>
Subject: RE: Index Skip Scan
Date: 2020-04-07 20:19:08
Message-ID: 8e4710f8d4004905af454a43d72c41a5@opammb0562.comp.optiver.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

>
> * Suspicious performance difference between different type of workload,
> mentioned by Tomas (unfortunately I had no chance yet to investigate).
>

His benchmark results indeed most likely point to multiple comparisons being done. Since the most likely place where these occur is _bt_readpage, I suspect this is called multiple times. Looking at your patch, I think that's indeed the case. For example, suppose a page contains [1,2,3,4,5] and the planner makes a complete misestimation and chooses a skip scan here. First call to _bt_readpage will compare every tuple on the page already and store everything in the workspace, which will now contain [1,2,3,4,5]. However, when a skip is done the elements on the page (not the workspace) are compared to find the next one. Then, another _bt_readpage is done, starting at the new offnum. So we'll compare every tuple (except 1) on the page again. Workspace now contains [2,3,4,5]. Next tuple we'll end up with [3,4,5] etc. So tuple 5 actually gets compared 5 times in _bt_readpage alone.

-Floris

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Alvaro Herrera 2020-04-07 20:25:43 pgsql: Support FETCH FIRST WITH TIES
Previous Message Robert Haas 2020-04-07 20:13:07 Re: Improving connection scalability: GetSnapshotData()