TABLESAMPLE patch is really in pretty sad shape

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: pgsql-hackers(at)postgreSQL(dot)org
Subject: TABLESAMPLE patch is really in pretty sad shape
Date: 2015-07-11 20:28:40
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

The two contrib modules this patch added are nowhere near fit for public
consumption. They cannot clean up after themselves when dropped:

regression=# create extension tsm_system_rows;
regression=# create table big as select i, random() as x from generate_series(1,1000000) i;
SELECT 1000000
regression=# create view v1 as select * from big tablesample system_rows(10);
regression=# drop extension tsm_system_rows;

The view is still there, but is badly broken:

regression=# select * from v1;
ERROR: cache lookup failed for function 46379

Potentially this is a security issue, since a malicious user could
probably manage to create a Trojan horse function having the now-vacated
OID, whereupon use of the view would invoke that function.

Worse still, the broken pg_tablesample_method row is still there:

regression=# create extension tsm_system_rows;
ERROR: duplicate key value violates unique constraint "pg_tablesample_method_name_index"
DETAIL: Key (tsmname)=(system_rows) already exists.

And even if we fixed that, these modules will not survive a pg_upgrade
cycle, because pg_upgrade has no idea that it would need to create a
pg_tablesample_method row (remember that we do *not* replay the extension
script during binary upgrade). Raw inserts into system catalogs just
aren't a sane thing to do in extensions.

Some of the risks here come from what seems like a fundamentally
wrong decision to copy all of the info about a tablesample method out
of the pg_tablesample_method catalog *at parse time* and store it
permanently in the query parse tree. This makes any sort of "alter
tablesample method" DDL operation impossible in principle, because
any views/rules referencing the method have already copied the data.

On top of that, I find the basic implementation design rather dubious,
because it supposes that the tablesample filtering step must always
come first; the moment you say TABLESAMPLE you can kiss goodbye the
idea that the query will use any indexes. For example:

d2=# create table big as select i, random() as x from generate_series(1,10000000) i;
SELECT 10000000
d2=# create index on big(i);
d2=# analyze big;
d2=# explain analyze select * from big where i between 100 and 200;
Index Scan using big_i_idx on big (cost=0.43..10.18 rows=87 width=12) (actual time=0.022..0.088 rows=101 loops=1)
Index Cond: ((i >= 100) AND (i <= 200))
Planning time: 0.495 ms
Execution time: 0.141 ms
(4 rows)

d2=# explain analyze select * from big tablesample bernoulli(10) where i between 100 and 200;
Sample Scan (bernoulli) on big (cost=0.00..54055.13 rows=9 width=12) (actual time=0.028..970.051 rows=13 loops=1)
Filter: ((i >= 100) AND (i <= 200))
Rows Removed by Filter: 999066
Planning time: 0.182 ms
Execution time: 970.093 ms
(5 rows)

Now, maybe I don't understand the use-case for this feature, but I should
think it's meant for dealing with tables that are so big that you can't
afford to scan all the data. So, OK, a samplescan is hopefully cheaper
than a pure seqscan, but that doesn't mean that fetching 999079 rows and
discarding 999066 of them is a good plan design. There needs to be an
operational mode whereby we can use an index and do random sampling of
the TIDs the index returns. I do not insist that that has to appear in
version one of the feature --- but I am troubled by the fact that, by
exposing an oversimplified API for use by external modules, this patch is
doubling down on the assumption that no such operational mode will ever
need to be implemented.

There are a whole lot of lesser sins, such as documentation that was
clearly never copy-edited by a native speaker of English, badly designed
planner APIs (Paths really ought not have rowcounts different from the
underlying RelOptInfo), potential core dumps due to dereferencing values
that could be null (the guards for null values are in the wrong places
entirely), etc etc.

While there's nothing here that couldn't be fixed by nuking the contrib
modules and putting a week or two of concentrated work into fixing the
core code, I for one certainly don't have time to put that kind of effort
into TABLESAMPLE right now. Nor do I really have the interest; I find
this feature of pretty dubious value.

What are we going to do about this?

regards, tom lane


Browse pgsql-hackers by date

  From Date Subject
Next Message Tomas Vondra 2015-07-11 21:20:23 Re: strange plan with bitmap heap scan and multiple partial indexes
Previous Message Pavel Stehule 2015-07-11 18:40:32 Re: [PATCH] Generalized JSON output functions