Re: TABLESAMPLE patch

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Petr Jelinek <petr(at)2ndquadrant(dot)com>
Cc: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>, Jaime Casanova <jaime(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Tomas Vondra <tv(at)fuzzy(dot)cz>
Subject: Re: TABLESAMPLE patch
Date: 2015-01-17 12:46:24
Message-ID: CAA4eK1JrGQTetQM31J5LfDeTUWE2jDUcmymgRx3MdhBP97X0qg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sun, Jan 11, 2015 at 1:29 AM, Petr Jelinek <petr(at)2ndquadrant(dot)com> wrote:
>
>
> In second patch which implements the TABLESAMPLE itself I changed the
implementation of random generator because when I looked at the code again
I realized the old one would produce wrong results if there were multiple
TABLESAMPLE statements in same query or multiple cursors in same
transaction.
>

I have looked into this patch and would like to share my
findings with you.

1.
+static void
+set_tablesample_rel_pathlist(PlannerInfo *root, RelOptInfo *rel,
RangeTblEntry *rte)
{
..
+ /*
+ * There is only one plan to consider but we still need to set
+ * parameters for RelOptInfo.
+ */
+ set_cheapest(rel);
}

It seems we already call set_cheapest(rel) in set_rel_pathlist()
which is a caller of set_tablesample_rel_pathlist(), so why do
we need it inside set_tablesample_rel_pathlist()?

2.
+static void
+set_tablesample_rel_pathlist(PlannerInfo *root, RelOptInfo *rel,
RangeTblEntry *rte)
{
..
+ /* We only do sample scan if it was requested */
+ add_path(rel, (Path *) create_samplescan_path(root, rel, required_outer));
}

Do we need to add_path, if there is only one path?

3.
@@ -332,6 +334,11 @@ set_rel_pathlist(PlannerInfo *root, RelOptInfo *rel,
/* Foreign table */
set_foreign_pathlist(root, rel, rte);
}
+ else if (rte->tablesample != NULL)
+ {
+ /* Build sample scan on relation */
+ set_tablesample_rel_pathlist(root, rel, rte);
+ }

Have you consider to have a different RTE for this?

4.
SampleNext()
a.
Isn't it better to code SampleNext() similar to SeqNext() and
IndexNext(), which encapsulate the logic to get the tuple in
another function(tbs_next() or something like that)?

b.
Also don't we want to handle pruning of page while
scanning (heap_page_prune_opt()) and I observed
in heap scan API's after visibility check we do check
for serializable conflict (CheckForSerializableConflictOut()).
Another thing is don't we want to do anything for sync scans
for these method's as they are doing heap scan?

c.
for bernoulli method, it will first get the tupoffset with
the help of function and then apply visibility check, it seems
that could reduce the sample set way lower than expected
in case lot of tuples are not visible, shouldn't it be done in reverse
way(first visibility check, then call function to see if we want to
include it in the sample)?
I think something similar is done in acquire_sample_rows which
seems right to me.

5.

CREATE TABLE test_tablesample (id INT, name text) WITH (fillfactor=10);
INSERT INTO test_tablesample SELECT i, repeat(i::text, 200) FROM
generate_series(0, 9) s(i) ORDER BY i;

postgres=# SELECT id FROM test_tablesample TABLESAMPLE BERNOULLI (80);
id
----
1
3
4
7
8
9
(6 rows)

postgres=# SELECT id FROM test_tablesample TABLESAMPLE BERNOULLI (80);
id
----
0
1
2
3
4
5
6
7
8
9
(10 rows)

So above test yield 60% rows first time and 100% rows next time,
when the test has requested 80%.
I understand that sample percentage will result an approximate
number of rows, however I just wanted that we should check if the
implementation has any problem or not?

In this regard, I have one question related to below code:

+Datum
+tsm_bernoulli_nexttuple(PG_FUNCTION_ARGS)
+{
..
+ /* Every tuple has percent chance of being returned */
+ while (sampler_random_fract(sampler->randstate) > samplesize)
+ {
+ tupoffset++;
+
+ if (tupoffset > maxoffset)
+ break;
+ }

Why are we not considering tuple in above code
if tupoffset is less than maxoffset?

6.
One larger question about the approach used in patch, why do you
think it is better to have a new node (SampleScan/SamplePath) like
you have used in patch instead of doing it as part of Sequence Scan.
I agree that we need some changes in different parts of Sequence Scan
execution, but still sounds like a viable way. One simple thing that
occurs to me is that in ExecSeqScan(), we can use something like
SampleSeqNext instead of SeqNext to scan heap in a slightly different
way, probably doing it as part of sequence scan will have advantage that
we can use most of the existing infrastructure (sequence node path)
and have less discrepancies as mentioned in point-4.

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2015-01-17 13:08:34 Re: moving from contrib to bin
Previous Message Andres Freund 2015-01-17 12:16:18 Re: moving from contrib to bin