Re: Specifying attribute slot for storing/reading statistics

From: Esteban Zimanyi <ezimanyi(at)ulb(dot)ac(dot)be>
To: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)lists(dot)postgresql(dot)org, Mahmoud Sakr <m_attia_sakr(at)yahoo(dot)com>, mohamed sayed <mohamed_bakli(at)aun(dot)edu(dot)eg>
Subject: Re: Specifying attribute slot for storing/reading statistics
Date: 2019-09-12 09:22:13
Message-ID: CAPqRbE4BYvTY-JdytZCrJy3DcNzSTy3qukaTJzmhHYnN0KGQNA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

>
> So these are 4 different data types (or classes of data types) that you
> introduce in your extension? Or is that just a conceptual view and it's
> stored in some other way (e.g. normalized in some way)?
>

At the SQL level these 4 durations are not distinguishable. For example for
a tfloat (temporal float) we can have

select tfloat '1(at)2000-01-01' -- Instant duration
select tfloat '{1(at)2000-01-01 , 2(at)2000-01-02 , 1(at)2000-01-03}' -- Instant set
duration
select tfloat '[1(at)2000-01-01, 2(at)2000-01-02 , 1(at)2000-01-03)' -- Sequence
duration, left-inclusive and right-exclusive bound,
select tfloat {'[1(at)2000-01-01, 2(at)2000-01-02 , 1(at)2000-01-03], '[1(at)2000-01-04,
1(at)2000-01-05]} ' -- Sequence set duration

Nevertheless it is possible to restrict a column to a specific duration
with a typymod specifier as in

create table test ( ..., measure tfloat(Instant) -- only Instant durations
accepted, ...)

At the C level these 4 durations are distinguished and implement in
something equivalent to a template abstract class Temporal with four
subclasses TemporalInst, TemporalI, TemporalSeq, and TemporalS. Indeed the
algorithms for manipulating these 4 durations are completely different.
They are called template classes since they keep the Oid of the base type
(float for tfloat or geometry for tgeompoint) in the same way array or
ranges do.

For more information please refer to the manual at github
https://github.com/ULB-CoDE-WIT/MobilityDB/

> I don't think we're strongly against changing the code to allow this, as
> long as it does not break existing extensions/code (unnecessarily).
>
> >If you want I can prepare a PR in order to understand the implications of
> >these changes. Please let me know.
> >
>
> I think having an actual patch to look at would be helpful.
>

I am preparing a first patch for the files selfuncs.h and selfunc.c and
thus for instant duration selectivity. It basically
1) Moves some prototypes of the static functions from the .c to the .h file
so that the functions are exported.
2) Passes the operator from the top level functions to the inner functions
such as mcv_selectivity or ineq_histogram_selectivity.

This allows me to call the functions twice, once for the value component
and another for the time component, e.g. as follows.

else if (cachedOp == CONTAINED_OP || cachedOp == OVERLAPS_OP)
{
/* Enable the addition of the selectivity of the value and time
* dimensions since either may be missing */
int selec_value = 1.0, selec_time = 1.0;

/* Selectivity for the value dimension */
if (MOBDB_FLAGS_GET_X(box->flags))
{
operator = oper_oid(LT_OP, valuetypid, valuetypid);
selec_value = scalarineqsel(root, operator, false, false
, vardata,
Float8GetDatum(box->xmin), valuetypid);
operator = oper_oid(GT_OP, valuetypid, valuetypid);
selec_value += scalarineqsel(root, operator, true, false
, vardata,
Float8GetDatum(box->xmax), valuetypid);
selec_value = 1 - selec_value;
}
/* Selectivity for the time dimension */
if (MOBDB_FLAGS_GET_T(box->flags))
{
operator = oper_oid(LT_OP, T_TIMESTAMPTZ, T_TIMESTAMPTZ);
selec_time = scalarineqsel(root, operator, false, false
, vardata,
TimestampTzGetDatum(box->tmin), TIMESTAMPTZOID);
operator = oper_oid(GT_OP, T_TIMESTAMPTZ, T_TIMESTAMPTZ);
selec_time += scalarineqsel(root, operator, true, false
, vardata,
TimestampTzGetDatum(box->tmax), TIMESTAMPTZOID);
selec_time = 1 - selec_time;
}
selec = selec_value * selec_time;
}

Regards

Esteban

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Tomas Vondra 2019-09-12 09:30:55 Re: logical decoding : exceeded maxAllocatedDescs for .spill files
Previous Message Fabien COELHO 2019-09-12 08:45:06 Re: psql - improve test coverage from 41% to 88%