Re: [HACKERS] PATCH: multivariate histograms and MCV lists

From: Mark Dilger <hornschnorter(at)gmail(dot)com>
To: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
Cc: Adrien Nayrat <adrien(dot)nayrat(at)dalibo(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [HACKERS] PATCH: multivariate histograms and MCV lists
Date: 2017-11-25 20:23:17
Message-ID: 5F255F0E-3F31-4753-87D7-3C4768A2B967@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers


> On Nov 18, 2017, at 12:28 PM, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com> wrote:
>
> Hi,
>
> Attached is an updated version of the patch, adopting the psql describe
> changes introduced by 471d55859c11b.
>
> regards
>
> --
> Tomas Vondra http://www.2ndQuadrant.com
> PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
> <0001-multivariate-MCV-lists.patch.gz><0002-multivariate-histograms.patch.gz>

Hello Tomas,

After applying both your patches, I get a warning:

histogram.c:1284:10: warning: taking the absolute value of unsigned type 'uint32' (aka 'unsigned int') has no effect [-Wabsolute-value]
delta = fabs(data->numrows);
^
histogram.c:1284:10: note: remove the call to 'fabs' since unsigned values cannot be negative
delta = fabs(data->numrows);
^~~~
1 warning generated.

Looking closer at this section, there is some odd integer vs. floating point arithmetic happening
that is not necessarily wrong, but might be needlessly inefficient:

delta = fabs(data->numrows);
split_value = values[0].value;

for (i = 1; i < data->numrows; i++)
{
if (values[i].value != values[i - 1].value)
{
/* are we closer to splitting the bucket in half? */
if (fabs(i - data->numrows / 2.0) < delta)
{
/* let's assume we'll use this value for the split */
split_value = values[i].value;
delta = fabs(i - data->numrows / 2.0);
nrows = i;
}
}
}

I'm not sure the compiler will be able to optimize out the recomputation of data->numrows / 2.0
each time through the loop, since the compiler might not be able to prove to itself that data->numrows
does not get changed. Perhaps you should compute it just once prior to entering the outer loop,
store it in a variable of integer type, round 'delta' off and store in an integer, and do integer comparisons
within the loop? Just a thought....

mark

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Mark Dilger 2017-11-25 20:25:49 Re: [HACKERS] PATCH: multivariate histograms and MCV lists
Previous Message Mark Dilger 2017-11-25 19:57:08 Code cleanup patch submission for extended_stats.c