From: | David Rowley <david(dot)rowley(at)2ndquadrant(dot)com> |
---|---|
To: | Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com> |
Cc: | Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>, Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>, Bruce Momjian <bruce(at)momjian(dot)us>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Mark Dilger <hornschnorter(at)gmail(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: [HACKERS] PATCH: multivariate histograms and MCV lists |
Date: | 2019-01-24 01:59:50 |
Message-ID: | CAKJS1f_NLyMM7KXHt5+aK7zes-brCk0yRR9=0Us_n0fR6frNWA@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Wed, 23 Jan 2019 at 12:46, David Rowley <david(dot)rowley(at)2ndquadrant(dot)com> wrote:
> (Stopped in statext_mcv_build(). Need to take a break)
Continuing...
27. statext_mcv_build() could declare the int j,k variables in the
scope that they're required in.
28. "an array"
* build array of SortItems for distinct groups and counts matching items
29. No need to set isnull to false in statext_mcv_load()
30. Wondering about the reason in statext_mcv_serialize() that you're
not passing the collation to sort the array.
You have:
ssup[dim].ssup_collation = DEFAULT_COLLATION_OID;
should it not be:
ssup[dim].ssup_collation = stats[dim]->attrcollid;
?
31. uint32 should use %u, not %d:
if (mcvlist->magic != STATS_MCV_MAGIC)
elog(ERROR, "invalid MCV magic %d (expected %d)",
mcvlist->magic, STATS_MCV_MAGIC);
and
if (mcvlist->type != STATS_MCV_TYPE_BASIC)
elog(ERROR, "invalid MCV type %d (expected %d)",
mcvlist->type, STATS_MCV_TYPE_BASIC);
and
ereport(ERROR,
(errcode(ERRCODE_DATA_CORRUPTED),
errmsg("invalid length (%d) item array in MCVList",
mcvlist->nitems)));
I don't think %ld is the correct format for VARSIZE_ANY_EXHDR. %u or
%d seem more suited. I see that value is quite often assigned to int,
so probably can't argue much with %d.
elog(ERROR, "invalid MCV size %ld (expected %zu)",
VARSIZE_ANY_EXHDR(data), expected_size);
32. I think the format is wrong here too:
elog(ERROR, "invalid MCV size %ld (expected %ld)",
VARSIZE_ANY_EXHDR(data), expected_size);
I'd expect "invalid MCV size %d (expected %zu)"
33. How do you allocate a single chunk non-densely?
* Allocate one large chunk of memory for the intermediate data, needed
* only for deserializing the MCV list (and allocate densely to minimize
* the palloc overhead).
34. I thought I saw a few issues with pg_stats_ext_mcvlist_items() so
tried to test it:
create table ab (a int, b int);
insert into ab select x,x from generate_serieS(1,10)x;
create statistics ab_ab_stat (mcv) on a,b from ab;
analyze ab;
select pg_mcv_list_items(stxmcv) from pg_Statistic_ext where stxmcv is not null;
ERROR: cache lookup failed for type 2139062143
The issues I saw were:
You do:
appendStringInfoString(&itemValues, "{");
appendStringInfoString(&itemNulls, "{");
but never append '}' after building the string.
(can use appendStringInfoChar() BTW)
also:
if (i == 0)
{
appendStringInfoString(&itemValues, ", ");
appendStringInfoString(&itemNulls, ", ");
}
I'd have expected you to append the ", " only when i > 0.
--
David Rowley http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
From | Date | Subject | |
---|---|---|---|
Next Message | Masahiko Sawada | 2019-01-24 02:14:12 | Re: [HACKERS] Block level parallel vacuum |
Previous Message | Peter Geoghegan | 2019-01-24 01:44:41 | Re: Making all nbtree entries unique by having heap TIDs participate in comparisons |