|From:||Mark Dilger <hornschnorter(at)gmail(dot)com>|
|To:||Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>|
|Cc:||Adrien Nayrat <adrien(dot)nayrat(at)dalibo(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>|
|Subject:||Re: [HACKERS] PATCH: multivariate histograms and MCV lists|
|Views:||Raw Message | Whole Thread | Download mbox|
> On Nov 27, 2017, at 8:47 AM, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com> wrote:
> Attached is an updated version of the patch series, fixing the issues
> reported by Mark Dilger:
> 1) Fix fabs() issue in histogram.c.
> 2) Do not rely on extra_data being StdAnalyzeData, and instead lookup
> the LT operator explicitly. This also adds a simple regression tests to
> make sure ANALYZE on arrays works fine, but perhaps we should invent
> some simple queries too.
> 3) I've removed / clarified some of the comments mentioned by Mark.
> 4) I haven't changed how the statistics kinds are defined in relation.h,
> but I agree there should be a comment explaining how STATS_EXT_INFO_*
> relate to StatisticExtInfo.kinds.
> 5) The most significant change happened histograms. There used to be two
> structures for histograms:
> - MVHistogram - expanded (no deduplication etc.), result of histogram
> build and never used for estimation
> - MVSerializedHistogram - deduplicated to save space, produced from
> MVHistogram before storing in pg_statistic_ext and never used for
> So there wasn't really any reason to expose the "non-serialized" version
> outside histogram.c. It was just confusing and unnecessary, so I've
> moved MVHistogram to histogram.c (and renamed it to MVHistogramBuild),
> and renamed MVSerializedHistogram. And same for the MVBucket stuff.
> So now we only deal with MVHistogram everywhere, except in histogram.c.
> 6) I've also made MVHistogram to include a varlena header directly (and
> be packed as a bytea), which allows us to store it without having to
> call any serialization functions).
> I guess if we should do (5) and (6) for the MCV lists too, it seems more
> convenient than the current approach. And perhaps even for the
> statistics added to 9.6 (it does not change the storage format).
I tested your latest patches on my mac os x laptop and got one test
failure due to the results of 'explain' coming up differently. For the record,
I followed these steps:
# this got my directory up to 8526bcb2df76d5171b4f4d6dc7a97560a73a5eff with no local changes
patch -p 1 < ../0001-multivariate-MCV-lists.patch
patch -p 1 < ../0002-multivariate-histograms.patch
./configure --prefix=/Users/mark/master/testinstall --enable-cassert --enable-tap-tests --enable-depend && make -j4 && make check-world
|Next Message||Mark Dilger||2017-12-19 19:38:08||Re: WIP: BRIN multi-range indexes|
|Previous Message||David Fetter||2017-12-19 18:59:18||Re: WIP: a way forward on bootstrap data|