Quick Links

Re: Abbreviated keys for text cost model fix

From:	Peter Geoghegan <pg(at)heroku(dot)com>
To:	Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
Cc:	Robert Haas <robertmhaas(at)gmail(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Abbreviated keys for text cost model fix
Date:	2015-02-22 21:30:40
Message-ID:	CAM3SWZR2PDCphC+sWi9y811uYrJZopCj0PSKfafnoWHji=qckw@mail.gmail.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

On Sun, Feb 22, 2015 at 1:19 PM, Tomas Vondra
<tomas(dot)vondra(at)2ndquadrant(dot)com> wrote:
> In short, this fixes all the cases except for the ASC sorted data. I
> haven't done any code review, but I think we want this.
>
> I'll use data from the i5-2500k, but it applies to the Xeon too, except
> that the Xeon results are more noisy and the speedups are not that
> significant.
>
> For the 'text' data type, and 'random' dataset, the results are these:
>
> scale datum cost-model
> -------------------------------
> 100000 328% 323%
> 1000000 392% 391%
> 2000000 96% 565%
> 3000000 97% 572%
> 4000000 97% 571%
> 5000000 98% 570%
>
> The numbers are speedup vs. master, so 100% means exactly the same
> speed, 200% means twice as fast.
>
> So while with 'datum' patch this actually caused very nice speedup for
> small datasets - about 3-4x speedup up to 1M rows, for larger datasets
> we've seen small regression (~3% slower). With the cost model fix, we
> actually see a significant speedup (about 5.7x) for these cases.

Cool.

> I haven't verified whether this produces the same results, but if it
> does this is very nice.
>
> For 'DESC' dataset (i.e. data sorted in reverse order), we do get even
> better numbers, with up to 6.5x speedup on large datasets.
>
> But for 'ASC' dataset (i.e. already sorted data), we do get this:
>
> scale datum cost-model
> -------------------------------
> 100000 85% 84%
> 1000000 87% 87%
> 2000000 76% 96%
> 3000000 82% 90%
> 4000000 91% 83%
> 5000000 93% 81%
>
> Ummm, not that great, I guess :-(

You should try it with the data fully sorted like this, but with one
tiny difference: The very last tuple is out of order. How does that
look?

--
Peter Geoghegan

In response to

Re: Abbreviated keys for text cost model fix at 2015-02-22 21:19:33 from Tomas Vondra

Responses

Re: Abbreviated keys for text cost model fix at 2015-02-22 23:16:40 from Peter Geoghegan
Re: Abbreviated keys for text cost model fix at 2015-02-23 16:40:44 from Tomas Vondra

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Peter Geoghegan	2015-02-22 23:16:40	Re: Abbreviated keys for text cost model fix
Previous Message	Tomas Vondra	2015-02-22 21:19:33	Re: Abbreviated keys for text cost model fix