Re: tuplesort test coverage

From: Peter Geoghegan <pg(at)bowt(dot)ie>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: Heikki Linnakangas <heikki(dot)linnakangas(at)iki(dot)fi>, Robert Haas <robertmhaas(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: tuplesort test coverage
Date: 2019-10-25 11:30:26
Message-ID: CAH2-WznF4s_DZSiV+o7vqDptHb9RRycYUEMoBBouVx=0vqXqMg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Oct 24, 2019 at 7:10 PM Andres Freund <andres(at)anarazel(dot)de> wrote:
> I really don't think it's ok to have as many abbrev abort related paths
> without any coverage - the relevant code isn't that trivial. And
> something like amcheck really doesn't strike me as sufficient. For one,
> it doesn't provide any coverage either. For another, plenty sorts don't
> end up in a form that amcheck sees.

I agree.

> Tests aren't just there to verify that the current behaviour isn't
> broken, they're also there to allow to develop with some confidence. And
> I don't think tuplesort as is really allows that, and e.g. abbreviated
> keys made that substantially worse. That happens, but I think it'd be
> good if you could help improving the situation.

I would like to improve this. I am mostly just pointing out that there
has been resistance to this historically. I am in favor of
mechanically increasing test coverage of tuplesort.c along the lines
you describe. I'm just a bit concerned that Tom or others may see it
differently.

> E.g.
> SELECT * FROM (SELECT ('00000000-0000-0000-0000-'||to_char(g.i, '000000000000FM'))::uuid uuid FROM generate_series(15000, 0, -1) g(i)) d ORDER BY uuid
> reliably triggers abbreviated keys, and it looks to me like that should
> be portable. With a few tweaks it'd be fairly easy to use that to
> provide some OK coverage for most the abbrev key cases.

I agree.

> Yes, that's kind of my point? Either that patch reduced coverage, or it
> created dead code. Neither is good.

I agree.

> > mergeruns() doesn't use abbreviated keys -- this code disables their
> > use in the standard way.
>
> Well, then reformulate the point that we should have coverage for
> mergeruns() when initially abbreviated keys were set up.

That doesn't seem essentially, but I'm okay with it.

> > I had to convince Tom to get the coverage of external sorts we have
> > now. Apparently there are buildfarm animals that are very sensitive to
> > that cost, that could have substantially increased test runtimes were
> > we to do more. Perhaps this could be revisited.
>
> Hm. I'm a bit confused. Isn't all that's required to set a tiny amount
> of work_mem? Then it's easy to trigger many passes without a lot of IO?

Yes, but Tom felt that this might not be good enough when this was
discussed in 2016. However, I seem to recall that he was pleasantly
surprised at how small the overhead turned out to be.

It's hard for me to test how much overhead this will have on a machine
with horribly slow I/O. Though I just bought a new Raspberry Pi, and
could test on that when I get back home from my trip to Europe -- it
uses an SD card, which is pretty slow.

> I'm not saying that tuplesort has gotten worse or anything. Just that
> there's been too much development without adding tests.

I agree.

--
Peter Geoghegan

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Geoghegan 2019-10-25 11:37:38 Re: tuplesort test coverage
Previous Message amul sul 2019-10-25 09:59:03 Re: [HACKERS] advanced partition matching algorithm for partition-wise join