Re: texteq/byteaeq: avoid detoast [REVIEW]

From: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To: Andy Colson <andy(at)squeakycode(dot)net>
Cc: Noah Misch <noah(at)leadboat(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: texteq/byteaeq: avoid detoast [REVIEW]
Date: 2011-01-16 21:07:13
Message-ID: AANLkTimBy4QCXodj7XHqQhEqCqWZhme=WirzNa1vAEbF@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hello

I looked on this patch too.

It's good idea.

I think, so we can have a function or macro that compare a varlena
sizes. Some like

Datum texteq(..)
{
if (!datumsHasSameLength(PG_GETARG_DATUM(0), PG_GETARG_DATUM(1))
PG_RETURN_FALSE();

... actual code ..
}

Regards

Pavel Stehule
2011/1/16 Andy Colson <andy(at)squeakycode(dot)net>:
> This is a review of:
> https://commitfest.postgresql.org/action/patch_view?id=468
>
> Purpose:
> ========
> Equal and not-equal _may_ be quickly determined if their lengths are
> different.   This _may_ be a huge speed up if we dont have to detoat.
>
>
> The Patch:
> ==========
> I was able to read and understand the patch, its a simple change and looked
> correct to me (a non PG hacker).
> It applies clean to git head, compiles and runs fine with debug enabled.
>
> make check passes
>
>
> Usability:
> ==========
> I used _may_ above.  The benchmark included with the patch, showing huge
> speedups, is really contrived.  It uses a where clause with a thousand
> character constant:  (where c =
> 'long...long...long...long...ConstantText...etc').  In my opinion this is
> very uncommon (the author does note this is a "best case").  If you have a
> field large enough to be toasted you are not going to be using that to
> search on, you are going to have an ID field that is indexed.  (select c
> where id = 7)
>
> This also only touches = and <>.  > < and like wont be touched.  So I think
> the scope of this is limited.
>
> THAT being said, the patch is simple, and if you do happen to hit the code,
> it will speed things up.  As a user of PG I'd like to have this included.
>  Its a corner case, but a big corner, and its a small, simple change, and it
> wont slow anything else down.
>
>
> Performance:
> ============
> I created myself a more real world test, with a table with indexes and id's
> and a large toasted field.
>
> create table junk(id serial primary key, xgroup integer, c text);
> create index junk_group on junk(xgroup);
>
>
> I filled it full of junk:
>
> do $$
>        declare i integer;
>        declare j integer;
> begin
>        for i in 1..100 loop
>                for j in 1..500 loop
>                        insert into junk(xgroup, c) values (j, 'c'||i);
>                        insert into junk (xgroup, c) select j, repeat('abc',
> 2000)|| n from generate_series(1, 5) n;
>                end loop;
>        end loop;
> end$$;
>
>
> This will make about 600 records within the same xgroup.  As well as a
> simple 'c15' type of value in c we can search for.  My thinking is you may
> not know the exact unique id, but you do know what group its in, so that'll
> cut out 90% of the records, and then you'll have to add " and c = 'c15'" to
> get the exact one you want.
>
> I still saw a nice performance boost.
>
> Old PG:
> $ psql < bench3.sql
> Timing is on.
> DO
> Time: 2010.412 ms
>
> Patched:
> $ psql < bench3.sql
> Timing is on.
> DO
> Time: 184.602 ms
>
>
> bench3.sql:
> do $$
>        declare i integer;
> begin
>        for i in 1..400 loop
>                perform count(*) from junk where xgroup = i and c like 'c' ||
> i;
>        end loop;
> end$$;
>
>
>
> Summary:
> ========
> Performance speed-up:  Oh yeah!  If you just happen to hit it, and if you do
> hit it, you might want to re-think your layout a little bit.
>
> Do I want it?  Yes please.
>
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers(at)postgresql(dot)org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Euler Taveira de Oliveira 2011-01-16 21:13:30 Re: [HACKERS] reviewers needed!
Previous Message Dimitri Fontaine 2011-01-16 21:06:29 Re: LOCK for non-tables