Re: [PATCHES] Eliminate more detoast copies for packed varlenas

From: Bruce Momjian <bruce(at)momjian(dot)us>
To: Gregory Stark <stark(at)enterprisedb(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL-development Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [PATCHES] Eliminate more detoast copies for packed varlenas
Date: 2008-03-24 18:24:21
Message-ID: 200803241824.m2OIOLs10022@momjian.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers pgsql-patches


Added to TODO:

* Research reducing deTOASTing in more places

http://archives.postgresql.org/pgsql-hackers/2007-09/msg00895.php

---------------------------------------------------------------------------

Gregory Stark wrote:
> "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us> writes:
>
> > Gregory Stark <stark(at)enterprisedb(dot)com> writes:
> >> Ok, this removes what should be most if not all of the call sites where we're
> >> detoasting text or byteas. In particular it gets all the regexp/like functions
> >> and all the trim/pad functions. It also gets hashtext and hash_any.
> >
> > Applied with some fixes --- you'd missed like_match.c, which doubtless
> > explains Guillame's complaint that it didn't work ...
>
> Strange. It passed all regression tests for me and it seems like this is
> something that would have been caught even in single-byte mode by a simple
> test. It seems to me that like_match.c only used for SIMILAR is that right?
> That would explain it as there don't appear to be any tests of SIMILAR.
>
> Separately:
>
> > Although I'd misdiagnosed the reason for the recent failures on buildfarm
> > member grebe, I see no reason to revert the 1-byte-header-friendly changes I
> > made in varlena.c. Instead, tweak the code a little bit to get more
> > advantage out of that.
>
> This may have been a misdiagnosis of the buildfarm failures but it looks like
> a correct bug fix to me. That is, it looks like regexp_split_to_array() was
> capable of passing a packed varlena to setup_regexp_matches which wasn't
> expecting it. But this I don't understand why it wouldn't cause regression
> failures and indeed when I tested it with my build it didn't cause any
> problems.
>
>
> This all brings up the question of what other files should be considered for
> fixing. There is the possibility that some of the other sites could turn out
> to be performance critical for a given application. In particular I'm
> primarily concerned with calls which could be invoked during a data load or
> index build.
>
> Of the following list it seems to me the calls in adt/acl.c are probably
> interesting to fix. Especially since it seems we could just replace all those
> text* parameters with Datum parameters since they're only going to be passed
> to textin anyways.
>
> After that the tsearch and xml data types are probably interesting, and the
> various data type input functions and contrib gist/gin operator classes.
>
> I'm fine doing the drudge work but wanted to check before I do it that I'm not
> doing something we don't think is worth doing at this point in time.
>
> src/backend/access/transam/xlog.c:3
> src/backend/catalog/pg_conversion.c:2
> src/backend/commands/sequence.c:1
> src/backend/libpq/be-fsstubs.c:2
> src/backend/tsearch/dict.c:3
> src/backend/tsearch/to_tsany.c:6
> src/backend/tsearch/wparser.c:6
> src/backend/utils/adt/acl.c:61
> src/backend/utils/adt/char.c:1
> src/backend/utils/adt/date.c:3
> src/backend/utils/adt/dbsize.c:2
> src/backend/utils/adt/encode.c:1
> src/backend/utils/adt/formatting.c:14
> src/backend/utils/adt/genfile.c:3
> src/backend/utils/adt/not_in.c:2
> src/backend/utils/adt/quote.c:2
> src/backend/utils/adt/regproc.c:1
> src/backend/utils/adt/ruleutils.c:6
> src/backend/utils/adt/tid.c:1
> src/backend/utils/adt/timestamp.c:8
> src/backend/utils/adt/tsquery_rewrite.c:1
> src/backend/utils/adt/tsvector_op.c:3
> src/backend/utils/adt/xml.c:24
> src/backend/utils/mb/mbutils.c:1
> src/tutorial/funcs_new.c:3
>
> contrib/adminpack/adminpack.c:6
> contrib/chkpass/chkpass.c:2
> contrib/dblink/dblink.c:41
> contrib/fuzzystrmatch/dmetaphone.c:2
> contrib/fuzzystrmatch/fuzzystrmatch.c:6
> contrib/hstore/hstore_gin.c:1
> contrib/hstore/hstore_gist.c:1
> contrib/hstore/hstore_op.c:6
> contrib/intarray/_int_op.c:1
> contrib/ltree/ltree_op.c:3
> contrib/pageinspect/btreefuncs.c:3
> contrib/pageinspect/rawpage.c:1
> contrib/pg_trgm/trgm_gin.c:2
> contrib/pg_trgm/trgm_gist.c:1
> contrib/pg_trgm/trgm_op.c:3
> contrib/pgcrypto/pgcrypto.c:10
> contrib/pgcrypto/pgp-pgsql.c:1
> contrib/pgrowlocks/pgrowlocks.c:1
> contrib/pgstattuple/pgstatindex.c:2
> contrib/pgstattuple/pgstattuple.c:1
> contrib/sslinfo/sslinfo.c:2
> contrib/tablefunc/tablefunc.c:14
> contrib/tsearch2/dict.c:3
> contrib/tsearch2/dict_ex.c:1
> contrib/tsearch2/dict_ispell.c:1
> contrib/tsearch2/dict_snowball.c:3
> contrib/tsearch2/dict_syn.c:1
> contrib/tsearch2/dict_thesaurus.c:1
> contrib/tsearch2/query.c:4
> contrib/tsearch2/query_rewrite.c:1
> contrib/tsearch2/ts_cfg.c:1
> contrib/tsearch2/ts_stat.c:2
> contrib/tsearch2/tsvector.c:2
> contrib/tsearch2/wparser.c:9
> contrib/uuid-ossp/uuid-ossp.c:2
> contrib/xml2/xpath.c:25
> contrib/xml2/xslt_proc.c:3
>
>
> --
> Gregory Stark
> EnterpriseDB http://www.enterprisedb.com
>
> ---------------------------(end of broadcast)---------------------------
> TIP 4: Have you searched our list archives?
>
> http://archives.postgresql.org

--
Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
EnterpriseDB http://postgres.enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Andreas 'ads' Scherbaum 2008-03-24 18:31:16 Re: New email list for emergency communications
Previous Message Andrew Dunstan 2008-03-24 18:20:07 Re: gcc 4.3 breaks ContribCheck in 8.2 and older.

Browse pgsql-patches by date

  From Date Subject
Next Message Pavel Stehule 2008-03-24 19:01:15 generate_subscripts
Previous Message Bruce Momjian 2008-03-24 17:20:16 Re: script binaries renaming