long analyze, libc bug and libicu

From: Grigory Smolkin <g(dot)smolkin(at)postgrespro(dot)ru>
To: Pg Bugs <pgsql-bugs(at)postgresql(dot)org>
Subject: long analyze, libc bug and libicu
Date: 2018-07-05 15:05:17
Message-ID: dc49b0d2-d20d-adeb-ecdb-0ad25461386d@postgrespro.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

Hello, everyone!

One of your clients with PostgreSQL 10.4 recently noticed that ANALYZE
started to take unproportionally long amount of time. Backtrace:

#0 0x00007f244b68ca06 in strcoll_l () from /lib/x86_64-linux-gnu/libc.so.6
#1 0x00007f244e37802d in varstrfastcmp_locale (x=139794027137284,
y=139794027967812, ssup=<optimized out>) at
/build/postgresql-10-U6N320/postgresql-10-10.4/build/../src/backend/utils/adt/varlena.c:2185
#2 0x00007f244e0d6b15 in ApplySortComparator (isNull1=0 '\000',
isNull2=0 '\000', ssup=0x7fffa958de80, datum2=<optimized out>,
datum1=<optimized out>) at
/build/postgresql-10-U6N320/postgresql-10-10.4/build/../src/include/utils/sortsupport.h:225
#3 compare_scalars (a=<optimized out>, b=<optimized out>,
arg=0x7fffa958de70) at
/build/postgresql-10-U6N320/postgresql-10-10.4/build/../src/backend/commands/analyze.c:2821
#4 0x00007f244e3e1023 in med3 (a=0x7f24578f6bd8 "\224__U$\177",
b=0x7f24578f6c78 "\004ITU$\177", c=0x7f24578f6d18 "D\365`U$\177",
cmp=0x7f244e0d6af0 <compare_scalars>, arg=0x7fffa958de70) at
/build/postgresql-10-U6N320/postgresql-10-10.4/build/../src/port/qsort_arg.c:108
#5 0x00007f244e3e13ec in qsort_arg (a=a(at)entry=0x7f24578f6bd8,
n=<optimized out>, n(at)entry=82, es=es(at)entry=16,
cmp=cmp(at)entry=0x7f244e0d6af0 <compare_scalars>,
arg=arg(at)entry=0x7fffa958de70) at
/build/postgresql-10-U6N320/postgresql-10-10.4/build/../src/port/qsort_arg.c:157
#6 0x00007f244e3e1475 in qsort_arg (a=a(at)entry=0x7f24578f6248,
n=<optimized out>, n(at)entry=235, es=es(at)entry=16,
cmp=cmp(at)entry=0x7f244e0d6af0 <compare_scalars>,
arg=arg(at)entry=0x7fffa958de70) at
/build/postgresql-10-U6N320/postgresql-10-10.4/build/../src/port/qsort_arg.c:217
#7 0x00007f244e3e1475 in qsort_arg (a=a(at)entry=0x7f24578f5348,
n=<optimized out>, es=es(at)entry=16, cmp=cmp(at)entry=0x7f244e0d6af0
<compare_scalars>, arg=arg(at)entry=0x7fffa958de70) at
/build/postgresql-10-U6N320/postgresql-10-10.4/build/../src/port/qsort_arg.c:217
#8 0x00007f244e3e13ac in qsort_arg (a=a(at)entry=0x7f24578f5348,
n=<optimized out>, es=es(at)entry=16, cmp=cmp(at)entry=0x7f244e0d6af0
<compare_scalars>, arg=arg(at)entry=0x7fffa958de70) at
/build/postgresql-10-U6N320/postgresql-10-10.4/build/../src/port/qsort_arg.c:203
#9 0x00007f244e3e13ac in qsort_arg (a=0x7f24578f5348,
a(at)entry=0x7f2457816928, n=<optimized out>, n(at)entry=109083,
es=es(at)entry=16, cmp=cmp(at)entry=0x7f244e0d6af0 <compare_scalars>,
arg=arg(at)entry=0x7fffa958de70) at
/build/postgresql-10-U6N320/postgresql-10-10.4/build/../src/port/qsort_arg.c:203
#10 0x00007f244e0d604f in compute_scalar_stats (stats=0x7f244fd168f8,
fetchfunc=<optimized out>, samplerows=110194, totalrows=110194) at
/build/postgresql-10-U6N320/postgresql-10-10.4/build/../src/backend/commands/analyze.c:2378
#11 0x00007f244e0d76f9 in do_analyze_rel
(onerel=onerel(at)entry=0x7f244fc339d8, options=options(at)entry=99,
params=params(at)entry=0x7f244f98b9f0, va_cols=va_cols(at)entry=0x0,
acquirefunc=<optimized out>, relpages=11515, inh=0 '\000',
in_outer_xact=0 '\000', elevel=13) at
/build/postgresql-10-U6N320/postgresql-10-10.4/build/../src/backend/commands/analyze.c:545

After a bit of digging it became clear that they hit the libc bug:
https://sourceware.org/bugzilla/show_bug.cgi?id=18441

We decided to switch to libicu collation, but it became clear that
ANALYZE ignore column level COLLATE and stubbornly use default collation.

It happens thanks to this piece of code:

static void
compute_scalar_stats(VacAttrStatsP stats,
AnalyzeAttrFetchFunc fetchfunc,
int samplerows,
double totalrows)
....
/* We always use the default collation for statistics */
ssup.ssup_collation = DEFAULT_COLLATION_OID;
....

Ok, we decided to set libicu collation as default for database, although
it is a major pain.
But turns out it`s impossible to set libicu collation as default for
database:

postgres=# create database icu LC_COLLATE = 'en-x-icu';
ERROR: invalid locale name: "en-x-icu"

Why ANALYZE igrones column COLLATE? It there workaround to force ANALYZE
to use libicu?

--
Grigory Smolkin
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Taiki Kondo 2018-07-06 08:33:56 Typo in Japanese translation of psql.
Previous Message Jeff Janes 2018-07-04 22:11:03 Re: NOTIFY does not work as expected