Skip site navigation (1) Skip section navigation (2)

Analyzing foreign tables & memory problems

From: "Albe Laurenz" <laurenz(dot)albe(at)wien(dot)gv(dot)at>
To: <pgsql-hackers(at)postgresql(dot)org>
Subject: Analyzing foreign tables & memory problems
Date: 2012-04-30 10:27:45
Message-ID: D960CB61B694CF459DCFB4B0128514C207D4F6FA@exadv11.host.magwien.gv.at (view raw or flat)
Thread:
Lists: pgsql-hackers
During ANALYZE, in analyze.c, functions compute_minimal_stats
and compute_scalar_stats, values whose length exceed
WIDTH_THRESHOLD (= 1024) are not used for calculating statistics
other than that they are counted as "too wide rows" and assumed
to be all different.

This works fine with regular tables; values exceeding that threshold
don't get detoasted and won't consume excessive memory.

With foreign tables the situation is different.  Even though
values exceeding WIDTH_THRESHOLD won't get used, the complete
rows will be fetched from the foreign table.  This can easily
exhaust maintenance_work_mem.

A foreign data wrapper has no good way to counter the problem.
It can return truncated values in ist AcquireSampleRowsFunc,
but WIDTH_THRESHOLD is private to analyze.c and it's a bad idea
to hard code a cutoff limit of 1025.

I can think of two remedies:
1) Expose WIDTH_THRESHOLD in commands/vacuum.h and add documentation
   so that the authors of foreign data wrappers are aware of the
   problem and can avoid it on their side.
   This would be quite simple.

2) Instead of one callback that returns all sample rows, have
   a callback that just returns the next table row (or the next
   table row from a subset of the pages of the table in the
   internal case).  This function could be called in a short-lived
   memory context.  Vitter's algorithm for selecting a sample
   and the truncation of excessively long values would then be
   handled in analyze.c.
   This would avoid the problem completely and make it easier
   to write a foreign data wrapper.
   I haven't thought this out completely, and it would require
   bigger changes to analyze.c and the API than are probably
   welcome this close to beta.

What is your opinion?

Yours,
Laurenz Albe



Responses

pgsql-hackers by date

Next:From: Magnus HaganderDate: 2012-04-30 10:28:21
Subject: Re: enable_indexonly
Previous:From: Wolfgang WilhelmDate: 2012-04-30 08:55:08
Subject: Re: smart shutdown at end of transaction (was: Default mode for shutdown)

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group