Skip site navigation (1) Skip section navigation (2)

Re: Applying TOAST to CURRENT

From: JanWieck(at)t-online(dot)de (Jan Wieck)
To: The Hermit Hacker <scrappy(at)hub(dot)org>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Jan Wieck <JanWieck(at)yahoo(dot)com>, Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>, PostgreSQL HACKERS <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Applying TOAST to CURRENT
Date: 2000-05-31 01:10:20
Message-ID: 200005310110.DAA14672@hot.jw.home (view raw or flat)
Thread:
Lists: pgsql-hackers
The Hermit Hacker wrote:
>
> have to third this one ... I think it should be totally transparent to the
> admin/user ... just create it when the table is created, what's the worst
> case scenario?  it never gets used and you waste 16k of disk space?
>

    Not exactly.

    I've  made  some  good  experiences  with  having the toaster
    trying to keep the main tuple size below 1/4 of  MaxTupleSize
    (BLKSIZE  -  block  header).   Remember  that external stored
    attributes are only fetched from the  secondary  relation  if
    really  needed  (when the result set is sent to the client or
    if explicitly used in the query). So in a usual case, where a
    relatively  small  amount of the entire data is retrieved and
    key attributes are small, it's a win. With this  config  more
    main tuples fit into one block, and if the attributes used in
    the WHERE clause  aren't  stored  external,  the  result  set
    (including  sort  and  group  actions)  can be collected with
    fewer block reads. Only those big  values,  that  the  client
    really wanted, have to be fetched at send time.

    If  no  external  table  exists, the toaster will try the <2K
    thing by compression only. If the resulting tuple  fits  into
    the  8K  limit,  it's OK. But if a secondary relation exists,
    it'll store external to make the tuple <2K.  Thus, a 4K or 6K
    tuple,  that  actually  fits  and would be stored in the main
    table, will cause the toaster to jump in if we allways create
    the secondary table.

    Hmmm - thinking about that it doesn't sound bad if we allways
    create a secondary relation at CREATE TABLE time, but NOT the
    index  for  it.  And at VACUUM time we create the index if it
    doesn't exist AND there is external stored data.

    The table is prepared for external  storage  allways  and  we
    avoid  the  risks  from  creating  tables  in  possibly later
    aborting transactions or due to concurrency  issues.  But  we
    don't  waste  the  index space for really allways-small-tuple
    tables.

    Another benefit would  be,  that  reloads  should  be  faster
    because  with  this  technique,  the  toaster doesn't need to
    insert index tuples during the load. The indices are  created
    later at VACUUM after reload.

    The  toaster  needs  to  use sequential scans on the external
    table until the next vacuum  run,  but  index  usage  allways
    depends on vacuum so that's not a real issue from my PoV.

    At least a transparent compromise - isn't it?


Jan

--

#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.                                  #
#================================================== JanWieck(at)Yahoo(dot)com #



In response to

Responses

pgsql-hackers by date

Next:From: The Hermit HackerDate: 2000-05-31 01:33:16
Subject: Re: CVS log problem
Previous:From: Tatsuo IshiiDate: 2000-05-31 01:03:46
Subject: Re: CVS log problem

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group