Skip site navigation (1) Skip section navigation (2)

Re: How are null's stored? -- Some numbers

From: "Jim C(dot) Nasby" <jim(at)nasby(dot)net>
To: pgsql-performance(at)postgresql(dot)org
Subject: Re: How are null's stored? -- Some numbers
Date: 2003-05-13 21:55:41
Message-ID: 20030513215541.GC40542@flake.decibel.org (view raw or flat)
Thread:
Lists: pgsql-hackerspgsql-performance
I did two experiments. First, as someone mentioned, changing between
char and varchar made absolutely no difference size-wise. In some other
RDBMSes, performance wise char might still win out because the database
wouldn't have to do the math to figure out where in the tuple the fields
are. I know it's splitting hairs, but on what will be a 40M row table...

Second, I modified the table (see below; all fields were originally
nullable):

Before:
usps=# vacuum full analyze verbose zip4_detail;
INFO:  --Relation public.zip4_detail--
INFO:  Pages 12728: Changed 0, reaped 1, Empty 0, New 0; Tup 467140: Vac
0, Keep/VTL 0/0, UnUsed 19, MinLen 154, MaxLen 302; Re-using:
Free/Avail. Space 1009820/264028; EndEmpty/Avail. Pages 0/1521.
        CPU 0.65s/0.86u sec elapsed 1.51 sec.
INFO:  Rel zip4_detail: Pages: 12728 --> 12728; Tuple(s) moved: 0.
        CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO:  Analyzing public.zip4_detail


After:
INFO:  --Relation public.zip4_detail--
INFO:  Pages 13102: Changed 0, reaped 6961, Empty 0, New 0; Tup 467140:
Vac 0, Keep/VTL 0/0, UnUsed 31795, MinLen 166, MaxLen 306; Re-using:
Free/Avail. Space 1136364/190188; EndEmpty/Avail. Pages 0/1056.
        CPU 0.41s/0.79u sec elapsed 1.20 sec.
INFO:  Rel zip4_detail: Pages: 13102 --> 13102; Tuple(s) moved: 0.
        CPU 0.59s/10.02u sec elapsed 18.17 sec.
INFO:  Analyzing public.zip4_detail

As you can see, space useage actually went up, by 2.9% (pages). In other
words, it appears to be more efficient to store a null than to store an
empty string in a varchar.

usps=# select count(*) from zip4_detail where street_pre_drctn_abbrev=''
and street_suffix_abbrev='' and street_post_drctn_abbrev='';
-------
  9599

usps=# select count(*) from zip4_detail where street_pre_drctn_abbrev=''
or street_suffix_abbrev='';
--------
 128434

(all rows have at least one of the 3 fields empty)

Hope someone finds this info useful... :)

                  Table "public.zip4_detail"
          Column           |         Type          | Modifiers
---------------------------+-----------------------+-----------
 zip_code                  | character varying(5)  |
 update_key_no             | character varying(10) |
 action_code               | character varying(1)  |
 record_type_code          | character varying(1)  |
 carrier_route_id          | character varying(4)  |
 street_pre_drctn_abbrev   | character varying(2)  | not null
 street_name               | character varying(28) |
 street_suffix_abbrev      | character varying(4)  | not null
 street_post_drctn_abbrev  | character varying(2)  | not null
 addr_primary_low_no       | character varying(10) |
 addr_primary_high_no      | character varying(10) |
 addr_prmry_odd_even_code  | character varying(1)  |
 building_or_firm_name     | character varying(40) |
 addr_secondary_abbrev     | character varying(4)  |
 addr_secondary_low_no     | character varying(8)  |
 addr_secondary_high_no    | character varying(8)  |
 addr_secny_odd_even_code  | character varying(1)  |
 zip_add_on_low_no         | character varying(4)  |
 zip_add_on_high_no        | character varying(4)  |
 base_alt_code             | character varying(1)  |
 lacs_status_ind           | character varying(1)  |
 govt_bldg_ind             | character varying(1)  |
 finance_no                | character varying(6)  |
 state_abbrev              | character varying(2)  |
 county_no                 | character varying(3)  |
 congressional_dist_no     | character varying(2)  |
 muncipality_ctyst_key     | character varying(6)  |
 urbanization_ctyst_key    | character varying(6)  |
 prefd_last_line_ctyst_key | character varying(6)  |

-- 
Jim C. Nasby (aka Decibel!)                    jim(at)nasby(dot)net
Member: Triangle Fraternity, Sports Car Club of America
Give your computer some brain candy! www.distributed.net Team #1828

Windows: "Where do you want to go today?"
Linux: "Where do you want to go tomorrow?"
FreeBSD: "Are you guys coming, or what?"


In response to

pgsql-performance by date

Next:From: Jamie LawrenceDate: 2003-05-13 22:30:56
Subject: Finding filenames for tables
Previous:From: Andreas PflugDate: 2003-05-13 21:52:29
Subject: Re: Finding filenames for tables

pgsql-hackers by date

Next:From: Dennis GearonDate: 2003-05-13 22:06:04
Subject: Re: fomatting an interval (resend)
Previous:From: scott.marloweDate: 2003-05-13 21:51:28
Subject: Re: Scheduled jobs

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group