Re: How to setup default value "0000-00-00" for "date"

From: "Jim Wilson" <jimw(at)kelcomaine(dot)com>
To: "Michal Taborsky" <michal(at)taborsky(dot)cz>
Cc: <pgsql-general(at)postgresql(dot)org>
Subject: Re: How to setup default value "0000-00-00" for "date"
Date: 2004-08-20 14:38:12
Message-ID: twig.1093012692.59157@kelcomaine.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Michal Taborsky said:

> Richard Huxton wrote:
> > Where you don't have a valid date to store you should use NULL. This
> > business of storing zeroes is a horrible MySQL design mistake.
>
> Well, yes and no. It certainly is a design mistake and introduces
> incosistency into the database, but after I was bitten several times by
> NULL values I'd go for solution like this any day. Let me explain.
>
> We had a table of messages, which was inserted to randomly and every few
> minutes we'd walk through the unprocessed messages and perform some work
> on them. I, trying to have the database as clean as possible, used this
> table definition (simplified):
>
> messages (
> id serial,
> data text,
> arrived timestamp default now,
> processed timestamp)
>
> So after the message arrived, it had the processed field set to null,
> which was perfectly consistent and represented what it realy was--an
> unknown value.
>
> We'd then simply SELECT * FROM messages WHERE processed IS NULL and were
> happy ever after, only to realise after the table had grown to few
> thousands rows, that the SELECT took ages, because the system had to
> perform seqscan. Aha! So we added an index on processed, because common
> sense told me, that as long as there are 100k rows and only 10 of them
> are NULL, the index would be very selective and therefore useful.
>
> I guess you know where it ends--the index is not used for IS [NOT] NULL
> expressions. The obvious workaround was to add DEFAULT value to
> "processed" in form of kind of anchor (we used '-infinity') and then do
> SELECT * FROM messages WHERE processed='-infinity'.
>
> Bingo! The thing went 100x faster. So we could choose to have
> standards-compliant, very clean database design OR the thing that does
> what it's supposed to do in reasonable time. And believe me, it's kind
> of difficult to explain to our logistics department that we could have
> done the thing to return results in milliseconds instead of 10 secs, but
> chose not to for sake of clean design.
>
> It'd be really nice if we didn't have to use such hacks, but hey, life's
> inperfect.

It'd probably be better design to not use the date as a flag. This issue
actually came up for me yesterday with an application that is now being ported
to Postgres. Previously a null "ship date" indicated that an item to be
shipped had not gone yet. I'm adding a flag, not just because of this issue
you describe, but it is also more intuitive for anyone looking at the data
who is unfamiliar with the business logic.

Best,

Jim Wilson

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Tom Lane 2004-08-20 14:56:01 Re: int8, primary key, seq scan
Previous Message Tom Lane 2004-08-20 14:32:59 Re: libpq: passwords WAS: scripting & psql issues