Re: Storing questionnaire data

From: Jeff Davis <pgsql(at)j-davis(dot)com>
To: Thom Brown <thombrown(at)gmail(dot)com>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: Storing questionnaire data
Date: 2008-10-23 23:31:05
Message-ID: 1224804665.25425.77.camel@dell.linuxdev.us.dell.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On Thu, 2008-10-23 at 21:38 +0100, Thom Brown wrote:
> I'm afraid such a rigid structure is completely tailored for a
> specific questionnaire. What if I, or even a client, wanted to
> generate different questionnaires? I would like the data to indicate
> the flow of questions and answers rather than just use the database as
> pure storage for a completely coded solution.

If you construct a schema in such a way that there's really no
constraint on the data at all, then the user of that permissive schema
is effectively designing the database.

This is not necessarily a bad thing (or rather, it may not be
avoidable). For instance, if you have no idea what kind of questions
might be asked by the questionnaire, nor any idea what kind of questions
might be asked about the responses they receive to the questionnaire,
there's not much you can do. Pretty much anything is going to look a lot
like EAV.

The cost of this, however, is that you (as the designer of the
permissive schema) can no longer see any meaning in the data at all. You
may be able to dig around manually a bit and find out a few specific
things, but you can't do it in any automated way. This is because you
aren't the real designer of the database, you've passed that job along
to your users. Only they (hopefully) have any idea what it might mean.
The users might not be good database designers, in which case they'll
end up with a mess, and you won't be able to help them.

Also, as a performance matter, the optimizer also has no idea what your
data means, and so it can't take any useful shortcuts. So, it will
probably be slow.

The best you can really do is try to find whatever basic meaning you
can. Usually there is something there: there are basic data types people
will want (e.g. string, numeric, timestamp). There are questions,
perhaps groups of questions, order in which the questions should be
asked, order in which the questions are answered, time the question was
answered, and respondents. There is one (or fewer) answer per question
per respondent. Try to piece this stuff together in some way as to
provide maximum meaning to you (and to PostgreSQL) without destroying
the usefulness to your customers.

I think the article David mentioned:
http://www.varlena.com/GeneralBits/110.php
Is a pretty reasonable compromise for many use-cases. Perhaps more can
be done, but usually questionnaires are either too unimportant to really
dig in, or so important that designing a database around it is the
obvious thing to do.

Regards,
Jeff Davis

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Pavel Stehule 2008-10-24 03:44:06 Re: overhead of plpgsql functions over simple select
Previous Message Guy Rouillier 2008-10-23 23:18:35 Re: Annoying Reply-To