Re: Implementing "thick"/"fat" databases

From: Sim Zacks <sim(at)compulab(dot)co(dot)il>
To: PostgreSQL general <pgsql-general(at)postgresql(dot)org>
Subject: Re: Implementing "thick"/"fat" databases
Date: 2011-07-26 08:04:02
Message-ID: 4E2E74F2.7010404@compulab.co.il
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On 07/25/2011 06:24 PM, Chris Travers wrote:

> On Sun, Jul 24, 2011 at 11:53 PM, Sim Zacks<sim(at)compulab(dot)co(dot)il> wrote:
>
>> The goal is to make our system client agnostic, Most of our GUI is written
>> in wxpython, we also have some web functions and even a barcode terminal
>> function, written in C#. We would like to use an application server, so that
>> all the code is run on the server but we don't want to be web-based. I don't
>> want to have the same business logic written into various clients as that
>> causes maintenance headaches.
> Ok. So we are talking about a multi-application database, and you are
> trying to provide some degree of consistency in business logic across
> the software applications (and hence uses of your data). So far, so
> good. Agreed to put anything that might need to be in common.
>
>> The way that postgresql works is that each session is its own process. That
>> means that from the server perspective (at least in most ways), my plpython
>> function is not connected to any other call on the database. My goal is to
>> allow any client to request functionality and have it execute the same way
>> every time.
> So.... I assume that means using Pl/Mono to make your barcode stuff
> work in the db too?
No need for PL/Mono or any other client specific language. The GUI
should be dumb, so all I really need to program design is the interface
and input output methods. When you push a button, it should call the
appropriate function. The functions on the barcode terminal (which is a
Windows Mobile platform) can also be run on the client application
written in wxpython. Keeping as much as possible off of the client
allows me to share the work and be sure that both clients do the exact
same thing when the user pushes the button.

There is always functionality that needs to be on the client, however in
my experience with multi-client applications, if it is business logic
you will end up duplicating it and then needing to maintain 2 copies of
the same thing.

>> Data logic is built in by use of constraints and triggers and some
>> functions, business logic is built only into functions.
> I guess I am using "data logic" more broadly than you. Any select,
> update, or insert statement against a relation (except, of course,
> simple function calls) is part of that data logic, and there are
> supporting dependencies too, such as security and permissions.
>
> IOW, I think a lot of 'business logic' is data logic. I separate this
> with what you actually use the data for.
>
> IOW, Generate a report with the following format is data logic (even
> perhaps, and email it to the head of product ordering), but use this
> report to decide what inventory to order is application logic (you
> probably want people making these decisions).
>
> Similarly "Generate me a list of invoices to be paid in the following
> format": is data logic, but selecting those invoices to be paid,
> entering the data, determining what bank account you want to pay them
> from etc. is application logic. Recording the payments is again data
> logic. Application logic might be better thought of as workflow
> logic.
So much of human interaction is definition of terms. Now I understand
what you mean by data logic.

To me data logic is strictly whether the data will be considered corrupt
if the database allows an action to occur. For example, not enforcing
relationships. If you erase the parent the child has no meaning and
therefore you have data in the system which is suspect.
As my db is designed with the business rules in mind, some of the
business logic is included in the data logic, for example, 3 tables must
be updated at the same time in order for the transaction to be
considered valid. This is taken care of either through a function or
triggers.

An example of Business Logic is when my stock is updated to 0, I want
the record deleted.
This is done through a rule that on update that changes the value to 0
delete instead.

>> In other words, our postgresql server is a hybrid database/application
>> server.
> Right. One thing we are seeing in LedgerSMB as we go through this is
> that the database does indeed become an application server. The areas
> which are taking the new approach are mostly SQL and HTML templates
> (in TemplateToolkit), while the Perl code is shrinking pretty fast.
>
> I guess what I am wondering is if you are using it as an application
> server, is there a benefit to trying to put everything in the
> database? Or just those parts which necessarily follow from the
> database independent of workflow, etc or are likely to be common
> across diverse applications hitting your server?
>
> If so, what is it?
The benefits of putting everything into the database is having one
platform to maintain. Also, anything that can talk to the database can
call the functions, so we have a defined interface for all our
functions. If I wanted to use JBOSS, for example, how would my wxpython
or my c# app talk to it?
IOW, I did not find a middleware that was more appropriate for a
multi-client/multi-platform environment then the database server.
Another big advantage of using the database for business logic, is that
the plpython functions can be called from other functions. For example,
in the python prompt function I posted earlier, I can have an plpgsql
function say
if python_prompt('Are you sure?"') then
process
end if

It can also say
insert into table X with a trigger on the table that calls a send email
function.

I have a database function that builds an HTML page and FTPs it to our
website. This gives me a number of benefits: a) I don't have to worry
about the ftp client version on the client or connection issues,
firewalls, etc. b) it can be done with mutliple clients without changing
the code, c) I don't have to worry about path issues.

All I need is to know that my server can do it and that my clients can
connect to my server.

>> As I mentioned, I am still playing around and testing this, and have
>> The concept is that each client has a SocketServer thread that listens on a
>> specific port (8080 in this case).
> Ok, so just pointing out here that if you go this route you are
> starting to break the client-agnosticism requirement, unless I
> misunderstand what you are trying to do :-).
Any client that has the ability to listen to a port will be able to use
this functionality.
In C# for example, I would use System.Net.Sockets and the TcpListener class
In Python I would use the socketserver module
>> My database function is called python_prompt pasted below.
>> It has a timeout of 2 minutes after which the function errors out with a
>> timeout. The user will get a message that the function did not complete. The
>> important thing is that it has a timeout. If the timeout is 20 minutes, that
>> is also ok. Depending on the query that calls this function, there do not
>> have to be any locks being used and as this is its own process and it is
>> basically sleeping while waiting for the answer, it doesn't use resources
>> (aside from a connection and the work mem in the calling query, of course).
> Now, if for sake of argument this happens after an insert or update,
> that means things are locked for that time, right? This would be more
> likely to cause deadlocks and if the person steps up entering the
> data, there would be a rollback, right?
Yes. This is the type of functionality that must be used smartly.

>> Using this, a plpgsql query can prompt the user, "You are about to use more
>> stock then you have, are you sure you want to do this?"
>>
>> CREATE OR REPLACE FUNCTION python_prompt(v_question text)
>> RETURNS boolean AS
>> $BODY$
>> import time
>> import socket
>> ipaddr=plpy.execute("select inet_client_addr()")[0]["inet_client_addr"]
>> HOST, PORT = str(ipaddr), 8080
>> sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
>> sock.settimeout(120)
>> sock.connect((HOST, PORT))
>> sock.send(v_question + "\n")
>> ans=sock.recv(1024)
>> sock.close()
>> if ans=="yes":
>> return true
>> else:
>> return false
>> $BODY$
>> LANGUAGE plpythonu VOLATILE;
> I guess I am wondering what you get by requiring that a client listens
> on port 8080 AND knows how to communicate with the server on it that
> you don't get from assuming that the client can manage the workflow
> (breaking the routines into two separate ones and prompting for input
> between). I also think there is a better way to solve this problem
> (see below).
Breaking it into 2 functions is not always practical because you lose
your context (all variables etc..). It is not all done in a single
transaction which means when you start the second function you would
then have to verify that the work that is there is actually what you put
there and nobody modified it.

What we do today (the reason I am playing with this interactivity) is
have the database return an error code, which the client understands to
be a question. It then prompts the user and the query is sent again with
the answer as a parameter. It works but it is ugly (IMO).
>> I haven't fully decided to use this yet, but one of my goals is to make my
>> GUI as dumb as possible.
>>
> Probably a personal preference but I don't like to try to push
> something as far as possible :-)
>
> Anyway, here's an alternative to your interactive query approach, the
> one we use for LedgerSMB user creation for 1.3. it';s the "if you
> don't tell me to do it anyway, I am raising an exception" approach.
>
> Basically the problem is that db users are used to enforce application
> permissions, and db users are cluster-wide while application users are
> specific to a database. So what happens when you are running two
> companies (which have to have complete data separation and different
> permissions) but a single user account is needed to access both of
> them?
>
> What we decided to do is have a stored procedure which creates the
> user, sets the password (valid for 1 day by default, until changed by
> the user, at which point it is valid for an admin-specified time).
>
> Now, one parameter to the function is whether this is an import of an
> existing db role or not (i.e. not == creation of new user/role). In
> the event you specify that this is an import, if you also specify a
> password, you get an error unconditionally. Don't want to change
> password on import. However, if you don't specify that this is an
> import and the role exists you get a different exception. Exception
> strings in this case are short but descriptive ('No password allowed,'
> and 'Duplicate user'), and these are handed back up to the application
> to process. The application then can flag that this role exists, and
> the user of the software can resubmit as an import. Now there's no
> reason the application if it wasn't a web-based app couldn't just pop
> up a confirmation box saying "This role exists. Do you want to
> import? Yes/no" and handle it there.
>
> Similarly if you add a do_anyway arg to your invoice item query, you
> can raise an exception "Negative Inventory" and the application can
> handle that, flag the user, and let the user decide whether to do it
> anyway.
>
> It seems to me that as you have timeouts, the application has to be
> prepared to retry functions anyway.
>
> Also one huge cost to putting too much interactive logic in the
> database in some environments. I haven't yet found a decent way of
> localizing strings in database queries. That means that if you want
> to support multiple languages, it's a LOT easier to deal with these
> strings if they are in the client codebase than if they are in the
> server codebase and interpolated in the client (gettext throws errors
> with variable interpolation if you identify the string on the server
> and mix localization of client-side things with server-side strings).
> I suppose you can patch gettext to ignore these errors.......
>
> Best Wishes,
> Chris Travers

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Mcleod, John 2011-07-26 12:47:27 Re: pgsql error
Previous Message Toby Corkindale 2011-07-26 05:29:13 practical Fail-over methods (was: streaming replication trigger file)