Skip site navigation (1) Skip section navigation (2)

bug in Google translate snippet

From: Alvaro Herrera <alvherre(at)commandprompt(dot)com>
To: Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Cc: pg-peter(at)alvh(dot)no-ip(dot)org
Subject: bug in Google translate snippet
Date: 2009-07-02 20:51:15
Message-ID: 20090702205115.GL4698@alvh.no-ip.org (view raw or flat)
Thread:
Lists: pgsql-hackers
Hi,

I was having a look at this snippet:
http://wiki.postgresql.org/wiki/Google_Translate
and it turns out that it doesn't work if the result contains non-ASCII
chars.  Does anybody know how to fix it?

alvherre=# select gtranslate('en', 'es', 'he');
ERROR:  plpython: function "gtranslate" could not create return value
DETALLE:  <type 'exceptions.UnicodeEncodeError'>: 'ascii' codec can't encode character u'\xe9' in position 0: ordinal not in range(128)

By adding a plpy.log() call you can see that the answer is "él":
LOG:  (u'\xe9l',)

I guess it needs some treatment similar to the one in this function:
http://wiki.postgresql.org/wiki/Strip_accents_from_strings


For completeness, here is the code:

CREATE OR REPLACE FUNCTION gtranslate(src text, target text, phrase text) RETURNS text
LANGUAGE plpythonu
AS $$
import re
import urllib
 
import simplejson as json
 
class UrlOpener(urllib.FancyURLopener):
        version = "py-gtranslate/1.0"
 
base_uri = "http://ajax.googleapis.com/ajax/services/language/translate"
default_params = {'v': '1.0'}
 
def translate(src, to, phrase):
        args = default_params.copy()
        args.update({
                'langpair': '%s%%7C%s' % (src, to),
                'q': urllib.quote_plus(phrase),
        })
        argstring = '%s' % ('&'.join(['%s=%s' % (k,v) for (k,v) in args.iteritems()]))
        resp = json.load(UrlOpener().open('%s?%s' % (base_uri, argstring)))
        try:
                return resp['responseData']['translatedText']
        except:
                # should probably warn about failed translation
                return phrase
 
return translate(src, target, phrase)
$$;

-- 
Alvaro Herrera                                http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

Responses

pgsql-hackers by date

Next:From: Tom LaneDate: 2009-07-02 20:55:49
Subject: Re: PGXS problem with pdftotext
Previous:From: Kevin GrittnerDate: 2009-07-02 20:20:42
Subject: PGXS problem with pdftotext

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group