![]() |
Available news archives:
comp.lang.tcl
-
comp.lang.python
-
comp.security.firewalls
-
sci.crypt -
comp.lang.php -
comp.lang.javascript
|
|
comp.lang.python archiveRe: 'ascii' codec can't encode character u'\u2013'
From: Fredrik Lundh <fredrik@pythonware.com>
Date: Fri Sep 30 2005 - 15:50:05 CEST
Thomas Armstrong wrote:
> I'm trying to parse a UTF-8 document with special characters like
> It works, but I don't want to substitute each special character, because there
if you really want to use latin-1 in the database, and you don't mind dropping
text_extrated = text_extrated.encode('iso-8859-1', 'replace')
or
text_extrated = text_extrated.encode('iso-8859-1', 'ignore')
a better approach is of course to convert your database to use UTF-8 and use
text_extrated = text_extrated.encode('utf-8')
it's also a good idea to switch to parameter substitution in your SQL queries:
cursor.execute ("update ... set text = %s where id = %s", text_extrated, id)
it's possible that your database layer can automatically encode unicode strings if
</F>
|