Re: PHP & Unicode
Available news archives: comp.lang.tcl - comp.lang.python - comp.security.firewalls - sci.crypt - comp.lang.php - comp.lang.javascript
Google
 
Web news.hping.org


comp.lang.php archive

Re: PHP & Unicode

From: Sander Tekelenburg <user@domain.invalid>
Date: Mon Oct 24 2005 - 23:08:36 CEST

In article <1130157575.614621.249910@z14g2000cwz.googlegroups.com>,
 slavi.marinov@gmail.com wrote:

[...]

> Let's say I have a database and a php script that communicates with the
> database. The database has some kind of character encoding - let's say
> UTF-8, UTF-16, or something different.

[...]

> My question is, how do you tell the PHP interpreter what encoding to
> use when displaying the text that the mysql queries return? In other
> words, will the $row[0] be displayed correctly regardless the database
> encoding, provided the database encoding and the HTML <meta> tags are
> the same

No. First off you'll need to use a character repertoire that makes sense
on the Web. utf-8 makes sense, utf-16 does not. So if your database uses
utf-16, you'll need to transliterate to utf-8 before serving.[*]

In addition you need to ensure that the user-agent (a browser for
example) is informed correctly of which character repertoire applies.
(Unless you want to rely on chance this is *always* a requirement, with
any character repertoire. Not just when you work with utf-8.) You do so
by having your server accompany the document with an appropriate
Content-Type header. For example, if it's a utf-8 encoded HTML file,
your server must say Content-Type: text/html; charset=utf-8. (Whether
the file name extension is ".php" or ".html" is irrelevant)

An alternative to configuring the server to do so is to have PHP
generate the Content-Type header:

   header("Content-Type: text/html; charset=utf-8");

Contrary to popular belief, a META HTTP-EQUIV is *not* a realiable
alternative.

Notes:
- I'm not entirely sure what you mean with "displaying". PHP doesn't
display. Nor does a Web server. It is the *browser*'s job to "display"
(whether visually or otherwise).
- all this assumes what you're trying to do is meant for the Web. An
intranet situation may have different requirements and possibilities.

[*] How exactly to do transliteration in PHP I can't tell you. I'm sure
it can be found in the documentation. It might also be that your
database allows you to request output in a specific character
repertoire. If so, that route might be more efficient.

-- 
Sander Tekelenburg, <http://www.euronet.nl/~tekelenb/>
Mac user: "Macs only have 40 viruses, tops!"
PC user: "SEE! Not even the virus writers support Macs!"
Received on Mon Nov 21 02:48:54 2005