iconv

This is a discussion on iconv within the PHP Language forums, part of the PHP Programming Forums category; I need to convert Japanese characters to UTF-8. I tried using this command on a PHP 5.0.3 ...


Go Back   Usenet Forums > PHP Programming Forums > PHP Language

FAQ Members List Calendar Search Today's Posts Mark Forums Read
  #1 (permalink)  
Old 03-03-2005
pbuchheit@hotmail.com
 
Posts: n/a
Default iconv

I need to convert Japanese characters to UTF-8. I tried using this
command on a PHP 5.0.3 server:
$out = iconv("iso-2022-jp", "utf-8", $in);
But it does not work. Can anyone tell me what I'm doing wrong?

(Note: I also tried this, unsuccessfully, with Chinese characters:
$out = iconv("gb2312", "utf-8", $in);
)

Reply With Quote
  #2 (permalink)  
Old 03-03-2005
Brion Vibber
 
Posts: n/a
Default Re: iconv

pbuchheit@hotmail.com wrote:
> I need to convert Japanese characters to UTF-8. I tried using this
> command on a PHP 5.0.3 server:
> $out = iconv("iso-2022-jp", "utf-8", $in);
> But it does not work. Can anyone tell me what I'm doing wrong?


You might start by mentioning what happens. Do you get an error message?
An empty string? A garbled string? FALSE?

Can you confirm that the iconv module is built-in or loaded? Can you
confirm that the input really is iso-2022-jp, and not some other
encoding such as Shift-JIS? Can you confirm that the input is valid?

-- brion vibber (brion @ pobox.com)
Reply With Quote
  #3 (permalink)  
Old 03-03-2005
pbuchheit@hotmail.com
 
Posts: n/a
Default Re: iconv

That's why I added the note about gb2312 - I'm SURE my input is in
simplified Chinese, but I get NULL output with the PHP command $out =
iconv("gb2312", "utf-8", $in);

My PHP 4 ISP activated ICONV, and I tried this also on a PHP 5.0.3
server. When you say 'confirm that the iconv module is built-in or
loaded,' are you suggesting that something else might need to be done?

Thanks.

Reply With Quote
  #4 (permalink)  
Old 03-03-2005
Brion Vibber
 
Posts: n/a
Default Re: iconv

pbuchheit@hotmail.com wrote:
> That's why I added the note about gb2312 - I'm SURE my input is in
> simplified Chinese, but I get NULL output with the PHP command $out =
> iconv("gb2312", "utf-8", $in);


You haven't provided any sample data or a test program or described the
system it's running on (OS, version, etc), so it's hard to reproduce. :)

Try this program:

<?php
$in =
urldecode('%CE%AC%BB%F9%B0%D9%BF%C6%A3%AC%D7%D4%D3 %C9%B5%C4%B0%D9%BF%C6%C8%AB%CA%E9');
$out = iconv("gb2312", "utf-8", $in);
echo urlencode($out);
?>

The output should be:

%E7%BB%B4%E5%9F%BA%E7%99%BE%E7%A7%91%EF%BC%8C%E8%8 7%AA%E7%94%B1%E7%9A%84%E7%99%BE%E7%A7%91%E5%85%A8% E4%B9%A6

I tested this successfully on PHP 4.3.10 on a Red Hat Linux 9 system.

> My PHP 4 ISP activated ICONV, and I tried this also on a PHP 5.0.3
> server. When you say 'confirm that the iconv module is built-in or
> loaded,' are you suggesting that something else might need to be done?


Check for instance phpinfo() output to make sure iconv is listed
properly. (If it weren't you should get a fatal error as the iconv
function won't be defined, but...)

It's also possible there's something wrong with the iconv library on
your system; see http://www.php.net/iconv
"Supported character sets depend on the iconv implementation of your
system. Note that the iconv function on some systems may not work as
you expect. In such case, it'd be a good idea to install the GNU
libiconv library. It will most likely end up with more consistent results."

Does it fail in the same way on both systems? What OS are they running?
Are only Asian encodings affected, or all encodings? Can you do an iconv
from iso-8859-1 to utf-8, for instance?

You might also try mb_convert_encoding if the mbstring module is enabled.

-- brion vibber (brion @ pobox.com)
Reply With Quote
Reply
Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are Off
[IMG] code is Off
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On




All times are GMT +1. The time now is 10:59 AM.


Powered by vBulletin® Version 3.7.3
Copyright ©2000 - 2008, Jelsoft Enterprises Ltd.
Content Relevant URLs by vBSEO 3.0.0