This is a discussion on DOCTYPE within the PHP Language forums, part of the PHP Programming Forums category; Jonas Werres wrote: >> So? They're the ones setting the recommendations. That's what most >> of ...
|
|||||||
| FAQ | Members List | Calendar | Search | Today's Posts | Mark Forums Read |
|
|||
|
Jonas Werres wrote:
>> So? They're the ones setting the recommendations. That's what most >> of the rest of us follow - including the browser developers. And if >> they say XHTML isn't going anywhere soon, browser developers won't be >> spending a lot of time supporting it. > > Did you have a look at their website? > Sure. I go there regularly. So what? -- ================== Remove the "x" from my email address Jerry Stuckle JDS Computer Training Corp. jstucklex@attglobal.net ================== |
|
|||
|
Tony wrote:
> Jerry Stuckle wrote: >> >> What can you say when even W3C doesn't recommend it? >> > > I would be very interested in where the W3C says that - I haven't been > able to find anything like that. Most of the info I'm able to find seems > to point the other way - such as: > http://www.webstandards.org/learn/ar...skw3c/oct2003/ > Of course, that's over 4 years old - and that's also the case with the > info that I find... > That's right - it's over 4 years old. Check the www.w3c.org site. I don't have the link handy, but they are now pushing towards HTML 5.0 instead of a new version of XHTML. -- ================== Remove the "x" from my email address Jerry Stuckle JDS Computer Training Corp. jstucklex@attglobal.net ================== |
|
|||
|
Jerry Stuckle wrote:
> > What can you say when even W3C doesn't recommend it? > I would be very interested in where the W3C says that - I haven't been able to find anything like that. Most of the info I'm able to find seems to point the other way - such as: http://www.webstandards.org/learn/ar...skw3c/oct2003/ Of course, that's over 4 years old - and that's also the case with the info that I find... |
|
|||
|
> That's right - it's over 4 years old. Check the www.w3c.org site. I
> don't have the link handy, but they are now pushing towards HTML 5.0 > instead of a new version of XHTML. Yeah ... HTML 5... Already had a look at THAT? |
|
|||
|
> I took a brief tour of the web now and see that some Chinese sites have
> their own charset (which is included in FF), but Korea and Thailand did > not. Also, Denmark did not. They all used UTF-8. Is there any overriding > principle that determines whether they use UTF-8 or not? Or is it just > case by case? I think this is just case-by-case. It could be dependent on what character sets are installed for Unicode on most systems (I heard that even Klingon characters have their place), but frankly I don't know. Also, utf-8 allows you to mix character sets, thus rendering Korean and Danish in one sentence if you wish. It could be that the Chinese sites were not that internationally oriented. I did some international and foreign sites in utf-8, and learned the hard way. The main problem here is that there is a difference between a text and a string. A string is just a sequence of bytes, whereas a text is something that you can read. The difference, off course, is the encoding it is rendered in, and the problem is that texts are stored as strings. So an encoding is usually passed along with the string, and is more "metadata" than part of the value itself. And that metadata is usually sent through separate channels and easily separated and lost. You must often do some work to know the encoding, and even in modern web applications this information can be missing. By the way, if you want a nice introduction on the matter, here's a good start: http://www.joelonsoftware.com/articles/Unicode.html .... and beware of onions ;) There is one thing I strongly disagree with the above site: the remark that character encodings would be easy. They are not. Especially if you take some quirky behaviours of Windows and MySQL into account. Best regards. |
|
|||
|
Jonas Werres wrote:
>> Sure. I go there regularly. So what? >> > It is XHTML? So obviously at least they do not recommend NOT to use it. > What does that prove?. They just haven't rewritten their website since it became obvious XHTML isn't going anywhere. -- ================== Remove the "x" from my email address Jerry Stuckle JDS Computer Training Corp. jstucklex@attglobal.net ================== |
|
|||
|
Jonas Werres wrote:
>> That's right - it's over 4 years old. Check the www.w3c.org site. I >> don't have the link handy, but they are now pushing towards HTML 5.0 >> instead of a new version of XHTML. > > Yeah ... HTML 5... Already had a look at THAT? > Just preliminary specs is all. Looks ok to me. -- ================== Remove the "x" from my email address Jerry Stuckle JDS Computer Training Corp. jstucklex@attglobal.net ================== |
|
|||
|
..oO(Dikkie Dik)
>By the way, if you want a nice introduction on the matter, here's a good >start: > >http://www.joelonsoftware.com/articles/Unicode.html > >... and beware of onions ;) There is one thing I strongly disagree with >the above site: the remark that character encodings would be easy. They >are not. Especially if you take some quirky behaviours of Windows and >MySQL into account. IMHO it _can_ be easy - you just have to do it consistently. With UTF-8 you have to make sure that * your data is stored as UTF-8 in the DB * correctly transfered to your script (SET NAMES utf8) * correctly transfered to the browser (header()) In short: UTF-8 all the way from the source to the reader. Micha |
|
|||
|
>> By the way, if you want a nice introduction on the matter, here's a good
>> start: >> >> http://www.joelonsoftware.com/articles/Unicode.html >> >> ... and beware of onions ;) There is one thing I strongly disagree with >> the above site: the remark that character encodings would be easy. They >> are not. Especially if you take some quirky behaviours of Windows and >> MySQL into account. > > IMHO it _can_ be easy - you just have to do it consistently. > With UTF-8 you have to make sure that > > * your data is stored as UTF-8 in the DB > * correctly transfered to your script (SET NAMES utf8) > * correctly transfered to the browser (header()) > > In short: UTF-8 all the way from the source to the reader. That is the theory, yes. But I have a few "exercises" if you like: - Try to configure MySQL in a way that the server will assume client connections in utf-8 by default. - You forgot e-mail. Try to send a utf-8 encoded e-mail to MS-Outlook. - Nice one: have your PHP-enabled server tell that it accepts both iso-8859-1 and utf-8. Not hard, eh? Now create a form. Fill in some data and post it. Do this again, but now select the other encoding in your browser. Look at the submitted headers, the submitted data and tell me how on earth the server is to know which one the browser used. Tried this with Firefox, IE and Safari... - See above. Modify your form such that it will tell the server the encoding (being right or not). Tell me what happens. - MySQL has so many encoding translator settings. Explain how you can see what is actually stored in the database. - If you really want to make sure you are getting the characters right, you can use a unicode escape in some languages. How would you do this for MySQL if you know that your source file passes at least 3 programs that use different encodings but do not say anything in their documentation? - Try and find a nice database front-end program that can reliably render encodings. For Windows, Linux and Mac if you think it is too easy... - Well, utf-8 is relatively easy, off course. How would you make MySQL accept utf-16? - MySQL again: Try to convert latin-1 tables to utf-8. Now do the same if they have compound indexes. - FPDF: Just create a nice PDF with utf-8 as it's base. What does this do to your targeted consistency? - Explain why Windows and IE render a euro sign in iso-8859-1 while that character is not even covered by it. - An advanced one: what can a hacker abuse with character encodings? (No short answer, I'm afraid). Happy puzzling! |