This is a discussion on DOCTYPE within the PHP Language forums, part of the PHP Programming Forums category; here's what I'm using for my PHP files: <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 ...
|
|||||||
| FAQ | Members List | Calendar | Search | Today's Posts | Mark Forums Read |
|
|||
|
here's what I'm using for my PHP files:
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> <html xmlns="http://www.w3.org/1999/xhtml"> <head> <meta http-equiv="Content-Type" content="text/html; charset=utf-8" /> 1) Is that standard, and the only way it should be? 2) with that meta tag declaring utf-8, is it imperative that my MySQL tables also use utf-8? (I have them at latin 1 now.) |
|
|||
|
..oO(Jerry)
>here's what I'm using for my PHP files: > ><!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" >"http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> ><html xmlns="http://www.w3.org/1999/xhtml"> ><head> ><meta http-equiv="Content-Type" content="text/html; charset=utf-8" /> Transitional doesn't really make sense for new sites, since you're not transitioning from anything. And neither does XHTML, unless you have an explicit need for it and know exactly what you're doing. Remember: The most used browser on earth still doesn't understand real XHTML and XHTML 2.0 - if it will ever reach a usable state - will _not_ be backwards-compatible to the current XHTML versions 1.0 and 1.1. HTML 4.01 Strict(!) is still the document type of choice in most cases. >2) with that meta tag declaring utf-8, is it imperative that my >MySQL tables also use utf-8? (I have them at latin 1 now.) You can drop it. Its only use is if documents are served without a web server, for example a copy stored on the user's local disk. But in Web context these meta thingies are useless. What matters is the HTTP header sent by the server. The answer to the question is "depends". Usually if you want your output to be UTF-8, then it's a very good idea to use UTF-8 everywhere, from the DB through the scripts to the final HTML. Of course you can keep your data in Latin-1 and let the DB convert it automatically when it's transfered to your script, but why would you want that? Such character conversions are always prone to problems and should be avoided if possible. In short: If you want to use UTF-8, you should do it consistently. Micha |
|
|||
|
Michael Fesser wrote:
> .oO(Jerry) > >> here's what I'm using for my PHP files: >> >> <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" >> "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> >> <html xmlns="http://www.w3.org/1999/xhtml"> >> <head> >> <meta http-equiv="Content-Type" content="text/html; charset=utf-8" /> > > Transitional doesn't really make sense for new sites, since you're not > transitioning from anything. And neither does XHTML, unless you have an > explicit need for it and know exactly what you're doing. thanks. That's curious, because everything that I quoted above is what is inserted by Dream Weaver for new PHP files. > HTML 4.01 Strict(!) is still the document type of choice in most cases. .... > In short: If you want to use UTF-8, you should do it consistently. My conclusion after just now reading up is that I shouldn't want to. My visitors will all be US or Western Europe. Latin 1 (aka 8859-1) is what is served by the webserver (although it also accepts UTF-8). Latin-1 is also the default for PHP. I also just noticed that Dream Weaver doesn't even offer Latin 1 as an option for the default encoding - so maybe it is DW that is wacky. |
|
|||
|
Jerry wrote:
> Michael Fesser wrote: >> .oO(Jerry) >> >>> here's what I'm using for my PHP files: >>> >>> <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" >>> "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> >>> <html xmlns="http://www.w3.org/1999/xhtml"> >>> <head> >>> <meta http-equiv="Content-Type" content="text/html; charset=utf-8" /> >> >> Transitional doesn't really make sense for new sites, since you're not >> transitioning from anything. And neither does XHTML, unless you have an >> explicit need for it and know exactly what you're doing. > > thanks. That's curious, because everything that I quoted above is what > is inserted by Dream Weaver for new PHP files. > That's your first mistake. Don't let ANY product do ANYTHING for you unless you understand what it's doing. Learn HTML and you'll be much better off. The best HTML editor in the world is notepad on windows or vi on Linux/unix. They force you to learn how to do it right. >> HTML 4.01 Strict(!) is still the document type of choice in most cases. > > ... > >> In short: If you want to use UTF-8, you should do it consistently. > > My conclusion after just now reading up is that I shouldn't want to. My > visitors will all be US or Western Europe. Latin 1 (aka 8859-1) is what > is served by the webserver (although it also accepts UTF-8). Latin-1 is > also the default for PHP. > > I also just noticed that Dream Weaver doesn't even offer Latin 1 as an > option for the default encoding - so maybe it is DW that is wacky. > > > See above. -- ================== Remove the "x" from my email address Jerry Stuckle JDS Computer Training Corp. jstucklex@attglobal.net ================== |
|
|||
|
> HTML 4.01 Strict(!) is still the document type of choice in most cases.
NACK. Since HTML incontrast to XHTML allows some very strange things, XHTML is much easier to debug using a validator. Indeed, most of the stuff (except some specialties like checked="checked", <br /> ...) sane people are using in HTML is already XHTML. Defining it als XHTML has the advantage, that constructions that are commonly mistakes but are valid will be found by the validator. |
|
|||
|
On Thu, 28 Feb 2008 13:03:45 +0100, Jonas Werres <jonas@example.org> wrote:
>> HTML 4.01 Strict(!) is still the document type of choice in most cases. > > NACK. Since HTML incontrast to XHTML allows some very strange things, > XHTML is much easier to debug using a validator. Indeed, most of the > stuff (except some specialties like checked="checked", <br /> ...) sane > people are using in HTML is already XHTML. > Defining it als XHTML has the advantage, that constructions that are > commonly mistakes but are valid will be found by the validator. Writing HTML strict is at least as easy as XHTML, and easily validated. Some major browser whose name shall not be mentioned still doesn't support real XHTML, untill then there's no good reason to use XHTML. The XHTML hype is past us, most serious HTML developers/designers are back to HTML strict as using XHTML little to no advantages due to browser's implementation. It's been shelved as 'could be usefull in the future, certainly not now'. Even the w3c have kind of given up on it, and focus more on HTML5 then XHTML2. I'd be curious as to what 'constructions that are commonly mistakes but are valid' are more easily found in XHTML then HTML, care to give an example? Offcourse, I'd love to use SVG or MathML in pages. Support is sadly lacking. -- Rik Wasmus |
|
|||
|
Jerry Stuckle wrote:
>> >> I also just noticed that Dream Weaver doesn't even offer Latin 1 as an >> option for the default encoding - so maybe it is DW that is wacky. On further checking, DW offers "Western European" as an option, and if I choose that then ISO-8859-1 is what actually gets inserted into the (mostly useless) meta-tag. So that makes 3 synonyms for the same charset. Also, after thinking a bit, the fact that the output originates dynamically as php shouldn't matter, right? If I were to download my php page, save the source as *.html, and then publish that file as *.html, then the end result is the same - as far as DOCTYPE goes. So this was really an HTML question, independent of whether that the output originated as php. |
|
|||
|
> ..., DW offers "Western European" as an option, and if I
> choose that then ISO-8859-1 is what actually gets inserted into the > (mostly useless) meta-tag. So that makes 3 synonyms for the same charset. > > Also, after thinking a bit, the fact that the output originates > dynamically as php shouldn't matter, right? If I were to download my php > page, save the source as *.html, and then publish that file as *.html, > then the end result is the same - as far as DOCTYPE goes. Not entirely. The best way to send the character set is _outside_ the document itself. It is a strange that you should parse the character set used out of the encoded document. What about utf-16? or utf-32? Could you parse those character sets as easily out of a documented that is written in it? Off course not. And if you can, there is no reason to do so anymore. So character sets are sent in a header. PHP supports this in two ways: - set the default document type and character set in your PHP.ini, - send a header your self with the header() function; Especially the first way can be tricky is you do not realize it. Your server may be sending "Content-type: text/html; charset=utf-8" in a header, while your code may contain "Content-type: text/html; charset=iso-8859-1" in a meta tag. Which of them is true? > So this was really an HTML question, independent of whether that the > output originated as php. As you saw above, not entirely. Plain static HTML files have no control over their headers. Best regards. |
|
|||
|
> I'd be curious as to what 'constructions that are commonly mistakes but
> are valid' are more easily found in XHTML then HTML, care to give an > example? Sure. This is valid HTML 4.01 Strict: <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN"> <title>Title</title> <h1>Text</h1> <p> <select name="Choice" size="1"> <option>1. Entry <option>2. Entry <option>3. Entry </select> </p> Notice missing <head>, <body>, </option>. Try to check this: <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN"> <title>Title</title> <h1>Text</h1> <p> a< b </p> It is valid. But don't try to remove the space: <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN"> <title>Title</title> <h1>Text</h1> <p> a<b </p> Ha. Broken. Want it really bad? Try <!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01//EN"> <> <title// <p ltr<span></span</p> </> is valid. At least there are warnings now at validator.w3.org. Long time there weren't. Having to close <br /> and stuff are peanut compared to all the SGML stuff XHTML left behind. That's what I meant when I wrote, that the HTML a sane person writes IS mostly XHTML. |
|
|||
|
..oO(Jonas Werres)
>> HTML 4.01 Strict(!) is still the document type of choice in most cases. > >NACK. Since HTML incontrast to XHTML allows some very strange things, >XHTML is much easier to debug using a validator. Only if you use a schema validator. The W3 validator is SGML-based, which is not appropriate for validating X(HT)ML. Micha |