This is a discussion on xhtml, htaccess and accept-headers within the alt.comp.lang.php forums, part of the PHP Programming Forums category; (This is barely related to PHP, but from experience it seems that most PHP developers dabble in .htaccess and HTTP ...
|
|||||||
| FAQ | Members List | Calendar | Search | Today's Posts | Mark Forums Read |
|
|||
|
(This is barely related to PHP, but from experience it seems that most
PHP developers dabble in .htaccess and HTTP at some point.) I recently noticed that even though my pages are all written conforming to xhtml standards, they were parsed as html because the server sent a text/html content type. So I renamed all the files to .xml files, so the server would send them with an application/xml header. It worked like a charm - Firefox's page info now showed that the page was considered as a "real" xml file. It was wonderful until I tried to visit the page with lynx (text-only browser). It seems that older browsers are incapable of supporting xhtml - which in this case is annoying, because lynx would be able to parse the pages just fine. It is just getting stumped by the server's "application/xml" header. There's a silver lining. lynx and Firefox are both well-behaved browsers and send proper Accept headers - Firefox's preference is "text/xml, application/xml, application/xhtml+xml, text/html [...]", while lynx wants "text/html [...]". --- Now all I need is a way to determine the preferred content type and then send the appropriate headers. I know this could be done in PHP, but just for performance reasons I wanted to know if there's a way to do it from inside .htaccess - these pages are simple, static files and to launch the PHP parser just for content negotiation seems a bit overboard. Any ideas? -- CB |
|
|||
|
Okay, I've experimented a bit and actually found a way to do this with
mod_rewrite, which I'll put here in case someone else ever needs this. It's really amazingly simple. 1. Files now have the .html extension again 2. The AddType declaration that sends application/xhtml+xml for .html files is gone. 3. This rewrite rule adds the relevant content type: RewriteCond %{HTTP_ACCEPT} application/xhtml\+xml RewriteCond %{REQUEST_FILENAME} \.html$ RewriteRule .* - [L,T=application/xhtml+xml] This nicely works together with other rewrite rules that may already be there. For example, I have one that allows the ".html" extension to be removed from the URL - /about will return about.html: RewriteCond %{REQUEST_FILENAME} !-f RewriteCond %{REQUEST_FILENAME} !-d RewriteCond %{REQUEST_FILENAME}.html -f RewriteRule ^(.*)$ $1.html [QSA] That rule just needs to be added above the earlier one, and both will work as intended. -- Christoph Burschka |
|
|||
|
Christoph Burschka wrote: > (This is barely related to PHP, but from experience it seems that most > PHP developers dabble in .htaccess and HTTP at some point.) > > I recently noticed that even though my pages are all written conforming > to xhtml standards, they were parsed as html because the server sent a > text/html content type. > > So I renamed all the files to .xml files, so the server would send them > with an application/xml header. It worked like a charm - Firefox's page > info now showed that the page was considered as a "real" xml file. > > It was wonderful until I tried to visit the page with lynx (text-only > browser). It seems that older browsers are incapable of supporting xhtml > - which in this case is annoying, because lynx would be able to parse > the pages just fine. It is just getting stumped by the server's > "application/xml" header. > > There's a silver lining. lynx and Firefox are both well-behaved browsers > and send proper Accept headers - Firefox's preference is "text/xml, > application/xml, application/xhtml+xml, text/html [...]", while lynx > wants "text/html [...]". > > --- > > Now all I need is a way to determine the preferred content type and then > send the appropriate headers. > > I know this could be done in PHP, but just for performance reasons I > wanted to know if there's a way to do it from inside .htaccess - these > pages are simple, static files and to launch the PHP parser just for > content negotiation seems a bit overboard. > > Any ideas? > > -- > CB First of all, you don't need to rename your files. Just use the header() function to set the content type yourself: header('Content-type: application/xml'); //or whatever else Secondly, there are lots of ways to determine whether to use HTML or XHTML (XML) as the content type. This one works for me (I stole it from somewhere, but I can't remember where): //Return true if the browser accepts XML, false otherwise function xhtmlAccept() { $accept = $_SERVER['HTTP_ACCEPT']; if(empty($accept)) return false; $matches = array(); if(stristr($accept, 'application/xhtml+xml')) { $xhtml = str_replace(array('/', '+'), array('\/', '\+'), 'application/xhtml+xml'); if(preg_match('/' . $xhtml . ";q=([01]|0\\.\\d{1,3}|1\\.0)/i", $accept, $matches)) { $xhtml_q = $matches[1]; $html = str_replace('/', '\/', 'text/html'); if(preg_match("/" . $html . ";q=q=([01]|0\\.\\d{1,3}|1\\.0)/i", $accept, $matches)) { $html_q = $matches[1]; if((float)$xhtml_q >= (float)$html_q) return true; } } return true; } return false; } |
|
|||
|
>
> First of all, you don't need to rename your files. Just use the > header() function to set the content type yourself: > > header('Content-type: application/xml'); //or whatever else > > Thanks, but these are static html files with no PHP available - which is why I needed the .htaccess thing. For my PHP pages, this method works very well however. :) -- Christoph Burschka |
![]() |
| Thread Tools | |
| Display Modes | |
|
|