This is a discussion on XML RSS reader with BBC Website.. within the PHP Language forums, part of the PHP Programming Forums category; I have made an RSS reader and am testing on the BBC website, and I use this code to grab ...
|
|||||||
| FAQ | Members List | Calendar | Search | Today's Posts | Mark Forums Read |
|
|||
|
I have made an RSS reader and am testing on the BBC website, and I use
this code to grab the contents of the XML file, however when I look at the contents grabbed by my function, and the HTML source of the bbc website XML, they are different... how is that even possible? Anyone have an XML parser that they could test this on please? Heres a sample link and my code: http://newsrss.bbc.co.uk/rss/sporton...otball/rss.xml $rss_name = "filename.xml"; $ch = curl_init($feed); $fp = fopen($rss_name, "w"); curl_setopt($ch, CURLOPT_FILE, $fp); curl_setopt($ch, CURLOPT_HEADER, 0); curl_exec($ch); curl_close($ch); fclose($fp); |
|
|||
|
On Wed, 15 Aug 2007 16:57:21 +0200, junkmate <junkmate@gmail.com> wrote:
> I have made an RSS reader and am testing on the BBC website, and I use > this code to grab the contents of the XML file, however when I look at > the contents grabbed by my function, and the HTML source of the bbc > website XML, they are different... how is that even possible? > > Anyone have an XML parser that they could test this on please? Heres a > sample link and my code: > http://newsrss.bbc.co.uk/rss/sporton...otball/rss.xml > > > $rss_name = "filename.xml"; > > $ch = curl_init($feed); > $fp = fopen($rss_name, "w"); > > curl_setopt($ch, CURLOPT_FILE, $fp); > curl_setopt($ch, CURLOPT_HEADER, 0); > > curl_exec($ch); > curl_close($ch); > fclose($fp); > Viewing the feed source & the file from CURL, the only difference I see is (understandably) <lastBuildDate />. What do you see and what do you expect? -- Rik Wasmus |
|
|||
|
I get an old set of items... the latest items are not included...
Now I am thinking my cUrl function maybe grabbing cached versions of the xml file? is that possible and if so, can it be switched off? On Aug 15, 4:13 pm, Rik <luiheidsgoe...@hotmail.com> wrote: > On Wed, 15 Aug 2007 16:57:21 +0200, junkmate <junkm...@gmail.com> wrote: > > I have made an RSS reader and am testing on the BBC website, and I use > > this code to grab the contents of the XML file, however when I look at > > the contents grabbed by my function, and the HTML source of the bbc > > website XML, they are different... how is that even possible? > > > Anyone have an XML parser that they could test this on please? Heres a > > sample link and my code: > >http://newsrss.bbc.co.uk/rss/sporton...otball/rss.xml > > > $rss_name = "filename.xml"; > > > $ch = curl_init($feed); > > $fp = fopen($rss_name, "w"); > > > curl_setopt($ch, CURLOPT_FILE, $fp); > > curl_setopt($ch, CURLOPT_HEADER, 0); > > > curl_exec($ch); > > curl_close($ch); > > fclose($fp); > > Viewing the feed source & the file from CURL, the only difference I see is > (understandably) <lastBuildDate />. What do you see and what do you expect? > > -- > Rik Wasmus |
|
|||
|
On Wed, 15 Aug 2007 17:23:05 +0200, junkmate <junkmate@gmail.com> wrote:
> On Aug 15, 4:13 pm, Rik <luiheidsgoe...@hotmail.com> wrote: >> On Wed, 15 Aug 2007 16:57:21 +0200, junkmate <junkm...@gmail.com> wrote: >> > I have made an RSS reader and am testing on the BBC website, and I use >> > this code to grab the contents of the XML file, however when I lookat >> > the contents grabbed by my function, and the HTML source of the bbc >> > website XML, they are different... how is that even possible? >> >> > Anyone have an XML parser that they could test this on please? Heres a >> > sample link and my code: >> >http://newsrss.bbc.co.uk/rss/sporton...otball/rss.xml >> >> > $rss_name = "filename.xml"; >> >> > $ch = curl_init($feed); >> > $fp = fopen($rss_name, "w"); >> >> > curl_setopt($ch, CURLOPT_FILE, $fp); >> > curl_setopt($ch, CURLOPT_HEADER, 0); >> >> > curl_exec($ch); >> > curl_close($ch); >> > fclose($fp); >> >> Viewing the feed source & the file from CURL, the only difference I see >> is >> (understandably) <lastBuildDate />. What do you see and what do you >> expect? (topposting fixed) > I get an old set of items... the latest items are not included... > Now I am thinking my cUrl function maybe grabbing cached versions of > the xml file? is that possible and if so, can it be switched off? No such problem here, though it might depend on sever setup. Are you sure that what CURL gets is cached data, and it is not your own output on the web which is? (i.e. your file gets updated, browser still shows old file) -- Rik Wasmus |
|
|||
|
No, I have a button which grabs a fresh XML file and writes a fresh
htm file to be included every time via AJAX. I did find this: curl_setopt($ch, CURLOPT_DNS_CACHE_TIMEOUT, 0); Since adding that, I get the latest results... which means one of two things: 1) The cache finally ran out and it refreshed anyway! 2) its fixed... On Aug 15, 4:44 pm, Rik <luiheidsgoe...@hotmail.com> wrote: > On Wed, 15 Aug 2007 17:23:05 +0200, junkmate <junkm...@gmail.com> wrote: > > On Aug 15, 4:13 pm, Rik <luiheidsgoe...@hotmail.com> wrote: > >> On Wed, 15 Aug 2007 16:57:21 +0200, junkmate <junkm...@gmail.com> wrote: > >> > I have made an RSS reader and am testing on the BBC website, and I use > >> > this code to grab the contents of the XML file, however when I look at > >> > the contents grabbed by my function, and the HTML source of the bbc > >> > website XML, they are different... how is that even possible? > > >> > Anyone have an XML parser that they could test this on please? Heres a > >> > sample link and my code: > >> >http://newsrss.bbc.co.uk/rss/sporton...otball/rss.xml > > >> > $rss_name = "filename.xml"; > > >> > $ch = curl_init($feed); > >> > $fp = fopen($rss_name, "w"); > > >> > curl_setopt($ch, CURLOPT_FILE, $fp); > >> > curl_setopt($ch, CURLOPT_HEADER, 0); > > >> > curl_exec($ch); > >> > curl_close($ch); > >> > fclose($fp); > > >> Viewing the feed source & the file from CURL, the only difference I see > >> is > >> (understandably) <lastBuildDate />. What do you see and what do you > >> expect? > > (topposting fixed) > > > I get an old set of items... the latest items are not included... > > Now I am thinking my cUrl function maybe grabbing cached versions of > > the xml file? is that possible and if so, can it be switched off? > > No such problem here, though it might depend on sever setup. Are you sure > that what CURL gets is cached data, and it is not your own output on the > web which is? (i.e. your file gets updated, browser still shows old file) > -- > Rik Wasmus |
|
|||
|
No i just tried on a brand new fresh feed:
http://newsrss.bbc.co.uk/rss/newsonl...t_page/rss.xml The second item is different... On Aug 15, 4:55 pm, junkmate <junkm...@gmail.com> wrote: > No, I have a button which grabs a fresh XML file and writes a fresh > htm file to be included every time via AJAX. > > I did find this: > curl_setopt($ch, CURLOPT_DNS_CACHE_TIMEOUT, 0); > > Since adding that, I get the latest results... which means one of two > things: > 1) The cache finally ran out and it refreshed anyway! > 2) its fixed... > > On Aug 15, 4:44 pm, Rik <luiheidsgoe...@hotmail.com> wrote: > > > On Wed, 15 Aug 2007 17:23:05 +0200, junkmate <junkm...@gmail.com> wrote: > > > On Aug 15, 4:13 pm, Rik <luiheidsgoe...@hotmail.com> wrote: > > >> On Wed, 15 Aug 2007 16:57:21 +0200, junkmate <junkm...@gmail.com> wrote: > > >> > I have made an RSS reader and am testing on the BBC website, and I use > > >> > this code to grab the contents of the XML file, however when I look at > > >> > the contents grabbed by my function, and the HTML source of the bbc > > >> > website XML, they are different... how is that even possible? > > > >> > Anyone have an XML parser that they could test this on please? Heres a > > >> > sample link and my code: > > >> >http://newsrss.bbc.co.uk/rss/sporton...otball/rss.xml > > > >> > $rss_name = "filename.xml"; > > > >> > $ch = curl_init($feed); > > >> > $fp = fopen($rss_name, "w"); > > > >> > curl_setopt($ch, CURLOPT_FILE, $fp); > > >> > curl_setopt($ch, CURLOPT_HEADER, 0); > > > >> > curl_exec($ch); > > >> > curl_close($ch); > > >> > fclose($fp); > > > >> Viewing the feed source & the file from CURL, the only difference I see > > >> is > > >> (understandably) <lastBuildDate />. What do you see and what do you > > >> expect? > > > (topposting fixed) > > > > I get an old set of items... the latest items are not included... > > > Now I am thinking my cUrl function maybe grabbing cached versions of > > > the xml file? is that possible and if so, can it be switched off? > > > No such problem here, though it might depend on sever setup. Are you sure > > that what CURL gets is cached data, and it is not your own output on the > > web which is? (i.e. your file gets updated, browser still shows old file) > > -- > > Rik Wasmus |
|
|||
|
OK, somethings erratic... I added to my parser a date at the top which
shows the LastBuildDate of the XML file being parsed. It changes as you click on refresh... and is always different to the one found in the actual XML source found by clicking the rss button. Is it my browser? Is it my page being cached? I dont know. Any ideas? Here: http://dev.oldsushi.com/joe The top one, labeled BBC News (the actual RSS feed can be accessed by clicking the rss button in the top right) On Aug 15, 5:01 pm, junkmate <junkm...@gmail.com> wrote: > No i just tried on a brand new fresh feed:http://newsrss.bbc.co.uk/rss/newsonl...t_page/rss.xml > > The second item is different... > > On Aug 15, 4:55 pm, junkmate <junkm...@gmail.com> wrote: > > > No, I have a button which grabs a fresh XML file and writes a fresh > > htm file to be included every time via AJAX. > > > I did find this: > > curl_setopt($ch, CURLOPT_DNS_CACHE_TIMEOUT, 0); > > > Since adding that, I get the latest results... which means one of two > > things: > > 1) The cache finally ran out and it refreshed anyway! > > 2) its fixed... > > > On Aug 15, 4:44 pm, Rik <luiheidsgoe...@hotmail.com> wrote: > > > > On Wed, 15 Aug 2007 17:23:05 +0200, junkmate <junkm...@gmail.com> wrote: > > > > On Aug 15, 4:13 pm, Rik <luiheidsgoe...@hotmail.com> wrote: > > > >> On Wed, 15 Aug 2007 16:57:21 +0200, junkmate <junkm...@gmail.com> wrote: > > > >> > I have made an RSS reader and am testing on the BBC website, and I use > > > >> > this code to grab the contents of the XML file, however when I look at > > > >> > the contents grabbed by my function, and the HTML source of the bbc > > > >> > website XML, they are different... how is that even possible? > > > > >> > Anyone have an XML parser that they could test this on please? Heres a > > > >> > sample link and my code: > > > >> >http://newsrss.bbc.co.uk/rss/sporton...otball/rss.xml > > > > >> > $rss_name = "filename.xml"; > > > > >> > $ch = curl_init($feed); > > > >> > $fp = fopen($rss_name, "w"); > > > > >> > curl_setopt($ch, CURLOPT_FILE, $fp); > > > >> > curl_setopt($ch, CURLOPT_HEADER, 0); > > > > >> > curl_exec($ch); > > > >> > curl_close($ch); > > > >> > fclose($fp); > > > > >> Viewing the feed source & the file from CURL, the only difference I see > > > >> is > > > >> (understandably) <lastBuildDate />. What do you see and what do you > > > >> expect? > > > > (topposting fixed) > > > > > I get an old set of items... the latest items are not included... > > > > Now I am thinking my cUrl function maybe grabbing cached versions of > > > > the xml file? is that possible and if so, can it be switched off? > > > > No such problem here, though it might depend on sever setup. Are you sure > > > that what CURL gets is cached data, and it is not your own output on the > > > web which is? (i.e. your file gets updated, browser still shows old file) > > > -- > > > Rik Wasmus |