XML RSS reader with BBC Website..

This is a discussion on XML RSS reader with BBC Website.. within the PHP Language forums, part of the PHP Programming Forums category; I have made an RSS reader and am testing on the BBC website, and I use this code to grab ...


Go Back   Usenet Forums > PHP Programming Forums > PHP Language

FAQ Members List Calendar Search Today's Posts Mark Forums Read
  #1 (permalink)  
Old 08-15-2007
junkmate
 
Posts: n/a
Default XML RSS reader with BBC Website..

I have made an RSS reader and am testing on the BBC website, and I use
this code to grab the contents of the XML file, however when I look at
the contents grabbed by my function, and the HTML source of the bbc
website XML, they are different... how is that even possible?

Anyone have an XML parser that they could test this on please? Heres a
sample link and my code:
http://newsrss.bbc.co.uk/rss/sporton...otball/rss.xml


$rss_name = "filename.xml";

$ch = curl_init($feed);
$fp = fopen($rss_name, "w");

curl_setopt($ch, CURLOPT_FILE, $fp);
curl_setopt($ch, CURLOPT_HEADER, 0);

curl_exec($ch);
curl_close($ch);
fclose($fp);

Reply With Quote
  #2 (permalink)  
Old 08-15-2007
Rik
 
Posts: n/a
Default Re: XML RSS reader with BBC Website..

On Wed, 15 Aug 2007 16:57:21 +0200, junkmate <junkmate@gmail.com> wrote:

> I have made an RSS reader and am testing on the BBC website, and I use
> this code to grab the contents of the XML file, however when I look at
> the contents grabbed by my function, and the HTML source of the bbc
> website XML, they are different... how is that even possible?
>
> Anyone have an XML parser that they could test this on please? Heres a
> sample link and my code:
> http://newsrss.bbc.co.uk/rss/sporton...otball/rss.xml
>
>
> $rss_name = "filename.xml";
>
> $ch = curl_init($feed);
> $fp = fopen($rss_name, "w");
>
> curl_setopt($ch, CURLOPT_FILE, $fp);
> curl_setopt($ch, CURLOPT_HEADER, 0);
>
> curl_exec($ch);
> curl_close($ch);
> fclose($fp);
>


Viewing the feed source & the file from CURL, the only difference I see is
(understandably) <lastBuildDate />. What do you see and what do you expect?


--
Rik Wasmus
Reply With Quote
  #3 (permalink)  
Old 08-15-2007
junkmate
 
Posts: n/a
Default Re: XML RSS reader with BBC Website..

I get an old set of items... the latest items are not included...
Now I am thinking my cUrl function maybe grabbing cached versions of
the xml file? is that possible and if so, can it be switched off?



On Aug 15, 4:13 pm, Rik <luiheidsgoe...@hotmail.com> wrote:
> On Wed, 15 Aug 2007 16:57:21 +0200, junkmate <junkm...@gmail.com> wrote:
> > I have made an RSS reader and am testing on the BBC website, and I use
> > this code to grab the contents of the XML file, however when I look at
> > the contents grabbed by my function, and the HTML source of the bbc
> > website XML, they are different... how is that even possible?

>
> > Anyone have an XML parser that they could test this on please? Heres a
> > sample link and my code:
> >http://newsrss.bbc.co.uk/rss/sporton...otball/rss.xml

>
> > $rss_name = "filename.xml";

>
> > $ch = curl_init($feed);
> > $fp = fopen($rss_name, "w");

>
> > curl_setopt($ch, CURLOPT_FILE, $fp);
> > curl_setopt($ch, CURLOPT_HEADER, 0);

>
> > curl_exec($ch);
> > curl_close($ch);
> > fclose($fp);

>
> Viewing the feed source & the file from CURL, the only difference I see is
> (understandably) <lastBuildDate />. What do you see and what do you expect?
>
> --
> Rik Wasmus



Reply With Quote
  #4 (permalink)  
Old 08-15-2007
Rik
 
Posts: n/a
Default Re: XML RSS reader with BBC Website..

On Wed, 15 Aug 2007 17:23:05 +0200, junkmate <junkmate@gmail.com> wrote:
> On Aug 15, 4:13 pm, Rik <luiheidsgoe...@hotmail.com> wrote:
>> On Wed, 15 Aug 2007 16:57:21 +0200, junkmate <junkm...@gmail.com> wrote:
>> > I have made an RSS reader and am testing on the BBC website, and I use
>> > this code to grab the contents of the XML file, however when I lookat
>> > the contents grabbed by my function, and the HTML source of the bbc
>> > website XML, they are different... how is that even possible?

>>
>> > Anyone have an XML parser that they could test this on please? Heres a
>> > sample link and my code:
>> >http://newsrss.bbc.co.uk/rss/sporton...otball/rss.xml

>>
>> > $rss_name = "filename.xml";

>>
>> > $ch = curl_init($feed);
>> > $fp = fopen($rss_name, "w");

>>
>> > curl_setopt($ch, CURLOPT_FILE, $fp);
>> > curl_setopt($ch, CURLOPT_HEADER, 0);

>>
>> > curl_exec($ch);
>> > curl_close($ch);
>> > fclose($fp);

>>
>> Viewing the feed source & the file from CURL, the only difference I see
>> is
>> (understandably) <lastBuildDate />. What do you see and what do you
>> expect?


(topposting fixed)

> I get an old set of items... the latest items are not included...
> Now I am thinking my cUrl function maybe grabbing cached versions of
> the xml file? is that possible and if so, can it be switched off?


No such problem here, though it might depend on sever setup. Are you sure
that what CURL gets is cached data, and it is not your own output on the
web which is? (i.e. your file gets updated, browser still shows old file)
--
Rik Wasmus
Reply With Quote
  #5 (permalink)  
Old 08-15-2007
junkmate
 
Posts: n/a
Default Re: XML RSS reader with BBC Website..

No, I have a button which grabs a fresh XML file and writes a fresh
htm file to be included every time via AJAX.

I did find this:
curl_setopt($ch, CURLOPT_DNS_CACHE_TIMEOUT, 0);

Since adding that, I get the latest results... which means one of two
things:
1) The cache finally ran out and it refreshed anyway!
2) its fixed...



On Aug 15, 4:44 pm, Rik <luiheidsgoe...@hotmail.com> wrote:
> On Wed, 15 Aug 2007 17:23:05 +0200, junkmate <junkm...@gmail.com> wrote:
> > On Aug 15, 4:13 pm, Rik <luiheidsgoe...@hotmail.com> wrote:
> >> On Wed, 15 Aug 2007 16:57:21 +0200, junkmate <junkm...@gmail.com> wrote:
> >> > I have made an RSS reader and am testing on the BBC website, and I use
> >> > this code to grab the contents of the XML file, however when I look at
> >> > the contents grabbed by my function, and the HTML source of the bbc
> >> > website XML, they are different... how is that even possible?

>
> >> > Anyone have an XML parser that they could test this on please? Heres a
> >> > sample link and my code:
> >> >http://newsrss.bbc.co.uk/rss/sporton...otball/rss.xml

>
> >> > $rss_name = "filename.xml";

>
> >> > $ch = curl_init($feed);
> >> > $fp = fopen($rss_name, "w");

>
> >> > curl_setopt($ch, CURLOPT_FILE, $fp);
> >> > curl_setopt($ch, CURLOPT_HEADER, 0);

>
> >> > curl_exec($ch);
> >> > curl_close($ch);
> >> > fclose($fp);

>
> >> Viewing the feed source & the file from CURL, the only difference I see
> >> is
> >> (understandably) <lastBuildDate />. What do you see and what do you
> >> expect?

>
> (topposting fixed)
>
> > I get an old set of items... the latest items are not included...
> > Now I am thinking my cUrl function maybe grabbing cached versions of
> > the xml file? is that possible and if so, can it be switched off?

>
> No such problem here, though it might depend on sever setup. Are you sure
> that what CURL gets is cached data, and it is not your own output on the
> web which is? (i.e. your file gets updated, browser still shows old file)
> --
> Rik Wasmus



Reply With Quote
  #6 (permalink)  
Old 08-15-2007
junkmate
 
Posts: n/a
Default Re: XML RSS reader with BBC Website..

No i just tried on a brand new fresh feed:
http://newsrss.bbc.co.uk/rss/newsonl...t_page/rss.xml

The second item is different...




On Aug 15, 4:55 pm, junkmate <junkm...@gmail.com> wrote:
> No, I have a button which grabs a fresh XML file and writes a fresh
> htm file to be included every time via AJAX.
>
> I did find this:
> curl_setopt($ch, CURLOPT_DNS_CACHE_TIMEOUT, 0);
>
> Since adding that, I get the latest results... which means one of two
> things:
> 1) The cache finally ran out and it refreshed anyway!
> 2) its fixed...
>
> On Aug 15, 4:44 pm, Rik <luiheidsgoe...@hotmail.com> wrote:
>
> > On Wed, 15 Aug 2007 17:23:05 +0200, junkmate <junkm...@gmail.com> wrote:
> > > On Aug 15, 4:13 pm, Rik <luiheidsgoe...@hotmail.com> wrote:
> > >> On Wed, 15 Aug 2007 16:57:21 +0200, junkmate <junkm...@gmail.com> wrote:
> > >> > I have made an RSS reader and am testing on the BBC website, and I use
> > >> > this code to grab the contents of the XML file, however when I look at
> > >> > the contents grabbed by my function, and the HTML source of the bbc
> > >> > website XML, they are different... how is that even possible?

>
> > >> > Anyone have an XML parser that they could test this on please? Heres a
> > >> > sample link and my code:
> > >> >http://newsrss.bbc.co.uk/rss/sporton...otball/rss.xml

>
> > >> > $rss_name = "filename.xml";

>
> > >> > $ch = curl_init($feed);
> > >> > $fp = fopen($rss_name, "w");

>
> > >> > curl_setopt($ch, CURLOPT_FILE, $fp);
> > >> > curl_setopt($ch, CURLOPT_HEADER, 0);

>
> > >> > curl_exec($ch);
> > >> > curl_close($ch);
> > >> > fclose($fp);

>
> > >> Viewing the feed source & the file from CURL, the only difference I see
> > >> is
> > >> (understandably) <lastBuildDate />. What do you see and what do you
> > >> expect?

>
> > (topposting fixed)

>
> > > I get an old set of items... the latest items are not included...
> > > Now I am thinking my cUrl function maybe grabbing cached versions of
> > > the xml file? is that possible and if so, can it be switched off?

>
> > No such problem here, though it might depend on sever setup. Are you sure
> > that what CURL gets is cached data, and it is not your own output on the
> > web which is? (i.e. your file gets updated, browser still shows old file)
> > --
> > Rik Wasmus



Reply With Quote
  #7 (permalink)  
Old 08-15-2007
junkmate
 
Posts: n/a
Default Re: XML RSS reader with BBC Website..

OK, somethings erratic... I added to my parser a date at the top which
shows the LastBuildDate of the XML file being parsed. It changes as
you click on refresh... and is always different to the one found in
the actual XML source found by clicking the rss button.

Is it my browser? Is it my page being cached? I dont know. Any ideas?


Here: http://dev.oldsushi.com/joe
The top one, labeled BBC News
(the actual RSS feed can be accessed by clicking the rss button in the
top right)




On Aug 15, 5:01 pm, junkmate <junkm...@gmail.com> wrote:
> No i just tried on a brand new fresh feed:http://newsrss.bbc.co.uk/rss/newsonl...t_page/rss.xml
>
> The second item is different...
>
> On Aug 15, 4:55 pm, junkmate <junkm...@gmail.com> wrote:
>
> > No, I have a button which grabs a fresh XML file and writes a fresh
> > htm file to be included every time via AJAX.

>
> > I did find this:
> > curl_setopt($ch, CURLOPT_DNS_CACHE_TIMEOUT, 0);

>
> > Since adding that, I get the latest results... which means one of two
> > things:
> > 1) The cache finally ran out and it refreshed anyway!
> > 2) its fixed...

>
> > On Aug 15, 4:44 pm, Rik <luiheidsgoe...@hotmail.com> wrote:

>
> > > On Wed, 15 Aug 2007 17:23:05 +0200, junkmate <junkm...@gmail.com> wrote:
> > > > On Aug 15, 4:13 pm, Rik <luiheidsgoe...@hotmail.com> wrote:
> > > >> On Wed, 15 Aug 2007 16:57:21 +0200, junkmate <junkm...@gmail.com> wrote:
> > > >> > I have made an RSS reader and am testing on the BBC website, and I use
> > > >> > this code to grab the contents of the XML file, however when I look at
> > > >> > the contents grabbed by my function, and the HTML source of the bbc
> > > >> > website XML, they are different... how is that even possible?

>
> > > >> > Anyone have an XML parser that they could test this on please? Heres a
> > > >> > sample link and my code:
> > > >> >http://newsrss.bbc.co.uk/rss/sporton...otball/rss.xml

>
> > > >> > $rss_name = "filename.xml";

>
> > > >> > $ch = curl_init($feed);
> > > >> > $fp = fopen($rss_name, "w");

>
> > > >> > curl_setopt($ch, CURLOPT_FILE, $fp);
> > > >> > curl_setopt($ch, CURLOPT_HEADER, 0);

>
> > > >> > curl_exec($ch);
> > > >> > curl_close($ch);
> > > >> > fclose($fp);

>
> > > >> Viewing the feed source & the file from CURL, the only difference I see
> > > >> is
> > > >> (understandably) <lastBuildDate />. What do you see and what do you
> > > >> expect?

>
> > > (topposting fixed)

>
> > > > I get an old set of items... the latest items are not included...
> > > > Now I am thinking my cUrl function maybe grabbing cached versions of
> > > > the xml file? is that possible and if so, can it be switched off?

>
> > > No such problem here, though it might depend on sever setup. Are you sure
> > > that what CURL gets is cached data, and it is not your own output on the
> > > web which is? (i.e. your file gets updated, browser still shows old file)
> > > --
> > > Rik Wasmus



Reply With Quote
Reply
Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are Off
[IMG] code is Off
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On




All times are GMT +1. The time now is 08:14 PM.


Powered by vBulletin® Version 3.7.3
Copyright ©2000 - 2008, Jelsoft Enterprises Ltd.
Content Relevant URLs by vBSEO 3.0.0