This is a discussion on explode with the foreach not working (beginner) within the PHP Language forums, part of the PHP Programming Forums category; Dear group, The function to be used as follows: $links = "http://www.campaignindia.in/feature/analysis"; $tag1 = '<...
|
|||||||
| FAQ | Members List | Calendar | Search | Today's Posts | Mark Forums Read |
|
|||
|
Dear group, The function to be used as follows: $links = "http://www.campaignindia.in/feature/analysis"; $tag1 = '<div class=feature-wrapper>'; $tag2 = '<h1><a href'; $tag3 = "</a>"; $op = doGetTags($links,"START",$tag1,$tag2,$tag3,"END"); return: The exploded array of the page of link <?php function doGetTags() { $expCounter = 0; $numargs = func_num_args(); $toExplode_encoded = array(); $link=""; $condentExploded = ""; if ($numargs <= 1) { echo "The argument can not be less that two"; die(); } $arg_list = func_get_args(); for($i = 0; $i < $numargs; $i++) { if($i == 0) { $link = $arg_list[$i]; $first_delimiter = $arg_list[++$i]; for($j=$i; $j < count($arg_list); $j++) { if($arg_list[$j] != "START" && $arg_list[$j] != "END") { $toExplode_encoded[$expCounter] = htmlentities($arg_list[$j]); // the tag to be exploded and striped with attributs $expCounter++; } } } } $aContext = array('http' => array('proxy' =>'tcp:// 192.168.10.131:8020', 'request_fulluri' => True)); $cxContext = stream_context_create($aContext); $fpcont= file_get_contents($link, False, $cxContext); foreach($toExplode_encoded as $key => $val) { $delimiter = '\''.$val.'\''; $condentExploded = explode($delimiter,$fpcont); print_r($condentExploded); } } $links = "http://www.campaignindia.in/feature/analysis"; $tag1 = '<div class=feature-wrapper>'; $tag2 = '<h1><a href'; $tag3 = "</a>"; $op = doGetTags($links,"START",$tag1,$tag2,$tag3,"END"); ?> when I pass the delimiter in explode(), directly for example $condentExploded = explode('<div class=feature-wrapper>', $fpcont); it working perfectly,the array is exploded. Thanks for any help |
|
|||
|
On 21 May, 09:30, sathyashrayan <asm_f...@yahoo.co.uk> wrote:
> Dear group, > The function to be used as follows: > > $links = "http://www.campaignindia.in/feature/analysis"; > $tag1 = '<div class=feature-wrapper>'; > $tag2 = '<h1><a href'; > $tag3 = "</a>"; > > $op = doGetTags($links,"START",$tag1,$tag2,$tag3,"END"); > > return: > The exploded array of the page of link > > <?php > > function doGetTags() > { > $expCounter = 0; > $numargs = func_num_args(); > $toExplode_encoded = array(); > $link=""; > $condentExploded = ""; > if ($numargs <= 1) > { > echo "The argument can not be less that two"; > die(); > } > $arg_list = func_get_args(); > for($i = 0; $i < $numargs; $i++) > { > if($i == 0) > { > $link = $arg_list[$i]; > $first_delimiter = $arg_list[++$i]; > for($j=$i; $j < count($arg_list); $j++) > { > if($arg_list[$j] != "START" && $arg_list[$j] != "END") > { > $toExplode_encoded[$expCounter] = > htmlentities($arg_list[$j]); // the tag to be exploded and striped > with attributs > $expCounter++; > } > } > } > > } > $aContext = array('http' => array('proxy' =>'tcp:// > 192.168.10.131:8020', 'request_fulluri' => True)); > $cxContext = stream_context_create($aContext); > $fpcont= file_get_contents($link, False, $cxContext); > > foreach($toExplode_encoded as $key => $val) > { > $delimiter = '\''.$val.'\''; > $condentExploded = explode($delimiter,$fpcont); > print_r($condentExploded); > > } > > } > > $links = "http://www.campaignindia.in/feature/analysis"; > $tag1 = '<div class=feature-wrapper>'; > $tag2 = '<h1><a href'; > $tag3 = "</a>"; > > $op = doGetTags($links,"START",$tag1,$tag2,$tag3,"END"); > > ?> > > when I pass the delimiter in explode(), directly for > example $condentExploded = explode('<div class=feature-wrapper>', > $fpcont); > it working perfectly,the array is exploded. Thanks for any help HAve you tried echoing the $delimiter variable just before the $condentExploded = explode($delimiter,$fpcont); statement (no of course I know you didn't, otherwise you would have seen what was wrong!) Try this foreach instead: foreach($toExplode_encoded as $key => $val) { $condentExploded = explode($val,$fpcont); print_r($condentExploded); } |
|
|||
|
On May 21, 2:54 pm, Captain Paralytic <paul_laut...@yahoo.com> wrote:
> On 21 May, 09:30, sathyashrayan <asm_f...@yahoo.co.uk> wrote: > > > > > Dear group, > > The function to be used as follows: > > > $links = "http://www.campaignindia.in/feature/analysis"; > > $tag1 = '<div class=feature-wrapper>'; > > $tag2 = '<h1><a href'; > > $tag3 = "</a>"; > > > $op = doGetTags($links,"START",$tag1,$tag2,$tag3,"END"); > > > return: > > The exploded array of the page of link > > > <?php > > > function doGetTags() > > { > > $expCounter = 0; > > $numargs = func_num_args(); > > $toExplode_encoded = array(); > > $link=""; > > $condentExploded = ""; > > if ($numargs <= 1) > > { > > echo "The argument can not be less that two"; > > die(); > > } > > $arg_list = func_get_args(); > > for($i = 0; $i < $numargs; $i++) > > { > > if($i == 0) > > { > > $link = $arg_list[$i]; > > $first_delimiter = $arg_list[++$i]; > > for($j=$i; $j < count($arg_list); $j++) > > { > > if($arg_list[$j] != "START" && $arg_list[$j] != "END") > > { > > $toExplode_encoded[$expCounter] = > > htmlentities($arg_list[$j]); // the tag to be exploded and striped > > with attributs > > $expCounter++; > > } > > } > > } > > > } > > $aContext = array('http' => array('proxy' =>'tcp:// > > 192.168.10.131:8020', 'request_fulluri' => True)); > > $cxContext = stream_context_create($aContext); > > $fpcont= file_get_contents($link, False, $cxContext); > > > foreach($toExplode_encoded as $key => $val) > > { > > $delimiter = '\''.$val.'\''; > > $condentExploded = explode($delimiter,$fpcont); > > print_r($condentExploded); > > > } > > > } > > > $links = "http://www.campaignindia.in/feature/analysis"; > > $tag1 = '<div class=feature-wrapper>'; > > $tag2 = '<h1><a href'; > > $tag3 = "</a>"; > > > $op = doGetTags($links,"START",$tag1,$tag2,$tag3,"END"); > > > ?> > > > when I pass the delimiter in explode(), directly for > > example $condentExploded = explode('<div class=feature-wrapper>', > > $fpcont); > > it working perfectly,the array is exploded. Thanks for any help > > HAve you tried echoing the $delimiter variable just before the > $condentExploded = explode($delimiter,$fpcont); > statement (no of course I know you didn't, otherwise you would have > seen what was wrong!) > > Try this foreach instead: > foreach($toExplode_encoded as $key => $val) > { > $condentExploded = explode($val,$fpcont); > print_r($condentExploded); > > } Thanks for your reply. Yes I did echo the $val and passed the $val directly to the explode as like above. It does not work. If I pass the direct html delimiter in the explode instead of $val it works. I really don't know why. pass-by reference or pass by value problem? I am using php5. |
|
|||
|
On 21 May, 11:12, sathyashrayan <asm_f...@yahoo.co.uk> wrote:
> On May 21, 2:54 pm, Captain Paralytic <paul_laut...@yahoo.com> wrote: > > > > > On 21 May, 09:30, sathyashrayan <asm_f...@yahoo.co.uk> wrote: > > > > Dear group, > > > The function to be used as follows: > > > > $links = "http://www.campaignindia.in/feature/analysis"; > > > $tag1 = '<div class=feature-wrapper>'; > > > $tag2 = '<h1><a href'; > > > $tag3 = "</a>"; > > > > $op = doGetTags($links,"START",$tag1,$tag2,$tag3,"END"); > > > > return: > > > The exploded array of the page of link > > > > <?php > > > > function doGetTags() > > > { > > > $expCounter = 0; > > > $numargs = func_num_args(); > > > $toExplode_encoded = array(); > > > $link=""; > > > $condentExploded = ""; > > > if ($numargs <= 1) > > > { > > > echo "The argument can not be less that two"; > > > die(); > > > } > > > $arg_list = func_get_args(); > > > for($i = 0; $i < $numargs; $i++) > > > { > > > if($i == 0) > > > { > > > $link = $arg_list[$i]; > > > $first_delimiter = $arg_list[++$i]; > > > for($j=$i; $j < count($arg_list); $j++) > > > { > > > if($arg_list[$j] != "START" && $arg_list[$j] != "END") > > > { > > > $toExplode_encoded[$expCounter] = > > > htmlentities($arg_list[$j]); // the tag to be exploded and striped > > > with attributs > > > $expCounter++; > > > } > > > } > > > } > > > > } > > > $aContext = array('http' => array('proxy' =>'tcp:// > > > 192.168.10.131:8020', 'request_fulluri' => True)); > > > $cxContext = stream_context_create($aContext); > > > $fpcont= file_get_contents($link, False, $cxContext); > > > > foreach($toExplode_encoded as $key => $val) > > > { > > > $delimiter = '\''.$val.'\''; > > > $condentExploded = explode($delimiter,$fpcont); > > > print_r($condentExploded); > > > > } > > > > } > > > > $links = "http://www.campaignindia.in/feature/analysis"; > > > $tag1 = '<div class=feature-wrapper>'; > > > $tag2 = '<h1><a href'; > > > $tag3 = "</a>"; > > > > $op = doGetTags($links,"START",$tag1,$tag2,$tag3,"END"); > > > > ?> > > > > when I pass the delimiter in explode(), directly for > > > example $condentExploded = explode('<div class=feature-wrapper>', > > > $fpcont); > > > it working perfectly,the array is exploded. Thanks for any help > > > HAve you tried echoing the $delimiter variable just before the > > $condentExploded = explode($delimiter,$fpcont); > > statement (no of course I know you didn't, otherwise you would have > > seen what was wrong!) > > > Try this foreach instead: > > foreach($toExplode_encoded as $key => $val) > > { > > $condentExploded = explode($val,$fpcont); > > print_r($condentExploded); > > > } > > Thanks for your reply. Yes I did echo the $val and passed the $val > directly to the explode as like above. It does not work. If I pass the > direct html delimiter in the explode instead of $val it works. I > really don't know why. pass-by reference or pass by value problem? I > am using php5. Please show us the result of the echo. |
|
|||
|
On May 21, 5:30 am, sathyashrayan <asm_f...@yahoo.co.uk> wrote:
> Dear group, > The function to be used as follows: > > $links = "http://www.campaignindia.in/feature/analysis"; > $tag1 = '<div class=feature-wrapper>'; > $tag2 = '<h1><a href'; > $tag3 = "</a>"; > > $op = doGetTags($links,"START",$tag1,$tag2,$tag3,"END"); > > return: > The exploded array of the page of link > > <?php > > function doGetTags() > { > $expCounter = 0; > $numargs = func_num_args(); > $toExplode_encoded = array(); > $link=""; > $condentExploded = ""; > if ($numargs <= 1) > { > echo "The argument can not be less that two"; > die(); > } > $arg_list = func_get_args(); > for($i = 0; $i < $numargs; $i++) > { > if($i == 0) > { > $link = $arg_list[$i]; > $first_delimiter = $arg_list[++$i]; > for($j=$i; $j < count($arg_list); $j++) > { > if($arg_list[$j] != "START" && $arg_list[$j] != "END") > { > $toExplode_encoded[$expCounter] = > htmlentities($arg_list[$j]); // the tag to be exploded and striped > with attributs > $expCounter++; > } > } > } > > } > $aContext = array('http' => array('proxy' =>'tcp:// > 192.168.10.131:8020', 'request_fulluri' => True)); > $cxContext = stream_context_create($aContext); > $fpcont= file_get_contents($link, False, $cxContext); > > foreach($toExplode_encoded as $key => $val) > { > $delimiter = '\''.$val.'\''; > $condentExploded = explode($delimiter,$fpcont); > print_r($condentExploded); > > } > > } > > $links = "http://www.campaignindia.in/feature/analysis"; > $tag1 = '<div class=feature-wrapper>'; > $tag2 = '<h1><a href'; > $tag3 = "</a>"; > > $op = doGetTags($links,"START",$tag1,$tag2,$tag3,"END"); > > ?> > > when I pass the delimiter in explode(), directly for > example $condentExploded = explode('<div class=feature-wrapper>', > $fpcont); > it working perfectly,the array is exploded. Thanks for any help |
|
|||
|
Hi,
You can avoid HTML encoding the value with htmlentities on the value and enclosing it in quotes... The explode function will want to match the string exactly. If you just want the article links, you can use a simple regexp like so: <? class MyScraper { function extractLinks($html) { $ptn = '#<div .*?class="feature-wrapper".*?>.*?' . '<a .*?href="(.*?)".*?>\s*<img .*?src="(.*?)".*?>\s*</a>.*?' . '<h1>.*?<a .*?href="(.*?)".*?>(.*?)</a>.*?</h1>.*?' . '<p>([^<>]*?)</p>.*?<p .*?class="news-date".*?>(.*?)</p>#is'; $n = preg_match_all($ptn, $html, $matches); $articles = array(); for ($i = 0; $i < $n; $i++) { $articles[] = array( 'title' => trim($matches[4][$i]), 'url' => trim($matches[3][$i]), 'teaser' => trim($matches[5][$i]), 'time' => strtotime(trim($matches[6][$i])), 'img' => trim($matches[2][$i]), 'imgLink' => trim($matches[1][$i]) ); } return $articles; } function getArticles($url = 'http://www.campaignindia.in/feature/analysis') { $html = file_get_contents($url); return self::extractLinks($html); } } print_r(MyScraper::getArticles()); ?> This will parse the page into: Array ( [0] => Array ( [title] => India does a blink and miss at the One Show '08 [url] => http://www.campaignindia.in/feature/india_does_a_blink_and_miss_at_the_one_show__08 [teaser] => A Pencil is a Pencil is a Pencil. No argument. Winning a Pencil has... [time] => Wed, 21 May 2008 00:00:00 -0500 [img] => http://www.campaignindia.in/files/images/Oneshow_85.GIF [imgLink] => http://www.campaignindia.in/feature/india_does_a_blink_and_miss_at_the_one_show__08 ) [1] => Array ( [title] => Metrosexual gobbled by the urban lion? [url] => http://www.campaignindia.in/Feature/Metrosexual_gobbled_by_the_urban_lion [teaser] => Much has recently been written about this trend of maleness articul... [time] => Wed, 07 May 2008 00:00:00 -0500 [img] => http://www.campaignindia.in/files/images/ShahrukhLUX85x60-copy.gif [imgLink] => http://www.campaignindia.in/Feature/Metrosexual_gobbled_by_the_urban_lion ) [2] => Array ( [title] => Exclusive chat with Viacom's Kamat: We'll be No. 1 [url] => http://www.campaignindia.in/Feature/Exclusive_chat_with_Viacom_Kamat_We_ll_be_No_1 [teaser] => Rajesh Kamat has a clear mandate: to put Colors at a formidable pos... [time] => Wed, 07 May 2008 00:00:00 -0500 [img] => http://www.campaignindia.in/files/images/Rajesh_Kamat85x60-copy_0.gif [imgLink] => http://www.campaignindia.in/Feature/Exclusive_chat_with_Viacom_Kamat_We_ll_be_No_1 ) [3] => Array ( [title] => Front row view of the Media Spikes 2008 [url] => http://www.campaignindia.in/features/Front_View_of_the_Media_Spikes_08 [teaser] => India’s niggardly tally of four bronzes at the Media Spikes 2... [time] => Tue, 29 Apr 2008 00:00:00 -0500 [img] => http://www.campaignindia.in/files/images/spikes85x60.gif [imgLink] => http://www.campaignindia.in/features/Front_View_of_the_Media_Spikes_08 ) ) You could also ask the site to add an RSS feed. Weird, Google groups now has a captcha to post... Regards, John Peters On May 21, 5:30 am, sathyashrayan <asm_f...@yahoo.co.uk> wrote: > Dear group, > The function to be used as follows: > > $links = "http://www.campaignindia.in/feature/analysis"; > $tag1 = '<div class=feature-wrapper>'; > $tag2 = '<h1><a href'; > $tag3 = "</a>"; > > $op = doGetTags($links,"START",$tag1,$tag2,$tag3,"END"); > > return: > The exploded array of the page of link > > <?php > > function doGetTags() > { > $expCounter = 0; > $numargs = func_num_args(); > $toExplode_encoded = array(); > $link=""; > $condentExploded = ""; > if ($numargs <= 1) > { > echo "The argument can not be less that two"; > die(); > } > $arg_list = func_get_args(); > for($i = 0; $i < $numargs; $i++) > { > if($i == 0) > { > $link = $arg_list[$i]; > $first_delimiter = $arg_list[++$i]; > for($j=$i; $j < count($arg_list); $j++) > { > if($arg_list[$j] != "START" && $arg_list[$j] != "END") > { > $toExplode_encoded[$expCounter] = > htmlentities($arg_list[$j]); // the tag to be exploded and striped > with attributs > $expCounter++; > } > } > } > > } > $aContext = array('http' => array('proxy' =>'tcp:// > 192.168.10.131:8020', 'request_fulluri' => True)); > $cxContext = stream_context_create($aContext); > $fpcont= file_get_contents($link, False, $cxContext); > > foreach($toExplode_encoded as $key => $val) > { > $delimiter = '\''.$val.'\''; > $condentExploded = explode($delimiter,$fpcont); > print_r($condentExploded); > > } > > } > > $links = "http://www.campaignindia.in/feature/analysis"; > $tag1 = '<div class=feature-wrapper>'; > $tag2 = '<h1><a href'; > $tag3 = "</a>"; > > $op = doGetTags($links,"START",$tag1,$tag2,$tag3,"END"); > > ?> > > when I pass the delimiter in explode(), directly for > example $condentExploded = explode('<div class=feature-wrapper>', > $fpcont); > it working perfectly,the array is exploded. Thanks for any help |
|
|||
|
On May 21, 4:28 pm, petersprc <peters...@gmail.com> wrote:
> Hi, > > You can avoid HTML encoding the value with htmlentities on the value > and enclosing it in quotes... The explode function will want to match > the string exactly. > > If you just want the article links, you can use a simple regexp like [.. good reply snipped..] Thanks a lot for all of you who helped me to learn. |