This is a discussion on Finding a key word in a text file within the PHP Language forums, part of the PHP Programming Forums category; Hi all, I would like to find a word stored in a text file. Structure: I have one file named ...
|
|||||||
| FAQ | Members List | Calendar | Search | Today's Posts | Mark Forums Read |
|
|||
|
Hi all,
I would like to find a word stored in a text file. Structure: I have one file named keyWords.txt that stores some key words I'm interested in finding. In addition I also have a file named textOrigin.txt in which I store the text to search in. I would like my prog to check if a certain word appears in the text and than to tell me what line it found it in (if it did...). My problem is that the script can't find the words I'm looking for. I took one word from the word list and put it into the text file to be searched, for some reason this word is not found by the prog. I used 'enter' at the end of each line. The word being used is on line 3 in the keyWords.txt file. I have some reason to belive that the reason lie here: if ($pos) { echo " line $i: $storeWord[$n]\n"; } I also tried it with if (!$pos === FALSE) {...} but nothing there either... Anyone? Thank you very much for any help! Dekers the keyWords.txt file: ------------------------------- Recording Site Recording Type INTRA SUA MUA LFP Acquisition Type Windowed Digitilized Electrode Type Tetrode Metal Pipette Pipette Charakteristics Tetrode Charakteristics Electrode Tip Length (ìm) Electrode Tip OD (ìm) Bandwidth in Hz Impegance in MegOhm Number of Penetrations Neurons Encountered Neurones Analysed Spike Amplitude in ìVolts Spike Width in msec Number of Pyramidal Number of Interneurons Background Activity Max Modulation (Spikes/Sec) Min Modulation (Spikes/Sec) The textOrigin.txt file: ------------------------------- I found the INTRA inside My code: ******************* PHP:-------------------------------------------------------------------------------- <?php $filesource = "keyWords.txt"; $fp = fopen ($filesource, "r"); $storeWord = array(); if ($fp) { $i = 0; while (!feof($fp)) { /********************************** get the list of words to find and store them in an array for later use **********************************/ $line = fgets ($fp, 100); $storeWord[$i] = $line; $i = $i+1; } fclose($fp); } else echo "File could not be found"; /********************************** open the text source file, pick each line and compare it to the complete list of key words **********************************/ $filesource = "textOrigin.txt"; $fp = fopen ($filesource, "r"); if ($fp) { $i = 1; //this is the line number while (!feof($fp)) { $textLine = fgets ($fp, 300); /********************************** compare all the words stored in the array with each line in the origin file **********************************/ for($n=0; $n<=count($storeWord)-1; $n=$n+1) { $pos = strpos($textLine, $storeWord[$n]); if ($pos) { echo " line $i: $storeWord[$n]\n"; } $i = $i+1; } } fclose($fp); } else echo "File could not be found"; ?> |
|
|||
|
Noam Dekers wrote:
> > Hi all, > I would like to find a word stored in a text file. > > the keyWords.txt file: > ------------------------------- > Recording Site [snip] > The textOrigin.txt file: > ------------------------------- > I found the INTRA inside > > My code: > ******************* > > PHP:-------------------------------------------------------------------------------- > <?php > > $filesource = "keyWords.txt"; > $fp = fopen ($filesource, "r"); > > $storeWord = array(); > > if ($fp) > { $i = 0; > while (!feof($fp)) > { > /********************************** > get the list of words to find and > store them in an array for later use > **********************************/ > $line = fgets ($fp, 100); > $storeWord[$i] = $line; > > $i = $i+1; The index isn't necessary, just do $storeWord[] = $line; The biggest problem is that you are storing lines, not keywords. In the first place, many of these lines are multiple words. Is that what you want? If so, then call them key phrases. But most importantly, what about the newline at the end? That screws up matching. See the manual: fgets (PHP 3, PHP 4 ) fgets -- Gets line from file pointer Description string fgets ( resource handle [, int length]) Returns a string of up to length - 1 bytes read from the file pointed to by handle. Reading ends when length - 1 bytes have been read, on a newline (which is included in the return value), or on EOF (whichever comes first). If no length is specified, the length defaults to 1k, or 1024 bytes. Remove whitespace from the end with chop(). > /********************************** > open the text source file, pick each line > and compare it to the complete list of key words > **********************************/ Again, even with removing the newline you have phrases not words. If you want words, you'll need more processing. Brian Rodenborn |
|
|||
|
Default User <first.last@company.com> wrote in message news:<3F392601.677C01BE@company.com>...
> Noam Dekers wrote: > > > > Hi all, > > I would like to find a word stored in a text file. > > > > > the keyWords.txt file: > > ------------------------------- > > Recording Site > > [snip] > > > The textOrigin.txt file: > > ------------------------------- > > I found the INTRA inside > > > > My code: > > ******************* > > > > PHP:-------------------------------------------------------------------------------- > > <?php > > > > $filesource = "keyWords.txt"; > > $fp = fopen ($filesource, "r"); > > > > $storeWord = array(); > > > > if ($fp) > > { $i = 0; > > while (!feof($fp)) > > { > > /********************************** > > get the list of words to find and > > store them in an array for later use > > **********************************/ > > $line = fgets ($fp, 100); > > $storeWord[$i] = $line; > > > > $i = $i+1; > > The index isn't necessary, just do > > $storeWord[] = $line; > Thank you for this advice - I find the most difficult thing is to make my code efficient. I have only little experience with PHP... > > The biggest problem is that you are storing lines, not keywords. In the > first place, many of these lines are multiple words. Is that what you > want? If so, then call them key phrases. Well, I some times want to search for words and sometimes for phrases. There are certain places when a constant set of words is written in the same order and I would like to track it as is. > > But most importantly, what about the newline at the end? That screws up > matching. See the manual: > > fgets > > (PHP 3, PHP 4 ) > fgets -- Gets line from file pointer > Description > string fgets ( resource handle [, int length]) > > Returns a string of up to length - 1 bytes read from the file pointed to > by handle. Reading ends when length - 1 bytes have been read, on a > newline (which is included in the return value), or on EOF (whichever > comes first). If no length is specified, the length defaults to 1k, or > 1024 bytes. > > > Remove white space from the end with chop(). > Well - I must say I didn't understand that one. Why is white space a problem? Don't I have a white space in both cases: keyWord/keyPhrase and on the line of the original text? Or - don't I actually have a 'new line' at the end of each line? - I think I have new line because I always press enter in the end of each line. Thank you for any answer, Dekers. > > > > /********************************** > > open the text source file, pick each line > > and compare it to the complete list of key words > > **********************************/ > > Again, even with removing the newline you have phrases not words. If you > want words, you'll need more processing. > > > > Brian Rodenborn |
|
|||
|
Noam Dekers wrote:
> > The biggest problem is that you are storing lines, not keywords. In the > > first place, many of these lines are multiple words. Is that what you > > want? If so, then call them key phrases. > > Well, I some times want to search for words and sometimes for phrases. > There are certain places when a constant set of words is written in > the same order and I would like to track it as is. Ok, but that's not how you have the code written. You are reading in lines from a file, storing them in an array, then searching for these strings within another string. For instance, your data set has "Electrode Type", but not "Electrode". There's no way for you to search for just that word as you have it coded. Is that what you want, does the data set cover all substrings? Again, key words is the wrong term. I don't care, I just want to make sure you know what you want. > > But most importantly, what about the newline at the end? That screws up > > matching. See the manual: > > Remove white space from the end with chop(). > > > Well - I must say I didn't understand that one. Why is white space a > problem? Don't I have a white space in both cases: keyWord/keyPhrase > and on the line of the original text? Or - don't I actually have a > 'new line' at the end of each line? - I think I have new line because > I always press enter in the end of each line. Whitespace is any of the following: newline, carriage return, space character, tab, some others. In this case, as you are reading from the file with fgets(), the newlines that you use to separate the lines in your file are retained. You might think the first line read from your file is "Recording Site" but it is really "Recording Site\n" where '\n' is the newline character. So you won't get a match with strpos() or strstr() unless that exact string appears. Use chop() or rtrim() to remove that from the end of each line, along with any spaces that might be there invisibly. Actually, trim() might be best, in case there are any space characters leading the strings. Brian Rodenborn |