This is a discussion on detection of a robot in php within the PHP Language forums, part of the PHP Programming Forums category; Hello everybody :) A friend recently showed me an odd thing while playing with the command wget under linux, I don'...
|
|||||||
| FAQ | Members List | Calendar | Search | Today's Posts | Mark Forums Read |
|
|||
|
Hello everybody :)
A friend recently showed me an odd thing while playing with the command wget under linux, I don't know why... But the result has surprised me : $ wget http://www.prizee.com/parole.php --02:35:29-- http://www.prizee.com/parole.php => `parole.php' Resolution de www.prizee.com... 213.186.63.5 Connexion vers www.prizee.com|213.186.63.5|:80...connecte. requete HTTP transmise, en attente de la reponse...302 Found Emplacement: /index.php?joueur=1 [suivant] --02:35:30-- http://www.prizee.com/index.php?joueur=1 => `index.php?joueur=1.1' Connexion vers www.prizee.com|213.186.63.5|:80...connecte. requete HTTP transmise, en attente de la reponse...200 OK Longueur: non specifie [text/html] [ <=> ] 12,521 --.--K/s 02:35:30 (103.57 KB/s) - ? index.php?joueur=1.1 a sauvegarde [12521] Then, he obtains an http error code (302) which redirect him on the index page of the site. With a browser like firefox, ie, safari we get the good page without any redirection. After that, I've made some tests. I tried to change the user agent string with wget to identify it as mozilla, but I have the same result (redirection). I tried links (command line browser) and curl but same problem. Here is the result of curl command : $ curl -v http://www.prizee.com/parole.php * About to connect() to www.prizee.com port 80 * Trying 213.186.63.5... connected * Connected to www.prizee.com (213.186.63.5) port 80 > GET /parole.php HTTP/1.1 > User-Agent: curl/7.15.1 (i686-pc-linux-gnu) libcurl/7.15.1 GnuTLS/1.2.10 zlib/1.2.3 libidn/0.5.15 > Host: www.prizee.com > Accept: */* > < HTTP/1.1 302 Found < Date: Wed, 09 Aug 2006 00:02:57 GMT < Server: Apache/1.3.33 (Unix) PHP/4.3.10 < X-Powered-By: PHP/4.3.10 < X-Accelerated-By: PHPA/1.3.3r2 < Expires: Mon, 26 Jul 1997 05:00:00 GMT < Last-Modified: Wed, 09 Aug 2006 00:02:59 GMT < Cache-Control: no-cache, must-revalidate < Pragma: no-cache < Set-Cookie: COOKIEis_accepted=1; path=/; domain=.prizee.com < Location: /index.php?joueur=1 < Connection: close < Transfer-Encoding: chunked < Content-Type: text/html * Closing connection #0 So, my question is : How we can detect the use of a command line tool on a web site ? Like the site above. Thank you for your answers. Sorry for my bad english, i'm french ;) |
|
|||
|
giminik@gmail.com wrote:
> Hello everybody :) > > A friend recently showed me an odd thing while playing with the > command wget under linux, I don't know why... But the result has > surprised me : $ wget http://www.prizee.com/parole.php > --02:35:29-- http://www.prizee.com/parole.php > => `parole.php' > Resolution de www.prizee.com... 213.186.63.5 > Connexion vers www.prizee.com|213.186.63.5|:80...connecte. > requete HTTP transmise, en attente de la reponse...302 Found > Emplacement: /index.php?joueur=1 [suivant] > --02:35:30-- http://www.prizee.com/index.php?joueur=1 > => `index.php?joueur=1.1' > Connexion vers www.prizee.com|213.186.63.5|:80...connecte. > requete HTTP transmise, en attente de la reponse...200 OK > Longueur: non specifie [text/html] > > [ <=> > > ] 12,521 > --.--K/s > > 02:35:30 (103.57 KB/s) - ? index.php?joueur=1.1 a sauvegarde [12521] > > > Then, he obtains an http error code (302) which redirect him on the > index page of the site. > With a browser like firefox, ie, safari we get the good page without > any redirection. > After that, I've made some tests. I tried to change the user agent > string with wget to identify it as mozilla, but I have the same result > (redirection). I tried links (command line browser) and curl but same > problem. > Here is the result of curl command : > > $ curl -v http://www.prizee.com/parole.php > * About to connect() to www.prizee.com port 80 > * Trying 213.186.63.5... connected > * Connected to www.prizee.com (213.186.63.5) port 80 >> GET /parole.php HTTP/1.1 >> User-Agent: curl/7.15.1 (i686-pc-linux-gnu) libcurl/7.15.1 >> GnuTLS/1.2.10 zlib/1.2.3 libidn/0.5.15 Host: www.prizee.com >> Accept: */* >> > < HTTP/1.1 302 Found > < Date: Wed, 09 Aug 2006 00:02:57 GMT > < Server: Apache/1.3.33 (Unix) PHP/4.3.10 > < X-Powered-By: PHP/4.3.10 > < X-Accelerated-By: PHPA/1.3.3r2 > < Expires: Mon, 26 Jul 1997 05:00:00 GMT > < Last-Modified: Wed, 09 Aug 2006 00:02:59 GMT > < Cache-Control: no-cache, must-revalidate > < Pragma: no-cache > < Set-Cookie: COOKIEis_accepted=1; path=/; domain=.prizee.com > < Location: /index.php?joueur=1 > < Connection: close > < Transfer-Encoding: chunked > < Content-Type: text/html > * Closing connection #0 > > > So, my question is : How we can detect the use of a command line tool > on a web site ? Like the site above. Thank you for your answers. I tried both Firefox and Konqueror and they both redirected me to the second page, so there doesn't appear to be anything different between using wget and using a graphical browser, at least to me. You can't detect the use of a command line tool if they set the user agent correctly. For example: wget --user-agent="Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1)" followed by the url will tell the website you're using IE on Windows XP. -- Chris Hope | www.electrictoolbox.com | www.linuxcdmall.com |
|
|||
|
giminik@gmail.com wrote: > Hello everybody :) > > A friend recently showed me an odd thing while playing with the command > wget under linux, I don't know why... But the result has surprised me : > $ wget http://www.prizee.com/parole.php > --02:35:29-- http://www.prizee.com/parole.php > => `parole.php' > Resolution de www.prizee.com... 213.186.63.5 > Connexion vers www.prizee.com|213.186.63.5|:80...connecte. > requete HTTP transmise, en attente de la reponse...302 Found > Emplacement: /index.php?joueur=1 [suivant] > --02:35:30-- http://www.prizee.com/index.php?joueur=1 > => `index.php?joueur=1.1' > Connexion vers www.prizee.com|213.186.63.5|:80...connecte. > requete HTTP transmise, en attente de la reponse...200 OK > Longueur: non specifie [text/html] > > [ <=> > > ] 12,521 > --.--K/s > > 02:35:30 (103.57 KB/s) - ? index.php?joueur=1.1 a sauvegarde [12521] > > > Then, he obtains an http error code (302) which redirect him on the > index page of the site. > With a browser like firefox, ie, safari we get the good page without > any redirection. > After that, I've made some tests. I tried to change the user agent > string with wget to identify it as mozilla, but I have the same result > (redirection). I tried links (command line browser) and curl but same > problem. > Here is the result of curl command : > > $ curl -v http://www.prizee.com/parole.php > * About to connect() to www.prizee.com port 80 > * Trying 213.186.63.5... connected > * Connected to www.prizee.com (213.186.63.5) port 80 > > GET /parole.php HTTP/1.1 > > User-Agent: curl/7.15.1 (i686-pc-linux-gnu) libcurl/7.15.1 GnuTLS/1.2.10 zlib/1.2.3 libidn/0.5.15 > > Host: www.prizee.com > > Accept: */* > > > < HTTP/1.1 302 Found > < Date: Wed, 09 Aug 2006 00:02:57 GMT > < Server: Apache/1.3.33 (Unix) PHP/4.3.10 > < X-Powered-By: PHP/4.3.10 > < X-Accelerated-By: PHPA/1.3.3r2 > < Expires: Mon, 26 Jul 1997 05:00:00 GMT > < Last-Modified: Wed, 09 Aug 2006 00:02:59 GMT > < Cache-Control: no-cache, must-revalidate > < Pragma: no-cache > < Set-Cookie: COOKIEis_accepted=1; path=/; domain=.prizee.com > < Location: /index.php?joueur=1 > < Connection: close > < Transfer-Encoding: chunked > < Content-Type: text/html > * Closing connection #0 > > > So, my question is : How we can detect the use of a command line tool > on a web site ? Like the site above. Thank you for your answers. > > Sorry for my bad english, i'm french ;) probably just redirects for linux users and not ms by checking the agent-type. Flamer. |