Validating directory and file path

This is a discussion on Validating directory and file path within the PHP Language forums, part of the PHP Programming Forums category; Hello, I'm new wiyh PHP and would like to ask, what is the common way to check if directory ...


Go Back   Usenet Forums > PHP Programming Forums > PHP Language

FAQ Members List Calendar Search Today's Posts Mark Forums Read
  #1 (permalink)  
Old 02-20-2005
Ted G
 
Posts: n/a
Default Validating directory and file path

Hello,

I'm new wiyh PHP and would like to ask,
what is the common way to check if
directory path e.g. in url and file requested
are in proper format?

E.g. if I would give my homepage URL
in format /usr/software/index.php
it would be ok in my case but e.g.
/usr//software/index.php
would of course be wrong.

I tried those parse_url() etc methods but
they eat at least everything (or my php-version)
and does not understand any errors.

Could regular expression resolve this?
I haven't use it much so, can anyone say
how to test and notice that /usr//software/index.php
is not in proper format?

Thanks,


Reply With Quote
  #2 (permalink)  
Old 02-20-2005
Alvaro G. Vicario
 
Posts: n/a
Default Re: Validating directory and file path

*** Ted G escribió/wrote (Sun, 20 Feb 2005 14:30:59 +0200):
> E.g. if I would give my homepage URL
> in format /usr/software/index.php
> it would be ok in my case but e.g.
> /usr//software/index.php
> would of course be wrong.


I can't figure out what you're trying to do but if you are talking about
file system paths then realpath() can be used to return canonical absolute
paths or FALSE if file does not exist.



--
-+ Álvaro G. Vicario - Burgos, Spain
+- http://www.demogracia.com (la web de humor barnizada para la intemperie)
++ Manda tus dudas al grupo, no a mi buzón
-+ Send your questions to the group, not to my mailbox
--
Reply With Quote
  #3 (permalink)  
Old 02-20-2005
Ted G
 
Posts: n/a
Default Re: Validating directory and file path

Alvaro G. Vicario wrote:

> *** Ted G escribió/wrote (Sun, 20 Feb 2005 14:30:59 +0200):
>
>>E.g. if I would give my homepage URL
>>in format /usr/software/index.php
>>it would be ok in my case but e.g.
>>/usr//software/index.php
>>would of course be wrong.

>
>
> I can't figure out what you're trying to do but if you are talking about
> file system paths then realpath() can be used to return canonical absolute
> paths or FALSE if file does not exist.
>
>

Okey, I clearify.

In Web application user has a change to give his/hers homepage
address. So, I should validate that she/he will give it in
proper format.
As an example of error was: /usr//homepages/index.php

=> those // characters

There might be also other typing errors when you give your URL.

So, the question was that what is the easiest way in PHP to
check that URL or it's path part (e.g. /usr/homepages/index.php) is
written as syntax requires it?

Br











Reply With Quote
  #4 (permalink)  
Old 02-20-2005
Alvaro G. Vicario
 
Posts: n/a
Default Re: Validating directory and file path

*** Ted G escribió/wrote (Sun, 20 Feb 2005 15:07:00 +0200):
> As an example of error was: /usr//homepages/index.php
>
> => those // characters
>
> There might be also other typing errors when you give your URL.
>
> So, the question was that what is the easiest way in PHP to
> check that URL or it's path part (e.g. /usr/homepages/index.php) is
> written as syntax requires it?


I don't think it's illegal to have // in the path part of an URL. Perhaps
the best approach you can use is opening a socket to send a HEAD request
and check the returned status code. That way you can actually check if the
page exists and is up.


--
-+ Álvaro G. Vicario - Burgos, Spain
+- http://www.demogracia.com (la web de humor barnizada para la intemperie)
++ Manda tus dudas al grupo, no a mi buzón
-+ Send your questions to the group, not to my mailbox
--
Reply With Quote
  #5 (permalink)  
Old 02-20-2005
Ted G
 
Posts: n/a
Default Re: Validating directory and file path

Alvaro G. Vicario wrote:

> *** Ted G escribió/wrote (Sun, 20 Feb 2005 15:07:00 +0200):
>
>>As an example of error was: /usr//homepages/index.php
>>
>>=> those // characters
>>
>>There might be also other typing errors when you give your URL.
>>
>>So, the question was that what is the easiest way in PHP to
>>check that URL or it's path part (e.g. /usr/homepages/index.php) is
>>written as syntax requires it?

>
>
> I don't think it's illegal to have // in the path part of an URL. Perhaps
> the best approach you can use is opening a socket to send a HEAD request
> and check the returned status code. That way you can actually check if the
> page exists and is up.
>


What I have usually done, I have used javascript in Browser side and
Java's features in serverside to validate data.

Unfortunately I'm not a pro in RegExp or PHP neither so, that's
why I ask these questions ;)

Yep, // is not valid in URL string (path).

Ok, I can check as a string operation if thera are // or other unlegal
characters in URL string and then use checkdnsrr-method to check if
the host part is a living/valid host.

But I thing there are also more elegant way to do that...

Br,

Reply With Quote
  #6 (permalink)  
Old 02-20-2005
Chung Leong
 
Posts: n/a
Default Re: Validating directory and file path


"Ted G" <tg@not.valid.mail> wrote in message
news:37ripjF5h05hoU1@individual.net...
> Alvaro G. Vicario wrote:
> Yep, // is not valid in URL string (path).
>
> Ok, I can check as a string operation if thera are // or other unlegal
> characters in URL string and then use checkdnsrr-method to check if
> the host part is a living/valid host.


That's not true. Having // in the path part of URL does not make it
syntatically incorrect. It's at the discretion of the server to interpret
what the path means. If it chooses to, the server can correct for the
obvious typo.


Reply With Quote
  #7 (permalink)  
Old 02-20-2005
Daniel Tryba
 
Posts: n/a
Default Re: Validating directory and file path

Chung Leong <chernyshevsky@hotmail.com> wrote:
>> Yep, // is not valid in URL string (path).
>>
>> Ok, I can check as a string operation if thera are // or other unlegal
>> characters in URL string and then use checkdnsrr-method to check if
>> the host part is a living/valid host.

>
> That's not true. Having // in the path part of URL does not make it
> syntatically incorrect. It's at the discretion of the server to interpret
> what the path means. If it chooses to, the server can correct for the
> obvious typo.


URI RFC (2396) says otherwise, servers correcting this do that at their
own peril:

3.3. Path Component

The path component contains data, specific to the authority (or the
scheme if there is no authority component), identifying the resource
within the scope of that scheme and authority.

path = [ abs_path | opaque_part ]

path_segments = segment *( "/" segment )
segment = *pchar *( ";" param )
param = *pchar

pchar = unreserved | escaped |
":" | "@" | "&" | "=" | "+" | "$" | ","

The path may consist of a sequence of path segments separated by a
single slash "/" character. Within a path segment, the characters
"/", ";", "=", and "?" are reserved. Each path segment may include a
sequence of parameters, indicated by the semicolon ";" character.
The parameters are not significant to the parsing of relative
references.

Reply With Quote
  #8 (permalink)  
Old 02-20-2005
Andy Hassall
 
Posts: n/a
Default Re: Validating directory and file path

On 20 Feb 2005 19:28:42 GMT, Daniel Tryba <spam@tryba.invalid> wrote:

>Chung Leong <chernyshevsky@hotmail.com> wrote:
>>> Yep, // is not valid in URL string (path).

>>
>> That's not true. Having // in the path part of URL does not make it
>> syntatically incorrect.

>
>URI RFC (2396) says otherwise, servers correcting this do that at their
>own peril:
>
>3.3. Path Component
>
> The path component contains data, specific to the authority (or the
> scheme if there is no authority component), identifying the resource
> within the scope of that scheme and authority.
>
> path = [ abs_path | opaque_part ]
>
> path_segments = segment *( "/" segment )
> segment = *pchar *( ";" param )
> param = *pchar
>
> pchar = unreserved | escaped |
> ":" | "@" | "&" | "=" | "+" | "$" | ","
>
> The path may consist of a sequence of path segments separated by a
> single slash "/" character. Within a path segment, the characters
> "/", ";", "=", and "?" are reserved. Each path segment may include a
> sequence of parameters, indicated by the semicolon ";" character.
> The parameters are not significant to the parsing of relative
> references.


Under section 1.6, the definition of the BNF-like grammar, it's got:

"elements may be preceded with <n>* to designate n or more repetitions of the
following element; n defaults to 0."

Segment's declared as:

segment = *pchar *( ";" param )

Doesn't that imply that a segment may be the empty string, consisting of zero
repetitions of pchar and zero repetitions of ( ";" param ), so "//" is a valid
production of segment *( "/" segment )? Or am I reading it wrong?

--
Andy Hassall / <andy@andyh.co.uk> / <http://www.andyh.co.uk>
<http://www.andyhsoftware.co.uk/space> Space: disk usage analysis tool
Reply With Quote
  #9 (permalink)  
Old 02-20-2005
John Dunlop
 
Posts: n/a
Default parse_url

Ted G wrote:

> I tried those parse_url() etc methods but
> they eat at least everything (or my php-version)
> and does not understand any errors.


Should you pass something other than a URI to parse_url, it
only 'tries its best'. It does not validate the string.

http://www.php.net/manual/en/function.parse-url.php

Sadly, parse_url does not parse all URIs properly. Take
<http://host.invalid?query>, for example, which parse_url
thinks contains a host <host.invalid?query>. It doesn't; the
host is <host.invalid>, the path is empty though still
defined, and the query is <query>.

Instead, you can make use of the regular expression given in
RFC3986. Change

`^(([^:/?#]+):)?(//([^/?#]*))?([^?#]*)(\?([^#]*))?(#(.*))?`

to

`^(?:([^:/?#]+):)?(?://([^/?#]*))?([^?#]*)(?:\?([^#]*))?(?:#(.*))?`

which separates a URI, giving you the scheme name, the
authority (the host of an HTTP URI) if present, the path,
the query if present, and the fragment identifier if
present. It always finds the scheme name and path, which
are always defined for every URI.

--
Jock
Reply With Quote
  #10 (permalink)  
Old 02-20-2005
John Dunlop
 
Posts: n/a
Default Re: Validating directory and file path

Daniel Tryba wrote:

> Chung Leong <chernyshevsky@hotmail.com> wrote:


> > That's not true. Having // in the path part of URL does not make it
> > syntatically incorrect. It's at the discretion of the server to interpret
> > what the path means. If it chooses to, the server can correct for the
> > obvious typo.

>
> URI RFC (2396) says otherwise, servers correcting this do that at their
> own peril:


(Note that 2396 was obsoleted by 3986 over a month ago. The
additions and modifications are listed in an appendix.)

> 3.3. Path Component


[ ... ]

Sorry, but I don't see anything in section 3.3 which
contradicts phpSt.Chung. He's right, as usual.

<http://host.invalid/path> and <http://host.invalid//path>
are both syntactically correct; in other words, they conform
to the rules 'URI' in RFC3986 and 'http_URL' in 2616. What
each one identifies, however, as Chung said, depends on the
server.

--
Jock
Reply With Quote
Reply
Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are Off
[IMG] code is Off
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On




All times are GMT +1. The time now is 10:56 AM.


Powered by vBulletin® Version 3.7.3
Copyright ©2000 - 2008, Jelsoft Enterprises Ltd.
Content Relevant URLs by vBSEO 3.0.0