This is a discussion on Validating directory and file path within the PHP Language forums, part of the PHP Programming Forums category; Hello, I'm new wiyh PHP and would like to ask, what is the common way to check if directory ...
|
|||||||
| FAQ | Members List | Calendar | Search | Today's Posts | Mark Forums Read |
|
|||
|
Hello,
I'm new wiyh PHP and would like to ask, what is the common way to check if directory path e.g. in url and file requested are in proper format? E.g. if I would give my homepage URL in format /usr/software/index.php it would be ok in my case but e.g. /usr//software/index.php would of course be wrong. I tried those parse_url() etc methods but they eat at least everything (or my php-version) and does not understand any errors. Could regular expression resolve this? I haven't use it much so, can anyone say how to test and notice that /usr//software/index.php is not in proper format? Thanks, |
|
|||
|
*** Ted G escribió/wrote (Sun, 20 Feb 2005 14:30:59 +0200):
> E.g. if I would give my homepage URL > in format /usr/software/index.php > it would be ok in my case but e.g. > /usr//software/index.php > would of course be wrong. I can't figure out what you're trying to do but if you are talking about file system paths then realpath() can be used to return canonical absolute paths or FALSE if file does not exist. -- -+ Álvaro G. Vicario - Burgos, Spain +- http://www.demogracia.com (la web de humor barnizada para la intemperie) ++ Manda tus dudas al grupo, no a mi buzón -+ Send your questions to the group, not to my mailbox -- |
|
|||
|
Alvaro G. Vicario wrote:
> *** Ted G escribió/wrote (Sun, 20 Feb 2005 14:30:59 +0200): > >>E.g. if I would give my homepage URL >>in format /usr/software/index.php >>it would be ok in my case but e.g. >>/usr//software/index.php >>would of course be wrong. > > > I can't figure out what you're trying to do but if you are talking about > file system paths then realpath() can be used to return canonical absolute > paths or FALSE if file does not exist. > > Okey, I clearify. In Web application user has a change to give his/hers homepage address. So, I should validate that she/he will give it in proper format. As an example of error was: /usr//homepages/index.php => those // characters There might be also other typing errors when you give your URL. So, the question was that what is the easiest way in PHP to check that URL or it's path part (e.g. /usr/homepages/index.php) is written as syntax requires it? Br |
|
|||
|
*** Ted G escribió/wrote (Sun, 20 Feb 2005 15:07:00 +0200):
> As an example of error was: /usr//homepages/index.php > > => those // characters > > There might be also other typing errors when you give your URL. > > So, the question was that what is the easiest way in PHP to > check that URL or it's path part (e.g. /usr/homepages/index.php) is > written as syntax requires it? I don't think it's illegal to have // in the path part of an URL. Perhaps the best approach you can use is opening a socket to send a HEAD request and check the returned status code. That way you can actually check if the page exists and is up. -- -+ Álvaro G. Vicario - Burgos, Spain +- http://www.demogracia.com (la web de humor barnizada para la intemperie) ++ Manda tus dudas al grupo, no a mi buzón -+ Send your questions to the group, not to my mailbox -- |
|
|||
|
Alvaro G. Vicario wrote:
> *** Ted G escribió/wrote (Sun, 20 Feb 2005 15:07:00 +0200): > >>As an example of error was: /usr//homepages/index.php >> >>=> those // characters >> >>There might be also other typing errors when you give your URL. >> >>So, the question was that what is the easiest way in PHP to >>check that URL or it's path part (e.g. /usr/homepages/index.php) is >>written as syntax requires it? > > > I don't think it's illegal to have // in the path part of an URL. Perhaps > the best approach you can use is opening a socket to send a HEAD request > and check the returned status code. That way you can actually check if the > page exists and is up. > What I have usually done, I have used javascript in Browser side and Java's features in serverside to validate data. Unfortunately I'm not a pro in RegExp or PHP neither so, that's why I ask these questions ;) Yep, // is not valid in URL string (path). Ok, I can check as a string operation if thera are // or other unlegal characters in URL string and then use checkdnsrr-method to check if the host part is a living/valid host. But I thing there are also more elegant way to do that... Br, |
|
|||
|
"Ted G" <tg@not.valid.mail> wrote in message news:37ripjF5h05hoU1@individual.net... > Alvaro G. Vicario wrote: > Yep, // is not valid in URL string (path). > > Ok, I can check as a string operation if thera are // or other unlegal > characters in URL string and then use checkdnsrr-method to check if > the host part is a living/valid host. That's not true. Having // in the path part of URL does not make it syntatically incorrect. It's at the discretion of the server to interpret what the path means. If it chooses to, the server can correct for the obvious typo. |
|
|||
|
Chung Leong <chernyshevsky@hotmail.com> wrote:
>> Yep, // is not valid in URL string (path). >> >> Ok, I can check as a string operation if thera are // or other unlegal >> characters in URL string and then use checkdnsrr-method to check if >> the host part is a living/valid host. > > That's not true. Having // in the path part of URL does not make it > syntatically incorrect. It's at the discretion of the server to interpret > what the path means. If it chooses to, the server can correct for the > obvious typo. URI RFC (2396) says otherwise, servers correcting this do that at their own peril: 3.3. Path Component The path component contains data, specific to the authority (or the scheme if there is no authority component), identifying the resource within the scope of that scheme and authority. path = [ abs_path | opaque_part ] path_segments = segment *( "/" segment ) segment = *pchar *( ";" param ) param = *pchar pchar = unreserved | escaped | ":" | "@" | "&" | "=" | "+" | "$" | "," The path may consist of a sequence of path segments separated by a single slash "/" character. Within a path segment, the characters "/", ";", "=", and "?" are reserved. Each path segment may include a sequence of parameters, indicated by the semicolon ";" character. The parameters are not significant to the parsing of relative references. |
|
|||
|
On 20 Feb 2005 19:28:42 GMT, Daniel Tryba <spam@tryba.invalid> wrote:
>Chung Leong <chernyshevsky@hotmail.com> wrote: >>> Yep, // is not valid in URL string (path). >> >> That's not true. Having // in the path part of URL does not make it >> syntatically incorrect. > >URI RFC (2396) says otherwise, servers correcting this do that at their >own peril: > >3.3. Path Component > > The path component contains data, specific to the authority (or the > scheme if there is no authority component), identifying the resource > within the scope of that scheme and authority. > > path = [ abs_path | opaque_part ] > > path_segments = segment *( "/" segment ) > segment = *pchar *( ";" param ) > param = *pchar > > pchar = unreserved | escaped | > ":" | "@" | "&" | "=" | "+" | "$" | "," > > The path may consist of a sequence of path segments separated by a > single slash "/" character. Within a path segment, the characters > "/", ";", "=", and "?" are reserved. Each path segment may include a > sequence of parameters, indicated by the semicolon ";" character. > The parameters are not significant to the parsing of relative > references. Under section 1.6, the definition of the BNF-like grammar, it's got: "elements may be preceded with <n>* to designate n or more repetitions of the following element; n defaults to 0." Segment's declared as: segment = *pchar *( ";" param ) Doesn't that imply that a segment may be the empty string, consisting of zero repetitions of pchar and zero repetitions of ( ";" param ), so "//" is a valid production of segment *( "/" segment )? Or am I reading it wrong? -- Andy Hassall / <andy@andyh.co.uk> / <http://www.andyh.co.uk> <http://www.andyhsoftware.co.uk/space> Space: disk usage analysis tool |
|
|||
|
Ted G wrote:
> I tried those parse_url() etc methods but > they eat at least everything (or my php-version) > and does not understand any errors. Should you pass something other than a URI to parse_url, it only 'tries its best'. It does not validate the string. http://www.php.net/manual/en/function.parse-url.php Sadly, parse_url does not parse all URIs properly. Take <http://host.invalid?query>, for example, which parse_url thinks contains a host <host.invalid?query>. It doesn't; the host is <host.invalid>, the path is empty though still defined, and the query is <query>. Instead, you can make use of the regular expression given in RFC3986. Change `^(([^:/?#]+):)?(//([^/?#]*))?([^?#]*)(\?([^#]*))?(#(.*))?` to `^(?:([^:/?#]+):)?(?://([^/?#]*))?([^?#]*)(?:\?([^#]*))?(?:#(.*))?` which separates a URI, giving you the scheme name, the authority (the host of an HTTP URI) if present, the path, the query if present, and the fragment identifier if present. It always finds the scheme name and path, which are always defined for every URI. -- Jock |
|
|||
|
Daniel Tryba wrote:
> Chung Leong <chernyshevsky@hotmail.com> wrote: > > That's not true. Having // in the path part of URL does not make it > > syntatically incorrect. It's at the discretion of the server to interpret > > what the path means. If it chooses to, the server can correct for the > > obvious typo. > > URI RFC (2396) says otherwise, servers correcting this do that at their > own peril: (Note that 2396 was obsoleted by 3986 over a month ago. The additions and modifications are listed in an appendix.) > 3.3. Path Component [ ... ] Sorry, but I don't see anything in section 3.3 which contradicts phpSt.Chung. He's right, as usual. <http://host.invalid/path> and <http://host.invalid//path> are both syntactically correct; in other words, they conform to the rules 'URI' in RFC3986 and 'http_URL' in 2616. What each one identifies, however, as Chung said, depends on the server. -- Jock |