how to stop SE's listing 2 url's

This is a discussion on how to stop SE's listing 2 url's within the alt.comp.lang.php forums, part of the PHP Programming Forums category; Hello Am really worried, so wondered if anyone could help. My site outgrew itself recently so we've had to ...


Go Back   Usenet Forums > PHP Programming Forums > alt.comp.lang.php

FAQ Members List Calendar Search Today's Posts Mark Forums Read
  #1 (permalink)  
Old 10-19-2004
Chris
 
Posts: n/a
Default how to stop SE's listing 2 url's

Hello
Am really worried, so wondered if anyone could help.

My site outgrew itself recently so we've had to make changes to the url
structure.
I have some important url's like this: www.mysite.com/bluewidgets/, Yet now
with the expansion of the site and url structure change (had to be done) we
also have urls like: www.mysite.com/country1/bluewidgets/ which serves up
identical content to the above first url.

Is this bad? There is no way around it, cause if i dump my old url (i have
50 important ones kept) I will have to get around 6,000 webmasters to change
my link url on their pages, which i dont want to have to do.
My programmer says it wont be a problem with google etc, but i'm worried. I
rely on this site for my income.
Is it possible to stop google from crawling and most importantly listing the
50 new url's in the new format? So it sticks with the old ones? Everything
is done with php/mod rewrite rules and so its not simple for me to know.
Or is it possible to have 50 redirects from the new url's to the old ones?
Will that stop google listing both?

How do i get round this? The problem is, because the site is very much
database driven, i have no way of making it use the old format url's for the
50 in question. I hope this all makes sense. Thanks for any help,

Chris


Reply With Quote
  #2 (permalink)  
Old 10-20-2004
Centurion
 
Posts: n/a
Default Re: how to stop SE's listing 2 url's

Chris wrote:

> Hello
> Am really worried, so wondered if anyone could help.
>
> My site outgrew itself recently so we've had to make changes to the url
> structure.
> I have some important url's like this: www.mysite.com/bluewidgets/, Yet
> now with the expansion of the site and url structure change (had to be
> done) we also have urls like: www.mysite.com/country1/bluewidgets/ which
> serves up identical content to the above first url.
>
> Is this bad? There is no way around it, cause if i dump my old url (i have
> 50 important ones kept) I will have to get around 6,000 webmasters to
> change my link url on their pages, which i dont want to have to do.
> My programmer says it wont be a problem with google etc, but i'm worried.
> I rely on this site for my income.
> Is it possible to stop google from crawling and most importantly listing
> the 50 new url's in the new format? So it sticks with the old ones?
> Everything is done with php/mod rewrite rules and so its not simple for me
> to know. Or is it possible to have 50 redirects from the new url's to the
> old ones? Will that stop google listing both?
>
> How do i get round this? The problem is, because the site is very much
> database driven, i have no way of making it use the old format url's for
> the 50 in question. I hope this all makes sense. Thanks for any help,
>
> Chris


Put a robots.txt file in your web server's root dir and tell all user agents
to NOT crawl /counrty1 /couuntry2 etc. There's no way to wildcard anything
with robots.txt files, so DON'T try something like "disallow: /country*"
etc.

I currently tell robots/spiders to leave a bunch of virtual directories
alone and it works well. By "virtual" I mean, they don't really exist in
the file system, they are URL's that get rewritten with Apache rewrites
etc. eg, http://www.mysite.eg/gallery/foo doesn't exist, but the rewrite
rules get the correct files from the right place in the file system
(<webroot>/content/users/foo/gallery). I've got "disallow: /gallery/foo"
in my robots.txt file and google/msn/yahoo etc, all honour that.

There's heaps of info online and tools to verify robots.txt files online -
just google it ;)

Cheers,

James
--
"In short, _N is Richardian if, and only if, _N is not Richardian."

Reply With Quote
Reply
Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are Off
[IMG] code is Off
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On



All times are GMT +1. The time now is 12:18 PM.


Powered by vBulletin® Version 3.7.3
Copyright ©2000 - 2008, Jelsoft Enterprises Ltd.
Content Relevant URLs by vBSEO 3.0.0