This is a discussion on mysql request with regexs within the MySQL Database forums, part of the Database Forums category; Hi, I've got a table that contains list of url. I want a list of unique (de-duplicated) domain ...
|
|||||||
| FAQ | Members List | Calendar | Search | Today's Posts | Mark Forums Read |
|
|||
|
Hi,
I've got a table that contains list of url. I want a list of unique (de-duplicated) domain name with a mysql query. => so I want the mysql request to return only domain and not full url Table is like this : name: list columns: - url (varchar 255) - id (int) row example : url: http://groups.google.fr/group/comp.databases.mysql/post id: 1 I've tried this : SELECT DISTINCT(wl.url) as domain FROM list wl WHERE wl.url REGEXP "^(.*)/.*$" GROUP BY wl.url LIMIT 0, 100 But in this example, I don't know how to get the value within the parenthesis (in the regexp) that I would like to insert into the distinct function => I'm searching something like : SELECT DISTINCT($1) Is there another way to select what I want ? Thanks for your help |
|
|||
|
If I recall correctly, REGEXP in MySQL is pretty weak. It will only
return 1 on match, 0 on no match.. it won't do replaces, won't return matching segments, etc. In situations where RegEx were needed, I always found myself writing a PHP script to handle the RegEx portion. |
|
|||
|
ernond_paul@yahoo.fr wrote:
> Hi, > > I've got a table that contains list of url. > I want a list of unique (de-duplicated) domain name with a mysql > query. => so I want the mysql request to return only domain and not > full url > > Table is like this : > name: list > columns: > - url (varchar 255) > - id (int) > > row example : > url: http://groups.google.fr/group/comp.databases.mysql/post > id: 1 > > I've tried this : > SELECT DISTINCT(wl.url) as domain FROM list wl WHERE wl.url REGEXP > "^(.*)/.*$" GROUP BY wl.url LIMIT 0, 100 > > But in this example, I don't know how to get the value within the > parenthesis (in the regexp) that I would like to insert into the > distinct function => I'm searching something like : > SELECT DISTINCT($1) > > Is there another way to select what I want ? > Thanks for your help Well, asuming that your urls all begin http://. you could do something like: SELECT DISTINCT LEFT(wl.url,LOCATE('/',CONCAT(wl.url,'/'),8)-1) as domain FROM list wl WHERE wl.url REGEXP "^(.*)/.*$" GROUP BY wl.url LIMIT 0, 100 |
![]() |
| Thread Tools | |
| Display Modes | |
|
|