Bluehost.com Web Hosting $6.95

MySQL search with result scoring / relevance

This is a discussion on MySQL search with result scoring / relevance within the MySQL Database forums, part of the Database Forums category; Hi, Suppose i have the following sql table: my_table (col_1, col_2, col_3, col_4) and the folowing user input for the ...


Go Back   Usenet Forums > Database Forums > MySQL Database

FAQ Members List Calendar Search Today's Posts Mark Forums Read
  #1 (permalink)  
Old 03-26-2007
mareeus@gmail.com
 
Posts: n/a
Default MySQL search with result scoring / relevance

Hi,

Suppose i have the following sql table:
my_table (col_1, col_2, col_3, col_4)

and the folowing user input for the search:
col_1:value_1
col_2:value_2
value_3

The above input means the folowing: search col_1 for value_1, col_2
for value_2 and value_3 in all columns.

Besides this i need some scoring (this is used for sorting the results
by their relevance).
A match in col_1 will be 4 points, col_2 will be 3 points, col_3 will
be 2 poits and the last one 1 point.

All the results will be sorted by computing this score.
For example: matching in col_1 for value_1 (+4), matching in col_2
(+3) and col_4 (+1) for value_3 will mean a +8 score for a row.

I need some ideas and if it is possible some web links that might help
me or even code.
I'm also open to any suggestions that can offer a solution to my
problem.

Regards,
Marius.

Reply With Quote
  #2 (permalink)  
Old 03-27-2007
AlterEgo
 
Posts: n/a
Default Re: MySQL search with result scoring / relevance

Marius,

I think something like this is what you want:

DROP TABLE IF EXISTS `test`.`test`;
CREATE TABLE `test`.`test` (
`TestID` int(11) NOT NULL auto_increment,
`Col1` varchar(255) default NULL,
`Col2` varchar(255) default NULL,
`Col3` varchar(255) default NULL,
`Col4` varchar(255) default NULL,
PRIMARY KEY (`TestID`)
) ENGINE=InnoDB ;

insert Test (col1, col2, col3, col4) select 'mystring', '', '', '';
insert Test (col1, col2, col3, col4) select '', 'mystring', '', '';
insert Test (col1, col2, col3, col4) select '', '', 'mystring', '';
insert Test (col1, col2, col3, col4) select '', '', '', 'mystring';
insert Test (col1, col2, col3, col4) select 'mystring', 'mystring',
'mystring', 'mystring';

select
Col1
, Col2
, Col3
, Col4
, case when ifnull(Col1,'') = 'mystring' then 4 else 0 end /* now lets
add them all up */
+ case when ifnull(Col2,'') = 'mystring' then 3 else 0 end
+ case when ifnull(Col3,'') = 'mystring' then 2 else 0 end
+ case when ifnull(Col4,'') = 'mystring' then 1 else 0 end
MyRank
from Test

-- Bill


<mareeus@gmail.com> wrote in message
news:1174945884.429520.117870@n59g2000hsh.googlegr oups.com...
> Hi,
>
> Suppose i have the following sql table:
> my_table (col_1, col_2, col_3, col_4)
>
> and the folowing user input for the search:
> col_1:value_1
> col_2:value_2
> value_3
>
> The above input means the folowing: search col_1 for value_1, col_2
> for value_2 and value_3 in all columns.
>
> Besides this i need some scoring (this is used for sorting the results
> by their relevance).
> A match in col_1 will be 4 points, col_2 will be 3 points, col_3 will
> be 2 poits and the last one 1 point.
>
> All the results will be sorted by computing this score.
> For example: matching in col_1 for value_1 (+4), matching in col_2
> (+3) and col_4 (+1) for value_3 will mean a +8 score for a row.
>
> I need some ideas and if it is possible some web links that might help
> me or even code.
> I'm also open to any suggestions that can offer a solution to my
> problem.
>
> Regards,
> Marius.
>



Reply With Quote
  #3 (permalink)  
Old 03-27-2007
mareeus@gmail.com
 
Posts: n/a
Default Re: MySQL search with result scoring / relevance

Thanks a lot Bill.

Your example works very fine.

select
Col1
, Col2
, Col3
, Col4
, case when ifnull(Col1,'') = 'mystring' then 4 else 0 end /* now
lets
add them all up */
+ case when ifnull(Col2,'') = 'mystring' then 3 else 0 end
+ case when ifnull(Col3,'') = 'mystring' then 2 else 0 end
+ case when ifnull(Col4,'') = 'mystring' then 1 else 0 end
MyRank
from Test


Now i have another ouestion. Suppose the folowing user input:
col_1:value_1 or (col_2:value_2 and col_3:value_3) or col_4:value_4

I need to adapt your example to work with "()", "AND", "OR", and in
the case of one condition not meeting the boolean expression to
exclude that row from my selection.

Reply With Quote
  #4 (permalink)  
Old 03-28-2007
mareeus@gmail.com
 
Posts: n/a
Default Re: MySQL search with result scoring / relevance

I've solved the 'or' and 'and' problem but now I'm facing a new one (a
shameful syntax error).
This is my query:

select *
, case when `col1` LIKE '%val1%' then 1 else 0 end Token1
, case when `col2` LIKE '%val2%' then 2 else 0 end Token2
, case when Token1 * Token2 > 0 then Token1 * Token2 else 0 end score
from `table` HAVING score > 0 order by score DESC

I get the following error:
#1054 - Unknown column 'Token0' in 'field list'

How can i make this work keeping Token1 and Token2 and using them to
compute the score and order results by score

Reply With Quote
  #5 (permalink)  
Old 03-28-2007
Tigger
 
Posts: n/a
Default Re: MySQL search with result scoring / relevance

On Mar 28, 8:22 am, "mare...@gmail.com" <mare...@gmail.com> wrote:
> I've solved the 'or' and 'and' problem but now I'm facing a new one (a
> shameful syntax error).
> This is my query:
>
> select *
> , case when `col1` LIKE '%val1%' then 1 else 0 end Token1
> , case when `col2` LIKE '%val2%' then 2 else 0 end Token2
> , case when Token1 * Token2 > 0 then Token1 * Token2 else 0 end score
> from `table` HAVING score > 0 order by score DESC
>
> I get the following error:
> #1054 - Unknown column 'Token0' in 'field list'
>
> How can i make this work keeping Token1 and Token2 and using them to
> compute the score and order results by score


My last post attempt seemed to fail...lets try again...

You cant reference aliases from within the field list. You have to re-
state the formulas again.

Heres my possible solution to it all...

SELECT
col1,
col2,
col3,
col4,
((`col1` LIKE '%val1%') * 4) +
((`col2` LIKE '%val2%') * 3) +
((`col3` LIKE '%val3%') * 2) +
((`col4` LIKE '%val4%') * 1) score,
(`col1` LIKE '%val1%') OR
((`col2` LIKE '%val2%') AND
(`col3` LIKE '%val3%') ) OR
(`col4` LIKE '%val4%') booleanMatch
FROM test
HAVING booleanMatch = 1 AND score > 0 order by score DESC

Reply With Quote
  #6 (permalink)  
Old 03-28-2007
mareeus@gmail.com
 
Posts: n/a
Default Re: MySQL search with result scoring / relevance

Thanks Tigger,

Your solution is almost perfect but imagine a complex search string
with 10 or 20 search values. Computing both score and booleanmatch
won't be very optimized. They both mean almost the same thing. That's
why i need some temporary values like Token1 and Token2 to store the
values once computed and reduce the query complexity a little. So if
anyone could help me to rewrite my query keeping Token1 and Token2,
would be great.

select *
, case when `col1` LIKE '%val1%' then 1 else 0 end Token1
, case when `col2` LIKE '%val2%' then 2 else 0 end Token2
, case when Token1 * Token2 > 0 then Token1 * Token2 else 0 end score
from `table` HAVING score > 0 order by score DESC

#1054 - Unknown column 'Token0' in 'field list'

Regards,
Marius.

Reply With Quote
  #7 (permalink)  
Old 03-29-2007
Tigger
 
Posts: n/a
Default Re: MySQL search with result scoring / relevance

On Mar 28, 11:44 pm, "mare...@gmail.com" <mare...@gmail.com> wrote:
> Thanks Tigger,
>
> Your solution is almost perfect but imagine a complex search string
> with 10 or 20 search values. Computing both score and booleanmatch
> won't be very optimized. They both mean almost the same thing. That's
> why i need some temporary values like Token1 and Token2 to store the
> values once computed and reduce the query complexity a little. So if
> anyone could help me to rewrite my query keeping Token1 and Token2,
> would be great.
>
> select *
> , case when `col1` LIKE '%val1%' then 1 else 0 end Token1
> , case when `col2` LIKE '%val2%' then 2 else 0 end Token2
> , case when Token1 * Token2 > 0 then Token1 * Token2 else 0 end score
> from `table` HAVING score > 0 order by score DESC
>
> #1054 - Unknown column 'Token0' in 'field list'
>
> Regards,
> Marius.



You could play around with the order of things. Like

SELECT
col1,
col2,
col3,
col4
FROM test
WHERE
(`col1` LIKE '%val1%') OR
((`col2` LIKE '%val2%') AND
(`col3` LIKE '%val3%') ) OR
(`col4` LIKE '%val4%')
ORDER BY (((`col1` LIKE '%val1%') * 4) +
((`col2` LIKE '%val2%') * 3) +
((`col3` LIKE '%val3%') * 2) +
((`col4` LIKE '%val4%') * 1)) DESC

This would limit the 2nd calculations to only happen for rows that
match the boolean expression.

Another thing to look at is doing Full-Text Searches using the MATCH
AGAINST syntax...
http://dev.mysql.com/doc/refman/5.0/...xt-search.html

This can provide cleverer results for your searches, and I remember
reading once that repeating the same MATCH statement in the field and
in a where clause will only cause the one calculation, so you don't
have to worry about the same efficiency issue.

Reply With Quote
Reply
Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are Off
[IMG] code is Off
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On




All times are GMT +1. The time now is 06:54 AM.


Powered by vBulletin® Version 3.7.3
Copyright ©2000 - 2008, Jelsoft Enterprises Ltd.
Content Relevant URLs by vBSEO 3.0.0