keyword search methodology

This is a discussion on keyword search methodology within the MySQL Database forums, part of the Database Forums category; Hello, I'd like some input on the following: I have a table on which I will need to perform ...


Go Back   Usenet Forums > Database Forums > MySQL Database

FAQ Members List Calendar Search Today's Posts Mark Forums Read
  #1 (permalink)  
Old 03-06-2008
henribaeyens
 
Posts: n/a
Default keyword search methodology

Hello,

I'd like some input on the following:

I have a table on which I will need to perform keyword searches. To make
it short, the search is performed on three fields:

TABLE QandR
question which is varchar
answer which is text
keywords which is varchar

All data is in French and I need to account for accent insensitive
searches. So far I'm using the LIKE %kw% form but I'm considering using
fulltext. So, my idea is this: create a separate table (QandR_search)
that will hold two fields, an integer id, and a text. So each time the
QandR table is updated, by feeding it a question, an answer, and
keywords, I shall take those three, strip them from accents, concatenate
the whole thing into one big string that I'll insert into the
QandR_search table. When a keyword is searched for, it is first de-
accented and then fulltext searched against the text. When there's a
match, I use the id to point to the QandR table and then echo out the
original question along with it's answer. Does this sound like a viable
way of doing accent-insensitive fulltext searches?

Thank you
Reply With Quote
  #2 (permalink)  
Old 03-06-2008
Captain Paralytic
 
Posts: n/a
Default Re: keyword search methodology

On 6 Mar, 12:05, henribaeyens <cont...@myname.com> wrote:
> Hello,
>
> I'd like some input on the following:
>
> I have a table on which I will need to perform keyword searches. To make
> it short, the search is performed on three fields:
>
> TABLE QandR
> question which is varchar
> answer which is text
> keywords which is varchar
>
> All data is in French and I need to account for accent insensitive
> searches. So far I'm using the LIKE %kw% form but I'm considering using
> fulltext. So, my idea is this: create a separate table (QandR_search)
> that will hold two fields, an integer id, and a text. So each time the
> QandR table is updated, by feeding it a question, an answer, and
> keywords, I shall take those three, strip them from accents, concatenate
> the whole thing into one big string that I'll insert into the
> QandR_search table. When a keyword is searched for, it is first de-
> accented and then fulltext searched against the text. When there's a
> match, I use the id to point to the QandR table and then echo out the
> original question along with it's answer. Does this sound like a viable
> way of doing accent-insensitive fulltext searches?
>
> Thank you


Of course before going through all this design work you did your
google search for
mysql fulltext accent
and clicked I'm Feeling Lucky.
I'm therefore wondering why you ignored the advice therein?
Reply With Quote
  #3 (permalink)  
Old 03-06-2008
henribaeyens
 
Posts: n/a
Default Re: keyword search methodology

On Thu, 06 Mar 2008 04:26:00 -0800, Captain Paralytic wrote:

> On 6 Mar, 12:05, henribaeyens <cont...@myname.com> wrote:
>> Hello,
>>
>> I'd like some input on the following:
>>
>> I have a table on which I will need to perform keyword searches. To
>> make it short, the search is performed on three fields:
>>
>> TABLE QandR
>> question which is varchar
>> answer which is text
>> keywords which is varchar
>>
>> All data is in French and I need to account for accent insensitive
>> searches. So far I'm using the LIKE %kw% form but I'm considering using
>> fulltext. So, my idea is this: create a separate table (QandR_search)
>> that will hold two fields, an integer id, and a text. So each time the
>> QandR table is updated, by feeding it a question, an answer, and
>> keywords, I shall take those three, strip them from accents,
>> concatenate the whole thing into one big string that I'll insert into
>> the QandR_search table. When a keyword is searched for, it is first de-
>> accented and then fulltext searched against the text. When there's a
>> match, I use the id to point to the QandR table and then echo out the
>> original question along with it's answer. Does this sound like a viable
>> way of doing accent-insensitive fulltext searches?
>>
>> Thank you

>
> Of course before going through all this design work you did your google
> search for
> mysql fulltext accent
> and clicked I'm Feeling Lucky.
> I'm therefore wondering why you ignored the advice therein?


No I didn't. This page describes exactly what I intend to do, however.
What advice did I ignore? That maybe I should think about Ruby's Ferret
instead of relying on mysql fulltext? I know about Ferret and Sphinx but
these are not what I'm looking for at the moment.

Reply With Quote
  #4 (permalink)  
Old 03-06-2008
Captain Paralytic
 
Posts: n/a
Default Re: keyword search methodology

On 6 Mar, 12:59, henribaeyens <cont...@myname.com> wrote:
> On Thu, 06 Mar 2008 04:26:00 -0800, Captain Paralytic wrote:
> > On 6 Mar, 12:05, henribaeyens <cont...@myname.com> wrote:
> >> Hello,

>
> >> I'd like some input on the following:

>
> >> I have a table on which I will need to perform keyword searches. To
> >> make it short, the search is performed on three fields:

>
> >> TABLE QandR
> >> question which is varchar
> >> answer which is text
> >> keywords which is varchar

>
> >> All data is in French and I need to account for accent insensitive
> >> searches. So far I'm using the LIKE %kw% form but I'm considering using
> >> fulltext. So, my idea is this: create a separate table (QandR_search)
> >> that will hold two fields, an integer id, and a text. So each time the
> >> QandR table is updated, by feeding it a question, an answer, and
> >> keywords, I shall take those three, strip them from accents,
> >> concatenate the whole thing into one big string that I'll insert into
> >> the QandR_search table. When a keyword is searched for, it is first de-
> >> accented and then fulltext searched against the text. When there's a
> >> match, I use the id to point to the QandR table and then echo out the
> >> original question along with it's answer. Does this sound like a viable
> >> way of doing accent-insensitive fulltext searches?

>
> >> Thank you

>
> > Of course before going through all this design work you did your google
> > search for
> > mysql fulltext accent
> > and clicked I'm Feeling Lucky.
> > I'm therefore wondering why you ignored the advice therein?

>
> No I didn't. This page describes exactly what I intend to do, however.
> What advice did I ignore?

That FULLTEXT is perfectly capable of being "accent insensitive"
without you having to jump through any of the hoops you were
designing.

Reply With Quote
  #5 (permalink)  
Old 03-07-2008
henribaeyens
 
Posts: n/a
Default Re: keyword search methodology

On Thu, 06 Mar 2008 06:03:53 -0800, Captain Paralytic wrote:

> On 6 Mar, 12:59, henribaeyens <cont...@myname.com> wrote:
>> On Thu, 06 Mar 2008 04:26:00 -0800, Captain Paralytic wrote:
>> > On 6 Mar, 12:05, henribaeyens <cont...@myname.com> wrote:
>> >> Hello,

>>
>> >> I'd like some input on the following:

>>
>> >> I have a table on which I will need to perform keyword searches. To
>> >> make it short, the search is performed on three fields:

>>
>> >> TABLE QandR
>> >> question which is varchar
>> >> answer which is text
>> >> keywords which is varchar

>>
>> >> All data is in French and I need to account for accent insensitive
>> >> searches. So far I'm using the LIKE %kw% form but I'm considering
>> >> using fulltext. So, my idea is this: create a separate table
>> >> (QandR_search) that will hold two fields, an integer id, and a text.
>> >> So each time the QandR table is updated, by feeding it a question,
>> >> an answer, and keywords, I shall take those three, strip them from
>> >> accents, concatenate the whole thing into one big string that I'll
>> >> insert into the QandR_search table. When a keyword is searched for,
>> >> it is first de- accented and then fulltext searched against the
>> >> text. When there's a match, I use the id to point to the QandR table
>> >> and then echo out the original question along with it's answer. Does
>> >> this sound like a viable way of doing accent-insensitive fulltext
>> >> searches?

>>
>> >> Thank you

>>
>> > Of course before going through all this design work you did your
>> > google search for
>> > mysql fulltext accent
>> > and clicked I'm Feeling Lucky.
>> > I'm therefore wondering why you ignored the advice therein?

>>
>> No I didn't. This page describes exactly what I intend to do, however.
>> What advice did I ignore?

> That FULLTEXT is perfectly capable of being "accent insensitive" without
> you having to jump through any of the hoops you were designing.


well, I wonder how 'coz I'm using a collation which is supposed to be
both case insensitive and accent insensitive (latin1_swedish_ci) and it
don't do it.
Reply With Quote
  #6 (permalink)  
Old 03-08-2008
Tigger
 
Posts: n/a
Default Re: keyword search methodology

"henribaeyens" <contact@myname.com> wrote in message
news:47cfddf3$0$897$ba4acef3@news.orange.fr...
> Hello,
>
> I'd like some input on the following:
>
> I have a table on which I will need to perform keyword searches. To make
> it short, the search is performed on three fields:
>
> TABLE QandR
> question which is varchar
> answer which is text
> keywords which is varchar
>
> All data is in French and I need to account for accent insensitive
> searches. So far I'm using the LIKE %kw% form but I'm considering using
> fulltext. So, my idea is this: create a separate table (QandR_search)
> that will hold two fields, an integer id, and a text. So each time the
> QandR table is updated, by feeding it a question, an answer, and
> keywords, I shall take those three, strip them from accents, concatenate
> the whole thing into one big string that I'll insert into the
> QandR_search table. When a keyword is searched for, it is first de-
> accented and then fulltext searched against the text. When there's a
> match, I use the id to point to the QandR table and then echo out the
> original question along with it's answer. Does this sound like a viable
> way of doing accent-insensitive fulltext searches?
>
> Thank you


I do something similar to cope with hyphenated words. e.g. "first-last"

My searchable string would contain "first last firstlast" to cover all
bases.

It works well. Especially with one client who has complex hyphenated product
codes which he can now search on.

Another bonus is that you only have to index and search a single field,
which should be faster.

Tigger


Reply With Quote
Reply
Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are Off
[IMG] code is Off
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On



All times are GMT +1. The time now is 04:37 AM.


Powered by vBulletin® Version 3.7.3
Copyright ©2000 - 2008, Jelsoft Enterprises Ltd.
Content Relevant URLs by vBSEO 3.0.0