Bluehost.com Web Hosting $6.95

Why Is This Query A "Select_full_join"?

This is a discussion on Why Is This Query A "Select_full_join"? within the MySQL Database forums, part of the Database Forums category; Hi there While getting some undesired results from a query, I looked a little closer at it and was surprised ...


Go Back   Usenet Forums > Database Forums > MySQL Database

FAQ Members List Calendar Search Today's Posts Mark Forums Read
  #1 (permalink)  
Old 11-24-2006
Good Man
 
Posts: n/a
Default Why Is This Query A "Select_full_join"?

Hi there

While getting some undesired results from a query, I looked a little
closer at it and was surprised when my EXPLAIN statement told me that my
query was looking through all the rows in a particular table, and no
keys were being used on this table - "possible_keys" and "key" are both
NULL for this table in the EXPLAIN statement. Sure enough, the
"Select_full_join" variable in MySQL Administrator increments by one
every time I use the query.

I can't figure out WHY this is happening, as I have an index on the
table that should be used.... here is my query:

SELECT
ue.EmailID,ue.ToAddress,ue.Subject,ue.Message,ue.S avedDate,ue.EmailKey,
f.FileKey,f.FileName
FROM UserEmails ue
LEFT JOIN UserEmailAttachments ua ON ue.EmailKey=ua.EmailKey
LEFT JOIN ProjFiles f ON ua.FileKey=f.FileKey
WHERE ue.AdminID=1 AND ((ue.SavedDate > 1164344436) AND (ue.SentDate=0))

Basically, what I want to be doing here is grabbing a users 'email
message' in the database. LEFT JOINS are used on the
UserEmailAttachments and ProjFiles tables because an 'email' may or may
not have any files attached to it.

The problem table in the EXPLAIN statement is ProjFiles. It has 7803
rows so far, and they're all being looked at, even though the FileKey
column (which is the one being checked in the query) is indexed:

CREATE TABLE `ProjFiles` (
`FileID` int(10) unsigned NOT NULL auto_increment,
`FolderID` int(10) unsigned NOT NULL default '0',
`ClientID` int(10) unsigned NOT NULL default '0',
`FileSetID` mediumint(8) unsigned NOT NULL default '0',
`FileType` varchar(255) NOT NULL default '',
`FileSize` varchar(20) NOT NULL default '0',
`VersionNum` varchar(10) default NULL,
`Note` text,
`AddedBy` int(10) unsigned default NULL,
`AddedByUser` int(10) unsigned default NULL,
`DateAdded` varchar(100) default 'May 15th, 2005',
`DateStamp` bigint(20) unsigned NOT NULL default '0',
`FileName` varchar(250) NOT NULL default '',
`FilePath` text NOT NULL,
`FileKey` varchar(30) NOT NULL default '',
PRIMARY KEY (`FileID`),
KEY `CliDex` (`ClientID`),
KEY `FoldDex` (`FolderID`),
KEY `SetDex` (`FileSetID`),
KEY `KeyDex` (`FileKey`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1;

Any tips? Thanks.


Reply With Quote
  #2 (permalink)  
Old 11-24-2006
onedbguru@yahoo.com
 
Posts: n/a
Default Re: Why Is This Query A "Select_full_join"?


Good Man wrote:
> Hi there
>
> While getting some undesired results from a query, I looked a little
> closer at it and was surprised when my EXPLAIN statement told me that my
> query was looking through all the rows in a particular table, and no
> keys were being used on this table - "possible_keys" and "key" are both
> NULL for this table in the EXPLAIN statement. Sure enough, the
> "Select_full_join" variable in MySQL Administrator increments by one
> every time I use the query.
>
> I can't figure out WHY this is happening, as I have an index on the
> table that should be used.... here is my query:
>
> SELECT
> ue.EmailID,ue.ToAddress,ue.Subject,ue.Message,ue.S avedDate,ue.EmailKey,
> f.FileKey,f.FileName
> FROM UserEmails ue
> LEFT JOIN UserEmailAttachments ua ON ue.EmailKey=ua.EmailKey
> LEFT JOIN ProjFiles f ON ua.FileKey=f.FileKey
> WHERE ue.AdminID=1 AND ((ue.SavedDate > 1164344436) AND (ue.SentDate=0))
>
> Basically, what I want to be doing here is grabbing a users 'email
> message' in the database. LEFT JOINS are used on the
> UserEmailAttachments and ProjFiles tables because an 'email' may or may
> not have any files attached to it.
>
> The problem table in the EXPLAIN statement is ProjFiles. It has 7803
> rows so far, and they're all being looked at, even though the FileKey
> column (which is the one being checked in the query) is indexed:
>
> CREATE TABLE `ProjFiles` (
> `FileID` int(10) unsigned NOT NULL auto_increment,
> `FolderID` int(10) unsigned NOT NULL default '0',
> `ClientID` int(10) unsigned NOT NULL default '0',
> `FileSetID` mediumint(8) unsigned NOT NULL default '0',
> `FileType` varchar(255) NOT NULL default '',
> `FileSize` varchar(20) NOT NULL default '0',
> `VersionNum` varchar(10) default NULL,
> `Note` text,
> `AddedBy` int(10) unsigned default NULL,
> `AddedByUser` int(10) unsigned default NULL,
> `DateAdded` varchar(100) default 'May 15th, 2005',
> `DateStamp` bigint(20) unsigned NOT NULL default '0',
> `FileName` varchar(250) NOT NULL default '',
> `FilePath` text NOT NULL,
> `FileKey` varchar(30) NOT NULL default '',
> PRIMARY KEY (`FileID`),
> KEY `CliDex` (`ClientID`),
> KEY `FoldDex` (`FolderID`),
> KEY `SetDex` (`FileSetID`),
> KEY `KeyDex` (`FileKey`)
> ) ENGINE=InnoDB DEFAULT CHARSET=latin1;
>
> Any tips? Thanks.



because you do not have an index on AdminID, SaveDate and SentDate.

create index ue_idx on `UserEmails` (`AdminID`, `SaveDate`,
`SentDate`).....

Reply With Quote
  #3 (permalink)  
Old 11-24-2006
Good Man
 
Posts: n/a
Default Re: Why Is This Query A "Select_full_join"?

onedbguru@yahoo.com wrote in news:1164392562.354785.112340
@f16g2000cwb.googlegroups.com:

> because you do not have an index on AdminID, SaveDate and SentDate.
>
> create index ue_idx on `UserEmails` (`AdminID`, `SaveDate`,
> `SentDate`).....



That doesn't help at all, and I can't see why it would considering that the
table causing problems isn't UserEmails but ProjFiles.... the ProjFiles
table is the one being scanned.


Reply With Quote
  #4 (permalink)  
Old 11-25-2006
Michael Austin
 
Posts: n/a
Default Re: Why Is This Query A "Select_full_join"?

Good Man wrote:

> onedbguru@yahoo.com wrote in news:1164392562.354785.112340
> @f16g2000cwb.googlegroups.com:
>
>
>>because you do not have an index on AdminID, SaveDate and SentDate.
>>
>>create index ue_idx on `UserEmails` (`AdminID`, `SaveDate`,
>>`SentDate`).....

>
>
>
> That doesn't help at all, and I can't see why it would considering that the
> table causing problems isn't UserEmails but ProjFiles.... the ProjFiles
> table is the one being scanned.
>
>


sorry, overlooked that in your description...

What in your query makes some rows in ProjFiles more selectable? Since you did
not exclude or include any ProjFiles values in your where clause, of course it
needs to scan the whole table.


--
Michael Austin.
Database Consultant
Reply With Quote
  #5 (permalink)  
Old 11-25-2006
Paul Lautman
 
Posts: n/a
Default Re: Why Is This Query A "Select_full_join"?

Michael Austin wrote:
> Good Man wrote:
>
>> onedbguru@yahoo.com wrote in news:1164392562.354785.112340
>> @f16g2000cwb.googlegroups.com:
>>
>>
>>> because you do not have an index on AdminID, SaveDate and SentDate.
>>>
>>> create index ue_idx on `UserEmails` (`AdminID`, `SaveDate`,
>>> `SentDate`).....

>>
>>
>>
>> That doesn't help at all, and I can't see why it would considering
>> that the table causing problems isn't UserEmails but ProjFiles....
>> the ProjFiles table is the one being scanned.
>>
>>

>
> sorry, overlooked that in your description...
>
> What in your query makes some rows in ProjFiles more selectable?
> Since you did not exclude or include any ProjFiles values in your
> where clause, of course it needs to scan the whole table.


I think you are mising his point Michael. He is doing a JOIN to that table.
So the only records that need to be fetched from ProjFiles are those that
have f.FileKey = ua.FileKey. So he is expecting the index to be used to
locate the indovidual f.FileKey values, rather than having to scan the table
for them.


Reply With Quote
  #6 (permalink)  
Old 11-25-2006
Michael Austin
 
Posts: n/a
Default Re: Why Is This Query A "Select_full_join"?

Paul Lautman wrote:
> Michael Austin wrote:
>
>>Good Man wrote:
>>
>>
>>>onedbguru@yahoo.com wrote in news:1164392562.354785.112340
>>>@f16g2000cwb.googlegroups.com:
>>>
>>>
>>>
>>>>because you do not have an index on AdminID, SaveDate and SentDate.
>>>>
>>>>create index ue_idx on `UserEmails` (`AdminID`, `SaveDate`,
>>>>`SentDate`).....
>>>
>>>
>>>
>>>That doesn't help at all, and I can't see why it would considering
>>>that the table causing problems isn't UserEmails but ProjFiles....
>>>the ProjFiles table is the one being scanned.
>>>
>>>

>>
>>sorry, overlooked that in your description...
>>
>>What in your query makes some rows in ProjFiles more selectable?
>>Since you did not exclude or include any ProjFiles values in your
>>where clause, of course it needs to scan the whole table.

>
>
> I think you are mising his point Michael. He is doing a JOIN to that table.
> So the only records that need to be fetched from ProjFiles are those that
> have f.FileKey = ua.FileKey. So he is expecting the index to be used to
> locate the indovidual f.FileKey values, rather than having to scan the table
> for them.
>
>



Optimizers are a funny thing. They don't always do what we think they should do.
Maybe the cardinality of the selectivity is such that the optimizer determines
it can get the data faster by doing a FTS.

and doesn't he really want f.FileKey = ua.FileKey (as you stated), so, isn't
this backwards?

"LEFT JOIN ProjFiles f ON ua.FileKey=f.FileKey"

this, to me, says look at all records in f first, then match with ua. and since
there is no where-clause for f, it does a FTS. Order really does matter.

--
Michael Austin.
Database Consultant
Reply With Quote
  #7 (permalink)  
Old 11-25-2006
Paul Lautman
 
Posts: n/a
Default Re: Why Is This Query A "Select_full_join"?

Michael Austin wrote:
> and doesn't he really want f.FileKey = ua.FileKey (as you stated),
> so, isn't this backwards?
>
> "LEFT JOIN ProjFiles f ON ua.FileKey=f.FileKey"
>
> this, to me, says look at all records in f first, then match with ua.
> and since there is no where-clause for f, it does a FTS. Order really
> does matter.


Have you tried an experiment to prove this statement?
The order of operands around the equals statement should not matter. The
JOIN condition is what is stating that one should take all the records in ua
and lookup the equivalent row in f.
if the optimiser is sensitive to a=5 vs 5=a then it needs serious looking
at.


Reply With Quote
  #8 (permalink)  
Old 11-26-2006
Good Man
 
Posts: n/a
Default Re: Why Is This Query A "Select_full_join"?

"Paul Lautman" <paul.lautman@btinternet.com> wrote in
news:4srceaF10pcc8U1@mid.individual.net:

> Michael Austin wrote:
>> and doesn't he really want f.FileKey = ua.FileKey (as you stated),
>> so, isn't this backwards?
>>
>> "LEFT JOIN ProjFiles f ON ua.FileKey=f.FileKey"
>>
>> this, to me, says look at all records in f first, then match with ua.
>> and since there is no where-clause for f, it does a FTS. Order really
>> does matter.

>
> Have you tried an experiment to prove this statement?
> The order of operands around the equals statement should not matter.
> The JOIN condition is what is stating that one should take all the
> records in ua and lookup the equivalent row in f.


Hello, I appreciate all the comments here.

I tried swapping the columns on either side of the '=', but alas there
was no difference.

I have read about the optimizer sometimes choosing to ignore indexes (or
using an index you wouldn't expect it to use), so I'm *almost* ready to
accept that as an explanation, but seeing anything other than a '0' in
the "Select_full_join" MySQL status variable makes me feel soiled in
some way - I want to take a 14-hour shower!

The thing is, the ProjFiles table (f) is one of the largest in my
database; the 7000 entries I have in there now will surely be much
larger with real life usage.

Would using subqueries make a difference? I've never really explored
them; JOINS have been sufficient for me so far, so I'm still a little
baffled why MySQL isn't using the f.FileKey index....

Can my query be written in any other way? A comment above about
f.FileKey not being in the WHERE clause is the way it has to be!



Thanks
Reply With Quote
  #9 (permalink)  
Old 11-28-2006
Paul Lautman
 
Posts: n/a
Default Re: Why Is This Query A "Select_full_join"?

Good Man wrote:
> "Paul Lautman" <paul.lautman@btinternet.com> wrote in
> news:4srceaF10pcc8U1@mid.individual.net:
>
> Can my query be written in any other way? A comment above about
> f.FileKey not being in the WHERE clause is the way it has to be!
>

I would try deconstructing the query and building it up bit by bit. First
try with a basic where and see if the index gets used.



Reply With Quote
  #10 (permalink)  
Old 11-28-2006
Bill Karwin
 
Posts: n/a
Default Re: Why Is This Query A "Select_full_join"?

Good Man wrote:
> Can my query be written in any other way? A comment above about
> f.FileKey not being in the WHERE clause is the way it has to be!


It's kind of grasping at straws in a haystack in the dark, but I'd try
something like this:

SELECT ue.EmailID, ue.ToAddress, ue.Subject, ue.Message,
ue.SavedDate, ue.EmailKey, f.FileKey, f.FileName
FROM UserEmails ue
LEFT JOIN
(UserEmailAttachments ua JOIN ProjFiles f ON ua.FileKey = f.FileKey)
ON ue.EmailKey = ua.EmailKey
WHERE ue.AdminID = 1
AND ue.SavedDate > 1164344436
AND ue.SentDate = 0

I'm assuming that if a record exists in ua, then there must be a
matching record in f. Making the join between ua and f could therefore
be improved by making it an inner join.

I can't be precise about my reasoning, I just think this may give the
optimizer an "easier" task to analyze the join between ua and f.

Regarding using subqueries, here's a possible solution. It resembles my
suggestion above.

SELECT ue.EmailID, ue.ToAddress, ue.Subject, ue.Message,
ue.SavedDate, ue.EmailKey, uaf.FileKey, uaf.FileName
FROM UserEmails ue
LEFT JOIN
(SELECT f.FileKey, f.FileName, ua.EmailKey
FROM UserEmailAttachments ua JOIN ProjFiles f
ON ua.FileKey = f.FileKey) AS uaf
ON ue.EmailKey = uaf.EmailKey
WHERE ue.AdminID = 1
AND ue.SavedDate > 1164344436
AND ue.SentDate = 0

Another solution using correlated subqueries. Highly subject to
performance problems unless the outer query is likely to restrict its
rows to a very small subset.

SELECT ue.EmailID, ue.ToAddress, ue.Subject, ue.Message,
ue.SavedDate, ue.EmailKey,
(SELECT f1.FileKey FROM UserEmailAttachments ua1 JOIN ProjFiles f1
USING (FileKey) WHERE ua1.EmailKey = ue.EmailKey) AS FileKey,
(SELECT f2.FileName FROM UserEmailAttachments ua2 JOIN ProjFiles f2
USING (FileKey) WHERE ua2.EmailKey = ue.EmailKey) AS FileName
FROM UserEmails ue
WHERE ue.AdminID = 1
AND ue.SavedDate > 1164344436
AND ue.SentDate = 0

Note that MySQL 4.0 and earlier do not support subqueries.

Regards,
Bill K.
Reply With Quote
Reply
Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are Off
[IMG] code is Off
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On




All times are GMT +1. The time now is 03:25 PM.


Powered by vBulletin® Version 3.7.3
Copyright ©2000 - 2008, Jelsoft Enterprises Ltd.
Content Relevant URLs by vBSEO 3.0.0