This is a discussion on Why Is This Query A "Select_full_join"? within the MySQL Database forums, part of the Database Forums category; Hi there While getting some undesired results from a query, I looked a little closer at it and was surprised ...
|
|||||||
| FAQ | Members List | Calendar | Search | Today's Posts | Mark Forums Read |
|
|||
|
Hi there
While getting some undesired results from a query, I looked a little closer at it and was surprised when my EXPLAIN statement told me that my query was looking through all the rows in a particular table, and no keys were being used on this table - "possible_keys" and "key" are both NULL for this table in the EXPLAIN statement. Sure enough, the "Select_full_join" variable in MySQL Administrator increments by one every time I use the query. I can't figure out WHY this is happening, as I have an index on the table that should be used.... here is my query: SELECT ue.EmailID,ue.ToAddress,ue.Subject,ue.Message,ue.S avedDate,ue.EmailKey, f.FileKey,f.FileName FROM UserEmails ue LEFT JOIN UserEmailAttachments ua ON ue.EmailKey=ua.EmailKey LEFT JOIN ProjFiles f ON ua.FileKey=f.FileKey WHERE ue.AdminID=1 AND ((ue.SavedDate > 1164344436) AND (ue.SentDate=0)) Basically, what I want to be doing here is grabbing a users 'email message' in the database. LEFT JOINS are used on the UserEmailAttachments and ProjFiles tables because an 'email' may or may not have any files attached to it. The problem table in the EXPLAIN statement is ProjFiles. It has 7803 rows so far, and they're all being looked at, even though the FileKey column (which is the one being checked in the query) is indexed: CREATE TABLE `ProjFiles` ( `FileID` int(10) unsigned NOT NULL auto_increment, `FolderID` int(10) unsigned NOT NULL default '0', `ClientID` int(10) unsigned NOT NULL default '0', `FileSetID` mediumint(8) unsigned NOT NULL default '0', `FileType` varchar(255) NOT NULL default '', `FileSize` varchar(20) NOT NULL default '0', `VersionNum` varchar(10) default NULL, `Note` text, `AddedBy` int(10) unsigned default NULL, `AddedByUser` int(10) unsigned default NULL, `DateAdded` varchar(100) default 'May 15th, 2005', `DateStamp` bigint(20) unsigned NOT NULL default '0', `FileName` varchar(250) NOT NULL default '', `FilePath` text NOT NULL, `FileKey` varchar(30) NOT NULL default '', PRIMARY KEY (`FileID`), KEY `CliDex` (`ClientID`), KEY `FoldDex` (`FolderID`), KEY `SetDex` (`FileSetID`), KEY `KeyDex` (`FileKey`) ) ENGINE=InnoDB DEFAULT CHARSET=latin1; Any tips? Thanks. |
|
|||
|
Good Man wrote: > Hi there > > While getting some undesired results from a query, I looked a little > closer at it and was surprised when my EXPLAIN statement told me that my > query was looking through all the rows in a particular table, and no > keys were being used on this table - "possible_keys" and "key" are both > NULL for this table in the EXPLAIN statement. Sure enough, the > "Select_full_join" variable in MySQL Administrator increments by one > every time I use the query. > > I can't figure out WHY this is happening, as I have an index on the > table that should be used.... here is my query: > > SELECT > ue.EmailID,ue.ToAddress,ue.Subject,ue.Message,ue.S avedDate,ue.EmailKey, > f.FileKey,f.FileName > FROM UserEmails ue > LEFT JOIN UserEmailAttachments ua ON ue.EmailKey=ua.EmailKey > LEFT JOIN ProjFiles f ON ua.FileKey=f.FileKey > WHERE ue.AdminID=1 AND ((ue.SavedDate > 1164344436) AND (ue.SentDate=0)) > > Basically, what I want to be doing here is grabbing a users 'email > message' in the database. LEFT JOINS are used on the > UserEmailAttachments and ProjFiles tables because an 'email' may or may > not have any files attached to it. > > The problem table in the EXPLAIN statement is ProjFiles. It has 7803 > rows so far, and they're all being looked at, even though the FileKey > column (which is the one being checked in the query) is indexed: > > CREATE TABLE `ProjFiles` ( > `FileID` int(10) unsigned NOT NULL auto_increment, > `FolderID` int(10) unsigned NOT NULL default '0', > `ClientID` int(10) unsigned NOT NULL default '0', > `FileSetID` mediumint(8) unsigned NOT NULL default '0', > `FileType` varchar(255) NOT NULL default '', > `FileSize` varchar(20) NOT NULL default '0', > `VersionNum` varchar(10) default NULL, > `Note` text, > `AddedBy` int(10) unsigned default NULL, > `AddedByUser` int(10) unsigned default NULL, > `DateAdded` varchar(100) default 'May 15th, 2005', > `DateStamp` bigint(20) unsigned NOT NULL default '0', > `FileName` varchar(250) NOT NULL default '', > `FilePath` text NOT NULL, > `FileKey` varchar(30) NOT NULL default '', > PRIMARY KEY (`FileID`), > KEY `CliDex` (`ClientID`), > KEY `FoldDex` (`FolderID`), > KEY `SetDex` (`FileSetID`), > KEY `KeyDex` (`FileKey`) > ) ENGINE=InnoDB DEFAULT CHARSET=latin1; > > Any tips? Thanks. because you do not have an index on AdminID, SaveDate and SentDate. create index ue_idx on `UserEmails` (`AdminID`, `SaveDate`, `SentDate`)..... |
|
|||
|
onedbguru@yahoo.com wrote in news:1164392562.354785.112340
@f16g2000cwb.googlegroups.com: > because you do not have an index on AdminID, SaveDate and SentDate. > > create index ue_idx on `UserEmails` (`AdminID`, `SaveDate`, > `SentDate`)..... That doesn't help at all, and I can't see why it would considering that the table causing problems isn't UserEmails but ProjFiles.... the ProjFiles table is the one being scanned. |
|
|||
|
Good Man wrote:
> onedbguru@yahoo.com wrote in news:1164392562.354785.112340 > @f16g2000cwb.googlegroups.com: > > >>because you do not have an index on AdminID, SaveDate and SentDate. >> >>create index ue_idx on `UserEmails` (`AdminID`, `SaveDate`, >>`SentDate`)..... > > > > That doesn't help at all, and I can't see why it would considering that the > table causing problems isn't UserEmails but ProjFiles.... the ProjFiles > table is the one being scanned. > > sorry, overlooked that in your description... What in your query makes some rows in ProjFiles more selectable? Since you did not exclude or include any ProjFiles values in your where clause, of course it needs to scan the whole table. -- Michael Austin. Database Consultant |
|
|||
|
Michael Austin wrote:
> Good Man wrote: > >> onedbguru@yahoo.com wrote in news:1164392562.354785.112340 >> @f16g2000cwb.googlegroups.com: >> >> >>> because you do not have an index on AdminID, SaveDate and SentDate. >>> >>> create index ue_idx on `UserEmails` (`AdminID`, `SaveDate`, >>> `SentDate`)..... >> >> >> >> That doesn't help at all, and I can't see why it would considering >> that the table causing problems isn't UserEmails but ProjFiles.... >> the ProjFiles table is the one being scanned. >> >> > > sorry, overlooked that in your description... > > What in your query makes some rows in ProjFiles more selectable? > Since you did not exclude or include any ProjFiles values in your > where clause, of course it needs to scan the whole table. I think you are mising his point Michael. He is doing a JOIN to that table. So the only records that need to be fetched from ProjFiles are those that have f.FileKey = ua.FileKey. So he is expecting the index to be used to locate the indovidual f.FileKey values, rather than having to scan the table for them. |
|
|||
|
Paul Lautman wrote:
> Michael Austin wrote: > >>Good Man wrote: >> >> >>>onedbguru@yahoo.com wrote in news:1164392562.354785.112340 >>>@f16g2000cwb.googlegroups.com: >>> >>> >>> >>>>because you do not have an index on AdminID, SaveDate and SentDate. >>>> >>>>create index ue_idx on `UserEmails` (`AdminID`, `SaveDate`, >>>>`SentDate`)..... >>> >>> >>> >>>That doesn't help at all, and I can't see why it would considering >>>that the table causing problems isn't UserEmails but ProjFiles.... >>>the ProjFiles table is the one being scanned. >>> >>> >> >>sorry, overlooked that in your description... >> >>What in your query makes some rows in ProjFiles more selectable? >>Since you did not exclude or include any ProjFiles values in your >>where clause, of course it needs to scan the whole table. > > > I think you are mising his point Michael. He is doing a JOIN to that table. > So the only records that need to be fetched from ProjFiles are those that > have f.FileKey = ua.FileKey. So he is expecting the index to be used to > locate the indovidual f.FileKey values, rather than having to scan the table > for them. > > Optimizers are a funny thing. They don't always do what we think they should do. Maybe the cardinality of the selectivity is such that the optimizer determines it can get the data faster by doing a FTS. and doesn't he really want f.FileKey = ua.FileKey (as you stated), so, isn't this backwards? "LEFT JOIN ProjFiles f ON ua.FileKey=f.FileKey" this, to me, says look at all records in f first, then match with ua. and since there is no where-clause for f, it does a FTS. Order really does matter. -- Michael Austin. Database Consultant |
|
|||
|
Michael Austin wrote:
> and doesn't he really want f.FileKey = ua.FileKey (as you stated), > so, isn't this backwards? > > "LEFT JOIN ProjFiles f ON ua.FileKey=f.FileKey" > > this, to me, says look at all records in f first, then match with ua. > and since there is no where-clause for f, it does a FTS. Order really > does matter. Have you tried an experiment to prove this statement? The order of operands around the equals statement should not matter. The JOIN condition is what is stating that one should take all the records in ua and lookup the equivalent row in f. if the optimiser is sensitive to a=5 vs 5=a then it needs serious looking at. |
|
|||
|
"Paul Lautman" <paul.lautman@btinternet.com> wrote in
news:4srceaF10pcc8U1@mid.individual.net: > Michael Austin wrote: >> and doesn't he really want f.FileKey = ua.FileKey (as you stated), >> so, isn't this backwards? >> >> "LEFT JOIN ProjFiles f ON ua.FileKey=f.FileKey" >> >> this, to me, says look at all records in f first, then match with ua. >> and since there is no where-clause for f, it does a FTS. Order really >> does matter. > > Have you tried an experiment to prove this statement? > The order of operands around the equals statement should not matter. > The JOIN condition is what is stating that one should take all the > records in ua and lookup the equivalent row in f. Hello, I appreciate all the comments here. I tried swapping the columns on either side of the '=', but alas there was no difference. I have read about the optimizer sometimes choosing to ignore indexes (or using an index you wouldn't expect it to use), so I'm *almost* ready to accept that as an explanation, but seeing anything other than a '0' in the "Select_full_join" MySQL status variable makes me feel soiled in some way - I want to take a 14-hour shower! The thing is, the ProjFiles table (f) is one of the largest in my database; the 7000 entries I have in there now will surely be much larger with real life usage. Would using subqueries make a difference? I've never really explored them; JOINS have been sufficient for me so far, so I'm still a little baffled why MySQL isn't using the f.FileKey index.... Can my query be written in any other way? A comment above about f.FileKey not being in the WHERE clause is the way it has to be! Thanks |
|
|||
|
Good Man wrote:
> "Paul Lautman" <paul.lautman@btinternet.com> wrote in > news:4srceaF10pcc8U1@mid.individual.net: > > Can my query be written in any other way? A comment above about > f.FileKey not being in the WHERE clause is the way it has to be! > I would try deconstructing the query and building it up bit by bit. First try with a basic where and see if the index gets used. |
|
|||
|
Good Man wrote:
> Can my query be written in any other way? A comment above about > f.FileKey not being in the WHERE clause is the way it has to be! It's kind of grasping at straws in a haystack in the dark, but I'd try something like this: SELECT ue.EmailID, ue.ToAddress, ue.Subject, ue.Message, ue.SavedDate, ue.EmailKey, f.FileKey, f.FileName FROM UserEmails ue LEFT JOIN (UserEmailAttachments ua JOIN ProjFiles f ON ua.FileKey = f.FileKey) ON ue.EmailKey = ua.EmailKey WHERE ue.AdminID = 1 AND ue.SavedDate > 1164344436 AND ue.SentDate = 0 I'm assuming that if a record exists in ua, then there must be a matching record in f. Making the join between ua and f could therefore be improved by making it an inner join. I can't be precise about my reasoning, I just think this may give the optimizer an "easier" task to analyze the join between ua and f. Regarding using subqueries, here's a possible solution. It resembles my suggestion above. SELECT ue.EmailID, ue.ToAddress, ue.Subject, ue.Message, ue.SavedDate, ue.EmailKey, uaf.FileKey, uaf.FileName FROM UserEmails ue LEFT JOIN (SELECT f.FileKey, f.FileName, ua.EmailKey FROM UserEmailAttachments ua JOIN ProjFiles f ON ua.FileKey = f.FileKey) AS uaf ON ue.EmailKey = uaf.EmailKey WHERE ue.AdminID = 1 AND ue.SavedDate > 1164344436 AND ue.SentDate = 0 Another solution using correlated subqueries. Highly subject to performance problems unless the outer query is likely to restrict its rows to a very small subset. SELECT ue.EmailID, ue.ToAddress, ue.Subject, ue.Message, ue.SavedDate, ue.EmailKey, (SELECT f1.FileKey FROM UserEmailAttachments ua1 JOIN ProjFiles f1 USING (FileKey) WHERE ua1.EmailKey = ue.EmailKey) AS FileKey, (SELECT f2.FileName FROM UserEmailAttachments ua2 JOIN ProjFiles f2 USING (FileKey) WHERE ua2.EmailKey = ue.EmailKey) AS FileName FROM UserEmails ue WHERE ue.AdminID = 1 AND ue.SavedDate > 1164344436 AND ue.SentDate = 0 Note that MySQL 4.0 and earlier do not support subqueries. Regards, Bill K. |