This is a discussion on How can I improve the performance of this kind of query? within the MySQL Database forums, part of the Database Forums category; Please consider the following stored procedure: DROP PROCEDURE IF EXISTS `get_etf_stocks_and_prices`; CREATE PROCEDURE `get_etf_stocks_and_prices`(id INTEGER) BEGIN SELECT stocks.stock_id ...
|
|||||||
| FAQ | Members List | Calendar | Search | Today's Posts | Mark Forums Read |
|
|||
|
Please consider the following stored procedure:
DROP PROCEDURE IF EXISTS `get_etf_stocks_and_prices`; CREATE PROCEDURE `get_etf_stocks_and_prices`(id INTEGER) BEGIN SELECT stocks.stock_id AS stock_id, stocks.symbol AS symbol, stocks.name AS stock_name, A.`close` AS close_price, A.price_date AS price_date FROM etf_stocks LEFT JOIN stocks ON etf_stocks.stock_id = stocks.stock_id LEFT JOIN stockprices A ON stocks.stock_id = A.stock_id LEFT JOIN stockprices B ON A.stock_id = B.stock_id AND B.price_date > A.price_date WHERE B.price_date IS NULL AND etf_id = id; END // While price_date is part of the primary key, creating an extra index on it dramatically improved how long it takes to complete. However, it takes close to 10 minutes for the results to be displayed where there are 28 stocks in the etf required (out of almost 500 possible stocks for almost 500 etfs), the times shown by MySQL Query browser are 0.0037s (520.3263s): the latter s almost the 20 minutes I experience waiting for the results to be displayed. Given that there is daily data going back over 40 years, at least for some stocks, I don't want to think how long it would take to get the final close price for each week represented in the database for the stocks requested. The data I get appear to be correct, but I need to get it MUCH much more quickly. Thanks Ted |
|
|||
|
Ted
The statement AND B.price_date > A.price_date WHERE B.price_date IS NULL is contradictory. For B.price_date > A.price_date to return true, then B.price_date has to be NOT NULL (otherwise it returns null). On the next line you are asking B.price_date to be NULL. Seems like you are doing a lot of comparison work for no purpose. Also B.price_date is null will return unmatched rows in B or matched rows where B.price_date is null. Not sure if that is what you want? In fact I can't see a reason to have table B in there at all. What was your purpose for including it? X |
|
|||
|
On Nov 6, 2:54 am, xeni...@gmail.com wrote:
> Ted > > The statement > > AND B.price_date > A.price_date > WHERE B.price_date IS NULL > > is contradictory. For B.price_date > A.price_date to return true, > then B.price_date has to be NOT NULL (otherwise it returns null). On > the next line you are asking B.price_date to be NULL. Seems like you > are doing a lot of comparison work for no purpose. > > Also B.price_date is null will return unmatched rows in B or matched > rows where B.price_date is null. Not sure if that is what you want? > In fact I can't see a reason to have table B in there at all. What > was your purpose for including it? > > X This is what I was advised to use in the recent thread " Trouble getting latest record for a given item." Articles in that thread give URLs to additional sources, including MySQL documentation, that describe using this method for getting the most recent value. The essence of the rationale for it is that there will be exactly one record in A that has no price greater than that in that record. We're selecting records from a LEFT JOIN, so that join will have records in A even for comparisons for which ALL values on the B side are NULL. I agree there is a lot of comparison, but the question is, is there a faster way to get the result. This is a worry since the table in question has millions of records containing over 100 MB of data. The first objective was to get the most recent price for a given item. The next step is to get the last price for a given item for each week for which there is data. The complication for this is that while there is data for every business day, statutory holidays mean that in some weeks the last business day of the week is a Thursday rather than a Friday. It takes so long to get the last value for just a couple dozen items, that using the same kind of query to get the last record in each week, for series that go back 50 years, seems impracticable: the application would just sit there apparently doing nothing, for hours. Thanks, Ted |
|
|||
|
Hmmm...
SELECT stocks.stock_id AS stock_id, stocks.symbol AS symbol, stocks.name AS stock_name, A.`close` AS close_price, A.price_date AS price_date FROM etf_stocks LEFT JOIN stocks ON etf_stocks.stock_id = stocks.stock_id LEFT JOIN stockprices A ON stocks.stock_id = A.stock_id /*This is grabbing the most recent date for that STOCK_ID from an aliased STOCKPRICES table and using that as a criteria. It's the same as using a max, but I've found on MySQL that MAX in a sub-query sucks, and an order/limit doesn't - no idea why */ where A.price_date = (select price_date from stockprices A2 where A2.stock_id = A.stock_id order by price_date desc limit 1) AND etf_id = id; An index on stock_id and possibly on stock_id,price_date would probably help in this case to make the subquery run faster. |
|
|||
|
On Nov 9, 12:55 am, xeni...@gmail.com wrote:
> Hmmm... > > SELECT stocks.stock_id AS stock_id, > stocks.symbol AS symbol, > stocks.name AS stock_name, > A.`close` AS close_price, > A.price_date AS price_date > FROM etf_stocks LEFT JOIN stocks ON etf_stocks.stock_id = > stocks.stock_id > LEFT JOIN stockprices A ON stocks.stock_id = A.stock_id > /*This is grabbing the most recent date for that STOCK_ID from an > aliased STOCKPRICES table and using that as a criteria. > It's the same as using a max, but I've found on MySQL that MAX in a > sub-query sucks, and an order/limit doesn't - no idea why > */ > where A.price_date = (select price_date from stockprices A2 where > A2.stock_id = A.stock_id order by price_date desc limit 1) > AND etf_id = id; > > An index on stock_id and possibly on stock_id,price_date would > probably help in this case to make the subquery run faster. Thanks xeni. I tried my original with yours, and it seems an exercise in frustration. First note, while there is no index on stock_id, stock_id plus price_date form the primary key. Adding an index on price_date has made virtually no difference, neither in the times I see nor in the timing numbers MySQL Query Browser reports. For yours, applied to etf # 397, I get 28 rows in what Query Browser reports as 0.0054s (1133.5915s). It felt like about 20 minutes. BTW: do you know what the times are that the Query Browser reports? Sometimes the number within parentheses corresponds to perceived elapsed time on the wall and sometimes it does not. Anyway, with my original query, the corresponding times (on the same etf) are 0.0039s (516.0759s) Thanks Ted |
![]() |
| Thread Tools | |
| Display Modes | |
|
|