> From: David Krings > > At 02:21 PM 10/8/2006, you wrote: > > > From: David Krings > > > > > > SELECT DISTINCT pivientities.Entityid FROM pivientities, > > > pivicategories INNER JOIN pivicategories ON > > > pivientities.Entityid=pivicategories.Entityid > > > WHERE 1=1 AND pivientities.Entitytype = 'pic' OR > > > pivicategories.Category_name='Alton' OR > > > pivicategories.Category_name='Bristol' OR > > > pivicategories.Category_name='Soft' ORDER BY > > > pivientities.Creation_date ASC > > > >I don't think that's really what you want. > > > >You are doing a three-product cartesian join there, pivientities x > >pivicategories x pivicategories, which means that you are producing > >(number of rows in pivicategories) duplicate rows of each row you > >really do want. The only reason you are not seeing them is > because of the DISTINCT. > > I left the distinct out and did see the additional rows, > which I didn't mind too much, since they are gone with > distinct in the query. Except it's more work for the database to consider all those rows that you are going to throw out anyway! Suppose you have 50 rows in the category table, that's 50 _times_ the number of rows the query has to consider. Maybe I should have gotten to the point and mentioned that avoiding that is as simple are leaving out the unnecessary join: SELECT DISTINCT pivientities.Entityid FROM pivientities INNER JOIN pivicategories ON pivientities.Entityid=pivicategories.Entityid ... Also consider that INNER JOIN and "," are the same thing excepting that INNER JOIN can take on "ON" clause (think of "," as shorthand notation for INNER JOIN without ON). If you think of is this way, then what you wrote is like writing either one of these: SELECT DISTINCT pivientities.Entityid FROM pivientities, pivicategories, pivicategories WHERE ... Or SELECT DISTINCT pivientities.Entityid FROM pivientities INNER JOIN pivicategories INNER JOIN pivicategories ON pivientities.Entityid=pivicategories.Entityid ... > I looked up the operator precedence and found a table that > lists levels from 1 through 9 and the corresponding > operators. OR was level 9 and AND was level 8. The > description stated that the "higher level operators have > precedence over lower level ones". It did not explain if > level 1 or level 9 is the highest level. So, based on what > you write AND is preferred over OR. Very helpful documentation, eh? In this case, level 1 is the highest level, that is, without any parenthesis or other forced grouping, all the ANDs will be evaluated before all the ORs. > I'm not too concerned about performance of the query, as I > am sure that the execution time and transfer of the page > data drastically exceeds the query run time. So if I am > looking at optimizing the system I'd take a closer look at > my PHP code than the SQL queries. You might reconsider that. I'm making a lot of assumptions about your application here, but suppose your table grows to have thousands of rows in it. Probably you are still only going to display the results 20 or 50 rows at a time, so once your database gets to the size that you tend to produce at least that many rows per query, your PHP and transfer time will not increase any as the database gets bigger, since they are still only handling 20 or 50 at a time. Your SQL time will continue to go up though, because the query needs to sift through all the rows every time you display 20 or 50 rows. Once your indexes get bigger than will stay in memory, your SQL time will start to go way up. Also, PHP time will tend to scale with CPU speed, where SQL time will tend to scale with disk I/O and memory, which are generally more costly to scale up enough to make a difference than CPU is. David
More information about the ECLUG mailing list