« LOWER() :: MySQL Function of the Day | Main | DATE_FORMAT :: MySQL Function of the Day »
MySQL Pop Quiz #14
By Carsten | March 10, 2008
This quiz is based on an email I got from a somewhat annoyed Jesper Krogh.
btw, I’m still looking for new entries. I get quite a few suggestions, but not all of them make it into quiz questions. Do send in your suggestions!
Given:
SELECT * FROM car_parts: +---------+---------+-------+--------+ | part_id | car_id | price | weight | +---------+---------+-------+--------+ | 1 | 1 | 2.20 | 3.00 | | 2 | 1 | 3.30 | 5.00 | | 3 | 1 | 4.40 | 7.00 | | 4 | 2 | 2.20 | 9.00 | | 5 | 2 | 3.30 | 11.00 | | 6 | 2 | 4.40 | 13.00 | +---------+---------+-------+--------+
We want to know the least expensive part of each car, and how much that part weighs:
SELECT car_id, MIN(price), weight FROM parts GROUP BY car_id; +---------+------------+--------+ | car_id | MIN(price) | weight | +---------+------------+--------+ | 1 | 2.20 | 3.00 | | 2 | 2.20 | 9.00 | +---------+------------+--------+
Which looks fine. We then do a similar search to find most expensive part of each car, and — again — how much that part weighs:
SELECT car_id, MAX(price), weight FROM parts GROUP BY part_of
which gives us:
+---------+------------+--------+ | car_id | MAX(price) | weight | +---------+------------+--------+ | 1 | 4.40 | 3.00 | | 2 | 4.40 | 9.00 | +---------+------------+--------+
Question 1: Why is the content of the weight column wrong in the second query?
Question 2: What’s the reason these queries aren’t rejected?
Do you have your own favorite MySQL question? Is there something in MySQL that has you scratching your head? Send your ideas for new pop quiz questions to carsten (at) bitbybit (dot) dk. Send it in, even if you don’t know the answer — that only makes it more interesting.
More Pop Quiz entries: WWW or RSS.
March 10th, 2008 at 12:52 pm
Using subqueries is a way to solve this specific problem:
SELECT p1.car_id,
MIN(p1.price) AS minPrice,
(SELECT weight
FROM parts AS p2
WHERE (p1.part_id, p1.car_id, p1.price) = (p2.part_id, p2.car_id, p2.price)
ORDER BY p2.price ASC, p2.weight ASC
LIMIT 1
) AS weightOfMinPrice
FROM parts AS p1
GROUP BY p1.car_id;
Luckily MySQL permit use of LIMIT in subqueries too (but test it on your DB version it was buggy once).
The query show the lower weight product that has the minimum prize, it’s possible to achieve the four combination of min/max between price and weight changing the order by from ASC to DESC.
March 10th, 2008 at 3:35 pm
Good reading from Roland on this subject (dispelling the myth that the SQL standard *requires* GROUP BY to list all non-aggregate/grouped fields used in the SELECT):
http://rpbouman.blogspot.com/2007/05/debunking-group-by-myths.html
March 10th, 2008 at 4:36 pm
Hi!
I believe answer #2 is wrong:
“Unlike other RDBMSs, MySQL allows you to GROUP BY columns that aren’t named in the SELECT clause.”
There is no such restriction in any RDBMS that I know of. It is the other way around as described in the manual section you refer to:
“MySQL extends the use of GROUP BY so that you can use non-aggregated columns or calculations in the SELECT list that do not appear in the GROUP BY clause.”
kind regards,
Roland
March 10th, 2008 at 8:13 pm
Yes, an ambiguity occurs when an expression in the select-list has multiple values per distinct value in the column(s) named in the GROUP BY clause. Roland’s explanation is excellent.
The database I have the most experience with besides MySQL is InterBase (aka Firebird). InterBase’s implementation of SQL is based on SQL-92, and it actually throws a SQL error if you try to run a query of the form “SELECT a, b FROM t GROUP BY a”.
The problem is that most databases aren’t optimized to have a functional dependency map on the tip of their tongue at all times. So it’s hard for the SQL engine to know if a grouped query will result in an ambiguous result set or not. That’s probably for the best; it would be expensive to track this meta-information, and it would be useful infrequently.
In any case, whether it results in an error or not, there is such a thing as the “Single Value Rule” in SQL-99. Requiring that any non-aggregate column must be named in the GROUP BY clause is a simple guideline, and it does prevent ambiguous results, but certainly it’s more subtle than that.
March 10th, 2008 at 9:34 pm
Thanks for the comments. As you can see, I’ve updated answer #2 to point people to read those.
March 31st, 2008 at 9:10 am
[...] ???????????? ? ?????????? Carsten Pedersen. ???????? [...]