« Kristian’s MySQL Certification Experience | Main | Kai on MySQL certification »
Doing INTERSECT and MINUS in MySQL
By Carsten | October 3, 2005
Recently, I was asked to contribute a series of entries for the MySQL Network Knowledge Base (to which you get access only if you’re a MySQL Network subscriber).
I thought the following ones were kind of fun, so I asked the maintainer if I would be allowed to post them here — and maybe someone could tell me if I’m completely off-track or have suggestions for improvements.
Updated: My esteemed colleague Dean Ellis made me aware of a couple of issues with my suggestions, so here’s an update
Doing an INTERSECT
An INTERSECT is simply an inner join where we compare the tuples of one table with those of the other, and select those that appear in both while weeding out duplicates. So
SELECT member_id, name FROM a
INTERSECT
SELECT member_id, name FROM b
can simply be rewritten to
SELECT a.member_id, a.name
FROM a INNER JOIN b
USING (member_id, name)
For more MySQL-related information om this site, see:
MySQL Pop Quizzes || MySQL Function of the Day
MySQL DATE_FORMAT() and PHP DATE() tool
Other MySQL blog entries
Performing a MINUS
To transform the statement
SELECT member_id, name FROM a
MINUS
SELECT member_id, name FROM b
into something that MySQL can process, we can utilize subqueries (available from MySQL 4.1 onward). The easy-to-understand transformation is:
SELECT DISTINCT member_id, name
FROM a
WHERE (member_id, name) NOT IN
(SELECT member_id, name FROM table2);
Of course, to any long-time MySQL user, this is immediately obvious as the classical use-left-join-to-find-what-isn’t-in-the-other-table:
SELECT DISTINCT a.member_id, a.name
FROM a LEFT JOIN b USING (member_id, name)
WHERE b.member_id IS NULL
which tends to be a lot more efficient.
Topics: MySQL |
October 16th, 2005 at 9:31 pm
These examples are very usefull, however, I think it is important to comment that internally something totally different happens. When you would do and INTERSECT query for example, it would query the results from table A and table B and then compare both sets. With the MySQL solution you describe it would query all results from the first table and then do a lookup for each result in the second table. This results in worse performance when your query returns a huge number of results. For most uses your approach will be a good solution doh
October 4th, 2006 at 1:44 pm
This is realy very helpful. i have across the fact that mysql 4.0 version
has lots of limitation. coming from oracle background it is difficult to think of writting queries when you are not sure whether this will work or not. so this kind of website proves to be very handy. Thank you
May 4th, 2007 at 10:05 am
what if i have 3 tables or more to intersect with each other? thanks!
August 10th, 2007 at 8:21 am
I would assume for 3 or more tables you would do:
SELECT a.member_id, a.name
FROM a INNER JOIN b
USING (member_id, name) INNER JOIN c USING (member_id, name)
for intersect. Please correct me if I’m wrong. Haven’t tried this.
November 21st, 2007 at 3:38 pm
I need to perform ‘minus’ between two tables. But the only field I am using in select statement is ‘count(*)’. How to preform it?
Thanks in advance.
December 1st, 2007 at 3:47 pm
[...] Alternativ kann man das auch mit einem Left Join lösen. Alle Varianten haben mit dem Original-MINUS gemein, dass sie recht unperformant sind. (Gefunden bei bitbybit) [...]
December 12th, 2007 at 7:54 am
hi INTERSECT is not working in mysql
my query is this
select user_id from answer_master where answer=’abc’ and q_id=’199′ INTERSECT select user_id from answer_master where answer=’abc’ and q_id=’198′
December 15th, 2007 at 5:42 pm
Bahesh Singh, this article is about how to do an intersection in MySQL. Even if the author refers abou INTERSECT, this statement doesn’t exist in MySQL sql language.
January 3rd, 2008 at 6:21 pm
Actually statements
SELECT member_id, name FROM a
INTERSECT
SELECT member_id, name FROM b
and
SELECT a.member_id, a.name
FROM a INNER JOIN b
USING (member_id, name)
are functionally DIFFERENT.
INTERSECT and MINUS (EXCEPT) are operating on unique tuples, see example (in Oracle but this is not relevant):
SQL> create table a (member_id number, name varchar(10));
Table created.
SQL> create table b (member_id number, name varchar(10));
Table created.
SQL> insert into a values (1, ‘a’);
1 row created.
SQL> insert into a values (1, ‘b’);
1 row created.
SQL> insert into a values (1, ‘b’);
1 row created.
SQL> insert into b values (1, ‘b’);
1 row created.
SQL> commit;
Commit complete.
SQL> SELECT member_id, name FROM a
2 INTERSECT
3 SELECT member_id, name FROM b
SQL> /
MEMBER_ID NAME
———- ———-
1 b
SQL> SELECT a.member_id, a.name
2 FROM a INNER JOIN b
3 USING (member_id, name)
SQL> /
SELECT a.member_id, a.name
*
ERROR at line 1:
ORA-25154: column part of USING clause cannot have qualifier
SQL> ed
Wrote file afiedt.buf
1 SELECT member_id, name
2 FROM a INNER JOIN b
3* USING (member_id, name)
SQL> /
MEMBER_ID NAME
———- ———-
1 b
1 b
So as you can see returned sets are different.
February 5th, 2008 at 8:37 pm
use-left-join-to-find-what-isn’t-in-the-other-table:
thank you thank you thank you
February 8th, 2008 at 4:12 am
the use-left-join-to-find……. is an:
EXCLUSION JOIN
June 17th, 2008 at 10:45 pm
[...] Doing INTERSECT and MINUS in MySQL: classical use-left-join-to-find-what-isn’t-in-the-other-table [...]
June 17th, 2008 at 10:47 pm
use-left-join-to-find-what-isn’t-in-the-other-table: Sir, you deserve
a post.
July 20th, 2008 at 11:47 pm
I think that the two MINUS implementations (subquery vs. left join) are semantically different if (member_id, name) is not unique in both tables.
As an example (taken from Amarok’s MySQL back-end)
given (only showing relevant fields):
TABLE artist (
— One row per artist
artist.id INT UNSIGNED NOT NULL,
artist.name TEXT NOT NULL);
TABLE tags (
— One row per track. There may be multiple
— tracks per artist.
artist INT UNSIGNED NOT NULL,
url TEXT NOT NULL,
FOREIGN KEY (artist) REFERENCES (artist.id));
TABLE lyrics (
— One row per track with lyrics. Many tracks
— do not have lyrics available.
url TEXT NOT NULL,
FOREIGN KEY (url) REFERENCES (tags.url));
we would like to generate a list of artists for which there are no lyrics available (e.g., if there are five tracks for an artist, this artist should be a member of the returned list if _none_ of the tracks have lyrics).
The subquery implementation generates the desired result, e.g.:
SELECT DISTINCT artist.name FROM artist WHERE artist.name NOT IN (SELECT artist.name FROM artist, tags, lyrics WHERE artist.id = tags.artist AND tags.url = lyrics.url) ORDER BY artist.name
But the left join implementation (in my hands) generates a larger result set: all artists for which _any_ track does not have lyrics:
SELECT DISTINCT artist.name FROM artist JOIN tags ON tags.artist = artist.id LEFT JOIN lyrics ON tags.url = lyrics.url WHERE lyrics.url IS NULL ORDER BY artist.name
I would expect the semantics of the MINUS statement to match the first case, but I do not have SQL experience outside of MySQL.
February 2nd, 2009 at 11:49 pm
[...] found a page that has two alternatives to doing this. The first is supposedly less [...]
March 2nd, 2009 at 12:29 am
Thank you so much! It has been extremely helpful. By the way, use-left-join-to-find-what-isn’t-in-the-other-table ROCKS!!!
March 24th, 2009 at 1:26 pm
Intersect function isn’t working in MySql.
April 28th, 2009 at 11:50 am
I was wondering why ‘Intersect’ isn’t supported by Mysql. Is there a reason for this?
June 3rd, 2009 at 3:52 pm
hey all,
“use-left-join-to-find-what-isn’t-in-the-other-table” is a nice solution if you have 2 queries, kinda sucks if you have like 10 …
instead of Q1 intersect Q2 intersect Q3, you have to join Q(n) to Q(n-1), which is quite a pain in the ass…
Mysql should have intersect, its important…
Bye,
Zs
June 9th, 2009 at 1:59 pm
uhm Zs… if you need to do 10 intersects it might be time to consider your db design?
October 14th, 2009 at 12:19 pm
How about if I want to intersect within the same table? E.g. I have a table ‘prescription’ with details of patients and the drugs they are taking. I want to select patients that are taking BOTH drug A and drug B so I can’t use IN operator as it will also give me those only taking one of the drugs. Does MySQL allow use of the INTERSECT operator? I am getting an error for this query but am not sure I can achieve it with a JOIN operator Any suggestions?
March 4th, 2010 at 10:49 pm
Dude, you just answered what I was working on for hours! And this article was written like 5 years ago so you’ll probably never read this! Specifically I couldn’t get MINUS to work in Access 2007, so I used your alternative and it seems to work correctly! Thanks!
March 5th, 2010 at 9:57 am
Evan,
Of course I read your comment. The internet doesn’t forget
March 16th, 2010 at 6:29 am
The constant “small minded” thinking of MYSQL does nothing encourage any kind of respect. UNIONS are NOT necessarily a simple join from one table to another. A Union can and quite often is a s complex selection over several tables. The suggestion above that a simple “join” replaces an Intersect is a stupid as the person who wrote it.
March 16th, 2010 at 7:55 am
Fil,
Being as smart as you obviously are, I’m sure you’d have no problems extending the principles in the article to more complex situations using subselects and other methods.
March 17th, 2010 at 1:39 pm
specific and useful. thanks..
March 17th, 2010 at 9:48 pm
Thank you, Carsten! You just helped me solve a problem I’ve been struggling with for a couple of days. Best of luck to you!
April 23rd, 2010 at 5:32 am
i have about 9000+ in both ‘a’ and ‘b’, and my server locks up everytime using the LEFT JOIN…suggestions? i used the syntax above exactly. please help!!!