SQL select only rows with max value on a column [duplicate]
This question already has an answer here:
Retrieving the last record in each group - MySQL
22 answers
I have this table for documents (simplified version here):
+------+-------+--------------------------------------+
| id | rev | content |
+------+-------+--------------------------------------+
| 1 | 1 | ... |
| 2 | 1 | ... |
| 1 | 2 | ... |
| 1 | 3 | ... |
+------+-------+--------------------------------------+
How do I select one row per id and only the greatest rev?
With the above data, the result should contain two rows: [1, 3, ...]
and [2, 1, ..]
. I'm using MySQL.
Currently I use checks in the while
loop to detect and over-write old revs from the resultset. But is this the only method to achieve the result? Isn't there a SQL solution?
Update
As the answers suggest, there is a SQL solution, and here a sqlfiddle demo.
Update 2
I noticed after adding the above sqlfiddle, the rate at which the question is upvoted has surpassed the upvote rate of the answers. That has not been the intention! The fiddle is based on the answers, especially the accepted answer.
mysql sql aggregate-functions greatest-n-per-group groupwise-maximum
We're looking for long answers that provide some explanation and context. Don't just give a one-line answer; explain why your answer is right, ideally with citations. Answers that don't include explanations may be removed.
marked as duplicate by Bill Karwin
StackExchange.ready(function() {
if (StackExchange.options.isMobile) return;
$('.dupe-hammer-message-hover:not(.hover-bound)').each(function() {
var $hover = $(this).addClass('hover-bound'),
$msg = $hover.siblings('.dupe-hammer-message');
$hover.hover(
function() {
$hover.showInfoMessage('', {
messageElement: $msg.clone().show(),
transient: false,
position: { my: 'bottom left', at: 'top center', offsetTop: -7 },
dismissable: false,
relativeToBody: true
});
},
function() {
StackExchange.helpers.removeMessages();
}
);
});
});
Mar 20 at 0:19
This question has been asked before and already has an answer. If those answers do not fully address your question, please ask a new question.
|
show 1 more comment
This question already has an answer here:
Retrieving the last record in each group - MySQL
22 answers
I have this table for documents (simplified version here):
+------+-------+--------------------------------------+
| id | rev | content |
+------+-------+--------------------------------------+
| 1 | 1 | ... |
| 2 | 1 | ... |
| 1 | 2 | ... |
| 1 | 3 | ... |
+------+-------+--------------------------------------+
How do I select one row per id and only the greatest rev?
With the above data, the result should contain two rows: [1, 3, ...]
and [2, 1, ..]
. I'm using MySQL.
Currently I use checks in the while
loop to detect and over-write old revs from the resultset. But is this the only method to achieve the result? Isn't there a SQL solution?
Update
As the answers suggest, there is a SQL solution, and here a sqlfiddle demo.
Update 2
I noticed after adding the above sqlfiddle, the rate at which the question is upvoted has surpassed the upvote rate of the answers. That has not been the intention! The fiddle is based on the answers, especially the accepted answer.
mysql sql aggregate-functions greatest-n-per-group groupwise-maximum
We're looking for long answers that provide some explanation and context. Don't just give a one-line answer; explain why your answer is right, ideally with citations. Answers that don't include explanations may be removed.
marked as duplicate by Bill Karwin
StackExchange.ready(function() {
if (StackExchange.options.isMobile) return;
$('.dupe-hammer-message-hover:not(.hover-bound)').each(function() {
var $hover = $(this).addClass('hover-bound'),
$msg = $hover.siblings('.dupe-hammer-message');
$hover.hover(
function() {
$hover.showInfoMessage('', {
messageElement: $msg.clone().show(),
transient: false,
position: { my: 'bottom left', at: 'top center', offsetTop: -7 },
dismissable: false,
relativeToBody: true
});
},
function() {
StackExchange.helpers.removeMessages();
}
);
});
});
Mar 20 at 0:19
This question has been asked before and already has an answer. If those answers do not fully address your question, please ask a new question.
Do you need the correspondingcontent
field for the row?
– Mark Byers
Oct 12 '11 at 19:45
Yes, and that would pose no problem, I have cut out many columns which I'd be adding back.
– Majid Fouladpour
Oct 12 '11 at 19:48
1
@MarkByers I have edited my answer to comply with OP needs. Since I was at it, I decided to write a more comprehensive answer on the greatest-n-per-group topic.
– Adrian Carneiro
Oct 12 '11 at 20:57
This is common greatest-n-per-group problem, which has well tested and optimized solutions. I prefer the left join solution by Bill Karwin (the original post). Note that bunch of solutions to this common problem can surprisingly be found in the one of most official sources, MySQL manual! See Examples of Common Queries :: The Rows Holding the Group-wise Maximum of a Certain Column.
– TMS
Apr 28 '14 at 11:50
2
duplicate of Retrieving the last record in each group
– TMS
Jul 8 '14 at 18:39
|
show 1 more comment
This question already has an answer here:
Retrieving the last record in each group - MySQL
22 answers
I have this table for documents (simplified version here):
+------+-------+--------------------------------------+
| id | rev | content |
+------+-------+--------------------------------------+
| 1 | 1 | ... |
| 2 | 1 | ... |
| 1 | 2 | ... |
| 1 | 3 | ... |
+------+-------+--------------------------------------+
How do I select one row per id and only the greatest rev?
With the above data, the result should contain two rows: [1, 3, ...]
and [2, 1, ..]
. I'm using MySQL.
Currently I use checks in the while
loop to detect and over-write old revs from the resultset. But is this the only method to achieve the result? Isn't there a SQL solution?
Update
As the answers suggest, there is a SQL solution, and here a sqlfiddle demo.
Update 2
I noticed after adding the above sqlfiddle, the rate at which the question is upvoted has surpassed the upvote rate of the answers. That has not been the intention! The fiddle is based on the answers, especially the accepted answer.
mysql sql aggregate-functions greatest-n-per-group groupwise-maximum
This question already has an answer here:
Retrieving the last record in each group - MySQL
22 answers
I have this table for documents (simplified version here):
+------+-------+--------------------------------------+
| id | rev | content |
+------+-------+--------------------------------------+
| 1 | 1 | ... |
| 2 | 1 | ... |
| 1 | 2 | ... |
| 1 | 3 | ... |
+------+-------+--------------------------------------+
How do I select one row per id and only the greatest rev?
With the above data, the result should contain two rows: [1, 3, ...]
and [2, 1, ..]
. I'm using MySQL.
Currently I use checks in the while
loop to detect and over-write old revs from the resultset. But is this the only method to achieve the result? Isn't there a SQL solution?
Update
As the answers suggest, there is a SQL solution, and here a sqlfiddle demo.
Update 2
I noticed after adding the above sqlfiddle, the rate at which the question is upvoted has surpassed the upvote rate of the answers. That has not been the intention! The fiddle is based on the answers, especially the accepted answer.
This question already has an answer here:
Retrieving the last record in each group - MySQL
22 answers
mysql sql aggregate-functions greatest-n-per-group groupwise-maximum
mysql sql aggregate-functions greatest-n-per-group groupwise-maximum
edited Jan 2 at 22:44
Rick James
70.2k563103
70.2k563103
asked Oct 12 '11 at 19:42
Majid FouladpourMajid Fouladpour
12.4k1757118
12.4k1757118
We're looking for long answers that provide some explanation and context. Don't just give a one-line answer; explain why your answer is right, ideally with citations. Answers that don't include explanations may be removed.
We're looking for long answers that provide some explanation and context. Don't just give a one-line answer; explain why your answer is right, ideally with citations. Answers that don't include explanations may be removed.
marked as duplicate by Bill Karwin
StackExchange.ready(function() {
if (StackExchange.options.isMobile) return;
$('.dupe-hammer-message-hover:not(.hover-bound)').each(function() {
var $hover = $(this).addClass('hover-bound'),
$msg = $hover.siblings('.dupe-hammer-message');
$hover.hover(
function() {
$hover.showInfoMessage('', {
messageElement: $msg.clone().show(),
transient: false,
position: { my: 'bottom left', at: 'top center', offsetTop: -7 },
dismissable: false,
relativeToBody: true
});
},
function() {
StackExchange.helpers.removeMessages();
}
);
});
});
Mar 20 at 0:19
This question has been asked before and already has an answer. If those answers do not fully address your question, please ask a new question.
marked as duplicate by Bill Karwin
StackExchange.ready(function() {
if (StackExchange.options.isMobile) return;
$('.dupe-hammer-message-hover:not(.hover-bound)').each(function() {
var $hover = $(this).addClass('hover-bound'),
$msg = $hover.siblings('.dupe-hammer-message');
$hover.hover(
function() {
$hover.showInfoMessage('', {
messageElement: $msg.clone().show(),
transient: false,
position: { my: 'bottom left', at: 'top center', offsetTop: -7 },
dismissable: false,
relativeToBody: true
});
},
function() {
StackExchange.helpers.removeMessages();
}
);
});
});
Mar 20 at 0:19
This question has been asked before and already has an answer. If those answers do not fully address your question, please ask a new question.
Do you need the correspondingcontent
field for the row?
– Mark Byers
Oct 12 '11 at 19:45
Yes, and that would pose no problem, I have cut out many columns which I'd be adding back.
– Majid Fouladpour
Oct 12 '11 at 19:48
1
@MarkByers I have edited my answer to comply with OP needs. Since I was at it, I decided to write a more comprehensive answer on the greatest-n-per-group topic.
– Adrian Carneiro
Oct 12 '11 at 20:57
This is common greatest-n-per-group problem, which has well tested and optimized solutions. I prefer the left join solution by Bill Karwin (the original post). Note that bunch of solutions to this common problem can surprisingly be found in the one of most official sources, MySQL manual! See Examples of Common Queries :: The Rows Holding the Group-wise Maximum of a Certain Column.
– TMS
Apr 28 '14 at 11:50
2
duplicate of Retrieving the last record in each group
– TMS
Jul 8 '14 at 18:39
|
show 1 more comment
Do you need the correspondingcontent
field for the row?
– Mark Byers
Oct 12 '11 at 19:45
Yes, and that would pose no problem, I have cut out many columns which I'd be adding back.
– Majid Fouladpour
Oct 12 '11 at 19:48
1
@MarkByers I have edited my answer to comply with OP needs. Since I was at it, I decided to write a more comprehensive answer on the greatest-n-per-group topic.
– Adrian Carneiro
Oct 12 '11 at 20:57
This is common greatest-n-per-group problem, which has well tested and optimized solutions. I prefer the left join solution by Bill Karwin (the original post). Note that bunch of solutions to this common problem can surprisingly be found in the one of most official sources, MySQL manual! See Examples of Common Queries :: The Rows Holding the Group-wise Maximum of a Certain Column.
– TMS
Apr 28 '14 at 11:50
2
duplicate of Retrieving the last record in each group
– TMS
Jul 8 '14 at 18:39
Do you need the corresponding
content
field for the row?– Mark Byers
Oct 12 '11 at 19:45
Do you need the corresponding
content
field for the row?– Mark Byers
Oct 12 '11 at 19:45
Yes, and that would pose no problem, I have cut out many columns which I'd be adding back.
– Majid Fouladpour
Oct 12 '11 at 19:48
Yes, and that would pose no problem, I have cut out many columns which I'd be adding back.
– Majid Fouladpour
Oct 12 '11 at 19:48
1
1
@MarkByers I have edited my answer to comply with OP needs. Since I was at it, I decided to write a more comprehensive answer on the greatest-n-per-group topic.
– Adrian Carneiro
Oct 12 '11 at 20:57
@MarkByers I have edited my answer to comply with OP needs. Since I was at it, I decided to write a more comprehensive answer on the greatest-n-per-group topic.
– Adrian Carneiro
Oct 12 '11 at 20:57
This is common greatest-n-per-group problem, which has well tested and optimized solutions. I prefer the left join solution by Bill Karwin (the original post). Note that bunch of solutions to this common problem can surprisingly be found in the one of most official sources, MySQL manual! See Examples of Common Queries :: The Rows Holding the Group-wise Maximum of a Certain Column.
– TMS
Apr 28 '14 at 11:50
This is common greatest-n-per-group problem, which has well tested and optimized solutions. I prefer the left join solution by Bill Karwin (the original post). Note that bunch of solutions to this common problem can surprisingly be found in the one of most official sources, MySQL manual! See Examples of Common Queries :: The Rows Holding the Group-wise Maximum of a Certain Column.
– TMS
Apr 28 '14 at 11:50
2
2
duplicate of Retrieving the last record in each group
– TMS
Jul 8 '14 at 18:39
duplicate of Retrieving the last record in each group
– TMS
Jul 8 '14 at 18:39
|
show 1 more comment
27 Answers
27
active
oldest
votes
At first glance...
All you need is a GROUP BY
clause with the MAX
aggregate function:
SELECT id, MAX(rev)
FROM YourTable
GROUP BY id
It's never that simple, is it?
I just noticed you need the content
column as well.
This is a very common question in SQL: find the whole data for the row with some max value in a column per some group identifier. I heard that a lot during my career. Actually, it was one the questions I answered in my current job's technical interview.
It is, actually, so common that StackOverflow community has created a single tag just to deal with questions like that: greatest-n-per-group.
Basically, you have two approaches to solve that problem:
Joining with simple group-identifier, max-value-in-group
Sub-query
In this approach, you first find the group-identifier, max-value-in-group
(already solved above) in a sub-query. Then you join your table to the sub-query with equality on both group-identifier
and max-value-in-group
:
SELECT a.id, a.rev, a.contents
FROM YourTable a
INNER JOIN (
SELECT id, MAX(rev) rev
FROM YourTable
GROUP BY id
) b ON a.id = b.id AND a.rev = b.rev
Left Joining with self, tweaking join conditions and filters
In this approach, you left join the table with itself. Equality, of course, goes in the group-identifier
. Then, 2 smart moves:
- The second join condition is having left side value less than right value
- When you do step 1, the row(s) that actually have the max value will have
NULL
in the right side (it's aLEFT JOIN
, remember?). Then, we filter the joined result, showing only the rows where the right side isNULL
.
So you end up with:
SELECT a.*
FROM YourTable a
LEFT OUTER JOIN YourTable b
ON a.id = b.id AND a.rev < b.rev
WHERE b.id IS NULL;
Conclusion
Both approaches bring the exact same result.
If you have two rows with max-value-in-group
for group-identifier
, both rows will be in the result in both approaches.
Both approaches are SQL ANSI compatible, thus, will work with your favorite RDBMS, regardless of its "flavor".
Both approaches are also performance friendly, however your mileage may vary (RDBMS, DB Structure, Indexes, etc.). So when you pick one approach over the other, benchmark. And make sure you pick the one which make most of sense to you.
8
I know that MySQL allows you to add non aggregate fields to a "grouped by" query, but I find that kinda pointless. Try running thisselect id, max(rev), rev from YourTable group by id
and you see what I mean. Take your time and try to understand it
– Adrian Carneiro
Oct 12 '11 at 20:05
3
@JasonMcCarrell I'm glad this answer helped you! I get your point, this is why I called itgroup_identifier
, which could be one or more columns. In your case,group_identifier
is the combination of name and age
– Adrian Carneiro
Dec 12 '12 at 16:50
6
How do I get it to return only one row per group though? Don't these answers return every row in each group that has a compare value equal to the maximum value? For instance, suppose there was a second row in the OP's dataset with id = 1, rev = 3. Wouldn't it return both rows with id=1, rev=3?
– Michael Lang
Jun 24 '13 at 22:42
2
@RobertChrist to arbitrarily break ties with the first version, just addDISTINCT ON (yt.id)
after the initialSELECT
. That made my query take twice as long though. So, I don't tie-break since ties are practically impossible in my case.
– ma11hew28
Mar 14 '14 at 0:29
2
Why would the first solution work? Won'tmax
function run per each group consisting of a single row instead of all the rows as a whole.
– Gherman
Sep 18 '14 at 8:24
|
show 28 more comments
My preference is to use as little code as possible...
You can do it using IN
try this:
SELECT *
FROM t1 WHERE (id,rev) IN
( SELECT id, MAX(rev)
FROM t1
GROUP BY id
)
to my mind it is less complicated... easier to read and maintain.
23
Curious - which database engine can we use this type of WHERE clause in? This is not supported in SQL Server.
– Kash
Nov 17 '11 at 17:04
18
oracle & mysql (not sure about other databases sorry)
– Kevin Burton
Nov 17 '11 at 18:03
20
Works on PostgreSQL too.
– lcguida
Jan 15 '14 at 17:43
10
Confirmed working in DB2
– coderatchet
Jan 29 '14 at 2:32
11
Does not work with SQLite.
– Marcel Pfeiffer
Oct 26 '14 at 20:32
|
show 9 more comments
Yet another solution is to use a correlated subquery:
select yt.id, yt.rev, yt.contents
from YourTable yt
where rev =
(select max(rev) from YourTable st where yt.id=st.id)
Having an index on (id,rev) renders the subquery almost as a simple lookup...
Following are comparisons to the solutions in @AdrianCarneiro's answer (subquery, leftjoin), based on MySQL measurements with InnoDB table of ~1million records, group size being: 1-3.
While for full table scans subquery/leftjoin/correlated timings relate to each other as 6/8/9, when it comes to direct lookups or batch (id in (1,2,3)
), subquery is much slower then the others (Due to rerunning the subquery). However I couldnt differentiate between leftjoin and correlated solutions in speed.
One final note, as leftjoin creates n*(n+1)/2 joins in groups, its performance can be heavily affected by the size of groups...
This is the only one so far that worked in the way I needed it, thanks (needed to match by name, not by id)
– Doomed Mind
Feb 2 '17 at 15:27
1
I dont think this works if rev is not unique.
– Pita
Jun 5 '17 at 21:13
@Pita no. it works even if rev is not unique
– Pradeep Kumar Prabaharan
Sep 29 '17 at 16:44
Good point for mentioning index required for simple lookup (apparently cannot plus 1 in comments anymore)
– Jared Becksfort
Nov 13 '17 at 18:10
However I couldnt differentiate between leftjoin and correlated solutions in speed.
- the same for me for Sql Server
– nahab
Feb 15 '18 at 13:48
|
show 1 more comment
I am flabbergasted that no answer offered SQL window function solution:
SELECT a.id, a.rev, a.contents
FROM (SELECT id, rev, contents,
ROW_NUMBER() OVER (PARTITION BY id ORDER BY rev DESC) rank
FROM YourTable) a
WHERE a.rank = 1
Added in SQL standard ANSI/ISO Standard SQL:2003 and later extended with ANSI/ISO Standard SQL:2008, window (or windowing) functions are available with all major vendors now. There are more types of rank functions available to deal with a tie issue: RANK, DENSE_RANK, PERSENT_RANK
.
I think it is less intuitive and potentially less clear - but it can definitely work/be a solution.
– mmcrae
Jan 10 '17 at 16:52
4
intuition is tricky thing. I find it more intuitive than other answers as it builds explicit data structure that answers the question. But, again, intuition is the other side of bias...
– topchef
Jan 10 '17 at 18:22
8
This might work in MariaDB 10.2 and MySQL 8.0.2, but not before.
– Rick James
Apr 1 '17 at 22:01
2
At last, I was beginning to wonder why this wasn't here. This is far more "intuitive" than the vast majority of the "old hat" answers on this page, and way more efficient in almost all cases as it requires just a single pass of the data. Most databases now support these standard window functions (MySQL is late but will from v8 onward).
– Used_By_Already
Dec 11 '17 at 0:42
1
I had no idea this feature existed. Dug deeply into a bunch of manuals this evening. This makes so much more sense than left joins (just from a lack of frustration perspective).
– Andrew Philips
Oct 19 '18 at 4:42
add a comment |
I can't vouch for the performance, but here's a trick inspired by the limitations of Microsoft Excel. It has some good features
GOOD STUFF
- It should force return of only one "max record" even if there is a tie (sometimes useful)
- It doesn't require a join
APPROACH
It is a little bit ugly and requires that you know something about the range of valid values of the rev column. Let us assume that we know the rev column is a number between 0.00 and 999 including decimals but that there will only ever be two digits to the right of the decimal point (e.g. 34.17 would be a valid value).
The gist of the thing is that you create a single synthetic column by string concatenating/packing the primary comparison field along with the data you want. In this way, you can force SQL's MAX() aggregate function to return all of the data (because it has been packed into a single column). Then you have to unpack the data.
Here's how it looks with the above example, written in SQL
SELECT id,
CAST(SUBSTRING(max(packed_col) FROM 2 FOR 6) AS float) as max_rev,
SUBSTRING(max(packed_col) FROM 11) AS content_for_max_rev
FROM (SELECT id,
CAST(1000 + rev + .001 as CHAR) || '---' || CAST(content AS char) AS packed_col
FROM yourtable
)
GROUP BY id
The packing begins by forcing the rev column to be a number of known character length regardless of the value of rev so that for example
- 3.2 becomes 1003.201
- 57 becomes 1057.001
- 923.88 becomes 1923.881
If you do it right, string comparison of two numbers should yield the same "max" as numeric comparison of the two numbers and it's easy to convert back to the original number using the substring function (which is available in one form or another pretty much everywhere).
Great solution, it performs much faster than join and other proposed solutions.
– danial
Sep 29 '14 at 22:10
add a comment |
I think this is the easiest solution :
SELECT *
FROM
(SELECT *
FROM Employee
ORDER BY Salary DESC)
AS employeesub
GROUP BY employeesub.Salary;
SELECT *
: Return all fields.
FROM Employee
: Table searched on.
(SELECT *...)
subquery : Return all people, sorted by Salary.
GROUP BY employeesub.Salary
: Force the top-sorted, Salary row of each employee to be the returned result.
If you happen to need just the one row, it's even easier :
SELECT *
FROM Employee
ORDER BY Employee.Salary DESC
LIMIT 1
I also think it's the easiest to break down, understand, and modify to other purposes:
ORDER BY Employee.Salary DESC
: Order the results by the salary, with highest salaries first.
LIMIT 1
: Return just one result.
Understanding this approach, solving any of these similar problems becomes trivial: get employee with lowest salary (change DESC
to ASC
), get top-ten earning employees (change LIMIT 1
to LIMIT 10
), sort by means of another field (change ORDER BY Employee.Salary
to ORDER BY Employee.Commission
), etc..
1
This does not answer the question. The question is asking how to get the data for one row (as was asked, "one row per ID") in a group query where value x is the max within each group of rows. For example a customer order table with multiple orders per customer where you want to retrieve the largest order for each customer. Your query might very well return more than one row per customer (if, for example, the two largest orders were placed by the same customer).
– Aaron J Spetner
Oct 2 '17 at 6:39
"one row per ID" <-- keep reading, please, and you'll see "and only the greatest". That is logically equivalent to just the greatest.
– HoldOffHunger
Oct 2 '17 at 12:17
Yes, but it says "and". Which means the requirements are BOTH one row per ID AND only the greatest. Using this answer will not satisfy the first requirement. Additionally, the question implies the need to retrieve a single record for ALL of the IDs. This answer requires knowledge of the number of IDs beforehand (in order to configure the LIMIT), which will require additional code. The question's goal is stated specifically as seeking a SQL-only solution. Finally, even if you know the number of unique IDs, if there are multiple occurrences of the MAX value, the LIMIT clause will be wrong.
– Aaron J Spetner
Oct 3 '17 at 7:12
1
I did not have the exact same situation like in the original post but this is the most easy to understand and straightforward and working solution i came across so far for my problem. I am amazed how all the geeks and freaks try to overtake each other by bragging with complex / weird queries.
– sba
Oct 5 '17 at 14:58
1
This is a hacky solution, totally busted in the later MySQL versions won't work on servers withONLY_FULL_GROUP_BY
enabled within the server config... sqlfiddle.com/#!9/215cd/4
– Raymond Nijland
Jun 18 '18 at 15:55
|
show 3 more comments
Something like this?
SELECT yourtable.id, rev, content
FROM yourtable
INNER JOIN (
SELECT id, max(rev) as maxrev FROM yourtable
WHERE yourtable
GROUP BY id
) AS child ON (yourtable.id = child.id) AND (yourtable.rev = maxrev)
The join-less ones wouldn't cut it?
– Majid Fouladpour
Oct 12 '11 at 19:51
1
If they work, then they're fine too.
– Marc B
Oct 12 '11 at 19:54
10
What doesWHERE yourtable
do?
– Brian McCutchon
Jun 3 '16 at 5:19
This seems to be the fastest one (with proper indexes).
– Salman A
Feb 13 at 12:27
add a comment |
Since this is most popular question with regard to this problem, I'll re-post another answer to it here as well:
It looks like there is simpler way to do this (but only in MySQL):
select *
from (select * from mytable order by id, rev desc ) x
group by id
Please credit answer of user Bohemian in this question for providing such a concise and elegant answer to this problem.
EDIT: though this solution works for many people it may not be stable in the long run, since MySQL doesn't guarantee that GROUP BY statement will return meaningful values for columns not in GROUP BY list. So use this solution at your own risk
7
Except that it's wrong, as there is no guarantee that the order of the inner query means anything, nor is the GROUP BY always guaranteed to take the first encountered row. At least in MySQL and I would assume all others. In fact I was under the assumption that MySQL would simply ignore the whole ORDER BY. Any future version or a change in configuration might break this query.
– Jannes
Oct 10 '14 at 10:14
@Jannes this is interesting remark :) I welcome you to answer my question providing proofs: stackoverflow.com/questions/26301877/…
– Yura
Oct 10 '14 at 14:41
1
@Jannes concerning GROUP BY not guaranteed to take the first encountered row - you are totally right - found this issue bugs.mysql.com/bug.php?id=71942 which asks to provide such guarantees. Will update my answer now
– Yura
Oct 10 '14 at 14:59
I think I remember where I got the ORDER BY being discarded from: MySQL does that with UNIONs if you ORDER BY the inner queries, it's just ignore: dev.mysql.com/doc/refman/5.0/en/union.html says "If ORDER BY appears without LIMIT in a SELECT, it is optimized away because it will have no effect anyway." I haven't seen such a statement for the query in question here, but I don't see why it couldn't do that.
– Jannes
Oct 11 '14 at 19:09
add a comment |
I like to use a NOT EXIST
-based solution for this problem:
SELECT id, rev
FROM YourTable t
WHERE NOT EXISTS (
SELECT * FROM YourTable t WHERE t.id = id AND rev > t.rev
)
1
yes, not exists like this has generally been the preferred way rather than a left join. In older versions of SQL server it was faster, although i think now it makes no difference. I normally do SELECT 1 instead of SELECT *, again because in prior versions it was faster.
– EGP
Oct 8 '14 at 12:38
add a comment |
A third solution I hardly ever see mentioned is MySQL specific and looks like this:
SELECT id, MAX(rev) AS rev
, 0+SUBSTRING_INDEX(GROUP_CONCAT(numeric_content ORDER BY rev DESC), ',', 1) AS numeric_content
FROM t1
GROUP BY id
Yes it looks awful (converting to string and back etc.) but in my experience it's usually faster than the other solutions. Maybe that just for my use cases, but I have used it on tables with millions of records and many unique ids. Maybe it's because MySQL is pretty bad at optimizing the other solutions (at least in the 5.0 days when I came up with this solution).
One important thing is that GROUP_CONCAT has a maximum length for the string it can build up. You probably want to raise this limit by setting the group_concat_max_len
variable. And keep in mind that this will be a limit on scaling if you have a large number of rows.
Anyway, the above doesn't directly work if your content field is already text. In that case you probably want to use a different separator, like maybe. You'll also run into the group_concat_max_len
limit quicker.
add a comment |
If you have many fields in select statement and you want latest value for all of those fields through optimized code:
select * from
(select * from table_name
order by id,rev desc) temp
group by id
This works OK for small tables, but takes 6 passes over the entire dataset, so not fast for large tables.
– Rick James
May 17 '17 at 0:48
This is the query I needed because there were other columns involved, too.
– Mike Viens
Jun 1 '18 at 19:07
add a comment |
NOT mySQL, but for other people finding this question and using SQL, another way to resolve the greatest-n-per-group problem is using Cross Apply
in MS SQL
WITH DocIds AS (SELECT DISTINCT id FROM docs)
SELECT d2.id, d2.rev, d2.content
FROM DocIds d1
CROSS APPLY (
SELECT Top 1 * FROM docs d
WHERE d.id = d1.id
ORDER BY rev DESC
) d2
Here's an example in SqlFiddle
very slow comparing to other methods - group by, windows, not exists
– nahab
Feb 15 '18 at 13:40
add a comment |
I think, You want this?
select * from docs where (id, rev) IN (select id, max(rev) as rev from docs group by id order by id)
SQL Fiddle :
Check here
add a comment |
I would use this:
select t.*
from test as t
join
(select max(rev) as rev
from test
group by id) as o
on o.rev = t.rev
Subquery SELECT is not too eficient maybe, but in JOIN clause seems to be usable. I'm not an expert in optimizing queries, but I've tried at MySQL, PostgreSQL, FireBird and it does work very good.
You can use this schema in multiple joins and with WHERE clause. It is my working example (solving identical to yours problem with table "firmy"):
select *
from platnosci as p
join firmy as f
on p.id_rel_firmy = f.id_rel
join (select max(id_obj) as id_obj
from firmy
group by id_rel) as o
on o.id_obj = f.id_obj and p.od > '2014-03-01'
It is asked on tables having teens thusands of records, and it takes less then 0,01 second on really not too strong machine.
I wouldn't use IN clause (as it is mentioned somewhere above). IN is given to use with short lists of constans, and not as to be the query filter built on subquery. It is because subquery in IN is performed for every scanned record which can made query taking very loooong time.
I think using that subquery as a CTE might at least improve performance
– mmcrae
Jan 10 '17 at 18:52
Hi! For me it looks like your 1st query needs ...and o.id = t.id
in the end (and the subquery should returnid
for that). Doesn't it?
– Dmitry Grekov
Aug 10 '18 at 11:37
add a comment |
How about this:
SELECT all_fields.*
FROM (SELECT id, MAX(rev) FROM yourtable GROUP BY id) AS max_recs
LEFT OUTER JOIN yourtable AS all_fields
ON max_recs.id = all_fields.id
add a comment |
SELECT *
FROM Employee
where Employee.Salary in (select max(salary) from Employee group by Employe_id)
ORDER BY Employee.Salary
add a comment |
Another manner to do the job is using MAX()
analytic function in OVER PARTITION clause
SELECT t.*
FROM
(
SELECT id
,rev
,contents
,MAX(rev) OVER (PARTITION BY id) as max_rev
FROM YourTable
) t
WHERE t.rev = t.max_rev
The other ROW_NUMBER()
OVER PARTITION solution already documented in this post is
SELECT t.*
FROM
(
SELECT id
,rev
,contents
,ROW_NUMBER() OVER (PARTITION BY id ORDER BY rev DESC) rank
FROM YourTable
) t
WHERE t.rank = 1
This 2 SELECT work well on Oracle 10g.
MAX() solution runs certainly FASTER that ROW_NUMBER()
solution because MAX()
complexity is O(n)
while ROW_NUMBER()
complexity is at minimum O(n.log(n))
where n
represent the number of records in table !
add a comment |
This solution makes only one selection from YourTable, therefore it's faster. It works only for MySQL and SQLite(for SQLite remove DESC) according to test on sqlfiddle.com. Maybe it can be tweaked to work on other languages which I am not familiar with.
SELECT *
FROM ( SELECT *
FROM ( SELECT 1 as id, 1 as rev, 'content1' as content
UNION
SELECT 2, 1, 'content2'
UNION
SELECT 1, 2, 'content3'
UNION
SELECT 1, 3, 'content4'
) as YourTable
ORDER BY id, rev DESC
) as YourTable
GROUP BY id
This doesn't appear to work for the general case. And, it doesn't work at all in PostgreSQL, returning:ERROR: column "your table.reb" must appear in the GROUP BY clause or be used in an aggregate function LINE 1: SELECT *
– ma11hew28
Mar 13 '14 at 16:26
Sorry I didn't clarify the first time at which language it worked.
– plavozont
Mar 17 '14 at 5:11
add a comment |
Here is a nice way of doing that
Use following code :
with temp as (
select count(field1) as summ , field1
from table_name
group by field1 )
select * from temp where summ = (select max(summ) from temp)
add a comment |
I like to do this by ranking the records by some column. In this case, rank rev
values grouped by id
. Those with higher rev
will have lower rankings. So highest rev
will have ranking of 1.
select id, rev, content
from
(select
@rowNum := if(@prevValue = id, @rowNum+1, 1) as row_num,
id, rev, content,
@prevValue := id
from
(select id, rev, content from YOURTABLE order by id asc, rev desc) TEMP,
(select @rowNum := 1 from DUAL) X,
(select @prevValue := -1 from DUAL) Y) TEMP
where row_num = 1;
Not sure if introducing variables makes the whole thing slower. But at least I'm not querying YOURTABLE
twice.
Only tried approach in MySQL. Oracle has a similar function for ranking records. Idea should work too.
– user5124980
Jul 16 '15 at 18:54
1
Reading & writing a variable in a select statement is undefined in MySQL although particular versions happen to give the answer you might expect for certain syntax involving case expressions.
– philipxy
Sep 22 '18 at 10:57
add a comment |
Sorted the rev field in reverse order and then grouped by id which gave the first row of each grouping which is the one with the highest rev value.
SELECT * FROM (SELECT * FROM table1 ORDER BY id, rev DESC) X GROUP BY X.id;
Tested in http://sqlfiddle.com/ with the following data
CREATE TABLE table1
(`id` int, `rev` int, `content` varchar(11));
INSERT INTO table1
(`id`, `rev`, `content`)
VALUES
(1, 1, 'One-One'),
(1, 2, 'One-Two'),
(2, 1, 'Two-One'),
(2, 2, 'Two-Two'),
(3, 2, 'Three-Two'),
(3, 1, 'Three-One'),
(3, 3, 'Three-Three')
;
This gave the following result in MySql 5.5 and 5.6
id rev content
1 2 One-Two
2 2 Two-Two
3 3 Three-Two
This technique used to work, but no longer. See mariadb.com/kb/en/mariadb/…
– Rick James
Apr 1 '17 at 22:02
1
The original question tag is "mysql" and I have stated very clearly that my solution was tested with both Mysql 5.5 and 5.6 in sqlfiddle.com. I have provided all steps to independently verify the solution. I have not made any false claims that my solution works with Mariadb. Mariadb is not Mysql, its just a drop-in replacement for Mysql, owned by 2 different companies. Your comment will help anyone that is trying to implement it in Mariadb but my post in no way deserve a negative vote as it clearly answers the question that was asked.
– blokeish
Apr 3 '17 at 0:34
1
Yes, it works in older versions. And I have used that technique in the past, only to be burned when it stopped working. Also MySQL (in 5.7?) will also be ignoring theORDER BY
in a subquery. Since lots of people will read your answer, I am trying to steer them away from a technique that will break in their future. (And I did not give you the -1 vote.)
– Rick James
Apr 3 '17 at 2:38
1
Tests prove nothing. ORDER BY in a subquery has no guaranteed effect other than for a LIMIT in the same subquery. Even if order was preserved, the GROUP BY would not preserve it. Even if it were preserved, non-standard GROUP BY relying on disabled ONLY_FULL_GROUP_BY is specified to return some row in a group for a non-grouped column but not necessarily the first. So your query is not correct.
– philipxy
Sep 22 '18 at 11:50
add a comment |
here is another solution hope it will help someone
Select a.id , a.rev, a.content from Table1 a
inner join
(SELECT id, max(rev) rev FROM Table1 GROUP BY id) x on x.id =a.id and x.rev =a.rev
add a comment |
None of these answers have worked for me.
This is what worked for me.
with score as (select max(score_up) from history)
select history.* from score, history where history.score_up = score.max
add a comment |
Here's another solution to retrieving the records only with a field that has the maximum value for that field. This works for SQL400 which is the platform I work on. In this example, the records with the maximum value in field FIELD5 will be retrieved by the following SQL statement.
SELECT A.KEYFIELD1, A.KEYFIELD2, A.FIELD3, A.FIELD4, A.FIELD5
FROM MYFILE A
WHERE RRN(A) IN
(SELECT RRN(B)
FROM MYFILE B
WHERE B.KEYFIELD1 = A.KEYFIELD1 AND B.KEYFIELD2 = A.KEYFIELD2
ORDER BY B.FIELD5 DESC
FETCH FIRST ROW ONLY)
add a comment |
I used the below to solve a problem of my own. I first created a temp table and inserted the max rev value per unique id.
CREATE TABLE #temp1
(
id varchar(20)
, rev int
)
INSERT INTO #temp1
SELECT a.id, MAX(a.rev) as rev
FROM
(
SELECT id, content, SUM(rev) as rev
FROM YourTable
GROUP BY id, content
) as a
GROUP BY a.id
ORDER BY a.id
I then joined these max values (#temp1) to all of the possible id/content combinations. By doing this, I naturally filter out the non-maximum id/content combinations, and am left with the only max rev values for each.
SELECT a.id, a.rev, content
FROM #temp1 as a
LEFT JOIN
(
SELECT id, content, SUM(rev) as rev
FROM YourTable
GROUP BY id, content
) as b on a.id = b.id and a.rev = b.rev
GROUP BY a.id, a.rev, b.content
ORDER BY a.id
add a comment |
You can make the select without a join when you combine the rev
and id
into one maxRevId
value for MAX()
and then split it back to original values:
SELECT maxRevId & ((1 << 32) - 1) as id, maxRevId >> 32 AS rev
FROM (SELECT MAX(((rev << 32) | id)) AS maxRevId
FROM YourTable
GROUP BY id) x;
This is especially fast when there is a complex join instead of a single table. With the traditional approaches the complex join would be done twice.
The above combination is simple with bit functions when rev
and id
are INT UNSIGNED
(32 bit) and combined value fits to BIGINT UNSIGNED
(64 bit). When the id
& rev
are larger than 32-bit values or made of multiple columns, you need combine the value into e.g. a binary value with suitable padding for MAX()
.
add a comment |
Explanation
This is not pure SQL. This will use the SQLAlchemy ORM.
I came here looking for SQLAlchemy help, so I will duplicate Adrian Carneiro's answer with the python/SQLAlchemy version, specifically the outer join part.
This query answers the question of:
"Can you return me the records in this group of records (based on same id) that have the highest version number".
This allows me to duplicate the record, update it, increment its version number, and have the copy of the old version in such a way that I can show change over time.
Code
MyTableAlias = aliased(MyTable)
newest_records = appdb.session.query(MyTable).select_from(join(
MyTable,
MyTableAlias,
onclause=and_(
MyTable.id == MyTableAlias.id,
MyTable.version_int < MyTableAlias.version_int
),
isouter=True
)
).filter(
MyTableAlias.id == None,
).all()
Tested on a PostgreSQL database.
add a comment |
27 Answers
27
active
oldest
votes
27 Answers
27
active
oldest
votes
active
oldest
votes
active
oldest
votes
At first glance...
All you need is a GROUP BY
clause with the MAX
aggregate function:
SELECT id, MAX(rev)
FROM YourTable
GROUP BY id
It's never that simple, is it?
I just noticed you need the content
column as well.
This is a very common question in SQL: find the whole data for the row with some max value in a column per some group identifier. I heard that a lot during my career. Actually, it was one the questions I answered in my current job's technical interview.
It is, actually, so common that StackOverflow community has created a single tag just to deal with questions like that: greatest-n-per-group.
Basically, you have two approaches to solve that problem:
Joining with simple group-identifier, max-value-in-group
Sub-query
In this approach, you first find the group-identifier, max-value-in-group
(already solved above) in a sub-query. Then you join your table to the sub-query with equality on both group-identifier
and max-value-in-group
:
SELECT a.id, a.rev, a.contents
FROM YourTable a
INNER JOIN (
SELECT id, MAX(rev) rev
FROM YourTable
GROUP BY id
) b ON a.id = b.id AND a.rev = b.rev
Left Joining with self, tweaking join conditions and filters
In this approach, you left join the table with itself. Equality, of course, goes in the group-identifier
. Then, 2 smart moves:
- The second join condition is having left side value less than right value
- When you do step 1, the row(s) that actually have the max value will have
NULL
in the right side (it's aLEFT JOIN
, remember?). Then, we filter the joined result, showing only the rows where the right side isNULL
.
So you end up with:
SELECT a.*
FROM YourTable a
LEFT OUTER JOIN YourTable b
ON a.id = b.id AND a.rev < b.rev
WHERE b.id IS NULL;
Conclusion
Both approaches bring the exact same result.
If you have two rows with max-value-in-group
for group-identifier
, both rows will be in the result in both approaches.
Both approaches are SQL ANSI compatible, thus, will work with your favorite RDBMS, regardless of its "flavor".
Both approaches are also performance friendly, however your mileage may vary (RDBMS, DB Structure, Indexes, etc.). So when you pick one approach over the other, benchmark. And make sure you pick the one which make most of sense to you.
8
I know that MySQL allows you to add non aggregate fields to a "grouped by" query, but I find that kinda pointless. Try running thisselect id, max(rev), rev from YourTable group by id
and you see what I mean. Take your time and try to understand it
– Adrian Carneiro
Oct 12 '11 at 20:05
3
@JasonMcCarrell I'm glad this answer helped you! I get your point, this is why I called itgroup_identifier
, which could be one or more columns. In your case,group_identifier
is the combination of name and age
– Adrian Carneiro
Dec 12 '12 at 16:50
6
How do I get it to return only one row per group though? Don't these answers return every row in each group that has a compare value equal to the maximum value? For instance, suppose there was a second row in the OP's dataset with id = 1, rev = 3. Wouldn't it return both rows with id=1, rev=3?
– Michael Lang
Jun 24 '13 at 22:42
2
@RobertChrist to arbitrarily break ties with the first version, just addDISTINCT ON (yt.id)
after the initialSELECT
. That made my query take twice as long though. So, I don't tie-break since ties are practically impossible in my case.
– ma11hew28
Mar 14 '14 at 0:29
2
Why would the first solution work? Won'tmax
function run per each group consisting of a single row instead of all the rows as a whole.
– Gherman
Sep 18 '14 at 8:24
|
show 28 more comments
At first glance...
All you need is a GROUP BY
clause with the MAX
aggregate function:
SELECT id, MAX(rev)
FROM YourTable
GROUP BY id
It's never that simple, is it?
I just noticed you need the content
column as well.
This is a very common question in SQL: find the whole data for the row with some max value in a column per some group identifier. I heard that a lot during my career. Actually, it was one the questions I answered in my current job's technical interview.
It is, actually, so common that StackOverflow community has created a single tag just to deal with questions like that: greatest-n-per-group.
Basically, you have two approaches to solve that problem:
Joining with simple group-identifier, max-value-in-group
Sub-query
In this approach, you first find the group-identifier, max-value-in-group
(already solved above) in a sub-query. Then you join your table to the sub-query with equality on both group-identifier
and max-value-in-group
:
SELECT a.id, a.rev, a.contents
FROM YourTable a
INNER JOIN (
SELECT id, MAX(rev) rev
FROM YourTable
GROUP BY id
) b ON a.id = b.id AND a.rev = b.rev
Left Joining with self, tweaking join conditions and filters
In this approach, you left join the table with itself. Equality, of course, goes in the group-identifier
. Then, 2 smart moves:
- The second join condition is having left side value less than right value
- When you do step 1, the row(s) that actually have the max value will have
NULL
in the right side (it's aLEFT JOIN
, remember?). Then, we filter the joined result, showing only the rows where the right side isNULL
.
So you end up with:
SELECT a.*
FROM YourTable a
LEFT OUTER JOIN YourTable b
ON a.id = b.id AND a.rev < b.rev
WHERE b.id IS NULL;
Conclusion
Both approaches bring the exact same result.
If you have two rows with max-value-in-group
for group-identifier
, both rows will be in the result in both approaches.
Both approaches are SQL ANSI compatible, thus, will work with your favorite RDBMS, regardless of its "flavor".
Both approaches are also performance friendly, however your mileage may vary (RDBMS, DB Structure, Indexes, etc.). So when you pick one approach over the other, benchmark. And make sure you pick the one which make most of sense to you.
8
I know that MySQL allows you to add non aggregate fields to a "grouped by" query, but I find that kinda pointless. Try running thisselect id, max(rev), rev from YourTable group by id
and you see what I mean. Take your time and try to understand it
– Adrian Carneiro
Oct 12 '11 at 20:05
3
@JasonMcCarrell I'm glad this answer helped you! I get your point, this is why I called itgroup_identifier
, which could be one or more columns. In your case,group_identifier
is the combination of name and age
– Adrian Carneiro
Dec 12 '12 at 16:50
6
How do I get it to return only one row per group though? Don't these answers return every row in each group that has a compare value equal to the maximum value? For instance, suppose there was a second row in the OP's dataset with id = 1, rev = 3. Wouldn't it return both rows with id=1, rev=3?
– Michael Lang
Jun 24 '13 at 22:42
2
@RobertChrist to arbitrarily break ties with the first version, just addDISTINCT ON (yt.id)
after the initialSELECT
. That made my query take twice as long though. So, I don't tie-break since ties are practically impossible in my case.
– ma11hew28
Mar 14 '14 at 0:29
2
Why would the first solution work? Won'tmax
function run per each group consisting of a single row instead of all the rows as a whole.
– Gherman
Sep 18 '14 at 8:24
|
show 28 more comments
At first glance...
All you need is a GROUP BY
clause with the MAX
aggregate function:
SELECT id, MAX(rev)
FROM YourTable
GROUP BY id
It's never that simple, is it?
I just noticed you need the content
column as well.
This is a very common question in SQL: find the whole data for the row with some max value in a column per some group identifier. I heard that a lot during my career. Actually, it was one the questions I answered in my current job's technical interview.
It is, actually, so common that StackOverflow community has created a single tag just to deal with questions like that: greatest-n-per-group.
Basically, you have two approaches to solve that problem:
Joining with simple group-identifier, max-value-in-group
Sub-query
In this approach, you first find the group-identifier, max-value-in-group
(already solved above) in a sub-query. Then you join your table to the sub-query with equality on both group-identifier
and max-value-in-group
:
SELECT a.id, a.rev, a.contents
FROM YourTable a
INNER JOIN (
SELECT id, MAX(rev) rev
FROM YourTable
GROUP BY id
) b ON a.id = b.id AND a.rev = b.rev
Left Joining with self, tweaking join conditions and filters
In this approach, you left join the table with itself. Equality, of course, goes in the group-identifier
. Then, 2 smart moves:
- The second join condition is having left side value less than right value
- When you do step 1, the row(s) that actually have the max value will have
NULL
in the right side (it's aLEFT JOIN
, remember?). Then, we filter the joined result, showing only the rows where the right side isNULL
.
So you end up with:
SELECT a.*
FROM YourTable a
LEFT OUTER JOIN YourTable b
ON a.id = b.id AND a.rev < b.rev
WHERE b.id IS NULL;
Conclusion
Both approaches bring the exact same result.
If you have two rows with max-value-in-group
for group-identifier
, both rows will be in the result in both approaches.
Both approaches are SQL ANSI compatible, thus, will work with your favorite RDBMS, regardless of its "flavor".
Both approaches are also performance friendly, however your mileage may vary (RDBMS, DB Structure, Indexes, etc.). So when you pick one approach over the other, benchmark. And make sure you pick the one which make most of sense to you.
At first glance...
All you need is a GROUP BY
clause with the MAX
aggregate function:
SELECT id, MAX(rev)
FROM YourTable
GROUP BY id
It's never that simple, is it?
I just noticed you need the content
column as well.
This is a very common question in SQL: find the whole data for the row with some max value in a column per some group identifier. I heard that a lot during my career. Actually, it was one the questions I answered in my current job's technical interview.
It is, actually, so common that StackOverflow community has created a single tag just to deal with questions like that: greatest-n-per-group.
Basically, you have two approaches to solve that problem:
Joining with simple group-identifier, max-value-in-group
Sub-query
In this approach, you first find the group-identifier, max-value-in-group
(already solved above) in a sub-query. Then you join your table to the sub-query with equality on both group-identifier
and max-value-in-group
:
SELECT a.id, a.rev, a.contents
FROM YourTable a
INNER JOIN (
SELECT id, MAX(rev) rev
FROM YourTable
GROUP BY id
) b ON a.id = b.id AND a.rev = b.rev
Left Joining with self, tweaking join conditions and filters
In this approach, you left join the table with itself. Equality, of course, goes in the group-identifier
. Then, 2 smart moves:
- The second join condition is having left side value less than right value
- When you do step 1, the row(s) that actually have the max value will have
NULL
in the right side (it's aLEFT JOIN
, remember?). Then, we filter the joined result, showing only the rows where the right side isNULL
.
So you end up with:
SELECT a.*
FROM YourTable a
LEFT OUTER JOIN YourTable b
ON a.id = b.id AND a.rev < b.rev
WHERE b.id IS NULL;
Conclusion
Both approaches bring the exact same result.
If you have two rows with max-value-in-group
for group-identifier
, both rows will be in the result in both approaches.
Both approaches are SQL ANSI compatible, thus, will work with your favorite RDBMS, regardless of its "flavor".
Both approaches are also performance friendly, however your mileage may vary (RDBMS, DB Structure, Indexes, etc.). So when you pick one approach over the other, benchmark. And make sure you pick the one which make most of sense to you.
edited Nov 8 '15 at 11:52
user456814
answered Oct 12 '11 at 19:43
Adrian CarneiroAdrian Carneiro
44.9k1276116
44.9k1276116
8
I know that MySQL allows you to add non aggregate fields to a "grouped by" query, but I find that kinda pointless. Try running thisselect id, max(rev), rev from YourTable group by id
and you see what I mean. Take your time and try to understand it
– Adrian Carneiro
Oct 12 '11 at 20:05
3
@JasonMcCarrell I'm glad this answer helped you! I get your point, this is why I called itgroup_identifier
, which could be one or more columns. In your case,group_identifier
is the combination of name and age
– Adrian Carneiro
Dec 12 '12 at 16:50
6
How do I get it to return only one row per group though? Don't these answers return every row in each group that has a compare value equal to the maximum value? For instance, suppose there was a second row in the OP's dataset with id = 1, rev = 3. Wouldn't it return both rows with id=1, rev=3?
– Michael Lang
Jun 24 '13 at 22:42
2
@RobertChrist to arbitrarily break ties with the first version, just addDISTINCT ON (yt.id)
after the initialSELECT
. That made my query take twice as long though. So, I don't tie-break since ties are practically impossible in my case.
– ma11hew28
Mar 14 '14 at 0:29
2
Why would the first solution work? Won'tmax
function run per each group consisting of a single row instead of all the rows as a whole.
– Gherman
Sep 18 '14 at 8:24
|
show 28 more comments
8
I know that MySQL allows you to add non aggregate fields to a "grouped by" query, but I find that kinda pointless. Try running thisselect id, max(rev), rev from YourTable group by id
and you see what I mean. Take your time and try to understand it
– Adrian Carneiro
Oct 12 '11 at 20:05
3
@JasonMcCarrell I'm glad this answer helped you! I get your point, this is why I called itgroup_identifier
, which could be one or more columns. In your case,group_identifier
is the combination of name and age
– Adrian Carneiro
Dec 12 '12 at 16:50
6
How do I get it to return only one row per group though? Don't these answers return every row in each group that has a compare value equal to the maximum value? For instance, suppose there was a second row in the OP's dataset with id = 1, rev = 3. Wouldn't it return both rows with id=1, rev=3?
– Michael Lang
Jun 24 '13 at 22:42
2
@RobertChrist to arbitrarily break ties with the first version, just addDISTINCT ON (yt.id)
after the initialSELECT
. That made my query take twice as long though. So, I don't tie-break since ties are practically impossible in my case.
– ma11hew28
Mar 14 '14 at 0:29
2
Why would the first solution work? Won'tmax
function run per each group consisting of a single row instead of all the rows as a whole.
– Gherman
Sep 18 '14 at 8:24
8
8
I know that MySQL allows you to add non aggregate fields to a "grouped by" query, but I find that kinda pointless. Try running this
select id, max(rev), rev from YourTable group by id
and you see what I mean. Take your time and try to understand it– Adrian Carneiro
Oct 12 '11 at 20:05
I know that MySQL allows you to add non aggregate fields to a "grouped by" query, but I find that kinda pointless. Try running this
select id, max(rev), rev from YourTable group by id
and you see what I mean. Take your time and try to understand it– Adrian Carneiro
Oct 12 '11 at 20:05
3
3
@JasonMcCarrell I'm glad this answer helped you! I get your point, this is why I called it
group_identifier
, which could be one or more columns. In your case, group_identifier
is the combination of name and age– Adrian Carneiro
Dec 12 '12 at 16:50
@JasonMcCarrell I'm glad this answer helped you! I get your point, this is why I called it
group_identifier
, which could be one or more columns. In your case, group_identifier
is the combination of name and age– Adrian Carneiro
Dec 12 '12 at 16:50
6
6
How do I get it to return only one row per group though? Don't these answers return every row in each group that has a compare value equal to the maximum value? For instance, suppose there was a second row in the OP's dataset with id = 1, rev = 3. Wouldn't it return both rows with id=1, rev=3?
– Michael Lang
Jun 24 '13 at 22:42
How do I get it to return only one row per group though? Don't these answers return every row in each group that has a compare value equal to the maximum value? For instance, suppose there was a second row in the OP's dataset with id = 1, rev = 3. Wouldn't it return both rows with id=1, rev=3?
– Michael Lang
Jun 24 '13 at 22:42
2
2
@RobertChrist to arbitrarily break ties with the first version, just add
DISTINCT ON (yt.id)
after the initial SELECT
. That made my query take twice as long though. So, I don't tie-break since ties are practically impossible in my case.– ma11hew28
Mar 14 '14 at 0:29
@RobertChrist to arbitrarily break ties with the first version, just add
DISTINCT ON (yt.id)
after the initial SELECT
. That made my query take twice as long though. So, I don't tie-break since ties are practically impossible in my case.– ma11hew28
Mar 14 '14 at 0:29
2
2
Why would the first solution work? Won't
max
function run per each group consisting of a single row instead of all the rows as a whole.– Gherman
Sep 18 '14 at 8:24
Why would the first solution work? Won't
max
function run per each group consisting of a single row instead of all the rows as a whole.– Gherman
Sep 18 '14 at 8:24
|
show 28 more comments
My preference is to use as little code as possible...
You can do it using IN
try this:
SELECT *
FROM t1 WHERE (id,rev) IN
( SELECT id, MAX(rev)
FROM t1
GROUP BY id
)
to my mind it is less complicated... easier to read and maintain.
23
Curious - which database engine can we use this type of WHERE clause in? This is not supported in SQL Server.
– Kash
Nov 17 '11 at 17:04
18
oracle & mysql (not sure about other databases sorry)
– Kevin Burton
Nov 17 '11 at 18:03
20
Works on PostgreSQL too.
– lcguida
Jan 15 '14 at 17:43
10
Confirmed working in DB2
– coderatchet
Jan 29 '14 at 2:32
11
Does not work with SQLite.
– Marcel Pfeiffer
Oct 26 '14 at 20:32
|
show 9 more comments
My preference is to use as little code as possible...
You can do it using IN
try this:
SELECT *
FROM t1 WHERE (id,rev) IN
( SELECT id, MAX(rev)
FROM t1
GROUP BY id
)
to my mind it is less complicated... easier to read and maintain.
23
Curious - which database engine can we use this type of WHERE clause in? This is not supported in SQL Server.
– Kash
Nov 17 '11 at 17:04
18
oracle & mysql (not sure about other databases sorry)
– Kevin Burton
Nov 17 '11 at 18:03
20
Works on PostgreSQL too.
– lcguida
Jan 15 '14 at 17:43
10
Confirmed working in DB2
– coderatchet
Jan 29 '14 at 2:32
11
Does not work with SQLite.
– Marcel Pfeiffer
Oct 26 '14 at 20:32
|
show 9 more comments
My preference is to use as little code as possible...
You can do it using IN
try this:
SELECT *
FROM t1 WHERE (id,rev) IN
( SELECT id, MAX(rev)
FROM t1
GROUP BY id
)
to my mind it is less complicated... easier to read and maintain.
My preference is to use as little code as possible...
You can do it using IN
try this:
SELECT *
FROM t1 WHERE (id,rev) IN
( SELECT id, MAX(rev)
FROM t1
GROUP BY id
)
to my mind it is less complicated... easier to read and maintain.
edited Dec 16 '13 at 13:08
answered Oct 12 '11 at 19:47
Kevin BurtonKevin Burton
8,78521533
8,78521533
23
Curious - which database engine can we use this type of WHERE clause in? This is not supported in SQL Server.
– Kash
Nov 17 '11 at 17:04
18
oracle & mysql (not sure about other databases sorry)
– Kevin Burton
Nov 17 '11 at 18:03
20
Works on PostgreSQL too.
– lcguida
Jan 15 '14 at 17:43
10
Confirmed working in DB2
– coderatchet
Jan 29 '14 at 2:32
11
Does not work with SQLite.
– Marcel Pfeiffer
Oct 26 '14 at 20:32
|
show 9 more comments
23
Curious - which database engine can we use this type of WHERE clause in? This is not supported in SQL Server.
– Kash
Nov 17 '11 at 17:04
18
oracle & mysql (not sure about other databases sorry)
– Kevin Burton
Nov 17 '11 at 18:03
20
Works on PostgreSQL too.
– lcguida
Jan 15 '14 at 17:43
10
Confirmed working in DB2
– coderatchet
Jan 29 '14 at 2:32
11
Does not work with SQLite.
– Marcel Pfeiffer
Oct 26 '14 at 20:32
23
23
Curious - which database engine can we use this type of WHERE clause in? This is not supported in SQL Server.
– Kash
Nov 17 '11 at 17:04
Curious - which database engine can we use this type of WHERE clause in? This is not supported in SQL Server.
– Kash
Nov 17 '11 at 17:04
18
18
oracle & mysql (not sure about other databases sorry)
– Kevin Burton
Nov 17 '11 at 18:03
oracle & mysql (not sure about other databases sorry)
– Kevin Burton
Nov 17 '11 at 18:03
20
20
Works on PostgreSQL too.
– lcguida
Jan 15 '14 at 17:43
Works on PostgreSQL too.
– lcguida
Jan 15 '14 at 17:43
10
10
Confirmed working in DB2
– coderatchet
Jan 29 '14 at 2:32
Confirmed working in DB2
– coderatchet
Jan 29 '14 at 2:32
11
11
Does not work with SQLite.
– Marcel Pfeiffer
Oct 26 '14 at 20:32
Does not work with SQLite.
– Marcel Pfeiffer
Oct 26 '14 at 20:32
|
show 9 more comments
Yet another solution is to use a correlated subquery:
select yt.id, yt.rev, yt.contents
from YourTable yt
where rev =
(select max(rev) from YourTable st where yt.id=st.id)
Having an index on (id,rev) renders the subquery almost as a simple lookup...
Following are comparisons to the solutions in @AdrianCarneiro's answer (subquery, leftjoin), based on MySQL measurements with InnoDB table of ~1million records, group size being: 1-3.
While for full table scans subquery/leftjoin/correlated timings relate to each other as 6/8/9, when it comes to direct lookups or batch (id in (1,2,3)
), subquery is much slower then the others (Due to rerunning the subquery). However I couldnt differentiate between leftjoin and correlated solutions in speed.
One final note, as leftjoin creates n*(n+1)/2 joins in groups, its performance can be heavily affected by the size of groups...
This is the only one so far that worked in the way I needed it, thanks (needed to match by name, not by id)
– Doomed Mind
Feb 2 '17 at 15:27
1
I dont think this works if rev is not unique.
– Pita
Jun 5 '17 at 21:13
@Pita no. it works even if rev is not unique
– Pradeep Kumar Prabaharan
Sep 29 '17 at 16:44
Good point for mentioning index required for simple lookup (apparently cannot plus 1 in comments anymore)
– Jared Becksfort
Nov 13 '17 at 18:10
However I couldnt differentiate between leftjoin and correlated solutions in speed.
- the same for me for Sql Server
– nahab
Feb 15 '18 at 13:48
|
show 1 more comment
Yet another solution is to use a correlated subquery:
select yt.id, yt.rev, yt.contents
from YourTable yt
where rev =
(select max(rev) from YourTable st where yt.id=st.id)
Having an index on (id,rev) renders the subquery almost as a simple lookup...
Following are comparisons to the solutions in @AdrianCarneiro's answer (subquery, leftjoin), based on MySQL measurements with InnoDB table of ~1million records, group size being: 1-3.
While for full table scans subquery/leftjoin/correlated timings relate to each other as 6/8/9, when it comes to direct lookups or batch (id in (1,2,3)
), subquery is much slower then the others (Due to rerunning the subquery). However I couldnt differentiate between leftjoin and correlated solutions in speed.
One final note, as leftjoin creates n*(n+1)/2 joins in groups, its performance can be heavily affected by the size of groups...
This is the only one so far that worked in the way I needed it, thanks (needed to match by name, not by id)
– Doomed Mind
Feb 2 '17 at 15:27
1
I dont think this works if rev is not unique.
– Pita
Jun 5 '17 at 21:13
@Pita no. it works even if rev is not unique
– Pradeep Kumar Prabaharan
Sep 29 '17 at 16:44
Good point for mentioning index required for simple lookup (apparently cannot plus 1 in comments anymore)
– Jared Becksfort
Nov 13 '17 at 18:10
However I couldnt differentiate between leftjoin and correlated solutions in speed.
- the same for me for Sql Server
– nahab
Feb 15 '18 at 13:48
|
show 1 more comment
Yet another solution is to use a correlated subquery:
select yt.id, yt.rev, yt.contents
from YourTable yt
where rev =
(select max(rev) from YourTable st where yt.id=st.id)
Having an index on (id,rev) renders the subquery almost as a simple lookup...
Following are comparisons to the solutions in @AdrianCarneiro's answer (subquery, leftjoin), based on MySQL measurements with InnoDB table of ~1million records, group size being: 1-3.
While for full table scans subquery/leftjoin/correlated timings relate to each other as 6/8/9, when it comes to direct lookups or batch (id in (1,2,3)
), subquery is much slower then the others (Due to rerunning the subquery). However I couldnt differentiate between leftjoin and correlated solutions in speed.
One final note, as leftjoin creates n*(n+1)/2 joins in groups, its performance can be heavily affected by the size of groups...
Yet another solution is to use a correlated subquery:
select yt.id, yt.rev, yt.contents
from YourTable yt
where rev =
(select max(rev) from YourTable st where yt.id=st.id)
Having an index on (id,rev) renders the subquery almost as a simple lookup...
Following are comparisons to the solutions in @AdrianCarneiro's answer (subquery, leftjoin), based on MySQL measurements with InnoDB table of ~1million records, group size being: 1-3.
While for full table scans subquery/leftjoin/correlated timings relate to each other as 6/8/9, when it comes to direct lookups or batch (id in (1,2,3)
), subquery is much slower then the others (Due to rerunning the subquery). However I couldnt differentiate between leftjoin and correlated solutions in speed.
One final note, as leftjoin creates n*(n+1)/2 joins in groups, its performance can be heavily affected by the size of groups...
answered Jan 23 '14 at 14:16
Vajk HermeczVajk Hermecz
3,3612222
3,3612222
This is the only one so far that worked in the way I needed it, thanks (needed to match by name, not by id)
– Doomed Mind
Feb 2 '17 at 15:27
1
I dont think this works if rev is not unique.
– Pita
Jun 5 '17 at 21:13
@Pita no. it works even if rev is not unique
– Pradeep Kumar Prabaharan
Sep 29 '17 at 16:44
Good point for mentioning index required for simple lookup (apparently cannot plus 1 in comments anymore)
– Jared Becksfort
Nov 13 '17 at 18:10
However I couldnt differentiate between leftjoin and correlated solutions in speed.
- the same for me for Sql Server
– nahab
Feb 15 '18 at 13:48
|
show 1 more comment
This is the only one so far that worked in the way I needed it, thanks (needed to match by name, not by id)
– Doomed Mind
Feb 2 '17 at 15:27
1
I dont think this works if rev is not unique.
– Pita
Jun 5 '17 at 21:13
@Pita no. it works even if rev is not unique
– Pradeep Kumar Prabaharan
Sep 29 '17 at 16:44
Good point for mentioning index required for simple lookup (apparently cannot plus 1 in comments anymore)
– Jared Becksfort
Nov 13 '17 at 18:10
However I couldnt differentiate between leftjoin and correlated solutions in speed.
- the same for me for Sql Server
– nahab
Feb 15 '18 at 13:48
This is the only one so far that worked in the way I needed it, thanks (needed to match by name, not by id)
– Doomed Mind
Feb 2 '17 at 15:27
This is the only one so far that worked in the way I needed it, thanks (needed to match by name, not by id)
– Doomed Mind
Feb 2 '17 at 15:27
1
1
I dont think this works if rev is not unique.
– Pita
Jun 5 '17 at 21:13
I dont think this works if rev is not unique.
– Pita
Jun 5 '17 at 21:13
@Pita no. it works even if rev is not unique
– Pradeep Kumar Prabaharan
Sep 29 '17 at 16:44
@Pita no. it works even if rev is not unique
– Pradeep Kumar Prabaharan
Sep 29 '17 at 16:44
Good point for mentioning index required for simple lookup (apparently cannot plus 1 in comments anymore)
– Jared Becksfort
Nov 13 '17 at 18:10
Good point for mentioning index required for simple lookup (apparently cannot plus 1 in comments anymore)
– Jared Becksfort
Nov 13 '17 at 18:10
However I couldnt differentiate between leftjoin and correlated solutions in speed.
- the same for me for Sql Server– nahab
Feb 15 '18 at 13:48
However I couldnt differentiate between leftjoin and correlated solutions in speed.
- the same for me for Sql Server– nahab
Feb 15 '18 at 13:48
|
show 1 more comment
I am flabbergasted that no answer offered SQL window function solution:
SELECT a.id, a.rev, a.contents
FROM (SELECT id, rev, contents,
ROW_NUMBER() OVER (PARTITION BY id ORDER BY rev DESC) rank
FROM YourTable) a
WHERE a.rank = 1
Added in SQL standard ANSI/ISO Standard SQL:2003 and later extended with ANSI/ISO Standard SQL:2008, window (or windowing) functions are available with all major vendors now. There are more types of rank functions available to deal with a tie issue: RANK, DENSE_RANK, PERSENT_RANK
.
I think it is less intuitive and potentially less clear - but it can definitely work/be a solution.
– mmcrae
Jan 10 '17 at 16:52
4
intuition is tricky thing. I find it more intuitive than other answers as it builds explicit data structure that answers the question. But, again, intuition is the other side of bias...
– topchef
Jan 10 '17 at 18:22
8
This might work in MariaDB 10.2 and MySQL 8.0.2, but not before.
– Rick James
Apr 1 '17 at 22:01
2
At last, I was beginning to wonder why this wasn't here. This is far more "intuitive" than the vast majority of the "old hat" answers on this page, and way more efficient in almost all cases as it requires just a single pass of the data. Most databases now support these standard window functions (MySQL is late but will from v8 onward).
– Used_By_Already
Dec 11 '17 at 0:42
1
I had no idea this feature existed. Dug deeply into a bunch of manuals this evening. This makes so much more sense than left joins (just from a lack of frustration perspective).
– Andrew Philips
Oct 19 '18 at 4:42
add a comment |
I am flabbergasted that no answer offered SQL window function solution:
SELECT a.id, a.rev, a.contents
FROM (SELECT id, rev, contents,
ROW_NUMBER() OVER (PARTITION BY id ORDER BY rev DESC) rank
FROM YourTable) a
WHERE a.rank = 1
Added in SQL standard ANSI/ISO Standard SQL:2003 and later extended with ANSI/ISO Standard SQL:2008, window (or windowing) functions are available with all major vendors now. There are more types of rank functions available to deal with a tie issue: RANK, DENSE_RANK, PERSENT_RANK
.
I think it is less intuitive and potentially less clear - but it can definitely work/be a solution.
– mmcrae
Jan 10 '17 at 16:52
4
intuition is tricky thing. I find it more intuitive than other answers as it builds explicit data structure that answers the question. But, again, intuition is the other side of bias...
– topchef
Jan 10 '17 at 18:22
8
This might work in MariaDB 10.2 and MySQL 8.0.2, but not before.
– Rick James
Apr 1 '17 at 22:01
2
At last, I was beginning to wonder why this wasn't here. This is far more "intuitive" than the vast majority of the "old hat" answers on this page, and way more efficient in almost all cases as it requires just a single pass of the data. Most databases now support these standard window functions (MySQL is late but will from v8 onward).
– Used_By_Already
Dec 11 '17 at 0:42
1
I had no idea this feature existed. Dug deeply into a bunch of manuals this evening. This makes so much more sense than left joins (just from a lack of frustration perspective).
– Andrew Philips
Oct 19 '18 at 4:42
add a comment |
I am flabbergasted that no answer offered SQL window function solution:
SELECT a.id, a.rev, a.contents
FROM (SELECT id, rev, contents,
ROW_NUMBER() OVER (PARTITION BY id ORDER BY rev DESC) rank
FROM YourTable) a
WHERE a.rank = 1
Added in SQL standard ANSI/ISO Standard SQL:2003 and later extended with ANSI/ISO Standard SQL:2008, window (or windowing) functions are available with all major vendors now. There are more types of rank functions available to deal with a tie issue: RANK, DENSE_RANK, PERSENT_RANK
.
I am flabbergasted that no answer offered SQL window function solution:
SELECT a.id, a.rev, a.contents
FROM (SELECT id, rev, contents,
ROW_NUMBER() OVER (PARTITION BY id ORDER BY rev DESC) rank
FROM YourTable) a
WHERE a.rank = 1
Added in SQL standard ANSI/ISO Standard SQL:2003 and later extended with ANSI/ISO Standard SQL:2008, window (or windowing) functions are available with all major vendors now. There are more types of rank functions available to deal with a tie issue: RANK, DENSE_RANK, PERSENT_RANK
.
edited Aug 14 '16 at 23:16
answered Aug 9 '16 at 15:29
topcheftopchef
13.3k65092
13.3k65092
I think it is less intuitive and potentially less clear - but it can definitely work/be a solution.
– mmcrae
Jan 10 '17 at 16:52
4
intuition is tricky thing. I find it more intuitive than other answers as it builds explicit data structure that answers the question. But, again, intuition is the other side of bias...
– topchef
Jan 10 '17 at 18:22
8
This might work in MariaDB 10.2 and MySQL 8.0.2, but not before.
– Rick James
Apr 1 '17 at 22:01
2
At last, I was beginning to wonder why this wasn't here. This is far more "intuitive" than the vast majority of the "old hat" answers on this page, and way more efficient in almost all cases as it requires just a single pass of the data. Most databases now support these standard window functions (MySQL is late but will from v8 onward).
– Used_By_Already
Dec 11 '17 at 0:42
1
I had no idea this feature existed. Dug deeply into a bunch of manuals this evening. This makes so much more sense than left joins (just from a lack of frustration perspective).
– Andrew Philips
Oct 19 '18 at 4:42
add a comment |
I think it is less intuitive and potentially less clear - but it can definitely work/be a solution.
– mmcrae
Jan 10 '17 at 16:52
4
intuition is tricky thing. I find it more intuitive than other answers as it builds explicit data structure that answers the question. But, again, intuition is the other side of bias...
– topchef
Jan 10 '17 at 18:22
8
This might work in MariaDB 10.2 and MySQL 8.0.2, but not before.
– Rick James
Apr 1 '17 at 22:01
2
At last, I was beginning to wonder why this wasn't here. This is far more "intuitive" than the vast majority of the "old hat" answers on this page, and way more efficient in almost all cases as it requires just a single pass of the data. Most databases now support these standard window functions (MySQL is late but will from v8 onward).
– Used_By_Already
Dec 11 '17 at 0:42
1
I had no idea this feature existed. Dug deeply into a bunch of manuals this evening. This makes so much more sense than left joins (just from a lack of frustration perspective).
– Andrew Philips
Oct 19 '18 at 4:42
I think it is less intuitive and potentially less clear - but it can definitely work/be a solution.
– mmcrae
Jan 10 '17 at 16:52
I think it is less intuitive and potentially less clear - but it can definitely work/be a solution.
– mmcrae
Jan 10 '17 at 16:52
4
4
intuition is tricky thing. I find it more intuitive than other answers as it builds explicit data structure that answers the question. But, again, intuition is the other side of bias...
– topchef
Jan 10 '17 at 18:22
intuition is tricky thing. I find it more intuitive than other answers as it builds explicit data structure that answers the question. But, again, intuition is the other side of bias...
– topchef
Jan 10 '17 at 18:22
8
8
This might work in MariaDB 10.2 and MySQL 8.0.2, but not before.
– Rick James
Apr 1 '17 at 22:01
This might work in MariaDB 10.2 and MySQL 8.0.2, but not before.
– Rick James
Apr 1 '17 at 22:01
2
2
At last, I was beginning to wonder why this wasn't here. This is far more "intuitive" than the vast majority of the "old hat" answers on this page, and way more efficient in almost all cases as it requires just a single pass of the data. Most databases now support these standard window functions (MySQL is late but will from v8 onward).
– Used_By_Already
Dec 11 '17 at 0:42
At last, I was beginning to wonder why this wasn't here. This is far more "intuitive" than the vast majority of the "old hat" answers on this page, and way more efficient in almost all cases as it requires just a single pass of the data. Most databases now support these standard window functions (MySQL is late but will from v8 onward).
– Used_By_Already
Dec 11 '17 at 0:42
1
1
I had no idea this feature existed. Dug deeply into a bunch of manuals this evening. This makes so much more sense than left joins (just from a lack of frustration perspective).
– Andrew Philips
Oct 19 '18 at 4:42
I had no idea this feature existed. Dug deeply into a bunch of manuals this evening. This makes so much more sense than left joins (just from a lack of frustration perspective).
– Andrew Philips
Oct 19 '18 at 4:42
add a comment |
I can't vouch for the performance, but here's a trick inspired by the limitations of Microsoft Excel. It has some good features
GOOD STUFF
- It should force return of only one "max record" even if there is a tie (sometimes useful)
- It doesn't require a join
APPROACH
It is a little bit ugly and requires that you know something about the range of valid values of the rev column. Let us assume that we know the rev column is a number between 0.00 and 999 including decimals but that there will only ever be two digits to the right of the decimal point (e.g. 34.17 would be a valid value).
The gist of the thing is that you create a single synthetic column by string concatenating/packing the primary comparison field along with the data you want. In this way, you can force SQL's MAX() aggregate function to return all of the data (because it has been packed into a single column). Then you have to unpack the data.
Here's how it looks with the above example, written in SQL
SELECT id,
CAST(SUBSTRING(max(packed_col) FROM 2 FOR 6) AS float) as max_rev,
SUBSTRING(max(packed_col) FROM 11) AS content_for_max_rev
FROM (SELECT id,
CAST(1000 + rev + .001 as CHAR) || '---' || CAST(content AS char) AS packed_col
FROM yourtable
)
GROUP BY id
The packing begins by forcing the rev column to be a number of known character length regardless of the value of rev so that for example
- 3.2 becomes 1003.201
- 57 becomes 1057.001
- 923.88 becomes 1923.881
If you do it right, string comparison of two numbers should yield the same "max" as numeric comparison of the two numbers and it's easy to convert back to the original number using the substring function (which is available in one form or another pretty much everywhere).
Great solution, it performs much faster than join and other proposed solutions.
– danial
Sep 29 '14 at 22:10
add a comment |
I can't vouch for the performance, but here's a trick inspired by the limitations of Microsoft Excel. It has some good features
GOOD STUFF
- It should force return of only one "max record" even if there is a tie (sometimes useful)
- It doesn't require a join
APPROACH
It is a little bit ugly and requires that you know something about the range of valid values of the rev column. Let us assume that we know the rev column is a number between 0.00 and 999 including decimals but that there will only ever be two digits to the right of the decimal point (e.g. 34.17 would be a valid value).
The gist of the thing is that you create a single synthetic column by string concatenating/packing the primary comparison field along with the data you want. In this way, you can force SQL's MAX() aggregate function to return all of the data (because it has been packed into a single column). Then you have to unpack the data.
Here's how it looks with the above example, written in SQL
SELECT id,
CAST(SUBSTRING(max(packed_col) FROM 2 FOR 6) AS float) as max_rev,
SUBSTRING(max(packed_col) FROM 11) AS content_for_max_rev
FROM (SELECT id,
CAST(1000 + rev + .001 as CHAR) || '---' || CAST(content AS char) AS packed_col
FROM yourtable
)
GROUP BY id
The packing begins by forcing the rev column to be a number of known character length regardless of the value of rev so that for example
- 3.2 becomes 1003.201
- 57 becomes 1057.001
- 923.88 becomes 1923.881
If you do it right, string comparison of two numbers should yield the same "max" as numeric comparison of the two numbers and it's easy to convert back to the original number using the substring function (which is available in one form or another pretty much everywhere).
Great solution, it performs much faster than join and other proposed solutions.
– danial
Sep 29 '14 at 22:10
add a comment |
I can't vouch for the performance, but here's a trick inspired by the limitations of Microsoft Excel. It has some good features
GOOD STUFF
- It should force return of only one "max record" even if there is a tie (sometimes useful)
- It doesn't require a join
APPROACH
It is a little bit ugly and requires that you know something about the range of valid values of the rev column. Let us assume that we know the rev column is a number between 0.00 and 999 including decimals but that there will only ever be two digits to the right of the decimal point (e.g. 34.17 would be a valid value).
The gist of the thing is that you create a single synthetic column by string concatenating/packing the primary comparison field along with the data you want. In this way, you can force SQL's MAX() aggregate function to return all of the data (because it has been packed into a single column). Then you have to unpack the data.
Here's how it looks with the above example, written in SQL
SELECT id,
CAST(SUBSTRING(max(packed_col) FROM 2 FOR 6) AS float) as max_rev,
SUBSTRING(max(packed_col) FROM 11) AS content_for_max_rev
FROM (SELECT id,
CAST(1000 + rev + .001 as CHAR) || '---' || CAST(content AS char) AS packed_col
FROM yourtable
)
GROUP BY id
The packing begins by forcing the rev column to be a number of known character length regardless of the value of rev so that for example
- 3.2 becomes 1003.201
- 57 becomes 1057.001
- 923.88 becomes 1923.881
If you do it right, string comparison of two numbers should yield the same "max" as numeric comparison of the two numbers and it's easy to convert back to the original number using the substring function (which is available in one form or another pretty much everywhere).
I can't vouch for the performance, but here's a trick inspired by the limitations of Microsoft Excel. It has some good features
GOOD STUFF
- It should force return of only one "max record" even if there is a tie (sometimes useful)
- It doesn't require a join
APPROACH
It is a little bit ugly and requires that you know something about the range of valid values of the rev column. Let us assume that we know the rev column is a number between 0.00 and 999 including decimals but that there will only ever be two digits to the right of the decimal point (e.g. 34.17 would be a valid value).
The gist of the thing is that you create a single synthetic column by string concatenating/packing the primary comparison field along with the data you want. In this way, you can force SQL's MAX() aggregate function to return all of the data (because it has been packed into a single column). Then you have to unpack the data.
Here's how it looks with the above example, written in SQL
SELECT id,
CAST(SUBSTRING(max(packed_col) FROM 2 FOR 6) AS float) as max_rev,
SUBSTRING(max(packed_col) FROM 11) AS content_for_max_rev
FROM (SELECT id,
CAST(1000 + rev + .001 as CHAR) || '---' || CAST(content AS char) AS packed_col
FROM yourtable
)
GROUP BY id
The packing begins by forcing the rev column to be a number of known character length regardless of the value of rev so that for example
- 3.2 becomes 1003.201
- 57 becomes 1057.001
- 923.88 becomes 1923.881
If you do it right, string comparison of two numbers should yield the same "max" as numeric comparison of the two numbers and it's easy to convert back to the original number using the substring function (which is available in one form or another pretty much everywhere).
answered Jun 30 '13 at 6:02
David FosterDavid Foster
58945
58945
Great solution, it performs much faster than join and other proposed solutions.
– danial
Sep 29 '14 at 22:10
add a comment |
Great solution, it performs much faster than join and other proposed solutions.
– danial
Sep 29 '14 at 22:10
Great solution, it performs much faster than join and other proposed solutions.
– danial
Sep 29 '14 at 22:10
Great solution, it performs much faster than join and other proposed solutions.
– danial
Sep 29 '14 at 22:10
add a comment |
I think this is the easiest solution :
SELECT *
FROM
(SELECT *
FROM Employee
ORDER BY Salary DESC)
AS employeesub
GROUP BY employeesub.Salary;
SELECT *
: Return all fields.
FROM Employee
: Table searched on.
(SELECT *...)
subquery : Return all people, sorted by Salary.
GROUP BY employeesub.Salary
: Force the top-sorted, Salary row of each employee to be the returned result.
If you happen to need just the one row, it's even easier :
SELECT *
FROM Employee
ORDER BY Employee.Salary DESC
LIMIT 1
I also think it's the easiest to break down, understand, and modify to other purposes:
ORDER BY Employee.Salary DESC
: Order the results by the salary, with highest salaries first.
LIMIT 1
: Return just one result.
Understanding this approach, solving any of these similar problems becomes trivial: get employee with lowest salary (change DESC
to ASC
), get top-ten earning employees (change LIMIT 1
to LIMIT 10
), sort by means of another field (change ORDER BY Employee.Salary
to ORDER BY Employee.Commission
), etc..
1
This does not answer the question. The question is asking how to get the data for one row (as was asked, "one row per ID") in a group query where value x is the max within each group of rows. For example a customer order table with multiple orders per customer where you want to retrieve the largest order for each customer. Your query might very well return more than one row per customer (if, for example, the two largest orders were placed by the same customer).
– Aaron J Spetner
Oct 2 '17 at 6:39
"one row per ID" <-- keep reading, please, and you'll see "and only the greatest". That is logically equivalent to just the greatest.
– HoldOffHunger
Oct 2 '17 at 12:17
Yes, but it says "and". Which means the requirements are BOTH one row per ID AND only the greatest. Using this answer will not satisfy the first requirement. Additionally, the question implies the need to retrieve a single record for ALL of the IDs. This answer requires knowledge of the number of IDs beforehand (in order to configure the LIMIT), which will require additional code. The question's goal is stated specifically as seeking a SQL-only solution. Finally, even if you know the number of unique IDs, if there are multiple occurrences of the MAX value, the LIMIT clause will be wrong.
– Aaron J Spetner
Oct 3 '17 at 7:12
1
I did not have the exact same situation like in the original post but this is the most easy to understand and straightforward and working solution i came across so far for my problem. I am amazed how all the geeks and freaks try to overtake each other by bragging with complex / weird queries.
– sba
Oct 5 '17 at 14:58
1
This is a hacky solution, totally busted in the later MySQL versions won't work on servers withONLY_FULL_GROUP_BY
enabled within the server config... sqlfiddle.com/#!9/215cd/4
– Raymond Nijland
Jun 18 '18 at 15:55
|
show 3 more comments
I think this is the easiest solution :
SELECT *
FROM
(SELECT *
FROM Employee
ORDER BY Salary DESC)
AS employeesub
GROUP BY employeesub.Salary;
SELECT *
: Return all fields.
FROM Employee
: Table searched on.
(SELECT *...)
subquery : Return all people, sorted by Salary.
GROUP BY employeesub.Salary
: Force the top-sorted, Salary row of each employee to be the returned result.
If you happen to need just the one row, it's even easier :
SELECT *
FROM Employee
ORDER BY Employee.Salary DESC
LIMIT 1
I also think it's the easiest to break down, understand, and modify to other purposes:
ORDER BY Employee.Salary DESC
: Order the results by the salary, with highest salaries first.
LIMIT 1
: Return just one result.
Understanding this approach, solving any of these similar problems becomes trivial: get employee with lowest salary (change DESC
to ASC
), get top-ten earning employees (change LIMIT 1
to LIMIT 10
), sort by means of another field (change ORDER BY Employee.Salary
to ORDER BY Employee.Commission
), etc..
1
This does not answer the question. The question is asking how to get the data for one row (as was asked, "one row per ID") in a group query where value x is the max within each group of rows. For example a customer order table with multiple orders per customer where you want to retrieve the largest order for each customer. Your query might very well return more than one row per customer (if, for example, the two largest orders were placed by the same customer).
– Aaron J Spetner
Oct 2 '17 at 6:39
"one row per ID" <-- keep reading, please, and you'll see "and only the greatest". That is logically equivalent to just the greatest.
– HoldOffHunger
Oct 2 '17 at 12:17
Yes, but it says "and". Which means the requirements are BOTH one row per ID AND only the greatest. Using this answer will not satisfy the first requirement. Additionally, the question implies the need to retrieve a single record for ALL of the IDs. This answer requires knowledge of the number of IDs beforehand (in order to configure the LIMIT), which will require additional code. The question's goal is stated specifically as seeking a SQL-only solution. Finally, even if you know the number of unique IDs, if there are multiple occurrences of the MAX value, the LIMIT clause will be wrong.
– Aaron J Spetner
Oct 3 '17 at 7:12
1
I did not have the exact same situation like in the original post but this is the most easy to understand and straightforward and working solution i came across so far for my problem. I am amazed how all the geeks and freaks try to overtake each other by bragging with complex / weird queries.
– sba
Oct 5 '17 at 14:58
1
This is a hacky solution, totally busted in the later MySQL versions won't work on servers withONLY_FULL_GROUP_BY
enabled within the server config... sqlfiddle.com/#!9/215cd/4
– Raymond Nijland
Jun 18 '18 at 15:55
|
show 3 more comments
I think this is the easiest solution :
SELECT *
FROM
(SELECT *
FROM Employee
ORDER BY Salary DESC)
AS employeesub
GROUP BY employeesub.Salary;
SELECT *
: Return all fields.
FROM Employee
: Table searched on.
(SELECT *...)
subquery : Return all people, sorted by Salary.
GROUP BY employeesub.Salary
: Force the top-sorted, Salary row of each employee to be the returned result.
If you happen to need just the one row, it's even easier :
SELECT *
FROM Employee
ORDER BY Employee.Salary DESC
LIMIT 1
I also think it's the easiest to break down, understand, and modify to other purposes:
ORDER BY Employee.Salary DESC
: Order the results by the salary, with highest salaries first.
LIMIT 1
: Return just one result.
Understanding this approach, solving any of these similar problems becomes trivial: get employee with lowest salary (change DESC
to ASC
), get top-ten earning employees (change LIMIT 1
to LIMIT 10
), sort by means of another field (change ORDER BY Employee.Salary
to ORDER BY Employee.Commission
), etc..
I think this is the easiest solution :
SELECT *
FROM
(SELECT *
FROM Employee
ORDER BY Salary DESC)
AS employeesub
GROUP BY employeesub.Salary;
SELECT *
: Return all fields.
FROM Employee
: Table searched on.
(SELECT *...)
subquery : Return all people, sorted by Salary.
GROUP BY employeesub.Salary
: Force the top-sorted, Salary row of each employee to be the returned result.
If you happen to need just the one row, it's even easier :
SELECT *
FROM Employee
ORDER BY Employee.Salary DESC
LIMIT 1
I also think it's the easiest to break down, understand, and modify to other purposes:
ORDER BY Employee.Salary DESC
: Order the results by the salary, with highest salaries first.
LIMIT 1
: Return just one result.
Understanding this approach, solving any of these similar problems becomes trivial: get employee with lowest salary (change DESC
to ASC
), get top-ten earning employees (change LIMIT 1
to LIMIT 10
), sort by means of another field (change ORDER BY Employee.Salary
to ORDER BY Employee.Commission
), etc..
edited Mar 8 at 17:55
Bikramjeet Singh
3751312
3751312
answered Sep 14 '16 at 0:28
HoldOffHungerHoldOffHunger
4,08922349
4,08922349
1
This does not answer the question. The question is asking how to get the data for one row (as was asked, "one row per ID") in a group query where value x is the max within each group of rows. For example a customer order table with multiple orders per customer where you want to retrieve the largest order for each customer. Your query might very well return more than one row per customer (if, for example, the two largest orders were placed by the same customer).
– Aaron J Spetner
Oct 2 '17 at 6:39
"one row per ID" <-- keep reading, please, and you'll see "and only the greatest". That is logically equivalent to just the greatest.
– HoldOffHunger
Oct 2 '17 at 12:17
Yes, but it says "and". Which means the requirements are BOTH one row per ID AND only the greatest. Using this answer will not satisfy the first requirement. Additionally, the question implies the need to retrieve a single record for ALL of the IDs. This answer requires knowledge of the number of IDs beforehand (in order to configure the LIMIT), which will require additional code. The question's goal is stated specifically as seeking a SQL-only solution. Finally, even if you know the number of unique IDs, if there are multiple occurrences of the MAX value, the LIMIT clause will be wrong.
– Aaron J Spetner
Oct 3 '17 at 7:12
1
I did not have the exact same situation like in the original post but this is the most easy to understand and straightforward and working solution i came across so far for my problem. I am amazed how all the geeks and freaks try to overtake each other by bragging with complex / weird queries.
– sba
Oct 5 '17 at 14:58
1
This is a hacky solution, totally busted in the later MySQL versions won't work on servers withONLY_FULL_GROUP_BY
enabled within the server config... sqlfiddle.com/#!9/215cd/4
– Raymond Nijland
Jun 18 '18 at 15:55
|
show 3 more comments
1
This does not answer the question. The question is asking how to get the data for one row (as was asked, "one row per ID") in a group query where value x is the max within each group of rows. For example a customer order table with multiple orders per customer where you want to retrieve the largest order for each customer. Your query might very well return more than one row per customer (if, for example, the two largest orders were placed by the same customer).
– Aaron J Spetner
Oct 2 '17 at 6:39
"one row per ID" <-- keep reading, please, and you'll see "and only the greatest". That is logically equivalent to just the greatest.
– HoldOffHunger
Oct 2 '17 at 12:17
Yes, but it says "and". Which means the requirements are BOTH one row per ID AND only the greatest. Using this answer will not satisfy the first requirement. Additionally, the question implies the need to retrieve a single record for ALL of the IDs. This answer requires knowledge of the number of IDs beforehand (in order to configure the LIMIT), which will require additional code. The question's goal is stated specifically as seeking a SQL-only solution. Finally, even if you know the number of unique IDs, if there are multiple occurrences of the MAX value, the LIMIT clause will be wrong.
– Aaron J Spetner
Oct 3 '17 at 7:12
1
I did not have the exact same situation like in the original post but this is the most easy to understand and straightforward and working solution i came across so far for my problem. I am amazed how all the geeks and freaks try to overtake each other by bragging with complex / weird queries.
– sba
Oct 5 '17 at 14:58
1
This is a hacky solution, totally busted in the later MySQL versions won't work on servers withONLY_FULL_GROUP_BY
enabled within the server config... sqlfiddle.com/#!9/215cd/4
– Raymond Nijland
Jun 18 '18 at 15:55
1
1
This does not answer the question. The question is asking how to get the data for one row (as was asked, "one row per ID") in a group query where value x is the max within each group of rows. For example a customer order table with multiple orders per customer where you want to retrieve the largest order for each customer. Your query might very well return more than one row per customer (if, for example, the two largest orders were placed by the same customer).
– Aaron J Spetner
Oct 2 '17 at 6:39
This does not answer the question. The question is asking how to get the data for one row (as was asked, "one row per ID") in a group query where value x is the max within each group of rows. For example a customer order table with multiple orders per customer where you want to retrieve the largest order for each customer. Your query might very well return more than one row per customer (if, for example, the two largest orders were placed by the same customer).
– Aaron J Spetner
Oct 2 '17 at 6:39
"one row per ID" <-- keep reading, please, and you'll see "and only the greatest". That is logically equivalent to just the greatest.
– HoldOffHunger
Oct 2 '17 at 12:17
"one row per ID" <-- keep reading, please, and you'll see "and only the greatest". That is logically equivalent to just the greatest.
– HoldOffHunger
Oct 2 '17 at 12:17
Yes, but it says "and". Which means the requirements are BOTH one row per ID AND only the greatest. Using this answer will not satisfy the first requirement. Additionally, the question implies the need to retrieve a single record for ALL of the IDs. This answer requires knowledge of the number of IDs beforehand (in order to configure the LIMIT), which will require additional code. The question's goal is stated specifically as seeking a SQL-only solution. Finally, even if you know the number of unique IDs, if there are multiple occurrences of the MAX value, the LIMIT clause will be wrong.
– Aaron J Spetner
Oct 3 '17 at 7:12
Yes, but it says "and". Which means the requirements are BOTH one row per ID AND only the greatest. Using this answer will not satisfy the first requirement. Additionally, the question implies the need to retrieve a single record for ALL of the IDs. This answer requires knowledge of the number of IDs beforehand (in order to configure the LIMIT), which will require additional code. The question's goal is stated specifically as seeking a SQL-only solution. Finally, even if you know the number of unique IDs, if there are multiple occurrences of the MAX value, the LIMIT clause will be wrong.
– Aaron J Spetner
Oct 3 '17 at 7:12
1
1
I did not have the exact same situation like in the original post but this is the most easy to understand and straightforward and working solution i came across so far for my problem. I am amazed how all the geeks and freaks try to overtake each other by bragging with complex / weird queries.
– sba
Oct 5 '17 at 14:58
I did not have the exact same situation like in the original post but this is the most easy to understand and straightforward and working solution i came across so far for my problem. I am amazed how all the geeks and freaks try to overtake each other by bragging with complex / weird queries.
– sba
Oct 5 '17 at 14:58
1
1
This is a hacky solution, totally busted in the later MySQL versions won't work on servers with
ONLY_FULL_GROUP_BY
enabled within the server config... sqlfiddle.com/#!9/215cd/4– Raymond Nijland
Jun 18 '18 at 15:55
This is a hacky solution, totally busted in the later MySQL versions won't work on servers with
ONLY_FULL_GROUP_BY
enabled within the server config... sqlfiddle.com/#!9/215cd/4– Raymond Nijland
Jun 18 '18 at 15:55
|
show 3 more comments
Something like this?
SELECT yourtable.id, rev, content
FROM yourtable
INNER JOIN (
SELECT id, max(rev) as maxrev FROM yourtable
WHERE yourtable
GROUP BY id
) AS child ON (yourtable.id = child.id) AND (yourtable.rev = maxrev)
The join-less ones wouldn't cut it?
– Majid Fouladpour
Oct 12 '11 at 19:51
1
If they work, then they're fine too.
– Marc B
Oct 12 '11 at 19:54
10
What doesWHERE yourtable
do?
– Brian McCutchon
Jun 3 '16 at 5:19
This seems to be the fastest one (with proper indexes).
– Salman A
Feb 13 at 12:27
add a comment |
Something like this?
SELECT yourtable.id, rev, content
FROM yourtable
INNER JOIN (
SELECT id, max(rev) as maxrev FROM yourtable
WHERE yourtable
GROUP BY id
) AS child ON (yourtable.id = child.id) AND (yourtable.rev = maxrev)
The join-less ones wouldn't cut it?
– Majid Fouladpour
Oct 12 '11 at 19:51
1
If they work, then they're fine too.
– Marc B
Oct 12 '11 at 19:54
10
What doesWHERE yourtable
do?
– Brian McCutchon
Jun 3 '16 at 5:19
This seems to be the fastest one (with proper indexes).
– Salman A
Feb 13 at 12:27
add a comment |
Something like this?
SELECT yourtable.id, rev, content
FROM yourtable
INNER JOIN (
SELECT id, max(rev) as maxrev FROM yourtable
WHERE yourtable
GROUP BY id
) AS child ON (yourtable.id = child.id) AND (yourtable.rev = maxrev)
Something like this?
SELECT yourtable.id, rev, content
FROM yourtable
INNER JOIN (
SELECT id, max(rev) as maxrev FROM yourtable
WHERE yourtable
GROUP BY id
) AS child ON (yourtable.id = child.id) AND (yourtable.rev = maxrev)
edited Oct 12 '11 at 19:54
answered Oct 12 '11 at 19:48
Marc BMarc B
315k31324428
315k31324428
The join-less ones wouldn't cut it?
– Majid Fouladpour
Oct 12 '11 at 19:51
1
If they work, then they're fine too.
– Marc B
Oct 12 '11 at 19:54
10
What doesWHERE yourtable
do?
– Brian McCutchon
Jun 3 '16 at 5:19
This seems to be the fastest one (with proper indexes).
– Salman A
Feb 13 at 12:27
add a comment |
The join-less ones wouldn't cut it?
– Majid Fouladpour
Oct 12 '11 at 19:51
1
If they work, then they're fine too.
– Marc B
Oct 12 '11 at 19:54
10
What doesWHERE yourtable
do?
– Brian McCutchon
Jun 3 '16 at 5:19
This seems to be the fastest one (with proper indexes).
– Salman A
Feb 13 at 12:27
The join-less ones wouldn't cut it?
– Majid Fouladpour
Oct 12 '11 at 19:51
The join-less ones wouldn't cut it?
– Majid Fouladpour
Oct 12 '11 at 19:51
1
1
If they work, then they're fine too.
– Marc B
Oct 12 '11 at 19:54
If they work, then they're fine too.
– Marc B
Oct 12 '11 at 19:54
10
10
What does
WHERE yourtable
do?– Brian McCutchon
Jun 3 '16 at 5:19
What does
WHERE yourtable
do?– Brian McCutchon
Jun 3 '16 at 5:19
This seems to be the fastest one (with proper indexes).
– Salman A
Feb 13 at 12:27
This seems to be the fastest one (with proper indexes).
– Salman A
Feb 13 at 12:27
add a comment |
Since this is most popular question with regard to this problem, I'll re-post another answer to it here as well:
It looks like there is simpler way to do this (but only in MySQL):
select *
from (select * from mytable order by id, rev desc ) x
group by id
Please credit answer of user Bohemian in this question for providing such a concise and elegant answer to this problem.
EDIT: though this solution works for many people it may not be stable in the long run, since MySQL doesn't guarantee that GROUP BY statement will return meaningful values for columns not in GROUP BY list. So use this solution at your own risk
7
Except that it's wrong, as there is no guarantee that the order of the inner query means anything, nor is the GROUP BY always guaranteed to take the first encountered row. At least in MySQL and I would assume all others. In fact I was under the assumption that MySQL would simply ignore the whole ORDER BY. Any future version or a change in configuration might break this query.
– Jannes
Oct 10 '14 at 10:14
@Jannes this is interesting remark :) I welcome you to answer my question providing proofs: stackoverflow.com/questions/26301877/…
– Yura
Oct 10 '14 at 14:41
1
@Jannes concerning GROUP BY not guaranteed to take the first encountered row - you are totally right - found this issue bugs.mysql.com/bug.php?id=71942 which asks to provide such guarantees. Will update my answer now
– Yura
Oct 10 '14 at 14:59
I think I remember where I got the ORDER BY being discarded from: MySQL does that with UNIONs if you ORDER BY the inner queries, it's just ignore: dev.mysql.com/doc/refman/5.0/en/union.html says "If ORDER BY appears without LIMIT in a SELECT, it is optimized away because it will have no effect anyway." I haven't seen such a statement for the query in question here, but I don't see why it couldn't do that.
– Jannes
Oct 11 '14 at 19:09
add a comment |
Since this is most popular question with regard to this problem, I'll re-post another answer to it here as well:
It looks like there is simpler way to do this (but only in MySQL):
select *
from (select * from mytable order by id, rev desc ) x
group by id
Please credit answer of user Bohemian in this question for providing such a concise and elegant answer to this problem.
EDIT: though this solution works for many people it may not be stable in the long run, since MySQL doesn't guarantee that GROUP BY statement will return meaningful values for columns not in GROUP BY list. So use this solution at your own risk
7
Except that it's wrong, as there is no guarantee that the order of the inner query means anything, nor is the GROUP BY always guaranteed to take the first encountered row. At least in MySQL and I would assume all others. In fact I was under the assumption that MySQL would simply ignore the whole ORDER BY. Any future version or a change in configuration might break this query.
– Jannes
Oct 10 '14 at 10:14
@Jannes this is interesting remark :) I welcome you to answer my question providing proofs: stackoverflow.com/questions/26301877/…
– Yura
Oct 10 '14 at 14:41
1
@Jannes concerning GROUP BY not guaranteed to take the first encountered row - you are totally right - found this issue bugs.mysql.com/bug.php?id=71942 which asks to provide such guarantees. Will update my answer now
– Yura
Oct 10 '14 at 14:59
I think I remember where I got the ORDER BY being discarded from: MySQL does that with UNIONs if you ORDER BY the inner queries, it's just ignore: dev.mysql.com/doc/refman/5.0/en/union.html says "If ORDER BY appears without LIMIT in a SELECT, it is optimized away because it will have no effect anyway." I haven't seen such a statement for the query in question here, but I don't see why it couldn't do that.
– Jannes
Oct 11 '14 at 19:09
add a comment |
Since this is most popular question with regard to this problem, I'll re-post another answer to it here as well:
It looks like there is simpler way to do this (but only in MySQL):
select *
from (select * from mytable order by id, rev desc ) x
group by id
Please credit answer of user Bohemian in this question for providing such a concise and elegant answer to this problem.
EDIT: though this solution works for many people it may not be stable in the long run, since MySQL doesn't guarantee that GROUP BY statement will return meaningful values for columns not in GROUP BY list. So use this solution at your own risk
Since this is most popular question with regard to this problem, I'll re-post another answer to it here as well:
It looks like there is simpler way to do this (but only in MySQL):
select *
from (select * from mytable order by id, rev desc ) x
group by id
Please credit answer of user Bohemian in this question for providing such a concise and elegant answer to this problem.
EDIT: though this solution works for many people it may not be stable in the long run, since MySQL doesn't guarantee that GROUP BY statement will return meaningful values for columns not in GROUP BY list. So use this solution at your own risk
edited May 23 '17 at 12:34
Community♦
11
11
answered Jul 3 '14 at 14:33
YuraYura
2,87742441
2,87742441
7
Except that it's wrong, as there is no guarantee that the order of the inner query means anything, nor is the GROUP BY always guaranteed to take the first encountered row. At least in MySQL and I would assume all others. In fact I was under the assumption that MySQL would simply ignore the whole ORDER BY. Any future version or a change in configuration might break this query.
– Jannes
Oct 10 '14 at 10:14
@Jannes this is interesting remark :) I welcome you to answer my question providing proofs: stackoverflow.com/questions/26301877/…
– Yura
Oct 10 '14 at 14:41
1
@Jannes concerning GROUP BY not guaranteed to take the first encountered row - you are totally right - found this issue bugs.mysql.com/bug.php?id=71942 which asks to provide such guarantees. Will update my answer now
– Yura
Oct 10 '14 at 14:59
I think I remember where I got the ORDER BY being discarded from: MySQL does that with UNIONs if you ORDER BY the inner queries, it's just ignore: dev.mysql.com/doc/refman/5.0/en/union.html says "If ORDER BY appears without LIMIT in a SELECT, it is optimized away because it will have no effect anyway." I haven't seen such a statement for the query in question here, but I don't see why it couldn't do that.
– Jannes
Oct 11 '14 at 19:09
add a comment |
7
Except that it's wrong, as there is no guarantee that the order of the inner query means anything, nor is the GROUP BY always guaranteed to take the first encountered row. At least in MySQL and I would assume all others. In fact I was under the assumption that MySQL would simply ignore the whole ORDER BY. Any future version or a change in configuration might break this query.
– Jannes
Oct 10 '14 at 10:14
@Jannes this is interesting remark :) I welcome you to answer my question providing proofs: stackoverflow.com/questions/26301877/…
– Yura
Oct 10 '14 at 14:41
1
@Jannes concerning GROUP BY not guaranteed to take the first encountered row - you are totally right - found this issue bugs.mysql.com/bug.php?id=71942 which asks to provide such guarantees. Will update my answer now
– Yura
Oct 10 '14 at 14:59
I think I remember where I got the ORDER BY being discarded from: MySQL does that with UNIONs if you ORDER BY the inner queries, it's just ignore: dev.mysql.com/doc/refman/5.0/en/union.html says "If ORDER BY appears without LIMIT in a SELECT, it is optimized away because it will have no effect anyway." I haven't seen such a statement for the query in question here, but I don't see why it couldn't do that.
– Jannes
Oct 11 '14 at 19:09
7
7
Except that it's wrong, as there is no guarantee that the order of the inner query means anything, nor is the GROUP BY always guaranteed to take the first encountered row. At least in MySQL and I would assume all others. In fact I was under the assumption that MySQL would simply ignore the whole ORDER BY. Any future version or a change in configuration might break this query.
– Jannes
Oct 10 '14 at 10:14
Except that it's wrong, as there is no guarantee that the order of the inner query means anything, nor is the GROUP BY always guaranteed to take the first encountered row. At least in MySQL and I would assume all others. In fact I was under the assumption that MySQL would simply ignore the whole ORDER BY. Any future version or a change in configuration might break this query.
– Jannes
Oct 10 '14 at 10:14
@Jannes this is interesting remark :) I welcome you to answer my question providing proofs: stackoverflow.com/questions/26301877/…
– Yura
Oct 10 '14 at 14:41
@Jannes this is interesting remark :) I welcome you to answer my question providing proofs: stackoverflow.com/questions/26301877/…
– Yura
Oct 10 '14 at 14:41
1
1
@Jannes concerning GROUP BY not guaranteed to take the first encountered row - you are totally right - found this issue bugs.mysql.com/bug.php?id=71942 which asks to provide such guarantees. Will update my answer now
– Yura
Oct 10 '14 at 14:59
@Jannes concerning GROUP BY not guaranteed to take the first encountered row - you are totally right - found this issue bugs.mysql.com/bug.php?id=71942 which asks to provide such guarantees. Will update my answer now
– Yura
Oct 10 '14 at 14:59
I think I remember where I got the ORDER BY being discarded from: MySQL does that with UNIONs if you ORDER BY the inner queries, it's just ignore: dev.mysql.com/doc/refman/5.0/en/union.html says "If ORDER BY appears without LIMIT in a SELECT, it is optimized away because it will have no effect anyway." I haven't seen such a statement for the query in question here, but I don't see why it couldn't do that.
– Jannes
Oct 11 '14 at 19:09
I think I remember where I got the ORDER BY being discarded from: MySQL does that with UNIONs if you ORDER BY the inner queries, it's just ignore: dev.mysql.com/doc/refman/5.0/en/union.html says "If ORDER BY appears without LIMIT in a SELECT, it is optimized away because it will have no effect anyway." I haven't seen such a statement for the query in question here, but I don't see why it couldn't do that.
– Jannes
Oct 11 '14 at 19:09
add a comment |
I like to use a NOT EXIST
-based solution for this problem:
SELECT id, rev
FROM YourTable t
WHERE NOT EXISTS (
SELECT * FROM YourTable t WHERE t.id = id AND rev > t.rev
)
1
yes, not exists like this has generally been the preferred way rather than a left join. In older versions of SQL server it was faster, although i think now it makes no difference. I normally do SELECT 1 instead of SELECT *, again because in prior versions it was faster.
– EGP
Oct 8 '14 at 12:38
add a comment |
I like to use a NOT EXIST
-based solution for this problem:
SELECT id, rev
FROM YourTable t
WHERE NOT EXISTS (
SELECT * FROM YourTable t WHERE t.id = id AND rev > t.rev
)
1
yes, not exists like this has generally been the preferred way rather than a left join. In older versions of SQL server it was faster, although i think now it makes no difference. I normally do SELECT 1 instead of SELECT *, again because in prior versions it was faster.
– EGP
Oct 8 '14 at 12:38
add a comment |
I like to use a NOT EXIST
-based solution for this problem:
SELECT id, rev
FROM YourTable t
WHERE NOT EXISTS (
SELECT * FROM YourTable t WHERE t.id = id AND rev > t.rev
)
I like to use a NOT EXIST
-based solution for this problem:
SELECT id, rev
FROM YourTable t
WHERE NOT EXISTS (
SELECT * FROM YourTable t WHERE t.id = id AND rev > t.rev
)
edited Jul 17 '17 at 1:46
HoldOffHunger
4,08922349
4,08922349
answered Sep 5 '14 at 21:58
BulatBulat
5,02711941
5,02711941
1
yes, not exists like this has generally been the preferred way rather than a left join. In older versions of SQL server it was faster, although i think now it makes no difference. I normally do SELECT 1 instead of SELECT *, again because in prior versions it was faster.
– EGP
Oct 8 '14 at 12:38
add a comment |
1
yes, not exists like this has generally been the preferred way rather than a left join. In older versions of SQL server it was faster, although i think now it makes no difference. I normally do SELECT 1 instead of SELECT *, again because in prior versions it was faster.
– EGP
Oct 8 '14 at 12:38
1
1
yes, not exists like this has generally been the preferred way rather than a left join. In older versions of SQL server it was faster, although i think now it makes no difference. I normally do SELECT 1 instead of SELECT *, again because in prior versions it was faster.
– EGP
Oct 8 '14 at 12:38
yes, not exists like this has generally been the preferred way rather than a left join. In older versions of SQL server it was faster, although i think now it makes no difference. I normally do SELECT 1 instead of SELECT *, again because in prior versions it was faster.
– EGP
Oct 8 '14 at 12:38
add a comment |
A third solution I hardly ever see mentioned is MySQL specific and looks like this:
SELECT id, MAX(rev) AS rev
, 0+SUBSTRING_INDEX(GROUP_CONCAT(numeric_content ORDER BY rev DESC), ',', 1) AS numeric_content
FROM t1
GROUP BY id
Yes it looks awful (converting to string and back etc.) but in my experience it's usually faster than the other solutions. Maybe that just for my use cases, but I have used it on tables with millions of records and many unique ids. Maybe it's because MySQL is pretty bad at optimizing the other solutions (at least in the 5.0 days when I came up with this solution).
One important thing is that GROUP_CONCAT has a maximum length for the string it can build up. You probably want to raise this limit by setting the group_concat_max_len
variable. And keep in mind that this will be a limit on scaling if you have a large number of rows.
Anyway, the above doesn't directly work if your content field is already text. In that case you probably want to use a different separator, like maybe. You'll also run into the group_concat_max_len
limit quicker.
add a comment |
A third solution I hardly ever see mentioned is MySQL specific and looks like this:
SELECT id, MAX(rev) AS rev
, 0+SUBSTRING_INDEX(GROUP_CONCAT(numeric_content ORDER BY rev DESC), ',', 1) AS numeric_content
FROM t1
GROUP BY id
Yes it looks awful (converting to string and back etc.) but in my experience it's usually faster than the other solutions. Maybe that just for my use cases, but I have used it on tables with millions of records and many unique ids. Maybe it's because MySQL is pretty bad at optimizing the other solutions (at least in the 5.0 days when I came up with this solution).
One important thing is that GROUP_CONCAT has a maximum length for the string it can build up. You probably want to raise this limit by setting the group_concat_max_len
variable. And keep in mind that this will be a limit on scaling if you have a large number of rows.
Anyway, the above doesn't directly work if your content field is already text. In that case you probably want to use a different separator, like maybe. You'll also run into the group_concat_max_len
limit quicker.
add a comment |
A third solution I hardly ever see mentioned is MySQL specific and looks like this:
SELECT id, MAX(rev) AS rev
, 0+SUBSTRING_INDEX(GROUP_CONCAT(numeric_content ORDER BY rev DESC), ',', 1) AS numeric_content
FROM t1
GROUP BY id
Yes it looks awful (converting to string and back etc.) but in my experience it's usually faster than the other solutions. Maybe that just for my use cases, but I have used it on tables with millions of records and many unique ids. Maybe it's because MySQL is pretty bad at optimizing the other solutions (at least in the 5.0 days when I came up with this solution).
One important thing is that GROUP_CONCAT has a maximum length for the string it can build up. You probably want to raise this limit by setting the group_concat_max_len
variable. And keep in mind that this will be a limit on scaling if you have a large number of rows.
Anyway, the above doesn't directly work if your content field is already text. In that case you probably want to use a different separator, like maybe. You'll also run into the group_concat_max_len
limit quicker.
A third solution I hardly ever see mentioned is MySQL specific and looks like this:
SELECT id, MAX(rev) AS rev
, 0+SUBSTRING_INDEX(GROUP_CONCAT(numeric_content ORDER BY rev DESC), ',', 1) AS numeric_content
FROM t1
GROUP BY id
Yes it looks awful (converting to string and back etc.) but in my experience it's usually faster than the other solutions. Maybe that just for my use cases, but I have used it on tables with millions of records and many unique ids. Maybe it's because MySQL is pretty bad at optimizing the other solutions (at least in the 5.0 days when I came up with this solution).
One important thing is that GROUP_CONCAT has a maximum length for the string it can build up. You probably want to raise this limit by setting the group_concat_max_len
variable. And keep in mind that this will be a limit on scaling if you have a large number of rows.
Anyway, the above doesn't directly work if your content field is already text. In that case you probably want to use a different separator, like maybe. You'll also run into the group_concat_max_len
limit quicker.
answered Oct 10 '14 at 11:57
JannesJannes
1,3571418
1,3571418
add a comment |
add a comment |
If you have many fields in select statement and you want latest value for all of those fields through optimized code:
select * from
(select * from table_name
order by id,rev desc) temp
group by id
This works OK for small tables, but takes 6 passes over the entire dataset, so not fast for large tables.
– Rick James
May 17 '17 at 0:48
This is the query I needed because there were other columns involved, too.
– Mike Viens
Jun 1 '18 at 19:07
add a comment |
If you have many fields in select statement and you want latest value for all of those fields through optimized code:
select * from
(select * from table_name
order by id,rev desc) temp
group by id
This works OK for small tables, but takes 6 passes over the entire dataset, so not fast for large tables.
– Rick James
May 17 '17 at 0:48
This is the query I needed because there were other columns involved, too.
– Mike Viens
Jun 1 '18 at 19:07
add a comment |
If you have many fields in select statement and you want latest value for all of those fields through optimized code:
select * from
(select * from table_name
order by id,rev desc) temp
group by id
If you have many fields in select statement and you want latest value for all of those fields through optimized code:
select * from
(select * from table_name
order by id,rev desc) temp
group by id
answered Sep 4 '15 at 5:33
seahawkseahawk
1,736717
1,736717
This works OK for small tables, but takes 6 passes over the entire dataset, so not fast for large tables.
– Rick James
May 17 '17 at 0:48
This is the query I needed because there were other columns involved, too.
– Mike Viens
Jun 1 '18 at 19:07
add a comment |
This works OK for small tables, but takes 6 passes over the entire dataset, so not fast for large tables.
– Rick James
May 17 '17 at 0:48
This is the query I needed because there were other columns involved, too.
– Mike Viens
Jun 1 '18 at 19:07
This works OK for small tables, but takes 6 passes over the entire dataset, so not fast for large tables.
– Rick James
May 17 '17 at 0:48
This works OK for small tables, but takes 6 passes over the entire dataset, so not fast for large tables.
– Rick James
May 17 '17 at 0:48
This is the query I needed because there were other columns involved, too.
– Mike Viens
Jun 1 '18 at 19:07
This is the query I needed because there were other columns involved, too.
– Mike Viens
Jun 1 '18 at 19:07
add a comment |
NOT mySQL, but for other people finding this question and using SQL, another way to resolve the greatest-n-per-group problem is using Cross Apply
in MS SQL
WITH DocIds AS (SELECT DISTINCT id FROM docs)
SELECT d2.id, d2.rev, d2.content
FROM DocIds d1
CROSS APPLY (
SELECT Top 1 * FROM docs d
WHERE d.id = d1.id
ORDER BY rev DESC
) d2
Here's an example in SqlFiddle
very slow comparing to other methods - group by, windows, not exists
– nahab
Feb 15 '18 at 13:40
add a comment |
NOT mySQL, but for other people finding this question and using SQL, another way to resolve the greatest-n-per-group problem is using Cross Apply
in MS SQL
WITH DocIds AS (SELECT DISTINCT id FROM docs)
SELECT d2.id, d2.rev, d2.content
FROM DocIds d1
CROSS APPLY (
SELECT Top 1 * FROM docs d
WHERE d.id = d1.id
ORDER BY rev DESC
) d2
Here's an example in SqlFiddle
very slow comparing to other methods - group by, windows, not exists
– nahab
Feb 15 '18 at 13:40
add a comment |
NOT mySQL, but for other people finding this question and using SQL, another way to resolve the greatest-n-per-group problem is using Cross Apply
in MS SQL
WITH DocIds AS (SELECT DISTINCT id FROM docs)
SELECT d2.id, d2.rev, d2.content
FROM DocIds d1
CROSS APPLY (
SELECT Top 1 * FROM docs d
WHERE d.id = d1.id
ORDER BY rev DESC
) d2
Here's an example in SqlFiddle
NOT mySQL, but for other people finding this question and using SQL, another way to resolve the greatest-n-per-group problem is using Cross Apply
in MS SQL
WITH DocIds AS (SELECT DISTINCT id FROM docs)
SELECT d2.id, d2.rev, d2.content
FROM DocIds d1
CROSS APPLY (
SELECT Top 1 * FROM docs d
WHERE d.id = d1.id
ORDER BY rev DESC
) d2
Here's an example in SqlFiddle
edited Aug 17 '18 at 14:55
answered May 30 '14 at 13:47
KyleMitKyleMit
59.3k36248410
59.3k36248410
very slow comparing to other methods - group by, windows, not exists
– nahab
Feb 15 '18 at 13:40
add a comment |
very slow comparing to other methods - group by, windows, not exists
– nahab
Feb 15 '18 at 13:40
very slow comparing to other methods - group by, windows, not exists
– nahab
Feb 15 '18 at 13:40
very slow comparing to other methods - group by, windows, not exists
– nahab
Feb 15 '18 at 13:40
add a comment |
I think, You want this?
select * from docs where (id, rev) IN (select id, max(rev) as rev from docs group by id order by id)
SQL Fiddle :
Check here
add a comment |
I think, You want this?
select * from docs where (id, rev) IN (select id, max(rev) as rev from docs group by id order by id)
SQL Fiddle :
Check here
add a comment |
I think, You want this?
select * from docs where (id, rev) IN (select id, max(rev) as rev from docs group by id order by id)
SQL Fiddle :
Check here
I think, You want this?
select * from docs where (id, rev) IN (select id, max(rev) as rev from docs group by id order by id)
SQL Fiddle :
Check here
answered Dec 29 '18 at 11:00
Abhishek RanaAbhishek Rana
5310
5310
add a comment |
add a comment |
I would use this:
select t.*
from test as t
join
(select max(rev) as rev
from test
group by id) as o
on o.rev = t.rev
Subquery SELECT is not too eficient maybe, but in JOIN clause seems to be usable. I'm not an expert in optimizing queries, but I've tried at MySQL, PostgreSQL, FireBird and it does work very good.
You can use this schema in multiple joins and with WHERE clause. It is my working example (solving identical to yours problem with table "firmy"):
select *
from platnosci as p
join firmy as f
on p.id_rel_firmy = f.id_rel
join (select max(id_obj) as id_obj
from firmy
group by id_rel) as o
on o.id_obj = f.id_obj and p.od > '2014-03-01'
It is asked on tables having teens thusands of records, and it takes less then 0,01 second on really not too strong machine.
I wouldn't use IN clause (as it is mentioned somewhere above). IN is given to use with short lists of constans, and not as to be the query filter built on subquery. It is because subquery in IN is performed for every scanned record which can made query taking very loooong time.
I think using that subquery as a CTE might at least improve performance
– mmcrae
Jan 10 '17 at 18:52
Hi! For me it looks like your 1st query needs ...and o.id = t.id
in the end (and the subquery should returnid
for that). Doesn't it?
– Dmitry Grekov
Aug 10 '18 at 11:37
add a comment |
I would use this:
select t.*
from test as t
join
(select max(rev) as rev
from test
group by id) as o
on o.rev = t.rev
Subquery SELECT is not too eficient maybe, but in JOIN clause seems to be usable. I'm not an expert in optimizing queries, but I've tried at MySQL, PostgreSQL, FireBird and it does work very good.
You can use this schema in multiple joins and with WHERE clause. It is my working example (solving identical to yours problem with table "firmy"):
select *
from platnosci as p
join firmy as f
on p.id_rel_firmy = f.id_rel
join (select max(id_obj) as id_obj
from firmy
group by id_rel) as o
on o.id_obj = f.id_obj and p.od > '2014-03-01'
It is asked on tables having teens thusands of records, and it takes less then 0,01 second on really not too strong machine.
I wouldn't use IN clause (as it is mentioned somewhere above). IN is given to use with short lists of constans, and not as to be the query filter built on subquery. It is because subquery in IN is performed for every scanned record which can made query taking very loooong time.
I think using that subquery as a CTE might at least improve performance
– mmcrae
Jan 10 '17 at 18:52
Hi! For me it looks like your 1st query needs ...and o.id = t.id
in the end (and the subquery should returnid
for that). Doesn't it?
– Dmitry Grekov
Aug 10 '18 at 11:37
add a comment |
I would use this:
select t.*
from test as t
join
(select max(rev) as rev
from test
group by id) as o
on o.rev = t.rev
Subquery SELECT is not too eficient maybe, but in JOIN clause seems to be usable. I'm not an expert in optimizing queries, but I've tried at MySQL, PostgreSQL, FireBird and it does work very good.
You can use this schema in multiple joins and with WHERE clause. It is my working example (solving identical to yours problem with table "firmy"):
select *
from platnosci as p
join firmy as f
on p.id_rel_firmy = f.id_rel
join (select max(id_obj) as id_obj
from firmy
group by id_rel) as o
on o.id_obj = f.id_obj and p.od > '2014-03-01'
It is asked on tables having teens thusands of records, and it takes less then 0,01 second on really not too strong machine.
I wouldn't use IN clause (as it is mentioned somewhere above). IN is given to use with short lists of constans, and not as to be the query filter built on subquery. It is because subquery in IN is performed for every scanned record which can made query taking very loooong time.
I would use this:
select t.*
from test as t
join
(select max(rev) as rev
from test
group by id) as o
on o.rev = t.rev
Subquery SELECT is not too eficient maybe, but in JOIN clause seems to be usable. I'm not an expert in optimizing queries, but I've tried at MySQL, PostgreSQL, FireBird and it does work very good.
You can use this schema in multiple joins and with WHERE clause. It is my working example (solving identical to yours problem with table "firmy"):
select *
from platnosci as p
join firmy as f
on p.id_rel_firmy = f.id_rel
join (select max(id_obj) as id_obj
from firmy
group by id_rel) as o
on o.id_obj = f.id_obj and p.od > '2014-03-01'
It is asked on tables having teens thusands of records, and it takes less then 0,01 second on really not too strong machine.
I wouldn't use IN clause (as it is mentioned somewhere above). IN is given to use with short lists of constans, and not as to be the query filter built on subquery. It is because subquery in IN is performed for every scanned record which can made query taking very loooong time.
answered Mar 4 '15 at 18:12
Marek WysmułekMarek Wysmułek
311
311
I think using that subquery as a CTE might at least improve performance
– mmcrae
Jan 10 '17 at 18:52
Hi! For me it looks like your 1st query needs ...and o.id = t.id
in the end (and the subquery should returnid
for that). Doesn't it?
– Dmitry Grekov
Aug 10 '18 at 11:37
add a comment |
I think using that subquery as a CTE might at least improve performance
– mmcrae
Jan 10 '17 at 18:52
Hi! For me it looks like your 1st query needs ...and o.id = t.id
in the end (and the subquery should returnid
for that). Doesn't it?
– Dmitry Grekov
Aug 10 '18 at 11:37
I think using that subquery as a CTE might at least improve performance
– mmcrae
Jan 10 '17 at 18:52
I think using that subquery as a CTE might at least improve performance
– mmcrae
Jan 10 '17 at 18:52
Hi! For me it looks like your 1st query needs ...
and o.id = t.id
in the end (and the subquery should return id
for that). Doesn't it?– Dmitry Grekov
Aug 10 '18 at 11:37
Hi! For me it looks like your 1st query needs ...
and o.id = t.id
in the end (and the subquery should return id
for that). Doesn't it?– Dmitry Grekov
Aug 10 '18 at 11:37
add a comment |
How about this:
SELECT all_fields.*
FROM (SELECT id, MAX(rev) FROM yourtable GROUP BY id) AS max_recs
LEFT OUTER JOIN yourtable AS all_fields
ON max_recs.id = all_fields.id
add a comment |
How about this:
SELECT all_fields.*
FROM (SELECT id, MAX(rev) FROM yourtable GROUP BY id) AS max_recs
LEFT OUTER JOIN yourtable AS all_fields
ON max_recs.id = all_fields.id
add a comment |
How about this:
SELECT all_fields.*
FROM (SELECT id, MAX(rev) FROM yourtable GROUP BY id) AS max_recs
LEFT OUTER JOIN yourtable AS all_fields
ON max_recs.id = all_fields.id
How about this:
SELECT all_fields.*
FROM (SELECT id, MAX(rev) FROM yourtable GROUP BY id) AS max_recs
LEFT OUTER JOIN yourtable AS all_fields
ON max_recs.id = all_fields.id
edited Dec 29 '18 at 15:51
Munim Munna
10.3k41543
10.3k41543
answered Jul 14 '13 at 16:09
inorinor
1,27711630
1,27711630
add a comment |
add a comment |
SELECT *
FROM Employee
where Employee.Salary in (select max(salary) from Employee group by Employe_id)
ORDER BY Employee.Salary
add a comment |
SELECT *
FROM Employee
where Employee.Salary in (select max(salary) from Employee group by Employe_id)
ORDER BY Employee.Salary
add a comment |
SELECT *
FROM Employee
where Employee.Salary in (select max(salary) from Employee group by Employe_id)
ORDER BY Employee.Salary
SELECT *
FROM Employee
where Employee.Salary in (select max(salary) from Employee group by Employe_id)
ORDER BY Employee.Salary
edited Feb 22 at 18:30
Cody Gray♦
195k35382470
195k35382470
answered Jul 30 '17 at 18:12
guru008guru008
412
412
add a comment |
add a comment |
Another manner to do the job is using MAX()
analytic function in OVER PARTITION clause
SELECT t.*
FROM
(
SELECT id
,rev
,contents
,MAX(rev) OVER (PARTITION BY id) as max_rev
FROM YourTable
) t
WHERE t.rev = t.max_rev
The other ROW_NUMBER()
OVER PARTITION solution already documented in this post is
SELECT t.*
FROM
(
SELECT id
,rev
,contents
,ROW_NUMBER() OVER (PARTITION BY id ORDER BY rev DESC) rank
FROM YourTable
) t
WHERE t.rank = 1
This 2 SELECT work well on Oracle 10g.
MAX() solution runs certainly FASTER that ROW_NUMBER()
solution because MAX()
complexity is O(n)
while ROW_NUMBER()
complexity is at minimum O(n.log(n))
where n
represent the number of records in table !
add a comment |
Another manner to do the job is using MAX()
analytic function in OVER PARTITION clause
SELECT t.*
FROM
(
SELECT id
,rev
,contents
,MAX(rev) OVER (PARTITION BY id) as max_rev
FROM YourTable
) t
WHERE t.rev = t.max_rev
The other ROW_NUMBER()
OVER PARTITION solution already documented in this post is
SELECT t.*
FROM
(
SELECT id
,rev
,contents
,ROW_NUMBER() OVER (PARTITION BY id ORDER BY rev DESC) rank
FROM YourTable
) t
WHERE t.rank = 1
This 2 SELECT work well on Oracle 10g.
MAX() solution runs certainly FASTER that ROW_NUMBER()
solution because MAX()
complexity is O(n)
while ROW_NUMBER()
complexity is at minimum O(n.log(n))
where n
represent the number of records in table !
add a comment |
Another manner to do the job is using MAX()
analytic function in OVER PARTITION clause
SELECT t.*
FROM
(
SELECT id
,rev
,contents
,MAX(rev) OVER (PARTITION BY id) as max_rev
FROM YourTable
) t
WHERE t.rev = t.max_rev
The other ROW_NUMBER()
OVER PARTITION solution already documented in this post is
SELECT t.*
FROM
(
SELECT id
,rev
,contents
,ROW_NUMBER() OVER (PARTITION BY id ORDER BY rev DESC) rank
FROM YourTable
) t
WHERE t.rank = 1
This 2 SELECT work well on Oracle 10g.
MAX() solution runs certainly FASTER that ROW_NUMBER()
solution because MAX()
complexity is O(n)
while ROW_NUMBER()
complexity is at minimum O(n.log(n))
where n
represent the number of records in table !
Another manner to do the job is using MAX()
analytic function in OVER PARTITION clause
SELECT t.*
FROM
(
SELECT id
,rev
,contents
,MAX(rev) OVER (PARTITION BY id) as max_rev
FROM YourTable
) t
WHERE t.rev = t.max_rev
The other ROW_NUMBER()
OVER PARTITION solution already documented in this post is
SELECT t.*
FROM
(
SELECT id
,rev
,contents
,ROW_NUMBER() OVER (PARTITION BY id ORDER BY rev DESC) rank
FROM YourTable
) t
WHERE t.rank = 1
This 2 SELECT work well on Oracle 10g.
MAX() solution runs certainly FASTER that ROW_NUMBER()
solution because MAX()
complexity is O(n)
while ROW_NUMBER()
complexity is at minimum O(n.log(n))
where n
represent the number of records in table !
edited Feb 24 at 9:55
answered Feb 20 '18 at 9:07
schlebeschlebe
1,0071128
1,0071128
add a comment |
add a comment |
This solution makes only one selection from YourTable, therefore it's faster. It works only for MySQL and SQLite(for SQLite remove DESC) according to test on sqlfiddle.com. Maybe it can be tweaked to work on other languages which I am not familiar with.
SELECT *
FROM ( SELECT *
FROM ( SELECT 1 as id, 1 as rev, 'content1' as content
UNION
SELECT 2, 1, 'content2'
UNION
SELECT 1, 2, 'content3'
UNION
SELECT 1, 3, 'content4'
) as YourTable
ORDER BY id, rev DESC
) as YourTable
GROUP BY id
This doesn't appear to work for the general case. And, it doesn't work at all in PostgreSQL, returning:ERROR: column "your table.reb" must appear in the GROUP BY clause or be used in an aggregate function LINE 1: SELECT *
– ma11hew28
Mar 13 '14 at 16:26
Sorry I didn't clarify the first time at which language it worked.
– plavozont
Mar 17 '14 at 5:11
add a comment |
This solution makes only one selection from YourTable, therefore it's faster. It works only for MySQL and SQLite(for SQLite remove DESC) according to test on sqlfiddle.com. Maybe it can be tweaked to work on other languages which I am not familiar with.
SELECT *
FROM ( SELECT *
FROM ( SELECT 1 as id, 1 as rev, 'content1' as content
UNION
SELECT 2, 1, 'content2'
UNION
SELECT 1, 2, 'content3'
UNION
SELECT 1, 3, 'content4'
) as YourTable
ORDER BY id, rev DESC
) as YourTable
GROUP BY id
This doesn't appear to work for the general case. And, it doesn't work at all in PostgreSQL, returning:ERROR: column "your table.reb" must appear in the GROUP BY clause or be used in an aggregate function LINE 1: SELECT *
– ma11hew28
Mar 13 '14 at 16:26
Sorry I didn't clarify the first time at which language it worked.
– plavozont
Mar 17 '14 at 5:11
add a comment |
This solution makes only one selection from YourTable, therefore it's faster. It works only for MySQL and SQLite(for SQLite remove DESC) according to test on sqlfiddle.com. Maybe it can be tweaked to work on other languages which I am not familiar with.
SELECT *
FROM ( SELECT *
FROM ( SELECT 1 as id, 1 as rev, 'content1' as content
UNION
SELECT 2, 1, 'content2'
UNION
SELECT 1, 2, 'content3'
UNION
SELECT 1, 3, 'content4'
) as YourTable
ORDER BY id, rev DESC
) as YourTable
GROUP BY id
This solution makes only one selection from YourTable, therefore it's faster. It works only for MySQL and SQLite(for SQLite remove DESC) according to test on sqlfiddle.com. Maybe it can be tweaked to work on other languages which I am not familiar with.
SELECT *
FROM ( SELECT *
FROM ( SELECT 1 as id, 1 as rev, 'content1' as content
UNION
SELECT 2, 1, 'content2'
UNION
SELECT 1, 2, 'content3'
UNION
SELECT 1, 3, 'content4'
) as YourTable
ORDER BY id, rev DESC
) as YourTable
GROUP BY id
edited Mar 17 '14 at 8:28
answered Jan 29 '14 at 7:49
plavozontplavozont
446410
446410
This doesn't appear to work for the general case. And, it doesn't work at all in PostgreSQL, returning:ERROR: column "your table.reb" must appear in the GROUP BY clause or be used in an aggregate function LINE 1: SELECT *
– ma11hew28
Mar 13 '14 at 16:26
Sorry I didn't clarify the first time at which language it worked.
– plavozont
Mar 17 '14 at 5:11
add a comment |
This doesn't appear to work for the general case. And, it doesn't work at all in PostgreSQL, returning:ERROR: column "your table.reb" must appear in the GROUP BY clause or be used in an aggregate function LINE 1: SELECT *
– ma11hew28
Mar 13 '14 at 16:26
Sorry I didn't clarify the first time at which language it worked.
– plavozont
Mar 17 '14 at 5:11
This doesn't appear to work for the general case. And, it doesn't work at all in PostgreSQL, returning:
ERROR: column "your table.reb" must appear in the GROUP BY clause or be used in an aggregate function LINE 1: SELECT *
– ma11hew28
Mar 13 '14 at 16:26
This doesn't appear to work for the general case. And, it doesn't work at all in PostgreSQL, returning:
ERROR: column "your table.reb" must appear in the GROUP BY clause or be used in an aggregate function LINE 1: SELECT *
– ma11hew28
Mar 13 '14 at 16:26
Sorry I didn't clarify the first time at which language it worked.
– plavozont
Mar 17 '14 at 5:11
Sorry I didn't clarify the first time at which language it worked.
– plavozont
Mar 17 '14 at 5:11
add a comment |
Here is a nice way of doing that
Use following code :
with temp as (
select count(field1) as summ , field1
from table_name
group by field1 )
select * from temp where summ = (select max(summ) from temp)
add a comment |
Here is a nice way of doing that
Use following code :
with temp as (
select count(field1) as summ , field1
from table_name
group by field1 )
select * from temp where summ = (select max(summ) from temp)
add a comment |
Here is a nice way of doing that
Use following code :
with temp as (
select count(field1) as summ , field1
from table_name
group by field1 )
select * from temp where summ = (select max(summ) from temp)
Here is a nice way of doing that
Use following code :
with temp as (
select count(field1) as summ , field1
from table_name
group by field1 )
select * from temp where summ = (select max(summ) from temp)
edited Jan 7 '15 at 12:11
Ashish Kakkad
18.6k871111
18.6k871111
answered Jan 7 '15 at 11:36
shayshay
211
211
add a comment |
add a comment |
I like to do this by ranking the records by some column. In this case, rank rev
values grouped by id
. Those with higher rev
will have lower rankings. So highest rev
will have ranking of 1.
select id, rev, content
from
(select
@rowNum := if(@prevValue = id, @rowNum+1, 1) as row_num,
id, rev, content,
@prevValue := id
from
(select id, rev, content from YOURTABLE order by id asc, rev desc) TEMP,
(select @rowNum := 1 from DUAL) X,
(select @prevValue := -1 from DUAL) Y) TEMP
where row_num = 1;
Not sure if introducing variables makes the whole thing slower. But at least I'm not querying YOURTABLE
twice.
Only tried approach in MySQL. Oracle has a similar function for ranking records. Idea should work too.
– user5124980
Jul 16 '15 at 18:54
1
Reading & writing a variable in a select statement is undefined in MySQL although particular versions happen to give the answer you might expect for certain syntax involving case expressions.
– philipxy
Sep 22 '18 at 10:57
add a comment |
I like to do this by ranking the records by some column. In this case, rank rev
values grouped by id
. Those with higher rev
will have lower rankings. So highest rev
will have ranking of 1.
select id, rev, content
from
(select
@rowNum := if(@prevValue = id, @rowNum+1, 1) as row_num,
id, rev, content,
@prevValue := id
from
(select id, rev, content from YOURTABLE order by id asc, rev desc) TEMP,
(select @rowNum := 1 from DUAL) X,
(select @prevValue := -1 from DUAL) Y) TEMP
where row_num = 1;
Not sure if introducing variables makes the whole thing slower. But at least I'm not querying YOURTABLE
twice.
Only tried approach in MySQL. Oracle has a similar function for ranking records. Idea should work too.
– user5124980
Jul 16 '15 at 18:54
1
Reading & writing a variable in a select statement is undefined in MySQL although particular versions happen to give the answer you might expect for certain syntax involving case expressions.
– philipxy
Sep 22 '18 at 10:57
add a comment |
I like to do this by ranking the records by some column. In this case, rank rev
values grouped by id
. Those with higher rev
will have lower rankings. So highest rev
will have ranking of 1.
select id, rev, content
from
(select
@rowNum := if(@prevValue = id, @rowNum+1, 1) as row_num,
id, rev, content,
@prevValue := id
from
(select id, rev, content from YOURTABLE order by id asc, rev desc) TEMP,
(select @rowNum := 1 from DUAL) X,
(select @prevValue := -1 from DUAL) Y) TEMP
where row_num = 1;
Not sure if introducing variables makes the whole thing slower. But at least I'm not querying YOURTABLE
twice.
I like to do this by ranking the records by some column. In this case, rank rev
values grouped by id
. Those with higher rev
will have lower rankings. So highest rev
will have ranking of 1.
select id, rev, content
from
(select
@rowNum := if(@prevValue = id, @rowNum+1, 1) as row_num,
id, rev, content,
@prevValue := id
from
(select id, rev, content from YOURTABLE order by id asc, rev desc) TEMP,
(select @rowNum := 1 from DUAL) X,
(select @prevValue := -1 from DUAL) Y) TEMP
where row_num = 1;
Not sure if introducing variables makes the whole thing slower. But at least I'm not querying YOURTABLE
twice.
answered Jul 16 '15 at 18:52
user5124980user5124980
211
211
Only tried approach in MySQL. Oracle has a similar function for ranking records. Idea should work too.
– user5124980
Jul 16 '15 at 18:54
1
Reading & writing a variable in a select statement is undefined in MySQL although particular versions happen to give the answer you might expect for certain syntax involving case expressions.
– philipxy
Sep 22 '18 at 10:57
add a comment |
Only tried approach in MySQL. Oracle has a similar function for ranking records. Idea should work too.
– user5124980
Jul 16 '15 at 18:54
1
Reading & writing a variable in a select statement is undefined in MySQL although particular versions happen to give the answer you might expect for certain syntax involving case expressions.
– philipxy
Sep 22 '18 at 10:57
Only tried approach in MySQL. Oracle has a similar function for ranking records. Idea should work too.
– user5124980
Jul 16 '15 at 18:54
Only tried approach in MySQL. Oracle has a similar function for ranking records. Idea should work too.
– user5124980
Jul 16 '15 at 18:54
1
1
Reading & writing a variable in a select statement is undefined in MySQL although particular versions happen to give the answer you might expect for certain syntax involving case expressions.
– philipxy
Sep 22 '18 at 10:57
Reading & writing a variable in a select statement is undefined in MySQL although particular versions happen to give the answer you might expect for certain syntax involving case expressions.
– philipxy
Sep 22 '18 at 10:57
add a comment |
Sorted the rev field in reverse order and then grouped by id which gave the first row of each grouping which is the one with the highest rev value.
SELECT * FROM (SELECT * FROM table1 ORDER BY id, rev DESC) X GROUP BY X.id;
Tested in http://sqlfiddle.com/ with the following data
CREATE TABLE table1
(`id` int, `rev` int, `content` varchar(11));
INSERT INTO table1
(`id`, `rev`, `content`)
VALUES
(1, 1, 'One-One'),
(1, 2, 'One-Two'),
(2, 1, 'Two-One'),
(2, 2, 'Two-Two'),
(3, 2, 'Three-Two'),
(3, 1, 'Three-One'),
(3, 3, 'Three-Three')
;
This gave the following result in MySql 5.5 and 5.6
id rev content
1 2 One-Two
2 2 Two-Two
3 3 Three-Two
This technique used to work, but no longer. See mariadb.com/kb/en/mariadb/…
– Rick James
Apr 1 '17 at 22:02
1
The original question tag is "mysql" and I have stated very clearly that my solution was tested with both Mysql 5.5 and 5.6 in sqlfiddle.com. I have provided all steps to independently verify the solution. I have not made any false claims that my solution works with Mariadb. Mariadb is not Mysql, its just a drop-in replacement for Mysql, owned by 2 different companies. Your comment will help anyone that is trying to implement it in Mariadb but my post in no way deserve a negative vote as it clearly answers the question that was asked.
– blokeish
Apr 3 '17 at 0:34
1
Yes, it works in older versions. And I have used that technique in the past, only to be burned when it stopped working. Also MySQL (in 5.7?) will also be ignoring theORDER BY
in a subquery. Since lots of people will read your answer, I am trying to steer them away from a technique that will break in their future. (And I did not give you the -1 vote.)
– Rick James
Apr 3 '17 at 2:38
1
Tests prove nothing. ORDER BY in a subquery has no guaranteed effect other than for a LIMIT in the same subquery. Even if order was preserved, the GROUP BY would not preserve it. Even if it were preserved, non-standard GROUP BY relying on disabled ONLY_FULL_GROUP_BY is specified to return some row in a group for a non-grouped column but not necessarily the first. So your query is not correct.
– philipxy
Sep 22 '18 at 11:50
add a comment |
Sorted the rev field in reverse order and then grouped by id which gave the first row of each grouping which is the one with the highest rev value.
SELECT * FROM (SELECT * FROM table1 ORDER BY id, rev DESC) X GROUP BY X.id;
Tested in http://sqlfiddle.com/ with the following data
CREATE TABLE table1
(`id` int, `rev` int, `content` varchar(11));
INSERT INTO table1
(`id`, `rev`, `content`)
VALUES
(1, 1, 'One-One'),
(1, 2, 'One-Two'),
(2, 1, 'Two-One'),
(2, 2, 'Two-Two'),
(3, 2, 'Three-Two'),
(3, 1, 'Three-One'),
(3, 3, 'Three-Three')
;
This gave the following result in MySql 5.5 and 5.6
id rev content
1 2 One-Two
2 2 Two-Two
3 3 Three-Two
This technique used to work, but no longer. See mariadb.com/kb/en/mariadb/…
– Rick James
Apr 1 '17 at 22:02
1
The original question tag is "mysql" and I have stated very clearly that my solution was tested with both Mysql 5.5 and 5.6 in sqlfiddle.com. I have provided all steps to independently verify the solution. I have not made any false claims that my solution works with Mariadb. Mariadb is not Mysql, its just a drop-in replacement for Mysql, owned by 2 different companies. Your comment will help anyone that is trying to implement it in Mariadb but my post in no way deserve a negative vote as it clearly answers the question that was asked.
– blokeish
Apr 3 '17 at 0:34
1
Yes, it works in older versions. And I have used that technique in the past, only to be burned when it stopped working. Also MySQL (in 5.7?) will also be ignoring theORDER BY
in a subquery. Since lots of people will read your answer, I am trying to steer them away from a technique that will break in their future. (And I did not give you the -1 vote.)
– Rick James
Apr 3 '17 at 2:38
1
Tests prove nothing. ORDER BY in a subquery has no guaranteed effect other than for a LIMIT in the same subquery. Even if order was preserved, the GROUP BY would not preserve it. Even if it were preserved, non-standard GROUP BY relying on disabled ONLY_FULL_GROUP_BY is specified to return some row in a group for a non-grouped column but not necessarily the first. So your query is not correct.
– philipxy
Sep 22 '18 at 11:50
add a comment |
Sorted the rev field in reverse order and then grouped by id which gave the first row of each grouping which is the one with the highest rev value.
SELECT * FROM (SELECT * FROM table1 ORDER BY id, rev DESC) X GROUP BY X.id;
Tested in http://sqlfiddle.com/ with the following data
CREATE TABLE table1
(`id` int, `rev` int, `content` varchar(11));
INSERT INTO table1
(`id`, `rev`, `content`)
VALUES
(1, 1, 'One-One'),
(1, 2, 'One-Two'),
(2, 1, 'Two-One'),
(2, 2, 'Two-Two'),
(3, 2, 'Three-Two'),
(3, 1, 'Three-One'),
(3, 3, 'Three-Three')
;
This gave the following result in MySql 5.5 and 5.6
id rev content
1 2 One-Two
2 2 Two-Two
3 3 Three-Two
Sorted the rev field in reverse order and then grouped by id which gave the first row of each grouping which is the one with the highest rev value.
SELECT * FROM (SELECT * FROM table1 ORDER BY id, rev DESC) X GROUP BY X.id;
Tested in http://sqlfiddle.com/ with the following data
CREATE TABLE table1
(`id` int, `rev` int, `content` varchar(11));
INSERT INTO table1
(`id`, `rev`, `content`)
VALUES
(1, 1, 'One-One'),
(1, 2, 'One-Two'),
(2, 1, 'Two-One'),
(2, 2, 'Two-Two'),
(3, 2, 'Three-Two'),
(3, 1, 'Three-One'),
(3, 3, 'Three-Three')
;
This gave the following result in MySql 5.5 and 5.6
id rev content
1 2 One-Two
2 2 Two-Two
3 3 Three-Two
answered Dec 11 '15 at 3:14
blokeishblokeish
50548
50548
This technique used to work, but no longer. See mariadb.com/kb/en/mariadb/…
– Rick James
Apr 1 '17 at 22:02
1
The original question tag is "mysql" and I have stated very clearly that my solution was tested with both Mysql 5.5 and 5.6 in sqlfiddle.com. I have provided all steps to independently verify the solution. I have not made any false claims that my solution works with Mariadb. Mariadb is not Mysql, its just a drop-in replacement for Mysql, owned by 2 different companies. Your comment will help anyone that is trying to implement it in Mariadb but my post in no way deserve a negative vote as it clearly answers the question that was asked.
– blokeish
Apr 3 '17 at 0:34
1
Yes, it works in older versions. And I have used that technique in the past, only to be burned when it stopped working. Also MySQL (in 5.7?) will also be ignoring theORDER BY
in a subquery. Since lots of people will read your answer, I am trying to steer them away from a technique that will break in their future. (And I did not give you the -1 vote.)
– Rick James
Apr 3 '17 at 2:38
1
Tests prove nothing. ORDER BY in a subquery has no guaranteed effect other than for a LIMIT in the same subquery. Even if order was preserved, the GROUP BY would not preserve it. Even if it were preserved, non-standard GROUP BY relying on disabled ONLY_FULL_GROUP_BY is specified to return some row in a group for a non-grouped column but not necessarily the first. So your query is not correct.
– philipxy
Sep 22 '18 at 11:50
add a comment |
This technique used to work, but no longer. See mariadb.com/kb/en/mariadb/…
– Rick James
Apr 1 '17 at 22:02
1
The original question tag is "mysql" and I have stated very clearly that my solution was tested with both Mysql 5.5 and 5.6 in sqlfiddle.com. I have provided all steps to independently verify the solution. I have not made any false claims that my solution works with Mariadb. Mariadb is not Mysql, its just a drop-in replacement for Mysql, owned by 2 different companies. Your comment will help anyone that is trying to implement it in Mariadb but my post in no way deserve a negative vote as it clearly answers the question that was asked.
– blokeish
Apr 3 '17 at 0:34
1
Yes, it works in older versions. And I have used that technique in the past, only to be burned when it stopped working. Also MySQL (in 5.7?) will also be ignoring theORDER BY
in a subquery. Since lots of people will read your answer, I am trying to steer them away from a technique that will break in their future. (And I did not give you the -1 vote.)
– Rick James
Apr 3 '17 at 2:38
1
Tests prove nothing. ORDER BY in a subquery has no guaranteed effect other than for a LIMIT in the same subquery. Even if order was preserved, the GROUP BY would not preserve it. Even if it were preserved, non-standard GROUP BY relying on disabled ONLY_FULL_GROUP_BY is specified to return some row in a group for a non-grouped column but not necessarily the first. So your query is not correct.
– philipxy
Sep 22 '18 at 11:50
This technique used to work, but no longer. See mariadb.com/kb/en/mariadb/…
– Rick James
Apr 1 '17 at 22:02
This technique used to work, but no longer. See mariadb.com/kb/en/mariadb/…
– Rick James
Apr 1 '17 at 22:02
1
1
The original question tag is "mysql" and I have stated very clearly that my solution was tested with both Mysql 5.5 and 5.6 in sqlfiddle.com. I have provided all steps to independently verify the solution. I have not made any false claims that my solution works with Mariadb. Mariadb is not Mysql, its just a drop-in replacement for Mysql, owned by 2 different companies. Your comment will help anyone that is trying to implement it in Mariadb but my post in no way deserve a negative vote as it clearly answers the question that was asked.
– blokeish
Apr 3 '17 at 0:34
The original question tag is "mysql" and I have stated very clearly that my solution was tested with both Mysql 5.5 and 5.6 in sqlfiddle.com. I have provided all steps to independently verify the solution. I have not made any false claims that my solution works with Mariadb. Mariadb is not Mysql, its just a drop-in replacement for Mysql, owned by 2 different companies. Your comment will help anyone that is trying to implement it in Mariadb but my post in no way deserve a negative vote as it clearly answers the question that was asked.
– blokeish
Apr 3 '17 at 0:34
1
1
Yes, it works in older versions. And I have used that technique in the past, only to be burned when it stopped working. Also MySQL (in 5.7?) will also be ignoring the
ORDER BY
in a subquery. Since lots of people will read your answer, I am trying to steer them away from a technique that will break in their future. (And I did not give you the -1 vote.)– Rick James
Apr 3 '17 at 2:38
Yes, it works in older versions. And I have used that technique in the past, only to be burned when it stopped working. Also MySQL (in 5.7?) will also be ignoring the
ORDER BY
in a subquery. Since lots of people will read your answer, I am trying to steer them away from a technique that will break in their future. (And I did not give you the -1 vote.)– Rick James
Apr 3 '17 at 2:38
1
1
Tests prove nothing. ORDER BY in a subquery has no guaranteed effect other than for a LIMIT in the same subquery. Even if order was preserved, the GROUP BY would not preserve it. Even if it were preserved, non-standard GROUP BY relying on disabled ONLY_FULL_GROUP_BY is specified to return some row in a group for a non-grouped column but not necessarily the first. So your query is not correct.
– philipxy
Sep 22 '18 at 11:50
Tests prove nothing. ORDER BY in a subquery has no guaranteed effect other than for a LIMIT in the same subquery. Even if order was preserved, the GROUP BY would not preserve it. Even if it were preserved, non-standard GROUP BY relying on disabled ONLY_FULL_GROUP_BY is specified to return some row in a group for a non-grouped column but not necessarily the first. So your query is not correct.
– philipxy
Sep 22 '18 at 11:50
add a comment |
here is another solution hope it will help someone
Select a.id , a.rev, a.content from Table1 a
inner join
(SELECT id, max(rev) rev FROM Table1 GROUP BY id) x on x.id =a.id and x.rev =a.rev
add a comment |
here is another solution hope it will help someone
Select a.id , a.rev, a.content from Table1 a
inner join
(SELECT id, max(rev) rev FROM Table1 GROUP BY id) x on x.id =a.id and x.rev =a.rev
add a comment |
here is another solution hope it will help someone
Select a.id , a.rev, a.content from Table1 a
inner join
(SELECT id, max(rev) rev FROM Table1 GROUP BY id) x on x.id =a.id and x.rev =a.rev
here is another solution hope it will help someone
Select a.id , a.rev, a.content from Table1 a
inner join
(SELECT id, max(rev) rev FROM Table1 GROUP BY id) x on x.id =a.id and x.rev =a.rev
answered Jun 20 '17 at 10:10
Abdul SamadAbdul Samad
3416
3416
add a comment |
add a comment |
None of these answers have worked for me.
This is what worked for me.
with score as (select max(score_up) from history)
select history.* from score, history where history.score_up = score.max
add a comment |
None of these answers have worked for me.
This is what worked for me.
with score as (select max(score_up) from history)
select history.* from score, history where history.score_up = score.max
add a comment |
None of these answers have worked for me.
This is what worked for me.
with score as (select max(score_up) from history)
select history.* from score, history where history.score_up = score.max
None of these answers have worked for me.
This is what worked for me.
with score as (select max(score_up) from history)
select history.* from score, history where history.score_up = score.max
answered Jul 13 '17 at 18:19
qaisjpqaisjp
359320
359320
add a comment |
add a comment |
Here's another solution to retrieving the records only with a field that has the maximum value for that field. This works for SQL400 which is the platform I work on. In this example, the records with the maximum value in field FIELD5 will be retrieved by the following SQL statement.
SELECT A.KEYFIELD1, A.KEYFIELD2, A.FIELD3, A.FIELD4, A.FIELD5
FROM MYFILE A
WHERE RRN(A) IN
(SELECT RRN(B)
FROM MYFILE B
WHERE B.KEYFIELD1 = A.KEYFIELD1 AND B.KEYFIELD2 = A.KEYFIELD2
ORDER BY B.FIELD5 DESC
FETCH FIRST ROW ONLY)
add a comment |
Here's another solution to retrieving the records only with a field that has the maximum value for that field. This works for SQL400 which is the platform I work on. In this example, the records with the maximum value in field FIELD5 will be retrieved by the following SQL statement.
SELECT A.KEYFIELD1, A.KEYFIELD2, A.FIELD3, A.FIELD4, A.FIELD5
FROM MYFILE A
WHERE RRN(A) IN
(SELECT RRN(B)
FROM MYFILE B
WHERE B.KEYFIELD1 = A.KEYFIELD1 AND B.KEYFIELD2 = A.KEYFIELD2
ORDER BY B.FIELD5 DESC
FETCH FIRST ROW ONLY)
add a comment |
Here's another solution to retrieving the records only with a field that has the maximum value for that field. This works for SQL400 which is the platform I work on. In this example, the records with the maximum value in field FIELD5 will be retrieved by the following SQL statement.
SELECT A.KEYFIELD1, A.KEYFIELD2, A.FIELD3, A.FIELD4, A.FIELD5
FROM MYFILE A
WHERE RRN(A) IN
(SELECT RRN(B)
FROM MYFILE B
WHERE B.KEYFIELD1 = A.KEYFIELD1 AND B.KEYFIELD2 = A.KEYFIELD2
ORDER BY B.FIELD5 DESC
FETCH FIRST ROW ONLY)
Here's another solution to retrieving the records only with a field that has the maximum value for that field. This works for SQL400 which is the platform I work on. In this example, the records with the maximum value in field FIELD5 will be retrieved by the following SQL statement.
SELECT A.KEYFIELD1, A.KEYFIELD2, A.FIELD3, A.FIELD4, A.FIELD5
FROM MYFILE A
WHERE RRN(A) IN
(SELECT RRN(B)
FROM MYFILE B
WHERE B.KEYFIELD1 = A.KEYFIELD1 AND B.KEYFIELD2 = A.KEYFIELD2
ORDER BY B.FIELD5 DESC
FETCH FIRST ROW ONLY)
edited Oct 17 '17 at 0:18
Axel
2,23092241
2,23092241
answered Oct 16 '17 at 23:48
CesarCesar
313
313
add a comment |
add a comment |
I used the below to solve a problem of my own. I first created a temp table and inserted the max rev value per unique id.
CREATE TABLE #temp1
(
id varchar(20)
, rev int
)
INSERT INTO #temp1
SELECT a.id, MAX(a.rev) as rev
FROM
(
SELECT id, content, SUM(rev) as rev
FROM YourTable
GROUP BY id, content
) as a
GROUP BY a.id
ORDER BY a.id
I then joined these max values (#temp1) to all of the possible id/content combinations. By doing this, I naturally filter out the non-maximum id/content combinations, and am left with the only max rev values for each.
SELECT a.id, a.rev, content
FROM #temp1 as a
LEFT JOIN
(
SELECT id, content, SUM(rev) as rev
FROM YourTable
GROUP BY id, content
) as b on a.id = b.id and a.rev = b.rev
GROUP BY a.id, a.rev, b.content
ORDER BY a.id
add a comment |
I used the below to solve a problem of my own. I first created a temp table and inserted the max rev value per unique id.
CREATE TABLE #temp1
(
id varchar(20)
, rev int
)
INSERT INTO #temp1
SELECT a.id, MAX(a.rev) as rev
FROM
(
SELECT id, content, SUM(rev) as rev
FROM YourTable
GROUP BY id, content
) as a
GROUP BY a.id
ORDER BY a.id
I then joined these max values (#temp1) to all of the possible id/content combinations. By doing this, I naturally filter out the non-maximum id/content combinations, and am left with the only max rev values for each.
SELECT a.id, a.rev, content
FROM #temp1 as a
LEFT JOIN
(
SELECT id, content, SUM(rev) as rev
FROM YourTable
GROUP BY id, content
) as b on a.id = b.id and a.rev = b.rev
GROUP BY a.id, a.rev, b.content
ORDER BY a.id
add a comment |
I used the below to solve a problem of my own. I first created a temp table and inserted the max rev value per unique id.
CREATE TABLE #temp1
(
id varchar(20)
, rev int
)
INSERT INTO #temp1
SELECT a.id, MAX(a.rev) as rev
FROM
(
SELECT id, content, SUM(rev) as rev
FROM YourTable
GROUP BY id, content
) as a
GROUP BY a.id
ORDER BY a.id
I then joined these max values (#temp1) to all of the possible id/content combinations. By doing this, I naturally filter out the non-maximum id/content combinations, and am left with the only max rev values for each.
SELECT a.id, a.rev, content
FROM #temp1 as a
LEFT JOIN
(
SELECT id, content, SUM(rev) as rev
FROM YourTable
GROUP BY id, content
) as b on a.id = b.id and a.rev = b.rev
GROUP BY a.id, a.rev, b.content
ORDER BY a.id
I used the below to solve a problem of my own. I first created a temp table and inserted the max rev value per unique id.
CREATE TABLE #temp1
(
id varchar(20)
, rev int
)
INSERT INTO #temp1
SELECT a.id, MAX(a.rev) as rev
FROM
(
SELECT id, content, SUM(rev) as rev
FROM YourTable
GROUP BY id, content
) as a
GROUP BY a.id
ORDER BY a.id
I then joined these max values (#temp1) to all of the possible id/content combinations. By doing this, I naturally filter out the non-maximum id/content combinations, and am left with the only max rev values for each.
SELECT a.id, a.rev, content
FROM #temp1 as a
LEFT JOIN
(
SELECT id, content, SUM(rev) as rev
FROM YourTable
GROUP BY id, content
) as b on a.id = b.id and a.rev = b.rev
GROUP BY a.id, a.rev, b.content
ORDER BY a.id
answered Jan 5 '18 at 10:51
Richard BallRichard Ball
175110
175110
add a comment |
add a comment |
You can make the select without a join when you combine the rev
and id
into one maxRevId
value for MAX()
and then split it back to original values:
SELECT maxRevId & ((1 << 32) - 1) as id, maxRevId >> 32 AS rev
FROM (SELECT MAX(((rev << 32) | id)) AS maxRevId
FROM YourTable
GROUP BY id) x;
This is especially fast when there is a complex join instead of a single table. With the traditional approaches the complex join would be done twice.
The above combination is simple with bit functions when rev
and id
are INT UNSIGNED
(32 bit) and combined value fits to BIGINT UNSIGNED
(64 bit). When the id
& rev
are larger than 32-bit values or made of multiple columns, you need combine the value into e.g. a binary value with suitable padding for MAX()
.
add a comment |
You can make the select without a join when you combine the rev
and id
into one maxRevId
value for MAX()
and then split it back to original values:
SELECT maxRevId & ((1 << 32) - 1) as id, maxRevId >> 32 AS rev
FROM (SELECT MAX(((rev << 32) | id)) AS maxRevId
FROM YourTable
GROUP BY id) x;
This is especially fast when there is a complex join instead of a single table. With the traditional approaches the complex join would be done twice.
The above combination is simple with bit functions when rev
and id
are INT UNSIGNED
(32 bit) and combined value fits to BIGINT UNSIGNED
(64 bit). When the id
& rev
are larger than 32-bit values or made of multiple columns, you need combine the value into e.g. a binary value with suitable padding for MAX()
.
add a comment |
You can make the select without a join when you combine the rev
and id
into one maxRevId
value for MAX()
and then split it back to original values:
SELECT maxRevId & ((1 << 32) - 1) as id, maxRevId >> 32 AS rev
FROM (SELECT MAX(((rev << 32) | id)) AS maxRevId
FROM YourTable
GROUP BY id) x;
This is especially fast when there is a complex join instead of a single table. With the traditional approaches the complex join would be done twice.
The above combination is simple with bit functions when rev
and id
are INT UNSIGNED
(32 bit) and combined value fits to BIGINT UNSIGNED
(64 bit). When the id
& rev
are larger than 32-bit values or made of multiple columns, you need combine the value into e.g. a binary value with suitable padding for MAX()
.
You can make the select without a join when you combine the rev
and id
into one maxRevId
value for MAX()
and then split it back to original values:
SELECT maxRevId & ((1 << 32) - 1) as id, maxRevId >> 32 AS rev
FROM (SELECT MAX(((rev << 32) | id)) AS maxRevId
FROM YourTable
GROUP BY id) x;
This is especially fast when there is a complex join instead of a single table. With the traditional approaches the complex join would be done twice.
The above combination is simple with bit functions when rev
and id
are INT UNSIGNED
(32 bit) and combined value fits to BIGINT UNSIGNED
(64 bit). When the id
& rev
are larger than 32-bit values or made of multiple columns, you need combine the value into e.g. a binary value with suitable padding for MAX()
.
answered Sep 17 '18 at 9:08
zoviozovio
25433
25433
add a comment |
add a comment |
Explanation
This is not pure SQL. This will use the SQLAlchemy ORM.
I came here looking for SQLAlchemy help, so I will duplicate Adrian Carneiro's answer with the python/SQLAlchemy version, specifically the outer join part.
This query answers the question of:
"Can you return me the records in this group of records (based on same id) that have the highest version number".
This allows me to duplicate the record, update it, increment its version number, and have the copy of the old version in such a way that I can show change over time.
Code
MyTableAlias = aliased(MyTable)
newest_records = appdb.session.query(MyTable).select_from(join(
MyTable,
MyTableAlias,
onclause=and_(
MyTable.id == MyTableAlias.id,
MyTable.version_int < MyTableAlias.version_int
),
isouter=True
)
).filter(
MyTableAlias.id == None,
).all()
Tested on a PostgreSQL database.
add a comment |
Explanation
This is not pure SQL. This will use the SQLAlchemy ORM.
I came here looking for SQLAlchemy help, so I will duplicate Adrian Carneiro's answer with the python/SQLAlchemy version, specifically the outer join part.
This query answers the question of:
"Can you return me the records in this group of records (based on same id) that have the highest version number".
This allows me to duplicate the record, update it, increment its version number, and have the copy of the old version in such a way that I can show change over time.
Code
MyTableAlias = aliased(MyTable)
newest_records = appdb.session.query(MyTable).select_from(join(
MyTable,
MyTableAlias,
onclause=and_(
MyTable.id == MyTableAlias.id,
MyTable.version_int < MyTableAlias.version_int
),
isouter=True
)
).filter(
MyTableAlias.id == None,
).all()
Tested on a PostgreSQL database.
add a comment |
Explanation
This is not pure SQL. This will use the SQLAlchemy ORM.
I came here looking for SQLAlchemy help, so I will duplicate Adrian Carneiro's answer with the python/SQLAlchemy version, specifically the outer join part.
This query answers the question of:
"Can you return me the records in this group of records (based on same id) that have the highest version number".
This allows me to duplicate the record, update it, increment its version number, and have the copy of the old version in such a way that I can show change over time.
Code
MyTableAlias = aliased(MyTable)
newest_records = appdb.session.query(MyTable).select_from(join(
MyTable,
MyTableAlias,
onclause=and_(
MyTable.id == MyTableAlias.id,
MyTable.version_int < MyTableAlias.version_int
),
isouter=True
)
).filter(
MyTableAlias.id == None,
).all()
Tested on a PostgreSQL database.
Explanation
This is not pure SQL. This will use the SQLAlchemy ORM.
I came here looking for SQLAlchemy help, so I will duplicate Adrian Carneiro's answer with the python/SQLAlchemy version, specifically the outer join part.
This query answers the question of:
"Can you return me the records in this group of records (based on same id) that have the highest version number".
This allows me to duplicate the record, update it, increment its version number, and have the copy of the old version in such a way that I can show change over time.
Code
MyTableAlias = aliased(MyTable)
newest_records = appdb.session.query(MyTable).select_from(join(
MyTable,
MyTableAlias,
onclause=and_(
MyTable.id == MyTableAlias.id,
MyTable.version_int < MyTableAlias.version_int
),
isouter=True
)
).filter(
MyTableAlias.id == None,
).all()
Tested on a PostgreSQL database.
answered Feb 22 at 15:18
Ian A McElhennyIan A McElhenny
938
938
add a comment |
add a comment |
Do you need the corresponding
content
field for the row?– Mark Byers
Oct 12 '11 at 19:45
Yes, and that would pose no problem, I have cut out many columns which I'd be adding back.
– Majid Fouladpour
Oct 12 '11 at 19:48
1
@MarkByers I have edited my answer to comply with OP needs. Since I was at it, I decided to write a more comprehensive answer on the greatest-n-per-group topic.
– Adrian Carneiro
Oct 12 '11 at 20:57
This is common greatest-n-per-group problem, which has well tested and optimized solutions. I prefer the left join solution by Bill Karwin (the original post). Note that bunch of solutions to this common problem can surprisingly be found in the one of most official sources, MySQL manual! See Examples of Common Queries :: The Rows Holding the Group-wise Maximum of a Certain Column.
– TMS
Apr 28 '14 at 11:50
2
duplicate of Retrieving the last record in each group
– TMS
Jul 8 '14 at 18:39