In SQL how do I select only rows with max value on a column?

Richard C.

The Problem

If you have a SQL table with different versions of content in multiple rows, how do you get the latest version of the content? Or how do you select the row with the maximum value from a group of rows? This is called the greatest-n-per-group problem.

Consider the example below, where user email addresses change over time, and you want to select the most recent address for every user:

CREATE TABLE EmailAddress ( UserId INT, Email VARCHAR(255), DateCreated DATE ); INSERT INTO EmailAddress(UserId, Email, DateCreated) VALUES (1, 'Amir@example.com', '2020-01-01'), (1, 'Amir2example.com', '2021-01-01'), (1, 'Amir3example.com', '2022-01-01'), (2, 'Sofia@example.com', '2020-01-01'), (2, 'Sofia2@example.com', '2021-01-01'), (2, 'Sofia3@example.com', '2022-01-01');

The Solution

Unfortunately, you cannot return an ungrouped column value in a grouped query. Instead, for each row you have to find the maximum value in the group, return one result set, and then match that value in the original table.

The query below will return the correct result and run in MS SQL Server, MySQL, and PostgreSQL:

SELECT * FROM EmailAddress as a WHERE DateCreated = (SELECT MAX(DateCreated) FROM EmailAddress as b WHERE a.UserId = b.UserId); -- UserId Email DateCreated -- 1 Amir3example.com 2022-01-01 -- 2 Sofia3@example.com 2022-01-01

For every UserId in the outer select, the subquery with the inner select is run once to find the maximum value. This is called a correlated subquery. The code is short and easy to understand, but the performance might be poor on large tables.

Solution With Simple Join

If your tables are large, and you want to try a faster query, you can revert to using a simple join with the table on itself.

SELECT a.* FROM EmailAddress AS a JOIN (SELECT UserId, MAX(DateCreated) DateCreated FROM EmailAddress GROUP BY UserId) AS b ON a.UserId = b.UserId AND a.DateCreated = b.DateCreated;

The query consists of two parts:

  • An inner query to get the maximum date for each user.
  • An outer query to select the email address based on the inner query.

Loved by over 4 million developers and more than 90,000 organizations worldwide, Sentry provides code-level observability to many of the world’s best-known companies like Disney, Peloton, Cloudflare, Eventbrite, Slack, Supercell, and Rockstar Games. Each month we process billions of exceptions from the most popular products on the internet.

Share on Twitter
Bookmark this page
Ask a questionJoin the discussion

Related Answers

A better experience for your users. An easier life for your developers.

    TwitterGitHubDribbbleLinkedinDiscord
© 2024 • Sentry is a registered Trademark
of Functional Software, Inc.