Identifying and displaying duplicate entries in a MySQL database

Hey everyone, I’m working on a project where I need to find and show duplicate records in my MySQL database. I know how to count the duplicates, but I’m struggling to display the full details of each duplicate entry.

Here’s what I’ve got so far:

SELECT address, COUNT(*) AS total
FROM customers
GROUP BY address
HAVING total > 1

This gives me the addresses with duplicates and how many times they appear. But what I really want is to see all the info for each duplicate record, like this:

Name       Surname    Address
-------------------------------
Sarah      Lee        42 Oak Street
Michael    Chen       42 Oak Street

Is there a way to do this in a single query? I’d rather not use two separate queries if possible. Any help would be awesome!

I’ve dealt with this exact issue before, and I found a neat trick using window functions that might help you out. Here’s what worked for me:

SELECT name, surname, address
FROM (
    SELECT *, 
           COUNT(*) OVER (PARTITION BY address) as addr_count
    FROM customers
) subq
WHERE addr_count > 1
ORDER BY address, name;

This approach uses a window function to count duplicates, which is often faster than self-joins for larger datasets. It also avoids the need for a subquery in the FROM clause, which can sometimes cause performance issues.

One thing to keep in mind: if your database version doesn’t support window functions, you might need to stick with the join method. But if it does, give this a shot - it could save you some query execution time!

You’re on the right track with your initial query! To display all the details for duplicate entries, you can use a self-join approach. Here’s a query that should work for your needs:

SELECT c1.name, c1.surname, c1.address
FROM customers c1
JOIN (
    SELECT address
    FROM customers
    GROUP BY address
    HAVING COUNT(*) > 1
) c2 ON c1.address = c2.address
ORDER BY c1.address, c1.name;

This query first identifies the duplicate addresses in a subquery, then joins it back to the main table to fetch all columns for those records. It’s efficient and gives you the full details in one go. Hope this helps solve your problem!

hey there, i’ve got a quick solution for ya. try this out:

SELECT *
FROM customers c
WHERE EXISTS (
  SELECT 1 FROM customers c2
  WHERE c2.address = c.address
  GROUP BY c2.address
  HAVING COUNT(*) > 1
)
ORDER BY address, name;

this should give u all the details for duplicate addresses in one go. lemme know if it works!