Removing duplicate entries while keeping one instance in MySQL database

Hey guys, I’m stuck on a MySQL problem. I’ve got a table with duplicate entries and I want to clean it up. How can I remove all the duplicates but keep one copy of each unique entry?

Here’s what my table looks like right now:

+----+---------+
| id | company |
+----+---------+
| 1  | apple   |
| 2  | samsung |
| 3  | sony    |
| 4  | apple   |
| 5  | apple   |
| 6  | samsung |
+----+---------+

I know I can use SELECT DISTINCT to view unique entries, but how do I actually delete the extras? Any help would be awesome!

I’ve encountered this issue in my database management work. Here’s a solution that’s worked well for me:

CREATE TABLE temp_table AS
SELECT MIN(id) AS id, company
FROM your_table
GROUP BY company;

TRUNCATE TABLE your_table;

INSERT INTO your_table
SELECT * FROM temp_table;

DROP TABLE temp_table;

This approach creates a temporary table with unique entries, clears the original table, then repopulates it with the deduplicated data. It’s efficient for larger datasets and maintains data integrity. Just remember to adjust ‘your_table’ to your actual table name.

Always backup your data before running these commands. It’s saved me from potential disasters more than once.

hey neo, i’ve dealt with this before. you can use a subquery to keep the lowest id for each unique company. try something like:

DELETE FROM your_table
WHERE id NOT IN (
SELECT MIN(id)
FROM your_table
GROUP BY company
);

this should do the trick for ya!

For situations like this, I’ve found using a self-join with DELETE to be quite effective. Here’s a query that should work:

DELETE t1 FROM your_table t1
INNER JOIN your_table t2
WHERE t1.company = t2.company AND t1.id > t2.id;

This method compares each row with others having the same company name and removes the ones with higher IDs. It’s generally faster than subqueries for larger tables and doesn’t require creating temporary tables.

Remember to run a SELECT with the same join condition first to verify which rows will be deleted. Always backup your data before running DELETE operations on your production database.