How to safely remove only duplicated rows?












1















I have a tables that contains 6.820.483 and between these rows there a lot of duplicates, I discovered that running this query:



SELECT player_id, match_id, team_id, count(*) 
FROM fixtures
GROUP BY player_id, match_id, team_id
HAVING COUNT(*) > 1


structure example:



player_id | match_id  | team_id
19014 2506172 12573
19014 2506172 12573
19015 2506172 12573
19016 2506172 12573
19016 2506172 12573
19016 2506172 12573


how can I safely remove only the duplicates? In the example above the table should looks like:



player_id | match_id  | team_id
19014 2506172 12573
19015 2506172 12573
19016 2506172 12573


table structure:



CREATE TABLE IF NOT EXISTS `swp`.`fixtures` (
`player_id` INT NOT NULL,
`match_id` INT NOT NULL,
`team_id` INT NOT NULL,
INDEX `player_id_idx` (`player_id` ASC),
INDEX `match_id_idx` (`match_id` ASC),
INDEX `FK_team_fixtures_id_idx` (`team_id` ASC),
CONSTRAINT `FK_player_fixtures_id`
FOREIGN KEY (`player_id`)
REFERENCES `swp`.`player` (`id`)
ON DELETE NO ACTION
ON UPDATE NO ACTION,
CONSTRAINT `FK_match_fixtures_id`
FOREIGN KEY (`match_id`)
REFERENCES `swp`.`match` (`id`)
ON DELETE NO ACTION
ON UPDATE NO ACTION,
CONSTRAINT `FK_team_fixtures_id`
FOREIGN KEY (`team_id`)
REFERENCES `swp`.`team` (`id`)
ON DELETE NO ACTION
ON UPDATE NO ACTION)
ENGINE = InnoDB;









share|improve this question




















  • 1





    What do you mean by remove? Do you want to delete them from the table, or just select only unique values?

    – WillardSolutions
    Nov 20 '18 at 15:24











  • @WillardSolutions I need to delete them because are duplicated check the example for understand

    – Spartaok
    Nov 20 '18 at 15:25













  • Is there a primary key on the fixtures table?

    – WillardSolutions
    Nov 20 '18 at 15:25













  • Does your table have some PK that uniquely identifies a record?

    – Robert Kock
    Nov 20 '18 at 15:25











  • @WillardSolutions nope, the table contains FK of records available in other tables

    – Spartaok
    Nov 20 '18 at 15:26
















1















I have a tables that contains 6.820.483 and between these rows there a lot of duplicates, I discovered that running this query:



SELECT player_id, match_id, team_id, count(*) 
FROM fixtures
GROUP BY player_id, match_id, team_id
HAVING COUNT(*) > 1


structure example:



player_id | match_id  | team_id
19014 2506172 12573
19014 2506172 12573
19015 2506172 12573
19016 2506172 12573
19016 2506172 12573
19016 2506172 12573


how can I safely remove only the duplicates? In the example above the table should looks like:



player_id | match_id  | team_id
19014 2506172 12573
19015 2506172 12573
19016 2506172 12573


table structure:



CREATE TABLE IF NOT EXISTS `swp`.`fixtures` (
`player_id` INT NOT NULL,
`match_id` INT NOT NULL,
`team_id` INT NOT NULL,
INDEX `player_id_idx` (`player_id` ASC),
INDEX `match_id_idx` (`match_id` ASC),
INDEX `FK_team_fixtures_id_idx` (`team_id` ASC),
CONSTRAINT `FK_player_fixtures_id`
FOREIGN KEY (`player_id`)
REFERENCES `swp`.`player` (`id`)
ON DELETE NO ACTION
ON UPDATE NO ACTION,
CONSTRAINT `FK_match_fixtures_id`
FOREIGN KEY (`match_id`)
REFERENCES `swp`.`match` (`id`)
ON DELETE NO ACTION
ON UPDATE NO ACTION,
CONSTRAINT `FK_team_fixtures_id`
FOREIGN KEY (`team_id`)
REFERENCES `swp`.`team` (`id`)
ON DELETE NO ACTION
ON UPDATE NO ACTION)
ENGINE = InnoDB;









share|improve this question




















  • 1





    What do you mean by remove? Do you want to delete them from the table, or just select only unique values?

    – WillardSolutions
    Nov 20 '18 at 15:24











  • @WillardSolutions I need to delete them because are duplicated check the example for understand

    – Spartaok
    Nov 20 '18 at 15:25













  • Is there a primary key on the fixtures table?

    – WillardSolutions
    Nov 20 '18 at 15:25













  • Does your table have some PK that uniquely identifies a record?

    – Robert Kock
    Nov 20 '18 at 15:25











  • @WillardSolutions nope, the table contains FK of records available in other tables

    – Spartaok
    Nov 20 '18 at 15:26














1












1








1








I have a tables that contains 6.820.483 and between these rows there a lot of duplicates, I discovered that running this query:



SELECT player_id, match_id, team_id, count(*) 
FROM fixtures
GROUP BY player_id, match_id, team_id
HAVING COUNT(*) > 1


structure example:



player_id | match_id  | team_id
19014 2506172 12573
19014 2506172 12573
19015 2506172 12573
19016 2506172 12573
19016 2506172 12573
19016 2506172 12573


how can I safely remove only the duplicates? In the example above the table should looks like:



player_id | match_id  | team_id
19014 2506172 12573
19015 2506172 12573
19016 2506172 12573


table structure:



CREATE TABLE IF NOT EXISTS `swp`.`fixtures` (
`player_id` INT NOT NULL,
`match_id` INT NOT NULL,
`team_id` INT NOT NULL,
INDEX `player_id_idx` (`player_id` ASC),
INDEX `match_id_idx` (`match_id` ASC),
INDEX `FK_team_fixtures_id_idx` (`team_id` ASC),
CONSTRAINT `FK_player_fixtures_id`
FOREIGN KEY (`player_id`)
REFERENCES `swp`.`player` (`id`)
ON DELETE NO ACTION
ON UPDATE NO ACTION,
CONSTRAINT `FK_match_fixtures_id`
FOREIGN KEY (`match_id`)
REFERENCES `swp`.`match` (`id`)
ON DELETE NO ACTION
ON UPDATE NO ACTION,
CONSTRAINT `FK_team_fixtures_id`
FOREIGN KEY (`team_id`)
REFERENCES `swp`.`team` (`id`)
ON DELETE NO ACTION
ON UPDATE NO ACTION)
ENGINE = InnoDB;









share|improve this question
















I have a tables that contains 6.820.483 and between these rows there a lot of duplicates, I discovered that running this query:



SELECT player_id, match_id, team_id, count(*) 
FROM fixtures
GROUP BY player_id, match_id, team_id
HAVING COUNT(*) > 1


structure example:



player_id | match_id  | team_id
19014 2506172 12573
19014 2506172 12573
19015 2506172 12573
19016 2506172 12573
19016 2506172 12573
19016 2506172 12573


how can I safely remove only the duplicates? In the example above the table should looks like:



player_id | match_id  | team_id
19014 2506172 12573
19015 2506172 12573
19016 2506172 12573


table structure:



CREATE TABLE IF NOT EXISTS `swp`.`fixtures` (
`player_id` INT NOT NULL,
`match_id` INT NOT NULL,
`team_id` INT NOT NULL,
INDEX `player_id_idx` (`player_id` ASC),
INDEX `match_id_idx` (`match_id` ASC),
INDEX `FK_team_fixtures_id_idx` (`team_id` ASC),
CONSTRAINT `FK_player_fixtures_id`
FOREIGN KEY (`player_id`)
REFERENCES `swp`.`player` (`id`)
ON DELETE NO ACTION
ON UPDATE NO ACTION,
CONSTRAINT `FK_match_fixtures_id`
FOREIGN KEY (`match_id`)
REFERENCES `swp`.`match` (`id`)
ON DELETE NO ACTION
ON UPDATE NO ACTION,
CONSTRAINT `FK_team_fixtures_id`
FOREIGN KEY (`team_id`)
REFERENCES `swp`.`team` (`id`)
ON DELETE NO ACTION
ON UPDATE NO ACTION)
ENGINE = InnoDB;






mysql sql






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Nov 20 '18 at 15:32







Spartaok

















asked Nov 20 '18 at 15:20









SpartaokSpartaok

748




748








  • 1





    What do you mean by remove? Do you want to delete them from the table, or just select only unique values?

    – WillardSolutions
    Nov 20 '18 at 15:24











  • @WillardSolutions I need to delete them because are duplicated check the example for understand

    – Spartaok
    Nov 20 '18 at 15:25













  • Is there a primary key on the fixtures table?

    – WillardSolutions
    Nov 20 '18 at 15:25













  • Does your table have some PK that uniquely identifies a record?

    – Robert Kock
    Nov 20 '18 at 15:25











  • @WillardSolutions nope, the table contains FK of records available in other tables

    – Spartaok
    Nov 20 '18 at 15:26














  • 1





    What do you mean by remove? Do you want to delete them from the table, or just select only unique values?

    – WillardSolutions
    Nov 20 '18 at 15:24











  • @WillardSolutions I need to delete them because are duplicated check the example for understand

    – Spartaok
    Nov 20 '18 at 15:25













  • Is there a primary key on the fixtures table?

    – WillardSolutions
    Nov 20 '18 at 15:25













  • Does your table have some PK that uniquely identifies a record?

    – Robert Kock
    Nov 20 '18 at 15:25











  • @WillardSolutions nope, the table contains FK of records available in other tables

    – Spartaok
    Nov 20 '18 at 15:26








1




1





What do you mean by remove? Do you want to delete them from the table, or just select only unique values?

– WillardSolutions
Nov 20 '18 at 15:24





What do you mean by remove? Do you want to delete them from the table, or just select only unique values?

– WillardSolutions
Nov 20 '18 at 15:24













@WillardSolutions I need to delete them because are duplicated check the example for understand

– Spartaok
Nov 20 '18 at 15:25







@WillardSolutions I need to delete them because are duplicated check the example for understand

– Spartaok
Nov 20 '18 at 15:25















Is there a primary key on the fixtures table?

– WillardSolutions
Nov 20 '18 at 15:25







Is there a primary key on the fixtures table?

– WillardSolutions
Nov 20 '18 at 15:25















Does your table have some PK that uniquely identifies a record?

– Robert Kock
Nov 20 '18 at 15:25





Does your table have some PK that uniquely identifies a record?

– Robert Kock
Nov 20 '18 at 15:25













@WillardSolutions nope, the table contains FK of records available in other tables

– Spartaok
Nov 20 '18 at 15:26





@WillardSolutions nope, the table contains FK of records available in other tables

– Spartaok
Nov 20 '18 at 15:26












3 Answers
3






active

oldest

votes


















3














I'm not a MySQL expect but you could try this (if you're sure no new records will be inserted in the meantime):



CREATE TABLE tmp_fixtures
(
player_id INT NOT NULL,
match_id INT NOT NULL,
team_id INT NOT NULL
);

SELECT DISTINCT
player_id,
match_id,
team_id
INTO tmp_fixtures
FROM fixtures;

TRUNCATE TABLE fixtures;


In order to make sure no duplicated records are created anymore, you could do the following:



ALTER TABLE fixtures ADD PRIMARY KEY (player_id, match_id, team_id);


After this, repopulate the table and clean up:



INSERT INTO fixtures (player_id, match_id, team_id)
SELECT player_id,
match_id,
team_id
FROM tmp_fixtures;

DROP TABLE tmp_fixtures;





share|improve this answer


























  • This should be the accepted answer. When you do not use a PK, most DBMS would fail deleting a row in case of ambiguity. Moving old data to a new table which would use PKs is the only safe way to step out of this mess.

    – afe
    Nov 20 '18 at 15:47











  • this give me that error: undeclared tmp_fixtures

    – Spartaok
    Nov 20 '18 at 15:48











  • Right idea, wrong database. You should be using create table as.

    – Gordon Linoff
    Nov 20 '18 at 16:23











  • Updated the answer.

    – Robert Kock
    Nov 21 '18 at 8:16



















3














Robert and forpas both provided much better answers, but technically I believe this could be done without creating a new table (at least in MSSQL). I've attempted to translate into MySQL. Again I would likely never do it this way, especally on large data sets, but it was an interesting exercise.



As with all solutions, if you do attempt this backup your table first.



DECLARE @i INT = 0

WHILE @i < 6820483
BEGIN
DELETE FROM f
FROM (
SELECT *
FROM fixtures
WHERE player_id IN (SELECT player_id FROM fixtures GROUP BY player_id HAVING COUNT(*) > 1)
LIMIT 1
) f

SET @i = @i + 1
END


Also as pointed out by the other answers, you will likely want to create a Primary Key to prevent this in the future.






share|improve this answer


























  • but why robert solution give me the error error: undeclared tmp_fixtures?

    – Spartaok
    Nov 20 '18 at 16:21











  • @Spartaok I'm not sure, maybe SELECT INTO isn't valid in MySQL. His syntax would work in MSSQL. If that isn't allowed in MySQL you could always create the table first and do INSERT INTO tmp_fixtures SELECT DISTINCT...

    – Derrick Moeller
    Nov 20 '18 at 16:25



















1














There is no other solution but backup distinct rows of the table in a temporary table and restore it afterwards just like @Robert Kock proposed, but:

duplicates can appear again just like before.

So prior to restoring the table run this statement:



ALTER TABLE swp.fixtures ADD PRIMARY KEY(player_id, match_id, team_id);


to add a multi-column primary key so the problem will not appear again.

Edit1

From: https://dev.mysql.com/doc/refman/8.0/en/ansi-diff-select-into-table.html




MySQL Server doesn't support the SELECT ... INTO TABLE Sybase SQL
extension. Instead, MySQL Server supports the INSERT INTO ... SELECT
standard SQL syntax, which is basically the same thing. See Section
13.2.6.1, “INSERT ... SELECT Syntax”. For example:




INSERT INTO tbl_temp2 (fld_id)
SELECT tbl_temp1.fld_order_id
FROM tbl_temp1 WHERE tbl_temp1.fld_order_id > 100;



Edit2 (after Gordon Linoff's suggestion)

So the whole code should be:



CREATE TABLE tmp_fixtures AS 
SELECT DISTINCT player_id, match_id, team_id FROM fixtures;

TRUNCATE TABLE fixtures;

ALTER TABLE fixtures ADD PRIMARY KEY(player_id, match_id, team_id);

INSERT INTO fixtures (player_id, match_id, team_id)
SELECT player_id, match_id, team_id FROM tmp_fixtures;

DROP TABLE tmp_fixtures;



Use with caution and only if you have a backup of your data.






share|improve this answer


























  • the solution propose by robert return an error could you check that'

    – Spartaok
    Nov 20 '18 at 15:57











  • @LukStorms I don't. The documentation did...

    – forpas
    Nov 20 '18 at 16:07











  • Ah right you are. It was text copied from the source. My bad.

    – LukStorms
    Nov 20 '18 at 16:08











Your Answer






StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});


}
});














draft saved

draft discarded


















StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53396170%2fhow-to-safely-remove-only-duplicated-rows%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown

























3 Answers
3






active

oldest

votes








3 Answers
3






active

oldest

votes









active

oldest

votes






active

oldest

votes









3














I'm not a MySQL expect but you could try this (if you're sure no new records will be inserted in the meantime):



CREATE TABLE tmp_fixtures
(
player_id INT NOT NULL,
match_id INT NOT NULL,
team_id INT NOT NULL
);

SELECT DISTINCT
player_id,
match_id,
team_id
INTO tmp_fixtures
FROM fixtures;

TRUNCATE TABLE fixtures;


In order to make sure no duplicated records are created anymore, you could do the following:



ALTER TABLE fixtures ADD PRIMARY KEY (player_id, match_id, team_id);


After this, repopulate the table and clean up:



INSERT INTO fixtures (player_id, match_id, team_id)
SELECT player_id,
match_id,
team_id
FROM tmp_fixtures;

DROP TABLE tmp_fixtures;





share|improve this answer


























  • This should be the accepted answer. When you do not use a PK, most DBMS would fail deleting a row in case of ambiguity. Moving old data to a new table which would use PKs is the only safe way to step out of this mess.

    – afe
    Nov 20 '18 at 15:47











  • this give me that error: undeclared tmp_fixtures

    – Spartaok
    Nov 20 '18 at 15:48











  • Right idea, wrong database. You should be using create table as.

    – Gordon Linoff
    Nov 20 '18 at 16:23











  • Updated the answer.

    – Robert Kock
    Nov 21 '18 at 8:16
















3














I'm not a MySQL expect but you could try this (if you're sure no new records will be inserted in the meantime):



CREATE TABLE tmp_fixtures
(
player_id INT NOT NULL,
match_id INT NOT NULL,
team_id INT NOT NULL
);

SELECT DISTINCT
player_id,
match_id,
team_id
INTO tmp_fixtures
FROM fixtures;

TRUNCATE TABLE fixtures;


In order to make sure no duplicated records are created anymore, you could do the following:



ALTER TABLE fixtures ADD PRIMARY KEY (player_id, match_id, team_id);


After this, repopulate the table and clean up:



INSERT INTO fixtures (player_id, match_id, team_id)
SELECT player_id,
match_id,
team_id
FROM tmp_fixtures;

DROP TABLE tmp_fixtures;





share|improve this answer


























  • This should be the accepted answer. When you do not use a PK, most DBMS would fail deleting a row in case of ambiguity. Moving old data to a new table which would use PKs is the only safe way to step out of this mess.

    – afe
    Nov 20 '18 at 15:47











  • this give me that error: undeclared tmp_fixtures

    – Spartaok
    Nov 20 '18 at 15:48











  • Right idea, wrong database. You should be using create table as.

    – Gordon Linoff
    Nov 20 '18 at 16:23











  • Updated the answer.

    – Robert Kock
    Nov 21 '18 at 8:16














3












3








3







I'm not a MySQL expect but you could try this (if you're sure no new records will be inserted in the meantime):



CREATE TABLE tmp_fixtures
(
player_id INT NOT NULL,
match_id INT NOT NULL,
team_id INT NOT NULL
);

SELECT DISTINCT
player_id,
match_id,
team_id
INTO tmp_fixtures
FROM fixtures;

TRUNCATE TABLE fixtures;


In order to make sure no duplicated records are created anymore, you could do the following:



ALTER TABLE fixtures ADD PRIMARY KEY (player_id, match_id, team_id);


After this, repopulate the table and clean up:



INSERT INTO fixtures (player_id, match_id, team_id)
SELECT player_id,
match_id,
team_id
FROM tmp_fixtures;

DROP TABLE tmp_fixtures;





share|improve this answer















I'm not a MySQL expect but you could try this (if you're sure no new records will be inserted in the meantime):



CREATE TABLE tmp_fixtures
(
player_id INT NOT NULL,
match_id INT NOT NULL,
team_id INT NOT NULL
);

SELECT DISTINCT
player_id,
match_id,
team_id
INTO tmp_fixtures
FROM fixtures;

TRUNCATE TABLE fixtures;


In order to make sure no duplicated records are created anymore, you could do the following:



ALTER TABLE fixtures ADD PRIMARY KEY (player_id, match_id, team_id);


After this, repopulate the table and clean up:



INSERT INTO fixtures (player_id, match_id, team_id)
SELECT player_id,
match_id,
team_id
FROM tmp_fixtures;

DROP TABLE tmp_fixtures;






share|improve this answer














share|improve this answer



share|improve this answer








edited Nov 21 '18 at 8:15

























answered Nov 20 '18 at 15:40









Robert KockRobert Kock

4,0841617




4,0841617













  • This should be the accepted answer. When you do not use a PK, most DBMS would fail deleting a row in case of ambiguity. Moving old data to a new table which would use PKs is the only safe way to step out of this mess.

    – afe
    Nov 20 '18 at 15:47











  • this give me that error: undeclared tmp_fixtures

    – Spartaok
    Nov 20 '18 at 15:48











  • Right idea, wrong database. You should be using create table as.

    – Gordon Linoff
    Nov 20 '18 at 16:23











  • Updated the answer.

    – Robert Kock
    Nov 21 '18 at 8:16



















  • This should be the accepted answer. When you do not use a PK, most DBMS would fail deleting a row in case of ambiguity. Moving old data to a new table which would use PKs is the only safe way to step out of this mess.

    – afe
    Nov 20 '18 at 15:47











  • this give me that error: undeclared tmp_fixtures

    – Spartaok
    Nov 20 '18 at 15:48











  • Right idea, wrong database. You should be using create table as.

    – Gordon Linoff
    Nov 20 '18 at 16:23











  • Updated the answer.

    – Robert Kock
    Nov 21 '18 at 8:16

















This should be the accepted answer. When you do not use a PK, most DBMS would fail deleting a row in case of ambiguity. Moving old data to a new table which would use PKs is the only safe way to step out of this mess.

– afe
Nov 20 '18 at 15:47





This should be the accepted answer. When you do not use a PK, most DBMS would fail deleting a row in case of ambiguity. Moving old data to a new table which would use PKs is the only safe way to step out of this mess.

– afe
Nov 20 '18 at 15:47













this give me that error: undeclared tmp_fixtures

– Spartaok
Nov 20 '18 at 15:48





this give me that error: undeclared tmp_fixtures

– Spartaok
Nov 20 '18 at 15:48













Right idea, wrong database. You should be using create table as.

– Gordon Linoff
Nov 20 '18 at 16:23





Right idea, wrong database. You should be using create table as.

– Gordon Linoff
Nov 20 '18 at 16:23













Updated the answer.

– Robert Kock
Nov 21 '18 at 8:16





Updated the answer.

– Robert Kock
Nov 21 '18 at 8:16













3














Robert and forpas both provided much better answers, but technically I believe this could be done without creating a new table (at least in MSSQL). I've attempted to translate into MySQL. Again I would likely never do it this way, especally on large data sets, but it was an interesting exercise.



As with all solutions, if you do attempt this backup your table first.



DECLARE @i INT = 0

WHILE @i < 6820483
BEGIN
DELETE FROM f
FROM (
SELECT *
FROM fixtures
WHERE player_id IN (SELECT player_id FROM fixtures GROUP BY player_id HAVING COUNT(*) > 1)
LIMIT 1
) f

SET @i = @i + 1
END


Also as pointed out by the other answers, you will likely want to create a Primary Key to prevent this in the future.






share|improve this answer


























  • but why robert solution give me the error error: undeclared tmp_fixtures?

    – Spartaok
    Nov 20 '18 at 16:21











  • @Spartaok I'm not sure, maybe SELECT INTO isn't valid in MySQL. His syntax would work in MSSQL. If that isn't allowed in MySQL you could always create the table first and do INSERT INTO tmp_fixtures SELECT DISTINCT...

    – Derrick Moeller
    Nov 20 '18 at 16:25
















3














Robert and forpas both provided much better answers, but technically I believe this could be done without creating a new table (at least in MSSQL). I've attempted to translate into MySQL. Again I would likely never do it this way, especally on large data sets, but it was an interesting exercise.



As with all solutions, if you do attempt this backup your table first.



DECLARE @i INT = 0

WHILE @i < 6820483
BEGIN
DELETE FROM f
FROM (
SELECT *
FROM fixtures
WHERE player_id IN (SELECT player_id FROM fixtures GROUP BY player_id HAVING COUNT(*) > 1)
LIMIT 1
) f

SET @i = @i + 1
END


Also as pointed out by the other answers, you will likely want to create a Primary Key to prevent this in the future.






share|improve this answer


























  • but why robert solution give me the error error: undeclared tmp_fixtures?

    – Spartaok
    Nov 20 '18 at 16:21











  • @Spartaok I'm not sure, maybe SELECT INTO isn't valid in MySQL. His syntax would work in MSSQL. If that isn't allowed in MySQL you could always create the table first and do INSERT INTO tmp_fixtures SELECT DISTINCT...

    – Derrick Moeller
    Nov 20 '18 at 16:25














3












3








3







Robert and forpas both provided much better answers, but technically I believe this could be done without creating a new table (at least in MSSQL). I've attempted to translate into MySQL. Again I would likely never do it this way, especally on large data sets, but it was an interesting exercise.



As with all solutions, if you do attempt this backup your table first.



DECLARE @i INT = 0

WHILE @i < 6820483
BEGIN
DELETE FROM f
FROM (
SELECT *
FROM fixtures
WHERE player_id IN (SELECT player_id FROM fixtures GROUP BY player_id HAVING COUNT(*) > 1)
LIMIT 1
) f

SET @i = @i + 1
END


Also as pointed out by the other answers, you will likely want to create a Primary Key to prevent this in the future.






share|improve this answer















Robert and forpas both provided much better answers, but technically I believe this could be done without creating a new table (at least in MSSQL). I've attempted to translate into MySQL. Again I would likely never do it this way, especally on large data sets, but it was an interesting exercise.



As with all solutions, if you do attempt this backup your table first.



DECLARE @i INT = 0

WHILE @i < 6820483
BEGIN
DELETE FROM f
FROM (
SELECT *
FROM fixtures
WHERE player_id IN (SELECT player_id FROM fixtures GROUP BY player_id HAVING COUNT(*) > 1)
LIMIT 1
) f

SET @i = @i + 1
END


Also as pointed out by the other answers, you will likely want to create a Primary Key to prevent this in the future.







share|improve this answer














share|improve this answer



share|improve this answer








edited Nov 20 '18 at 16:29

























answered Nov 20 '18 at 16:14









Derrick MoellerDerrick Moeller

2,72121433




2,72121433













  • but why robert solution give me the error error: undeclared tmp_fixtures?

    – Spartaok
    Nov 20 '18 at 16:21











  • @Spartaok I'm not sure, maybe SELECT INTO isn't valid in MySQL. His syntax would work in MSSQL. If that isn't allowed in MySQL you could always create the table first and do INSERT INTO tmp_fixtures SELECT DISTINCT...

    – Derrick Moeller
    Nov 20 '18 at 16:25



















  • but why robert solution give me the error error: undeclared tmp_fixtures?

    – Spartaok
    Nov 20 '18 at 16:21











  • @Spartaok I'm not sure, maybe SELECT INTO isn't valid in MySQL. His syntax would work in MSSQL. If that isn't allowed in MySQL you could always create the table first and do INSERT INTO tmp_fixtures SELECT DISTINCT...

    – Derrick Moeller
    Nov 20 '18 at 16:25

















but why robert solution give me the error error: undeclared tmp_fixtures?

– Spartaok
Nov 20 '18 at 16:21





but why robert solution give me the error error: undeclared tmp_fixtures?

– Spartaok
Nov 20 '18 at 16:21













@Spartaok I'm not sure, maybe SELECT INTO isn't valid in MySQL. His syntax would work in MSSQL. If that isn't allowed in MySQL you could always create the table first and do INSERT INTO tmp_fixtures SELECT DISTINCT...

– Derrick Moeller
Nov 20 '18 at 16:25





@Spartaok I'm not sure, maybe SELECT INTO isn't valid in MySQL. His syntax would work in MSSQL. If that isn't allowed in MySQL you could always create the table first and do INSERT INTO tmp_fixtures SELECT DISTINCT...

– Derrick Moeller
Nov 20 '18 at 16:25











1














There is no other solution but backup distinct rows of the table in a temporary table and restore it afterwards just like @Robert Kock proposed, but:

duplicates can appear again just like before.

So prior to restoring the table run this statement:



ALTER TABLE swp.fixtures ADD PRIMARY KEY(player_id, match_id, team_id);


to add a multi-column primary key so the problem will not appear again.

Edit1

From: https://dev.mysql.com/doc/refman/8.0/en/ansi-diff-select-into-table.html




MySQL Server doesn't support the SELECT ... INTO TABLE Sybase SQL
extension. Instead, MySQL Server supports the INSERT INTO ... SELECT
standard SQL syntax, which is basically the same thing. See Section
13.2.6.1, “INSERT ... SELECT Syntax”. For example:




INSERT INTO tbl_temp2 (fld_id)
SELECT tbl_temp1.fld_order_id
FROM tbl_temp1 WHERE tbl_temp1.fld_order_id > 100;



Edit2 (after Gordon Linoff's suggestion)

So the whole code should be:



CREATE TABLE tmp_fixtures AS 
SELECT DISTINCT player_id, match_id, team_id FROM fixtures;

TRUNCATE TABLE fixtures;

ALTER TABLE fixtures ADD PRIMARY KEY(player_id, match_id, team_id);

INSERT INTO fixtures (player_id, match_id, team_id)
SELECT player_id, match_id, team_id FROM tmp_fixtures;

DROP TABLE tmp_fixtures;



Use with caution and only if you have a backup of your data.






share|improve this answer


























  • the solution propose by robert return an error could you check that'

    – Spartaok
    Nov 20 '18 at 15:57











  • @LukStorms I don't. The documentation did...

    – forpas
    Nov 20 '18 at 16:07











  • Ah right you are. It was text copied from the source. My bad.

    – LukStorms
    Nov 20 '18 at 16:08
















1














There is no other solution but backup distinct rows of the table in a temporary table and restore it afterwards just like @Robert Kock proposed, but:

duplicates can appear again just like before.

So prior to restoring the table run this statement:



ALTER TABLE swp.fixtures ADD PRIMARY KEY(player_id, match_id, team_id);


to add a multi-column primary key so the problem will not appear again.

Edit1

From: https://dev.mysql.com/doc/refman/8.0/en/ansi-diff-select-into-table.html




MySQL Server doesn't support the SELECT ... INTO TABLE Sybase SQL
extension. Instead, MySQL Server supports the INSERT INTO ... SELECT
standard SQL syntax, which is basically the same thing. See Section
13.2.6.1, “INSERT ... SELECT Syntax”. For example:




INSERT INTO tbl_temp2 (fld_id)
SELECT tbl_temp1.fld_order_id
FROM tbl_temp1 WHERE tbl_temp1.fld_order_id > 100;



Edit2 (after Gordon Linoff's suggestion)

So the whole code should be:



CREATE TABLE tmp_fixtures AS 
SELECT DISTINCT player_id, match_id, team_id FROM fixtures;

TRUNCATE TABLE fixtures;

ALTER TABLE fixtures ADD PRIMARY KEY(player_id, match_id, team_id);

INSERT INTO fixtures (player_id, match_id, team_id)
SELECT player_id, match_id, team_id FROM tmp_fixtures;

DROP TABLE tmp_fixtures;



Use with caution and only if you have a backup of your data.






share|improve this answer


























  • the solution propose by robert return an error could you check that'

    – Spartaok
    Nov 20 '18 at 15:57











  • @LukStorms I don't. The documentation did...

    – forpas
    Nov 20 '18 at 16:07











  • Ah right you are. It was text copied from the source. My bad.

    – LukStorms
    Nov 20 '18 at 16:08














1












1








1







There is no other solution but backup distinct rows of the table in a temporary table and restore it afterwards just like @Robert Kock proposed, but:

duplicates can appear again just like before.

So prior to restoring the table run this statement:



ALTER TABLE swp.fixtures ADD PRIMARY KEY(player_id, match_id, team_id);


to add a multi-column primary key so the problem will not appear again.

Edit1

From: https://dev.mysql.com/doc/refman/8.0/en/ansi-diff-select-into-table.html




MySQL Server doesn't support the SELECT ... INTO TABLE Sybase SQL
extension. Instead, MySQL Server supports the INSERT INTO ... SELECT
standard SQL syntax, which is basically the same thing. See Section
13.2.6.1, “INSERT ... SELECT Syntax”. For example:




INSERT INTO tbl_temp2 (fld_id)
SELECT tbl_temp1.fld_order_id
FROM tbl_temp1 WHERE tbl_temp1.fld_order_id > 100;



Edit2 (after Gordon Linoff's suggestion)

So the whole code should be:



CREATE TABLE tmp_fixtures AS 
SELECT DISTINCT player_id, match_id, team_id FROM fixtures;

TRUNCATE TABLE fixtures;

ALTER TABLE fixtures ADD PRIMARY KEY(player_id, match_id, team_id);

INSERT INTO fixtures (player_id, match_id, team_id)
SELECT player_id, match_id, team_id FROM tmp_fixtures;

DROP TABLE tmp_fixtures;



Use with caution and only if you have a backup of your data.






share|improve this answer















There is no other solution but backup distinct rows of the table in a temporary table and restore it afterwards just like @Robert Kock proposed, but:

duplicates can appear again just like before.

So prior to restoring the table run this statement:



ALTER TABLE swp.fixtures ADD PRIMARY KEY(player_id, match_id, team_id);


to add a multi-column primary key so the problem will not appear again.

Edit1

From: https://dev.mysql.com/doc/refman/8.0/en/ansi-diff-select-into-table.html




MySQL Server doesn't support the SELECT ... INTO TABLE Sybase SQL
extension. Instead, MySQL Server supports the INSERT INTO ... SELECT
standard SQL syntax, which is basically the same thing. See Section
13.2.6.1, “INSERT ... SELECT Syntax”. For example:




INSERT INTO tbl_temp2 (fld_id)
SELECT tbl_temp1.fld_order_id
FROM tbl_temp1 WHERE tbl_temp1.fld_order_id > 100;



Edit2 (after Gordon Linoff's suggestion)

So the whole code should be:



CREATE TABLE tmp_fixtures AS 
SELECT DISTINCT player_id, match_id, team_id FROM fixtures;

TRUNCATE TABLE fixtures;

ALTER TABLE fixtures ADD PRIMARY KEY(player_id, match_id, team_id);

INSERT INTO fixtures (player_id, match_id, team_id)
SELECT player_id, match_id, team_id FROM tmp_fixtures;

DROP TABLE tmp_fixtures;



Use with caution and only if you have a backup of your data.







share|improve this answer














share|improve this answer



share|improve this answer








edited Nov 20 '18 at 16:35

























answered Nov 20 '18 at 15:53









forpasforpas

11k3423




11k3423













  • the solution propose by robert return an error could you check that'

    – Spartaok
    Nov 20 '18 at 15:57











  • @LukStorms I don't. The documentation did...

    – forpas
    Nov 20 '18 at 16:07











  • Ah right you are. It was text copied from the source. My bad.

    – LukStorms
    Nov 20 '18 at 16:08



















  • the solution propose by robert return an error could you check that'

    – Spartaok
    Nov 20 '18 at 15:57











  • @LukStorms I don't. The documentation did...

    – forpas
    Nov 20 '18 at 16:07











  • Ah right you are. It was text copied from the source. My bad.

    – LukStorms
    Nov 20 '18 at 16:08

















the solution propose by robert return an error could you check that'

– Spartaok
Nov 20 '18 at 15:57





the solution propose by robert return an error could you check that'

– Spartaok
Nov 20 '18 at 15:57













@LukStorms I don't. The documentation did...

– forpas
Nov 20 '18 at 16:07





@LukStorms I don't. The documentation did...

– forpas
Nov 20 '18 at 16:07













Ah right you are. It was text copied from the source. My bad.

– LukStorms
Nov 20 '18 at 16:08





Ah right you are. It was text copied from the source. My bad.

– LukStorms
Nov 20 '18 at 16:08


















draft saved

draft discarded




















































Thanks for contributing an answer to Stack Overflow!


  • Please be sure to answer the question. Provide details and share your research!

But avoid



  • Asking for help, clarification, or responding to other answers.

  • Making statements based on opinion; back them up with references or personal experience.


To learn more, see our tips on writing great answers.




draft saved


draft discarded














StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53396170%2fhow-to-safely-remove-only-duplicated-rows%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown





















































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown

































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown







Popular posts from this blog

Can a sorcerer learn a 5th-level spell early by creating spell slots using the Font of Magic feature?

ts Property 'filter' does not exist on type '{}'

mat-slide-toggle shouldn't change it's state when I click cancel in confirmation window