Create index and then insert or insert and then create index?

I'm inserting a big volume of data in a table in Mysql, I need to create an index to access quickly to the data, however, I would like to know if there is a difference (in performance) between these scenarios:

Create an index and then insert all data

Insert all data and then create an index

thanks in advance!

edited Nov 21 '18 at 15:37

asked Nov 21 '18 at 15:32

Luis Montano

164

1

My gut feeling says scenario 2 is faster. Scenario 1: The index is there, a record is inserted and the index tree must be altered, another record gets inserted and the index tree must be altered, ... Scenario 2: All records get inserted straight away. Then an index is built with all branches perfectly balanced, as the data is already known. I may be wrong, but I'm pretty sure I'm right ;-)

– Thorsten Kettner
Nov 21 '18 at 15:41

Let's see SHOW CREATE TABLE; there could be other subtle issues. Also, will you be replacing the entire table, or merely augmenting it?

– Rick James
Nov 21 '18 at 22:59

add a comment |

Create an index and then insert all data

Insert all data and then create an index

thanks in advance!

edited Nov 21 '18 at 15:37

asked Nov 21 '18 at 15:32

Luis Montano

164

1

My gut feeling says scenario 2 is faster. Scenario 1: The index is there, a record is inserted and the index tree must be altered, another record gets inserted and the index tree must be altered, ... Scenario 2: All records get inserted straight away. Then an index is built with all branches perfectly balanced, as the data is already known. I may be wrong, but I'm pretty sure I'm right ;-)

– Thorsten Kettner
Nov 21 '18 at 15:41

Let's see SHOW CREATE TABLE; there could be other subtle issues. Also, will you be replacing the entire table, or merely augmenting it?

– Rick James
Nov 21 '18 at 22:59

add a comment |

Create an index and then insert all data

Insert all data and then create an index

thanks in advance!

edited Nov 21 '18 at 15:37

asked Nov 21 '18 at 15:32

Luis Montano

164

Create an index and then insert all data

Insert all data and then create an index

thanks in advance!

mysql database indexing

edited Nov 21 '18 at 15:37

asked Nov 21 '18 at 15:32

Luis Montano

164

edited Nov 21 '18 at 15:37

asked Nov 21 '18 at 15:32

Luis Montano

164

edited Nov 21 '18 at 15:37

asked Nov 21 '18 at 15:32

Luis Montano

164

asked Nov 21 '18 at 15:32

Luis Montano

164

asked Nov 21 '18 at 15:32

Luis Montano

164

1

My gut feeling says scenario 2 is faster. Scenario 1: The index is there, a record is inserted and the index tree must be altered, another record gets inserted and the index tree must be altered, ... Scenario 2: All records get inserted straight away. Then an index is built with all branches perfectly balanced, as the data is already known. I may be wrong, but I'm pretty sure I'm right ;-)

– Thorsten Kettner
Nov 21 '18 at 15:41

Let's see SHOW CREATE TABLE; there could be other subtle issues. Also, will you be replacing the entire table, or merely augmenting it?

– Rick James
Nov 21 '18 at 22:59

add a comment |

1

My gut feeling says scenario 2 is faster. Scenario 1: The index is there, a record is inserted and the index tree must be altered, another record gets inserted and the index tree must be altered, ... Scenario 2: All records get inserted straight away. Then an index is built with all branches perfectly balanced, as the data is already known. I may be wrong, but I'm pretty sure I'm right ;-)

– Thorsten Kettner
Nov 21 '18 at 15:41

Let's see SHOW CREATE TABLE; there could be other subtle issues. Also, will you be replacing the entire table, or merely augmenting it?

– Rick James
Nov 21 '18 at 22:59

My gut feeling says scenario 2 is faster. Scenario 1: The index is there, a record is inserted and the index tree must be altered, another record gets inserted and the index tree must be altered, ... Scenario 2: All records get inserted straight away. Then an index is built with all branches perfectly balanced, as the data is already known. I may be wrong, but I'm pretty sure I'm right ;-)

– Thorsten Kettner
Nov 21 '18 at 15:41

Let's see SHOW CREATE TABLE; there could be other subtle issues. Also, will you be replacing the entire table, or merely augmenting it?

– Rick James
Nov 21 '18 at 22:59

add a comment |

1 Answer
1

active

oldest

votes

For InnoDB storage engine, for the cluster index, it will be faster to specify the cluster index (i.e. PRIMARY KEY) on the table before inserting data.

This is because if a cluster index (PRIMARY KEY) is not defined on the table, then InnoDB will use a hidden 6-byte auto-incremented counter for the cluster index. If a PRIMARY KEY is later specified, the entire table will need to be rebuilt.

For secondary indexes (i.e. non-cluster indexes) with InnoDB, it is usually faster to insert data without secondary indexes defined, and then build the secondary indexes after the data is loaded.

FOLLOWUP

As far as the speed of loading to a table (in particular, a table that is truncated/emptied, and then reloaded), dropping and re-creating indexes is a well known technique for speeding up processing, not just with MySQL, but with other RDBMS such as Oracle.)

There isn't a guarantee that the processing will be faster; as with most things database, we need tests to determine which is faster.

For a table containing millions of rows, and we're adding a couple dozen hundred rows, then dropping and rebuilding indexes is likely going to be a lot slower, because of all of the extra work to re-index all of the existing rows. It would be faster to do the index maintenance while the rows are being inserted.

In terms of speeding up a load, the "drop and recreate indexes" technique isn't going to give us the kind of dramatic improvements we get from other changes. For example, it won't be anywhere near the improvement we would see by using LOAD DATA in place of INSERT statements, nor using multi-row INSERT statements vs a series of singleton INSERT statements.

edited Nov 21 '18 at 21:48

answered Nov 21 '18 at 16:03

spencer7593

85.1k108095

add a comment |

Your Answer

StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});

}
});

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53415440%2fcreate-index-and-then-insert-or-insert-and-then-create-index%23new-answer', 'question_page');
}
);

Post as a guest

Name

Required, but never shown

1 Answer
1

active

oldest

votes

1 Answer
1

active

oldest

votes

For InnoDB storage engine, for the cluster index, it will be faster to specify the cluster index (i.e. PRIMARY KEY) on the table before inserting data.

For secondary indexes (i.e. non-cluster indexes) with InnoDB, it is usually faster to insert data without secondary indexes defined, and then build the secondary indexes after the data is loaded.

FOLLOWUP

There isn't a guarantee that the processing will be faster; as with most things database, we need tests to determine which is faster.

edited Nov 21 '18 at 21:48

answered Nov 21 '18 at 16:03

spencer7593

85.1k108095

add a comment |

For InnoDB storage engine, for the cluster index, it will be faster to specify the cluster index (i.e. PRIMARY KEY) on the table before inserting data.

For secondary indexes (i.e. non-cluster indexes) with InnoDB, it is usually faster to insert data without secondary indexes defined, and then build the secondary indexes after the data is loaded.

FOLLOWUP

There isn't a guarantee that the processing will be faster; as with most things database, we need tests to determine which is faster.

edited Nov 21 '18 at 21:48

answered Nov 21 '18 at 16:03

spencer7593

85.1k108095

add a comment |

For InnoDB storage engine, for the cluster index, it will be faster to specify the cluster index (i.e. PRIMARY KEY) on the table before inserting data.

For secondary indexes (i.e. non-cluster indexes) with InnoDB, it is usually faster to insert data without secondary indexes defined, and then build the secondary indexes after the data is loaded.

FOLLOWUP

There isn't a guarantee that the processing will be faster; as with most things database, we need tests to determine which is faster.

edited Nov 21 '18 at 21:48

answered Nov 21 '18 at 16:03

spencer7593

85.1k108095

For InnoDB storage engine, for the cluster index, it will be faster to specify the cluster index (i.e. PRIMARY KEY) on the table before inserting data.

For secondary indexes (i.e. non-cluster indexes) with InnoDB, it is usually faster to insert data without secondary indexes defined, and then build the secondary indexes after the data is loaded.

FOLLOWUP

There isn't a guarantee that the processing will be faster; as with most things database, we need tests to determine which is faster.

edited Nov 21 '18 at 21:48

answered Nov 21 '18 at 16:03

spencer7593

85.1k108095

edited Nov 21 '18 at 21:48

answered Nov 21 '18 at 16:03

spencer7593

85.1k108095

answered Nov 21 '18 at 16:03

spencer7593

85.1k108095

answered Nov 21 '18 at 16:03

spencer7593

85.1k108095

add a comment |

draft saved

draft discarded

Thanks for contributing an answer to Stack Overflow!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

Search This Blog

Ufyukyu