Postgresql - optimal hardware for working with 50 milliion record retrival

I run postgresql 9.6 on aws rds with r4.xlarge. initially the table had less records and query was lightning fast later the table grew to 8 GB with 50 million records and query were extremely slow.

What's the optimal config needed for working with such data retrieval.

Server specs
Cores 4 ram 30 gb

edited Jan 2 at 13:38

Mark B

103k16166179

asked Jan 2 at 12:03

Kaja Hussain

5

First of all, that's NOT a lot of data. Your server's RAM alone could hold the entire database in memory 3 times over. If you have performance issues you should check indexes and execution plans, not assume the hardware is insufficient. You should also check query patterns. If client applications open transactions and keep locks for a long time without reason, they'll end up blocking each other

– Panagiotis Kanavos
Jan 2 at 12:07

2

In any case, you should first determine why performance is bad. You can't fix concurrency issues by throwing extra hardware for example. Adding missing indexes can improve performance 100x times or more and reduce blocking. A good schema can result in similar improvements. Or, to put it another way, you may find that you can use a far smaller machine for the same workload and still provide far better performance

– Panagiotis Kanavos
Jan 2 at 12:10

Thanks for reply panagiotis. But my cpu reaches 80+% that's why I'm suspecting hardware can be pumped up will solve this issue

– Kaja Hussain
Jan 2 at 13:18

I'm jus running count * from the table which is very slow when we compare r4.xlarge and r4.large

– Kaja Hussain
Jan 2 at 13:25

The issue is with talend jobs which were holding the sessions and blocking cpu

– Kaja Hussain
Jan 3 at 6:44

|
show 1 more comment

I run postgresql 9.6 on aws rds with r4.xlarge. initially the table had less records and query was lightning fast later the table grew to 8 GB with 50 million records and query were extremely slow.

What's the optimal config needed for working with such data retrieval.

Server specs
Cores 4 ram 30 gb

edited Jan 2 at 13:38

Mark B

103k16166179

asked Jan 2 at 12:03

Kaja Hussain

5

First of all, that's NOT a lot of data. Your server's RAM alone could hold the entire database in memory 3 times over. If you have performance issues you should check indexes and execution plans, not assume the hardware is insufficient. You should also check query patterns. If client applications open transactions and keep locks for a long time without reason, they'll end up blocking each other

– Panagiotis Kanavos
Jan 2 at 12:07

2

In any case, you should first determine why performance is bad. You can't fix concurrency issues by throwing extra hardware for example. Adding missing indexes can improve performance 100x times or more and reduce blocking. A good schema can result in similar improvements. Or, to put it another way, you may find that you can use a far smaller machine for the same workload and still provide far better performance

– Panagiotis Kanavos
Jan 2 at 12:10

Thanks for reply panagiotis. But my cpu reaches 80+% that's why I'm suspecting hardware can be pumped up will solve this issue

– Kaja Hussain
Jan 2 at 13:18

I'm jus running count * from the table which is very slow when we compare r4.xlarge and r4.large

– Kaja Hussain
Jan 2 at 13:25

The issue is with talend jobs which were holding the sessions and blocking cpu

– Kaja Hussain
Jan 3 at 6:44

|
show 1 more comment

I run postgresql 9.6 on aws rds with r4.xlarge. initially the table had less records and query was lightning fast later the table grew to 8 GB with 50 million records and query were extremely slow.

What's the optimal config needed for working with such data retrieval.

Server specs
Cores 4 ram 30 gb

edited Jan 2 at 13:38

Mark B

103k16166179

asked Jan 2 at 12:03

Kaja Hussain

I run postgresql 9.6 on aws rds with r4.xlarge. initially the table had less records and query was lightning fast later the table grew to 8 GB with 50 million records and query were extremely slow.

What's the optimal config needed for working with such data retrieval.

Server specs
Cores 4 ram 30 gb

postgresql amazon-web-services amazon-rds hardware

edited Jan 2 at 13:38

Mark B

103k16166179

asked Jan 2 at 12:03

Kaja Hussain

edited Jan 2 at 13:38

Mark B

103k16166179

asked Jan 2 at 12:03

Kaja Hussain

edited Jan 2 at 13:38

Mark B

103k16166179

edited Jan 2 at 13:38

Mark B

103k16166179

edited Jan 2 at 13:38

Mark B

103k16166179

asked Jan 2 at 12:03

Kaja Hussain

asked Jan 2 at 12:03

Kaja Hussain

asked Jan 2 at 12:03

Kaja Hussain

5

First of all, that's NOT a lot of data. Your server's RAM alone could hold the entire database in memory 3 times over. If you have performance issues you should check indexes and execution plans, not assume the hardware is insufficient. You should also check query patterns. If client applications open transactions and keep locks for a long time without reason, they'll end up blocking each other

– Panagiotis Kanavos
Jan 2 at 12:07

2

In any case, you should first determine why performance is bad. You can't fix concurrency issues by throwing extra hardware for example. Adding missing indexes can improve performance 100x times or more and reduce blocking. A good schema can result in similar improvements. Or, to put it another way, you may find that you can use a far smaller machine for the same workload and still provide far better performance

– Panagiotis Kanavos
Jan 2 at 12:10

Thanks for reply panagiotis. But my cpu reaches 80+% that's why I'm suspecting hardware can be pumped up will solve this issue

– Kaja Hussain
Jan 2 at 13:18

I'm jus running count * from the table which is very slow when we compare r4.xlarge and r4.large

– Kaja Hussain
Jan 2 at 13:25

The issue is with talend jobs which were holding the sessions and blocking cpu

– Kaja Hussain
Jan 3 at 6:44

|
show 1 more comment

5

First of all, that's NOT a lot of data. Your server's RAM alone could hold the entire database in memory 3 times over. If you have performance issues you should check indexes and execution plans, not assume the hardware is insufficient. You should also check query patterns. If client applications open transactions and keep locks for a long time without reason, they'll end up blocking each other

– Panagiotis Kanavos
Jan 2 at 12:07

2

In any case, you should first determine why performance is bad. You can't fix concurrency issues by throwing extra hardware for example. Adding missing indexes can improve performance 100x times or more and reduce blocking. A good schema can result in similar improvements. Or, to put it another way, you may find that you can use a far smaller machine for the same workload and still provide far better performance

– Panagiotis Kanavos
Jan 2 at 12:10

Thanks for reply panagiotis. But my cpu reaches 80+% that's why I'm suspecting hardware can be pumped up will solve this issue

– Kaja Hussain
Jan 2 at 13:18

I'm jus running count * from the table which is very slow when we compare r4.xlarge and r4.large

– Kaja Hussain
Jan 2 at 13:25

The issue is with talend jobs which were holding the sessions and blocking cpu

– Kaja Hussain
Jan 3 at 6:44

First of all, that's NOT a lot of data. Your server's RAM alone could hold the entire database in memory 3 times over. If you have performance issues you should check indexes and execution plans, not assume the hardware is insufficient. You should also check query patterns. If client applications open transactions and keep locks for a long time without reason, they'll end up blocking each other

– Panagiotis Kanavos
Jan 2 at 12:07

In any case, you should first determine why performance is bad. You can't fix concurrency issues by throwing extra hardware for example. Adding missing indexes can improve performance 100x times or more and reduce blocking. A good schema can result in similar improvements. Or, to put it another way, you may find that you can use a far smaller machine for the same workload and still provide far better performance

– Panagiotis Kanavos
Jan 2 at 12:10

Thanks for reply panagiotis. But my cpu reaches 80+% that's why I'm suspecting hardware can be pumped up will solve this issue

– Kaja Hussain
Jan 2 at 13:18

I'm jus running count * from the table which is very slow when we compare r4.xlarge and r4.large

– Kaja Hussain
Jan 2 at 13:25

The issue is with talend jobs which were holding the sessions and blocking cpu

– Kaja Hussain
Jan 3 at 6:44

|
show 1 more comment

0

active

oldest

votes

Your Answer

StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});

}
});

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f54006012%2fpostgresql-optimal-hardware-for-working-with-50-milliion-record-retrival%23new-answer', 'question_page');
}
);

Post as a guest

Name

Required, but never shown

0

active

oldest

votes

0

active

oldest

votes

draft saved

draft discarded

Thanks for contributing an answer to Stack Overflow!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

Search This Blog

Ufyukyu