Postgresql - optimal hardware for working with 50 milliion record retrival
I run postgresql 9.6 on aws rds with r4.xlarge. initially the table had less records and query was lightning fast later the table grew to 8 GB with 50 million records and query were extremely slow.
What's the optimal config needed for working with such data retrieval.
Server specs
Cores 4 ram 30 gb
postgresql amazon-web-services amazon-rds hardware
|
show 1 more comment
I run postgresql 9.6 on aws rds with r4.xlarge. initially the table had less records and query was lightning fast later the table grew to 8 GB with 50 million records and query were extremely slow.
What's the optimal config needed for working with such data retrieval.
Server specs
Cores 4 ram 30 gb
postgresql amazon-web-services amazon-rds hardware
5
First of all, that's NOT a lot of data. Your server's RAM alone could hold the entire database in memory 3 times over. If you have performance issues you should check indexes and execution plans, not assume the hardware is insufficient. You should also check query patterns. If client applications open transactions and keep locks for a long time without reason, they'll end up blocking each other
– Panagiotis Kanavos
Jan 2 at 12:07
2
In any case, you should first determine why performance is bad. You can't fix concurrency issues by throwing extra hardware for example. Adding missing indexes can improve performance 100x times or more and reduce blocking. A good schema can result in similar improvements. Or, to put it another way, you may find that you can use a far smaller machine for the same workload and still provide far better performance
– Panagiotis Kanavos
Jan 2 at 12:10
Thanks for reply panagiotis. But my cpu reaches 80+% that's why I'm suspecting hardware can be pumped up will solve this issue
– Kaja Hussain
Jan 2 at 13:18
I'm jus running count * from the table which is very slow when we compare r4.xlarge and r4.large
– Kaja Hussain
Jan 2 at 13:25
The issue is with talend jobs which were holding the sessions and blocking cpu
– Kaja Hussain
Jan 3 at 6:44
|
show 1 more comment
I run postgresql 9.6 on aws rds with r4.xlarge. initially the table had less records and query was lightning fast later the table grew to 8 GB with 50 million records and query were extremely slow.
What's the optimal config needed for working with such data retrieval.
Server specs
Cores 4 ram 30 gb
postgresql amazon-web-services amazon-rds hardware
I run postgresql 9.6 on aws rds with r4.xlarge. initially the table had less records and query was lightning fast later the table grew to 8 GB with 50 million records and query were extremely slow.
What's the optimal config needed for working with such data retrieval.
Server specs
Cores 4 ram 30 gb
postgresql amazon-web-services amazon-rds hardware
postgresql amazon-web-services amazon-rds hardware
edited Jan 2 at 13:38


Mark B
103k16166179
103k16166179
asked Jan 2 at 12:03


Kaja HussainKaja Hussain
14
14
5
First of all, that's NOT a lot of data. Your server's RAM alone could hold the entire database in memory 3 times over. If you have performance issues you should check indexes and execution plans, not assume the hardware is insufficient. You should also check query patterns. If client applications open transactions and keep locks for a long time without reason, they'll end up blocking each other
– Panagiotis Kanavos
Jan 2 at 12:07
2
In any case, you should first determine why performance is bad. You can't fix concurrency issues by throwing extra hardware for example. Adding missing indexes can improve performance 100x times or more and reduce blocking. A good schema can result in similar improvements. Or, to put it another way, you may find that you can use a far smaller machine for the same workload and still provide far better performance
– Panagiotis Kanavos
Jan 2 at 12:10
Thanks for reply panagiotis. But my cpu reaches 80+% that's why I'm suspecting hardware can be pumped up will solve this issue
– Kaja Hussain
Jan 2 at 13:18
I'm jus running count * from the table which is very slow when we compare r4.xlarge and r4.large
– Kaja Hussain
Jan 2 at 13:25
The issue is with talend jobs which were holding the sessions and blocking cpu
– Kaja Hussain
Jan 3 at 6:44
|
show 1 more comment
5
First of all, that's NOT a lot of data. Your server's RAM alone could hold the entire database in memory 3 times over. If you have performance issues you should check indexes and execution plans, not assume the hardware is insufficient. You should also check query patterns. If client applications open transactions and keep locks for a long time without reason, they'll end up blocking each other
– Panagiotis Kanavos
Jan 2 at 12:07
2
In any case, you should first determine why performance is bad. You can't fix concurrency issues by throwing extra hardware for example. Adding missing indexes can improve performance 100x times or more and reduce blocking. A good schema can result in similar improvements. Or, to put it another way, you may find that you can use a far smaller machine for the same workload and still provide far better performance
– Panagiotis Kanavos
Jan 2 at 12:10
Thanks for reply panagiotis. But my cpu reaches 80+% that's why I'm suspecting hardware can be pumped up will solve this issue
– Kaja Hussain
Jan 2 at 13:18
I'm jus running count * from the table which is very slow when we compare r4.xlarge and r4.large
– Kaja Hussain
Jan 2 at 13:25
The issue is with talend jobs which were holding the sessions and blocking cpu
– Kaja Hussain
Jan 3 at 6:44
5
5
First of all, that's NOT a lot of data. Your server's RAM alone could hold the entire database in memory 3 times over. If you have performance issues you should check indexes and execution plans, not assume the hardware is insufficient. You should also check query patterns. If client applications open transactions and keep locks for a long time without reason, they'll end up blocking each other
– Panagiotis Kanavos
Jan 2 at 12:07
First of all, that's NOT a lot of data. Your server's RAM alone could hold the entire database in memory 3 times over. If you have performance issues you should check indexes and execution plans, not assume the hardware is insufficient. You should also check query patterns. If client applications open transactions and keep locks for a long time without reason, they'll end up blocking each other
– Panagiotis Kanavos
Jan 2 at 12:07
2
2
In any case, you should first determine why performance is bad. You can't fix concurrency issues by throwing extra hardware for example. Adding missing indexes can improve performance 100x times or more and reduce blocking. A good schema can result in similar improvements. Or, to put it another way, you may find that you can use a far smaller machine for the same workload and still provide far better performance
– Panagiotis Kanavos
Jan 2 at 12:10
In any case, you should first determine why performance is bad. You can't fix concurrency issues by throwing extra hardware for example. Adding missing indexes can improve performance 100x times or more and reduce blocking. A good schema can result in similar improvements. Or, to put it another way, you may find that you can use a far smaller machine for the same workload and still provide far better performance
– Panagiotis Kanavos
Jan 2 at 12:10
Thanks for reply panagiotis. But my cpu reaches 80+% that's why I'm suspecting hardware can be pumped up will solve this issue
– Kaja Hussain
Jan 2 at 13:18
Thanks for reply panagiotis. But my cpu reaches 80+% that's why I'm suspecting hardware can be pumped up will solve this issue
– Kaja Hussain
Jan 2 at 13:18
I'm jus running count * from the table which is very slow when we compare r4.xlarge and r4.large
– Kaja Hussain
Jan 2 at 13:25
I'm jus running count * from the table which is very slow when we compare r4.xlarge and r4.large
– Kaja Hussain
Jan 2 at 13:25
The issue is with talend jobs which were holding the sessions and blocking cpu
– Kaja Hussain
Jan 3 at 6:44
The issue is with talend jobs which were holding the sessions and blocking cpu
– Kaja Hussain
Jan 3 at 6:44
|
show 1 more comment
0
active
oldest
votes
Your Answer
StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f54006012%2fpostgresql-optimal-hardware-for-working-with-50-milliion-record-retrival%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
0
active
oldest
votes
0
active
oldest
votes
active
oldest
votes
active
oldest
votes
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f54006012%2fpostgresql-optimal-hardware-for-working-with-50-milliion-record-retrival%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
5
First of all, that's NOT a lot of data. Your server's RAM alone could hold the entire database in memory 3 times over. If you have performance issues you should check indexes and execution plans, not assume the hardware is insufficient. You should also check query patterns. If client applications open transactions and keep locks for a long time without reason, they'll end up blocking each other
– Panagiotis Kanavos
Jan 2 at 12:07
2
In any case, you should first determine why performance is bad. You can't fix concurrency issues by throwing extra hardware for example. Adding missing indexes can improve performance 100x times or more and reduce blocking. A good schema can result in similar improvements. Or, to put it another way, you may find that you can use a far smaller machine for the same workload and still provide far better performance
– Panagiotis Kanavos
Jan 2 at 12:10
Thanks for reply panagiotis. But my cpu reaches 80+% that's why I'm suspecting hardware can be pumped up will solve this issue
– Kaja Hussain
Jan 2 at 13:18
I'm jus running count * from the table which is very slow when we compare r4.xlarge and r4.large
– Kaja Hussain
Jan 2 at 13:25
The issue is with talend jobs which were holding the sessions and blocking cpu
– Kaja Hussain
Jan 3 at 6:44