Postgresql - optimal hardware for working with 50 milliion record retrival












0















I run postgresql 9.6 on aws rds with r4.xlarge. initially the table had less records and query was lightning fast later the table grew to 8 GB with 50 million records and query were extremely slow.



What's the optimal config needed for working with such data retrieval.



Server specs
Cores 4 ram 30 gb










share|improve this question




















  • 5





    First of all, that's NOT a lot of data. Your server's RAM alone could hold the entire database in memory 3 times over. If you have performance issues you should check indexes and execution plans, not assume the hardware is insufficient. You should also check query patterns. If client applications open transactions and keep locks for a long time without reason, they'll end up blocking each other

    – Panagiotis Kanavos
    Jan 2 at 12:07






  • 2





    In any case, you should first determine why performance is bad. You can't fix concurrency issues by throwing extra hardware for example. Adding missing indexes can improve performance 100x times or more and reduce blocking. A good schema can result in similar improvements. Or, to put it another way, you may find that you can use a far smaller machine for the same workload and still provide far better performance

    – Panagiotis Kanavos
    Jan 2 at 12:10













  • Thanks for reply panagiotis. But my cpu reaches 80+% that's why I'm suspecting hardware can be pumped up will solve this issue

    – Kaja Hussain
    Jan 2 at 13:18











  • I'm jus running count * from the table which is very slow when we compare r4.xlarge and r4.large

    – Kaja Hussain
    Jan 2 at 13:25











  • The issue is with talend jobs which were holding the sessions and blocking cpu

    – Kaja Hussain
    Jan 3 at 6:44
















0















I run postgresql 9.6 on aws rds with r4.xlarge. initially the table had less records and query was lightning fast later the table grew to 8 GB with 50 million records and query were extremely slow.



What's the optimal config needed for working with such data retrieval.



Server specs
Cores 4 ram 30 gb










share|improve this question




















  • 5





    First of all, that's NOT a lot of data. Your server's RAM alone could hold the entire database in memory 3 times over. If you have performance issues you should check indexes and execution plans, not assume the hardware is insufficient. You should also check query patterns. If client applications open transactions and keep locks for a long time without reason, they'll end up blocking each other

    – Panagiotis Kanavos
    Jan 2 at 12:07






  • 2





    In any case, you should first determine why performance is bad. You can't fix concurrency issues by throwing extra hardware for example. Adding missing indexes can improve performance 100x times or more and reduce blocking. A good schema can result in similar improvements. Or, to put it another way, you may find that you can use a far smaller machine for the same workload and still provide far better performance

    – Panagiotis Kanavos
    Jan 2 at 12:10













  • Thanks for reply panagiotis. But my cpu reaches 80+% that's why I'm suspecting hardware can be pumped up will solve this issue

    – Kaja Hussain
    Jan 2 at 13:18











  • I'm jus running count * from the table which is very slow when we compare r4.xlarge and r4.large

    – Kaja Hussain
    Jan 2 at 13:25











  • The issue is with talend jobs which were holding the sessions and blocking cpu

    – Kaja Hussain
    Jan 3 at 6:44














0












0








0








I run postgresql 9.6 on aws rds with r4.xlarge. initially the table had less records and query was lightning fast later the table grew to 8 GB with 50 million records and query were extremely slow.



What's the optimal config needed for working with such data retrieval.



Server specs
Cores 4 ram 30 gb










share|improve this question
















I run postgresql 9.6 on aws rds with r4.xlarge. initially the table had less records and query was lightning fast later the table grew to 8 GB with 50 million records and query were extremely slow.



What's the optimal config needed for working with such data retrieval.



Server specs
Cores 4 ram 30 gb







postgresql amazon-web-services amazon-rds hardware






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Jan 2 at 13:38









Mark B

103k16166179




103k16166179










asked Jan 2 at 12:03









Kaja HussainKaja Hussain

14




14








  • 5





    First of all, that's NOT a lot of data. Your server's RAM alone could hold the entire database in memory 3 times over. If you have performance issues you should check indexes and execution plans, not assume the hardware is insufficient. You should also check query patterns. If client applications open transactions and keep locks for a long time without reason, they'll end up blocking each other

    – Panagiotis Kanavos
    Jan 2 at 12:07






  • 2





    In any case, you should first determine why performance is bad. You can't fix concurrency issues by throwing extra hardware for example. Adding missing indexes can improve performance 100x times or more and reduce blocking. A good schema can result in similar improvements. Or, to put it another way, you may find that you can use a far smaller machine for the same workload and still provide far better performance

    – Panagiotis Kanavos
    Jan 2 at 12:10













  • Thanks for reply panagiotis. But my cpu reaches 80+% that's why I'm suspecting hardware can be pumped up will solve this issue

    – Kaja Hussain
    Jan 2 at 13:18











  • I'm jus running count * from the table which is very slow when we compare r4.xlarge and r4.large

    – Kaja Hussain
    Jan 2 at 13:25











  • The issue is with talend jobs which were holding the sessions and blocking cpu

    – Kaja Hussain
    Jan 3 at 6:44














  • 5





    First of all, that's NOT a lot of data. Your server's RAM alone could hold the entire database in memory 3 times over. If you have performance issues you should check indexes and execution plans, not assume the hardware is insufficient. You should also check query patterns. If client applications open transactions and keep locks for a long time without reason, they'll end up blocking each other

    – Panagiotis Kanavos
    Jan 2 at 12:07






  • 2





    In any case, you should first determine why performance is bad. You can't fix concurrency issues by throwing extra hardware for example. Adding missing indexes can improve performance 100x times or more and reduce blocking. A good schema can result in similar improvements. Or, to put it another way, you may find that you can use a far smaller machine for the same workload and still provide far better performance

    – Panagiotis Kanavos
    Jan 2 at 12:10













  • Thanks for reply panagiotis. But my cpu reaches 80+% that's why I'm suspecting hardware can be pumped up will solve this issue

    – Kaja Hussain
    Jan 2 at 13:18











  • I'm jus running count * from the table which is very slow when we compare r4.xlarge and r4.large

    – Kaja Hussain
    Jan 2 at 13:25











  • The issue is with talend jobs which were holding the sessions and blocking cpu

    – Kaja Hussain
    Jan 3 at 6:44








5




5





First of all, that's NOT a lot of data. Your server's RAM alone could hold the entire database in memory 3 times over. If you have performance issues you should check indexes and execution plans, not assume the hardware is insufficient. You should also check query patterns. If client applications open transactions and keep locks for a long time without reason, they'll end up blocking each other

– Panagiotis Kanavos
Jan 2 at 12:07





First of all, that's NOT a lot of data. Your server's RAM alone could hold the entire database in memory 3 times over. If you have performance issues you should check indexes and execution plans, not assume the hardware is insufficient. You should also check query patterns. If client applications open transactions and keep locks for a long time without reason, they'll end up blocking each other

– Panagiotis Kanavos
Jan 2 at 12:07




2




2





In any case, you should first determine why performance is bad. You can't fix concurrency issues by throwing extra hardware for example. Adding missing indexes can improve performance 100x times or more and reduce blocking. A good schema can result in similar improvements. Or, to put it another way, you may find that you can use a far smaller machine for the same workload and still provide far better performance

– Panagiotis Kanavos
Jan 2 at 12:10







In any case, you should first determine why performance is bad. You can't fix concurrency issues by throwing extra hardware for example. Adding missing indexes can improve performance 100x times or more and reduce blocking. A good schema can result in similar improvements. Or, to put it another way, you may find that you can use a far smaller machine for the same workload and still provide far better performance

– Panagiotis Kanavos
Jan 2 at 12:10















Thanks for reply panagiotis. But my cpu reaches 80+% that's why I'm suspecting hardware can be pumped up will solve this issue

– Kaja Hussain
Jan 2 at 13:18





Thanks for reply panagiotis. But my cpu reaches 80+% that's why I'm suspecting hardware can be pumped up will solve this issue

– Kaja Hussain
Jan 2 at 13:18













I'm jus running count * from the table which is very slow when we compare r4.xlarge and r4.large

– Kaja Hussain
Jan 2 at 13:25





I'm jus running count * from the table which is very slow when we compare r4.xlarge and r4.large

– Kaja Hussain
Jan 2 at 13:25













The issue is with talend jobs which were holding the sessions and blocking cpu

– Kaja Hussain
Jan 3 at 6:44





The issue is with talend jobs which were holding the sessions and blocking cpu

– Kaja Hussain
Jan 3 at 6:44












0






active

oldest

votes











Your Answer






StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});


}
});














draft saved

draft discarded


















StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f54006012%2fpostgresql-optimal-hardware-for-working-with-50-milliion-record-retrival%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown

























0






active

oldest

votes








0






active

oldest

votes









active

oldest

votes






active

oldest

votes
















draft saved

draft discarded




















































Thanks for contributing an answer to Stack Overflow!


  • Please be sure to answer the question. Provide details and share your research!

But avoid



  • Asking for help, clarification, or responding to other answers.

  • Making statements based on opinion; back them up with references or personal experience.


To learn more, see our tips on writing great answers.




draft saved


draft discarded














StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f54006012%2fpostgresql-optimal-hardware-for-working-with-50-milliion-record-retrival%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown





















































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown

































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown







Popular posts from this blog

MongoDB - Not Authorized To Execute Command

in spring boot 2.1 many test slices are not allowed anymore due to multiple @BootstrapWith

Npm cannot find a required file even through it is in the searched directory