How to speed up PostgreSQL aggregate select with sub queries and case statements





.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty{ height:90px;width:728px;box-sizing:border-box;
}







3















Background: I have a table containing financial transaction records. The table has several tens of millions of rows for tens of thousands of users. I need to fetch the sum of the transactions for showing balances and other aspects of the site.



My current query can get extremely slow and often times out. I have tried optimizing the query but can't seem to get it to run efficiently.



Environment: My application is running on Heroku using a Postgres Standard-2 plan (8GB ram, 400 max connections, 256GB allowed storage). My max connections at any given time is about 20 and my current DB size is 35GB. According to statistics, this query runs on average about 1,000ms and is used very frequently which has a big impact on site performance.



For the database, the index cache hit rate is 99% and the table cache hit rate is 97%. Autovacuum runs about every other day based on the current thresholds.



Here's my current transactions table setup:



CREATE TABLE transactions (
id bigint DEFAULT nextval('transactions_id_seq'::regclass) NOT NULL,
user_id integer NOT NULL,
date timestamp without time zone NOT NULL,
amount numeric(15,2) NOT NULL,
transaction_type integer DEFAULT 0 NOT NULL,
account_id integer DEFAULT 0,
reconciled integer DEFAULT 0,
parent integer DEFAULT 0,
ccparent integer DEFAULT 0,
created_at timestamp without time zone DEFAULT now() NOT NULL
);
CREATE INDEX transactions_user_id_key ON transactions USING btree (user_id);
CREATE INDEX transactions_user_date_idx ON transactions (user_id, date);
CREATE INDEX transactions_user_ccparent_idx ON transactions (user_id, ccparent) WHERE ccparent >0;


And here's my current query:



SELECT account_id,
sum(deposit) - sum(withdrawal) AS balance,
sum(r_deposit)-sum(r_withdrawal) AS r_balance,
sum(deposit) AS o_deposit,
sum(withdrawal) AS o_withdrawal,
sum(r_deposit) AS r_deposit,
sum(r_withdrawal) AS r_withdrawal
FROM
(SELECT t.account_id,
CASE
WHEN transaction_type > 0 THEN sum(amount)
ELSE 0
END AS deposit,
CASE
WHEN transaction_type = 0 THEN sum(amount)
ELSE 0
END AS withdrawal,
CASE
WHEN transaction_type > 0 AND reconciled=0 THEN sum(amount)
ELSE 0
END AS r_deposit,
CASE
WHEN transaction_type = 0 AND reconciled=0 THEN sum(amount)
ELSE 0
END AS r_withdrawal
FROM transactions AS t
WHERE user_id = $1 AND parent=0 AND ccparent=0
GROUP BY transaction_type, account_id, reconciled ) AS t0
GROUP BY account_id;


The query has several parts. I have to get the following for each account the user has:



1) the overall account balance



2) the balance for all reconciled transactions



3) separately, the sum of all deposits, withdrawals, reconciled deposits and reconciled withdrawals.



Here's one query plan when I run explain analyze on the query:



QUERY PLAN                                                                           
---------------------------------------------------------------------------------------------------------------------------------------------------------------
HashAggregate (cost=13179.85..13180.14 rows=36 width=132) (actual time=1326.200..1326.204 rows=6 loops=1)
Group Key: t.account_id
-> HashAggregate (cost=13179.29..13179.58 rows=36 width=18) (actual time=1326.163..1326.171 rows=16 loops=1)
Group Key: t.transaction_type, t.account_id, t.reconciled
-> Bitmap Heap Scan on transactions t (cost=73.96..13132.07 rows=13491 width=18) (actual time=17.410..1317.863 rows=12310 loops=1)
Recheck Cond: (user_id = 1)
Filter: ((parent = 0) AND (ccparent = 0))
Rows Removed by Filter: 2
Heap Blocks: exact=6291
-> Bitmap Index Scan on transactions_user_id_key (cost=0.00..73.29 rows=13601 width=0) (actual time=15.901..15.901 rows=12343 loops=1)
Index Cond: (user_id = 1)
Planning time: 0.895 ms
Execution time: 1326.424 ms


Does anyone have any suggestions on how to speed up this query? Like I said, it's the most run query in my application and is also one of the most demanding on the DB. If I could optimize this, it would have tremendous benefits to the app in general.










share|improve this question























  • I suspect that indices are all you'll need here, however, because of the nature of your query you have some options. You could calculate the sums for a historical set of records and store that value and then just calculate it for new data when it's requested, store the most recent id and the sums associated with it as the new historical value and then return. That way you only have to look back as far as their most recent lookup.

    – sage88
    Jan 4 at 6:13













  • Also, your primary key syntax could use some work: id bigserial primary key will do the same thing

    – sage88
    Jan 4 at 6:20


















3















Background: I have a table containing financial transaction records. The table has several tens of millions of rows for tens of thousands of users. I need to fetch the sum of the transactions for showing balances and other aspects of the site.



My current query can get extremely slow and often times out. I have tried optimizing the query but can't seem to get it to run efficiently.



Environment: My application is running on Heroku using a Postgres Standard-2 plan (8GB ram, 400 max connections, 256GB allowed storage). My max connections at any given time is about 20 and my current DB size is 35GB. According to statistics, this query runs on average about 1,000ms and is used very frequently which has a big impact on site performance.



For the database, the index cache hit rate is 99% and the table cache hit rate is 97%. Autovacuum runs about every other day based on the current thresholds.



Here's my current transactions table setup:



CREATE TABLE transactions (
id bigint DEFAULT nextval('transactions_id_seq'::regclass) NOT NULL,
user_id integer NOT NULL,
date timestamp without time zone NOT NULL,
amount numeric(15,2) NOT NULL,
transaction_type integer DEFAULT 0 NOT NULL,
account_id integer DEFAULT 0,
reconciled integer DEFAULT 0,
parent integer DEFAULT 0,
ccparent integer DEFAULT 0,
created_at timestamp without time zone DEFAULT now() NOT NULL
);
CREATE INDEX transactions_user_id_key ON transactions USING btree (user_id);
CREATE INDEX transactions_user_date_idx ON transactions (user_id, date);
CREATE INDEX transactions_user_ccparent_idx ON transactions (user_id, ccparent) WHERE ccparent >0;


And here's my current query:



SELECT account_id,
sum(deposit) - sum(withdrawal) AS balance,
sum(r_deposit)-sum(r_withdrawal) AS r_balance,
sum(deposit) AS o_deposit,
sum(withdrawal) AS o_withdrawal,
sum(r_deposit) AS r_deposit,
sum(r_withdrawal) AS r_withdrawal
FROM
(SELECT t.account_id,
CASE
WHEN transaction_type > 0 THEN sum(amount)
ELSE 0
END AS deposit,
CASE
WHEN transaction_type = 0 THEN sum(amount)
ELSE 0
END AS withdrawal,
CASE
WHEN transaction_type > 0 AND reconciled=0 THEN sum(amount)
ELSE 0
END AS r_deposit,
CASE
WHEN transaction_type = 0 AND reconciled=0 THEN sum(amount)
ELSE 0
END AS r_withdrawal
FROM transactions AS t
WHERE user_id = $1 AND parent=0 AND ccparent=0
GROUP BY transaction_type, account_id, reconciled ) AS t0
GROUP BY account_id;


The query has several parts. I have to get the following for each account the user has:



1) the overall account balance



2) the balance for all reconciled transactions



3) separately, the sum of all deposits, withdrawals, reconciled deposits and reconciled withdrawals.



Here's one query plan when I run explain analyze on the query:



QUERY PLAN                                                                           
---------------------------------------------------------------------------------------------------------------------------------------------------------------
HashAggregate (cost=13179.85..13180.14 rows=36 width=132) (actual time=1326.200..1326.204 rows=6 loops=1)
Group Key: t.account_id
-> HashAggregate (cost=13179.29..13179.58 rows=36 width=18) (actual time=1326.163..1326.171 rows=16 loops=1)
Group Key: t.transaction_type, t.account_id, t.reconciled
-> Bitmap Heap Scan on transactions t (cost=73.96..13132.07 rows=13491 width=18) (actual time=17.410..1317.863 rows=12310 loops=1)
Recheck Cond: (user_id = 1)
Filter: ((parent = 0) AND (ccparent = 0))
Rows Removed by Filter: 2
Heap Blocks: exact=6291
-> Bitmap Index Scan on transactions_user_id_key (cost=0.00..73.29 rows=13601 width=0) (actual time=15.901..15.901 rows=12343 loops=1)
Index Cond: (user_id = 1)
Planning time: 0.895 ms
Execution time: 1326.424 ms


Does anyone have any suggestions on how to speed up this query? Like I said, it's the most run query in my application and is also one of the most demanding on the DB. If I could optimize this, it would have tremendous benefits to the app in general.










share|improve this question























  • I suspect that indices are all you'll need here, however, because of the nature of your query you have some options. You could calculate the sums for a historical set of records and store that value and then just calculate it for new data when it's requested, store the most recent id and the sums associated with it as the new historical value and then return. That way you only have to look back as far as their most recent lookup.

    – sage88
    Jan 4 at 6:13













  • Also, your primary key syntax could use some work: id bigserial primary key will do the same thing

    – sage88
    Jan 4 at 6:20














3












3








3


0






Background: I have a table containing financial transaction records. The table has several tens of millions of rows for tens of thousands of users. I need to fetch the sum of the transactions for showing balances and other aspects of the site.



My current query can get extremely slow and often times out. I have tried optimizing the query but can't seem to get it to run efficiently.



Environment: My application is running on Heroku using a Postgres Standard-2 plan (8GB ram, 400 max connections, 256GB allowed storage). My max connections at any given time is about 20 and my current DB size is 35GB. According to statistics, this query runs on average about 1,000ms and is used very frequently which has a big impact on site performance.



For the database, the index cache hit rate is 99% and the table cache hit rate is 97%. Autovacuum runs about every other day based on the current thresholds.



Here's my current transactions table setup:



CREATE TABLE transactions (
id bigint DEFAULT nextval('transactions_id_seq'::regclass) NOT NULL,
user_id integer NOT NULL,
date timestamp without time zone NOT NULL,
amount numeric(15,2) NOT NULL,
transaction_type integer DEFAULT 0 NOT NULL,
account_id integer DEFAULT 0,
reconciled integer DEFAULT 0,
parent integer DEFAULT 0,
ccparent integer DEFAULT 0,
created_at timestamp without time zone DEFAULT now() NOT NULL
);
CREATE INDEX transactions_user_id_key ON transactions USING btree (user_id);
CREATE INDEX transactions_user_date_idx ON transactions (user_id, date);
CREATE INDEX transactions_user_ccparent_idx ON transactions (user_id, ccparent) WHERE ccparent >0;


And here's my current query:



SELECT account_id,
sum(deposit) - sum(withdrawal) AS balance,
sum(r_deposit)-sum(r_withdrawal) AS r_balance,
sum(deposit) AS o_deposit,
sum(withdrawal) AS o_withdrawal,
sum(r_deposit) AS r_deposit,
sum(r_withdrawal) AS r_withdrawal
FROM
(SELECT t.account_id,
CASE
WHEN transaction_type > 0 THEN sum(amount)
ELSE 0
END AS deposit,
CASE
WHEN transaction_type = 0 THEN sum(amount)
ELSE 0
END AS withdrawal,
CASE
WHEN transaction_type > 0 AND reconciled=0 THEN sum(amount)
ELSE 0
END AS r_deposit,
CASE
WHEN transaction_type = 0 AND reconciled=0 THEN sum(amount)
ELSE 0
END AS r_withdrawal
FROM transactions AS t
WHERE user_id = $1 AND parent=0 AND ccparent=0
GROUP BY transaction_type, account_id, reconciled ) AS t0
GROUP BY account_id;


The query has several parts. I have to get the following for each account the user has:



1) the overall account balance



2) the balance for all reconciled transactions



3) separately, the sum of all deposits, withdrawals, reconciled deposits and reconciled withdrawals.



Here's one query plan when I run explain analyze on the query:



QUERY PLAN                                                                           
---------------------------------------------------------------------------------------------------------------------------------------------------------------
HashAggregate (cost=13179.85..13180.14 rows=36 width=132) (actual time=1326.200..1326.204 rows=6 loops=1)
Group Key: t.account_id
-> HashAggregate (cost=13179.29..13179.58 rows=36 width=18) (actual time=1326.163..1326.171 rows=16 loops=1)
Group Key: t.transaction_type, t.account_id, t.reconciled
-> Bitmap Heap Scan on transactions t (cost=73.96..13132.07 rows=13491 width=18) (actual time=17.410..1317.863 rows=12310 loops=1)
Recheck Cond: (user_id = 1)
Filter: ((parent = 0) AND (ccparent = 0))
Rows Removed by Filter: 2
Heap Blocks: exact=6291
-> Bitmap Index Scan on transactions_user_id_key (cost=0.00..73.29 rows=13601 width=0) (actual time=15.901..15.901 rows=12343 loops=1)
Index Cond: (user_id = 1)
Planning time: 0.895 ms
Execution time: 1326.424 ms


Does anyone have any suggestions on how to speed up this query? Like I said, it's the most run query in my application and is also one of the most demanding on the DB. If I could optimize this, it would have tremendous benefits to the app in general.










share|improve this question














Background: I have a table containing financial transaction records. The table has several tens of millions of rows for tens of thousands of users. I need to fetch the sum of the transactions for showing balances and other aspects of the site.



My current query can get extremely slow and often times out. I have tried optimizing the query but can't seem to get it to run efficiently.



Environment: My application is running on Heroku using a Postgres Standard-2 plan (8GB ram, 400 max connections, 256GB allowed storage). My max connections at any given time is about 20 and my current DB size is 35GB. According to statistics, this query runs on average about 1,000ms and is used very frequently which has a big impact on site performance.



For the database, the index cache hit rate is 99% and the table cache hit rate is 97%. Autovacuum runs about every other day based on the current thresholds.



Here's my current transactions table setup:



CREATE TABLE transactions (
id bigint DEFAULT nextval('transactions_id_seq'::regclass) NOT NULL,
user_id integer NOT NULL,
date timestamp without time zone NOT NULL,
amount numeric(15,2) NOT NULL,
transaction_type integer DEFAULT 0 NOT NULL,
account_id integer DEFAULT 0,
reconciled integer DEFAULT 0,
parent integer DEFAULT 0,
ccparent integer DEFAULT 0,
created_at timestamp without time zone DEFAULT now() NOT NULL
);
CREATE INDEX transactions_user_id_key ON transactions USING btree (user_id);
CREATE INDEX transactions_user_date_idx ON transactions (user_id, date);
CREATE INDEX transactions_user_ccparent_idx ON transactions (user_id, ccparent) WHERE ccparent >0;


And here's my current query:



SELECT account_id,
sum(deposit) - sum(withdrawal) AS balance,
sum(r_deposit)-sum(r_withdrawal) AS r_balance,
sum(deposit) AS o_deposit,
sum(withdrawal) AS o_withdrawal,
sum(r_deposit) AS r_deposit,
sum(r_withdrawal) AS r_withdrawal
FROM
(SELECT t.account_id,
CASE
WHEN transaction_type > 0 THEN sum(amount)
ELSE 0
END AS deposit,
CASE
WHEN transaction_type = 0 THEN sum(amount)
ELSE 0
END AS withdrawal,
CASE
WHEN transaction_type > 0 AND reconciled=0 THEN sum(amount)
ELSE 0
END AS r_deposit,
CASE
WHEN transaction_type = 0 AND reconciled=0 THEN sum(amount)
ELSE 0
END AS r_withdrawal
FROM transactions AS t
WHERE user_id = $1 AND parent=0 AND ccparent=0
GROUP BY transaction_type, account_id, reconciled ) AS t0
GROUP BY account_id;


The query has several parts. I have to get the following for each account the user has:



1) the overall account balance



2) the balance for all reconciled transactions



3) separately, the sum of all deposits, withdrawals, reconciled deposits and reconciled withdrawals.



Here's one query plan when I run explain analyze on the query:



QUERY PLAN                                                                           
---------------------------------------------------------------------------------------------------------------------------------------------------------------
HashAggregate (cost=13179.85..13180.14 rows=36 width=132) (actual time=1326.200..1326.204 rows=6 loops=1)
Group Key: t.account_id
-> HashAggregate (cost=13179.29..13179.58 rows=36 width=18) (actual time=1326.163..1326.171 rows=16 loops=1)
Group Key: t.transaction_type, t.account_id, t.reconciled
-> Bitmap Heap Scan on transactions t (cost=73.96..13132.07 rows=13491 width=18) (actual time=17.410..1317.863 rows=12310 loops=1)
Recheck Cond: (user_id = 1)
Filter: ((parent = 0) AND (ccparent = 0))
Rows Removed by Filter: 2
Heap Blocks: exact=6291
-> Bitmap Index Scan on transactions_user_id_key (cost=0.00..73.29 rows=13601 width=0) (actual time=15.901..15.901 rows=12343 loops=1)
Index Cond: (user_id = 1)
Planning time: 0.895 ms
Execution time: 1326.424 ms


Does anyone have any suggestions on how to speed up this query? Like I said, it's the most run query in my application and is also one of the most demanding on the DB. If I could optimize this, it would have tremendous benefits to the app in general.







postgresql






share|improve this question













share|improve this question











share|improve this question




share|improve this question










asked Jan 3 at 16:27









BrandonOBrandonO

534




534













  • I suspect that indices are all you'll need here, however, because of the nature of your query you have some options. You could calculate the sums for a historical set of records and store that value and then just calculate it for new data when it's requested, store the most recent id and the sums associated with it as the new historical value and then return. That way you only have to look back as far as their most recent lookup.

    – sage88
    Jan 4 at 6:13













  • Also, your primary key syntax could use some work: id bigserial primary key will do the same thing

    – sage88
    Jan 4 at 6:20



















  • I suspect that indices are all you'll need here, however, because of the nature of your query you have some options. You could calculate the sums for a historical set of records and store that value and then just calculate it for new data when it's requested, store the most recent id and the sums associated with it as the new historical value and then return. That way you only have to look back as far as their most recent lookup.

    – sage88
    Jan 4 at 6:13













  • Also, your primary key syntax could use some work: id bigserial primary key will do the same thing

    – sage88
    Jan 4 at 6:20

















I suspect that indices are all you'll need here, however, because of the nature of your query you have some options. You could calculate the sums for a historical set of records and store that value and then just calculate it for new data when it's requested, store the most recent id and the sums associated with it as the new historical value and then return. That way you only have to look back as far as their most recent lookup.

– sage88
Jan 4 at 6:13







I suspect that indices are all you'll need here, however, because of the nature of your query you have some options. You could calculate the sums for a historical set of records and store that value and then just calculate it for new data when it's requested, store the most recent id and the sums associated with it as the new historical value and then return. That way you only have to look back as far as their most recent lookup.

– sage88
Jan 4 at 6:13















Also, your primary key syntax could use some work: id bigserial primary key will do the same thing

– sage88
Jan 4 at 6:20





Also, your primary key syntax could use some work: id bigserial primary key will do the same thing

– sage88
Jan 4 at 6:20












1 Answer
1






active

oldest

votes


















0














Try if it picks up an index on transactions (user_id, parent, ccparent, transaction_type, account_id, reconciled).



CREATE INDEX transactions_u_p_ccp_tt_a_r_idx
ON transactions
(user_id,
parent,
ccparent,
transaction_type,
account_id,
reconciled);


Maybe you can even include amount in the index.



CREATE INDEX transactions_u_p_ccp_tt_a_r_a_idx
ON transactions
(user_id,
parent,
ccparent,
transaction_type,
account_id,
reconciled,
amount);





share|improve this answer
























  • I was worried that having such complex indexes might affect the performance of the table when it comes to other actions (insert, update). Is that an unfair assumption? Again, the table has about 30 million rows in it and is used extremely frequently.

    – BrandonO
    Jan 3 at 18:46






  • 1





    @BrandonO: Of course it will have impact on the INSERTs and UPDATEs (but for the latter indexes may also be a benefit, as one part of an UPDATE is the retrieval, that could be faster with an appropriate index). But I wouldn't worry that much about it. You'd have to try it. And then see what weighs more. I think it's likely that the decrease in speed of writes is neglectable but the increase in speed for your query is significant.

    – sticky bit
    Jan 3 at 19:02













  • It's proving difficult to test since after running the query once, the query planner seems to better account for it and running the same query back to back will result in the second time being something like 78ms vs 5000ms. I'll keep playing with this and see if some different indexes will help.

    – BrandonO
    Jan 3 at 19:16












Your Answer






StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});


}
});














draft saved

draft discarded


















StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f54026202%2fhow-to-speed-up-postgresql-aggregate-select-with-sub-queries-and-case-statements%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown

























1 Answer
1






active

oldest

votes








1 Answer
1






active

oldest

votes









active

oldest

votes






active

oldest

votes









0














Try if it picks up an index on transactions (user_id, parent, ccparent, transaction_type, account_id, reconciled).



CREATE INDEX transactions_u_p_ccp_tt_a_r_idx
ON transactions
(user_id,
parent,
ccparent,
transaction_type,
account_id,
reconciled);


Maybe you can even include amount in the index.



CREATE INDEX transactions_u_p_ccp_tt_a_r_a_idx
ON transactions
(user_id,
parent,
ccparent,
transaction_type,
account_id,
reconciled,
amount);





share|improve this answer
























  • I was worried that having such complex indexes might affect the performance of the table when it comes to other actions (insert, update). Is that an unfair assumption? Again, the table has about 30 million rows in it and is used extremely frequently.

    – BrandonO
    Jan 3 at 18:46






  • 1





    @BrandonO: Of course it will have impact on the INSERTs and UPDATEs (but for the latter indexes may also be a benefit, as one part of an UPDATE is the retrieval, that could be faster with an appropriate index). But I wouldn't worry that much about it. You'd have to try it. And then see what weighs more. I think it's likely that the decrease in speed of writes is neglectable but the increase in speed for your query is significant.

    – sticky bit
    Jan 3 at 19:02













  • It's proving difficult to test since after running the query once, the query planner seems to better account for it and running the same query back to back will result in the second time being something like 78ms vs 5000ms. I'll keep playing with this and see if some different indexes will help.

    – BrandonO
    Jan 3 at 19:16
















0














Try if it picks up an index on transactions (user_id, parent, ccparent, transaction_type, account_id, reconciled).



CREATE INDEX transactions_u_p_ccp_tt_a_r_idx
ON transactions
(user_id,
parent,
ccparent,
transaction_type,
account_id,
reconciled);


Maybe you can even include amount in the index.



CREATE INDEX transactions_u_p_ccp_tt_a_r_a_idx
ON transactions
(user_id,
parent,
ccparent,
transaction_type,
account_id,
reconciled,
amount);





share|improve this answer
























  • I was worried that having such complex indexes might affect the performance of the table when it comes to other actions (insert, update). Is that an unfair assumption? Again, the table has about 30 million rows in it and is used extremely frequently.

    – BrandonO
    Jan 3 at 18:46






  • 1





    @BrandonO: Of course it will have impact on the INSERTs and UPDATEs (but for the latter indexes may also be a benefit, as one part of an UPDATE is the retrieval, that could be faster with an appropriate index). But I wouldn't worry that much about it. You'd have to try it. And then see what weighs more. I think it's likely that the decrease in speed of writes is neglectable but the increase in speed for your query is significant.

    – sticky bit
    Jan 3 at 19:02













  • It's proving difficult to test since after running the query once, the query planner seems to better account for it and running the same query back to back will result in the second time being something like 78ms vs 5000ms. I'll keep playing with this and see if some different indexes will help.

    – BrandonO
    Jan 3 at 19:16














0












0








0







Try if it picks up an index on transactions (user_id, parent, ccparent, transaction_type, account_id, reconciled).



CREATE INDEX transactions_u_p_ccp_tt_a_r_idx
ON transactions
(user_id,
parent,
ccparent,
transaction_type,
account_id,
reconciled);


Maybe you can even include amount in the index.



CREATE INDEX transactions_u_p_ccp_tt_a_r_a_idx
ON transactions
(user_id,
parent,
ccparent,
transaction_type,
account_id,
reconciled,
amount);





share|improve this answer













Try if it picks up an index on transactions (user_id, parent, ccparent, transaction_type, account_id, reconciled).



CREATE INDEX transactions_u_p_ccp_tt_a_r_idx
ON transactions
(user_id,
parent,
ccparent,
transaction_type,
account_id,
reconciled);


Maybe you can even include amount in the index.



CREATE INDEX transactions_u_p_ccp_tt_a_r_a_idx
ON transactions
(user_id,
parent,
ccparent,
transaction_type,
account_id,
reconciled,
amount);






share|improve this answer












share|improve this answer



share|improve this answer










answered Jan 3 at 17:00









sticky bitsticky bit

16.7k111733




16.7k111733













  • I was worried that having such complex indexes might affect the performance of the table when it comes to other actions (insert, update). Is that an unfair assumption? Again, the table has about 30 million rows in it and is used extremely frequently.

    – BrandonO
    Jan 3 at 18:46






  • 1





    @BrandonO: Of course it will have impact on the INSERTs and UPDATEs (but for the latter indexes may also be a benefit, as one part of an UPDATE is the retrieval, that could be faster with an appropriate index). But I wouldn't worry that much about it. You'd have to try it. And then see what weighs more. I think it's likely that the decrease in speed of writes is neglectable but the increase in speed for your query is significant.

    – sticky bit
    Jan 3 at 19:02













  • It's proving difficult to test since after running the query once, the query planner seems to better account for it and running the same query back to back will result in the second time being something like 78ms vs 5000ms. I'll keep playing with this and see if some different indexes will help.

    – BrandonO
    Jan 3 at 19:16



















  • I was worried that having such complex indexes might affect the performance of the table when it comes to other actions (insert, update). Is that an unfair assumption? Again, the table has about 30 million rows in it and is used extremely frequently.

    – BrandonO
    Jan 3 at 18:46






  • 1





    @BrandonO: Of course it will have impact on the INSERTs and UPDATEs (but for the latter indexes may also be a benefit, as one part of an UPDATE is the retrieval, that could be faster with an appropriate index). But I wouldn't worry that much about it. You'd have to try it. And then see what weighs more. I think it's likely that the decrease in speed of writes is neglectable but the increase in speed for your query is significant.

    – sticky bit
    Jan 3 at 19:02













  • It's proving difficult to test since after running the query once, the query planner seems to better account for it and running the same query back to back will result in the second time being something like 78ms vs 5000ms. I'll keep playing with this and see if some different indexes will help.

    – BrandonO
    Jan 3 at 19:16

















I was worried that having such complex indexes might affect the performance of the table when it comes to other actions (insert, update). Is that an unfair assumption? Again, the table has about 30 million rows in it and is used extremely frequently.

– BrandonO
Jan 3 at 18:46





I was worried that having such complex indexes might affect the performance of the table when it comes to other actions (insert, update). Is that an unfair assumption? Again, the table has about 30 million rows in it and is used extremely frequently.

– BrandonO
Jan 3 at 18:46




1




1





@BrandonO: Of course it will have impact on the INSERTs and UPDATEs (but for the latter indexes may also be a benefit, as one part of an UPDATE is the retrieval, that could be faster with an appropriate index). But I wouldn't worry that much about it. You'd have to try it. And then see what weighs more. I think it's likely that the decrease in speed of writes is neglectable but the increase in speed for your query is significant.

– sticky bit
Jan 3 at 19:02







@BrandonO: Of course it will have impact on the INSERTs and UPDATEs (but for the latter indexes may also be a benefit, as one part of an UPDATE is the retrieval, that could be faster with an appropriate index). But I wouldn't worry that much about it. You'd have to try it. And then see what weighs more. I think it's likely that the decrease in speed of writes is neglectable but the increase in speed for your query is significant.

– sticky bit
Jan 3 at 19:02















It's proving difficult to test since after running the query once, the query planner seems to better account for it and running the same query back to back will result in the second time being something like 78ms vs 5000ms. I'll keep playing with this and see if some different indexes will help.

– BrandonO
Jan 3 at 19:16





It's proving difficult to test since after running the query once, the query planner seems to better account for it and running the same query back to back will result in the second time being something like 78ms vs 5000ms. I'll keep playing with this and see if some different indexes will help.

– BrandonO
Jan 3 at 19:16




















draft saved

draft discarded




















































Thanks for contributing an answer to Stack Overflow!


  • Please be sure to answer the question. Provide details and share your research!

But avoid



  • Asking for help, clarification, or responding to other answers.

  • Making statements based on opinion; back them up with references or personal experience.


To learn more, see our tips on writing great answers.




draft saved


draft discarded














StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f54026202%2fhow-to-speed-up-postgresql-aggregate-select-with-sub-queries-and-case-statements%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown





















































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown

































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown







Popular posts from this blog

MongoDB - Not Authorized To Execute Command

How to fix TextFormField cause rebuild widget in Flutter

in spring boot 2.1 many test slices are not allowed anymore due to multiple @BootstrapWith