How to speed up PostgreSQL aggregate select with sub queries and case statements
.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty{ height:90px;width:728px;box-sizing:border-box;
}
Background: I have a table containing financial transaction records. The table has several tens of millions of rows for tens of thousands of users. I need to fetch the sum of the transactions for showing balances and other aspects of the site.
My current query can get extremely slow and often times out. I have tried optimizing the query but can't seem to get it to run efficiently.
Environment: My application is running on Heroku using a Postgres Standard-2 plan (8GB ram, 400 max connections, 256GB allowed storage). My max connections at any given time is about 20 and my current DB size is 35GB. According to statistics, this query runs on average about 1,000ms and is used very frequently which has a big impact on site performance.
For the database, the index cache hit rate is 99% and the table cache hit rate is 97%. Autovacuum runs about every other day based on the current thresholds.
Here's my current transactions table setup:
CREATE TABLE transactions (
id bigint DEFAULT nextval('transactions_id_seq'::regclass) NOT NULL,
user_id integer NOT NULL,
date timestamp without time zone NOT NULL,
amount numeric(15,2) NOT NULL,
transaction_type integer DEFAULT 0 NOT NULL,
account_id integer DEFAULT 0,
reconciled integer DEFAULT 0,
parent integer DEFAULT 0,
ccparent integer DEFAULT 0,
created_at timestamp without time zone DEFAULT now() NOT NULL
);
CREATE INDEX transactions_user_id_key ON transactions USING btree (user_id);
CREATE INDEX transactions_user_date_idx ON transactions (user_id, date);
CREATE INDEX transactions_user_ccparent_idx ON transactions (user_id, ccparent) WHERE ccparent >0;
And here's my current query:
SELECT account_id,
sum(deposit) - sum(withdrawal) AS balance,
sum(r_deposit)-sum(r_withdrawal) AS r_balance,
sum(deposit) AS o_deposit,
sum(withdrawal) AS o_withdrawal,
sum(r_deposit) AS r_deposit,
sum(r_withdrawal) AS r_withdrawal
FROM
(SELECT t.account_id,
CASE
WHEN transaction_type > 0 THEN sum(amount)
ELSE 0
END AS deposit,
CASE
WHEN transaction_type = 0 THEN sum(amount)
ELSE 0
END AS withdrawal,
CASE
WHEN transaction_type > 0 AND reconciled=0 THEN sum(amount)
ELSE 0
END AS r_deposit,
CASE
WHEN transaction_type = 0 AND reconciled=0 THEN sum(amount)
ELSE 0
END AS r_withdrawal
FROM transactions AS t
WHERE user_id = $1 AND parent=0 AND ccparent=0
GROUP BY transaction_type, account_id, reconciled ) AS t0
GROUP BY account_id;
The query has several parts. I have to get the following for each account the user has:
1) the overall account balance
2) the balance for all reconciled transactions
3) separately, the sum of all deposits, withdrawals, reconciled deposits and reconciled withdrawals.
Here's one query plan when I run explain analyze
on the query:
QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------------------------------------
HashAggregate (cost=13179.85..13180.14 rows=36 width=132) (actual time=1326.200..1326.204 rows=6 loops=1)
Group Key: t.account_id
-> HashAggregate (cost=13179.29..13179.58 rows=36 width=18) (actual time=1326.163..1326.171 rows=16 loops=1)
Group Key: t.transaction_type, t.account_id, t.reconciled
-> Bitmap Heap Scan on transactions t (cost=73.96..13132.07 rows=13491 width=18) (actual time=17.410..1317.863 rows=12310 loops=1)
Recheck Cond: (user_id = 1)
Filter: ((parent = 0) AND (ccparent = 0))
Rows Removed by Filter: 2
Heap Blocks: exact=6291
-> Bitmap Index Scan on transactions_user_id_key (cost=0.00..73.29 rows=13601 width=0) (actual time=15.901..15.901 rows=12343 loops=1)
Index Cond: (user_id = 1)
Planning time: 0.895 ms
Execution time: 1326.424 ms
Does anyone have any suggestions on how to speed up this query? Like I said, it's the most run query in my application and is also one of the most demanding on the DB. If I could optimize this, it would have tremendous benefits to the app in general.
postgresql
add a comment |
Background: I have a table containing financial transaction records. The table has several tens of millions of rows for tens of thousands of users. I need to fetch the sum of the transactions for showing balances and other aspects of the site.
My current query can get extremely slow and often times out. I have tried optimizing the query but can't seem to get it to run efficiently.
Environment: My application is running on Heroku using a Postgres Standard-2 plan (8GB ram, 400 max connections, 256GB allowed storage). My max connections at any given time is about 20 and my current DB size is 35GB. According to statistics, this query runs on average about 1,000ms and is used very frequently which has a big impact on site performance.
For the database, the index cache hit rate is 99% and the table cache hit rate is 97%. Autovacuum runs about every other day based on the current thresholds.
Here's my current transactions table setup:
CREATE TABLE transactions (
id bigint DEFAULT nextval('transactions_id_seq'::regclass) NOT NULL,
user_id integer NOT NULL,
date timestamp without time zone NOT NULL,
amount numeric(15,2) NOT NULL,
transaction_type integer DEFAULT 0 NOT NULL,
account_id integer DEFAULT 0,
reconciled integer DEFAULT 0,
parent integer DEFAULT 0,
ccparent integer DEFAULT 0,
created_at timestamp without time zone DEFAULT now() NOT NULL
);
CREATE INDEX transactions_user_id_key ON transactions USING btree (user_id);
CREATE INDEX transactions_user_date_idx ON transactions (user_id, date);
CREATE INDEX transactions_user_ccparent_idx ON transactions (user_id, ccparent) WHERE ccparent >0;
And here's my current query:
SELECT account_id,
sum(deposit) - sum(withdrawal) AS balance,
sum(r_deposit)-sum(r_withdrawal) AS r_balance,
sum(deposit) AS o_deposit,
sum(withdrawal) AS o_withdrawal,
sum(r_deposit) AS r_deposit,
sum(r_withdrawal) AS r_withdrawal
FROM
(SELECT t.account_id,
CASE
WHEN transaction_type > 0 THEN sum(amount)
ELSE 0
END AS deposit,
CASE
WHEN transaction_type = 0 THEN sum(amount)
ELSE 0
END AS withdrawal,
CASE
WHEN transaction_type > 0 AND reconciled=0 THEN sum(amount)
ELSE 0
END AS r_deposit,
CASE
WHEN transaction_type = 0 AND reconciled=0 THEN sum(amount)
ELSE 0
END AS r_withdrawal
FROM transactions AS t
WHERE user_id = $1 AND parent=0 AND ccparent=0
GROUP BY transaction_type, account_id, reconciled ) AS t0
GROUP BY account_id;
The query has several parts. I have to get the following for each account the user has:
1) the overall account balance
2) the balance for all reconciled transactions
3) separately, the sum of all deposits, withdrawals, reconciled deposits and reconciled withdrawals.
Here's one query plan when I run explain analyze
on the query:
QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------------------------------------
HashAggregate (cost=13179.85..13180.14 rows=36 width=132) (actual time=1326.200..1326.204 rows=6 loops=1)
Group Key: t.account_id
-> HashAggregate (cost=13179.29..13179.58 rows=36 width=18) (actual time=1326.163..1326.171 rows=16 loops=1)
Group Key: t.transaction_type, t.account_id, t.reconciled
-> Bitmap Heap Scan on transactions t (cost=73.96..13132.07 rows=13491 width=18) (actual time=17.410..1317.863 rows=12310 loops=1)
Recheck Cond: (user_id = 1)
Filter: ((parent = 0) AND (ccparent = 0))
Rows Removed by Filter: 2
Heap Blocks: exact=6291
-> Bitmap Index Scan on transactions_user_id_key (cost=0.00..73.29 rows=13601 width=0) (actual time=15.901..15.901 rows=12343 loops=1)
Index Cond: (user_id = 1)
Planning time: 0.895 ms
Execution time: 1326.424 ms
Does anyone have any suggestions on how to speed up this query? Like I said, it's the most run query in my application and is also one of the most demanding on the DB. If I could optimize this, it would have tremendous benefits to the app in general.
postgresql
I suspect that indices are all you'll need here, however, because of the nature of your query you have some options. You could calculate the sums for a historical set of records and store that value and then just calculate it for new data when it's requested, store the most recent id and the sums associated with it as the new historical value and then return. That way you only have to look back as far as their most recent lookup.
– sage88
Jan 4 at 6:13
Also, your primary key syntax could use some work:id bigserial primary key
will do the same thing
– sage88
Jan 4 at 6:20
add a comment |
Background: I have a table containing financial transaction records. The table has several tens of millions of rows for tens of thousands of users. I need to fetch the sum of the transactions for showing balances and other aspects of the site.
My current query can get extremely slow and often times out. I have tried optimizing the query but can't seem to get it to run efficiently.
Environment: My application is running on Heroku using a Postgres Standard-2 plan (8GB ram, 400 max connections, 256GB allowed storage). My max connections at any given time is about 20 and my current DB size is 35GB. According to statistics, this query runs on average about 1,000ms and is used very frequently which has a big impact on site performance.
For the database, the index cache hit rate is 99% and the table cache hit rate is 97%. Autovacuum runs about every other day based on the current thresholds.
Here's my current transactions table setup:
CREATE TABLE transactions (
id bigint DEFAULT nextval('transactions_id_seq'::regclass) NOT NULL,
user_id integer NOT NULL,
date timestamp without time zone NOT NULL,
amount numeric(15,2) NOT NULL,
transaction_type integer DEFAULT 0 NOT NULL,
account_id integer DEFAULT 0,
reconciled integer DEFAULT 0,
parent integer DEFAULT 0,
ccparent integer DEFAULT 0,
created_at timestamp without time zone DEFAULT now() NOT NULL
);
CREATE INDEX transactions_user_id_key ON transactions USING btree (user_id);
CREATE INDEX transactions_user_date_idx ON transactions (user_id, date);
CREATE INDEX transactions_user_ccparent_idx ON transactions (user_id, ccparent) WHERE ccparent >0;
And here's my current query:
SELECT account_id,
sum(deposit) - sum(withdrawal) AS balance,
sum(r_deposit)-sum(r_withdrawal) AS r_balance,
sum(deposit) AS o_deposit,
sum(withdrawal) AS o_withdrawal,
sum(r_deposit) AS r_deposit,
sum(r_withdrawal) AS r_withdrawal
FROM
(SELECT t.account_id,
CASE
WHEN transaction_type > 0 THEN sum(amount)
ELSE 0
END AS deposit,
CASE
WHEN transaction_type = 0 THEN sum(amount)
ELSE 0
END AS withdrawal,
CASE
WHEN transaction_type > 0 AND reconciled=0 THEN sum(amount)
ELSE 0
END AS r_deposit,
CASE
WHEN transaction_type = 0 AND reconciled=0 THEN sum(amount)
ELSE 0
END AS r_withdrawal
FROM transactions AS t
WHERE user_id = $1 AND parent=0 AND ccparent=0
GROUP BY transaction_type, account_id, reconciled ) AS t0
GROUP BY account_id;
The query has several parts. I have to get the following for each account the user has:
1) the overall account balance
2) the balance for all reconciled transactions
3) separately, the sum of all deposits, withdrawals, reconciled deposits and reconciled withdrawals.
Here's one query plan when I run explain analyze
on the query:
QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------------------------------------
HashAggregate (cost=13179.85..13180.14 rows=36 width=132) (actual time=1326.200..1326.204 rows=6 loops=1)
Group Key: t.account_id
-> HashAggregate (cost=13179.29..13179.58 rows=36 width=18) (actual time=1326.163..1326.171 rows=16 loops=1)
Group Key: t.transaction_type, t.account_id, t.reconciled
-> Bitmap Heap Scan on transactions t (cost=73.96..13132.07 rows=13491 width=18) (actual time=17.410..1317.863 rows=12310 loops=1)
Recheck Cond: (user_id = 1)
Filter: ((parent = 0) AND (ccparent = 0))
Rows Removed by Filter: 2
Heap Blocks: exact=6291
-> Bitmap Index Scan on transactions_user_id_key (cost=0.00..73.29 rows=13601 width=0) (actual time=15.901..15.901 rows=12343 loops=1)
Index Cond: (user_id = 1)
Planning time: 0.895 ms
Execution time: 1326.424 ms
Does anyone have any suggestions on how to speed up this query? Like I said, it's the most run query in my application and is also one of the most demanding on the DB. If I could optimize this, it would have tremendous benefits to the app in general.
postgresql
Background: I have a table containing financial transaction records. The table has several tens of millions of rows for tens of thousands of users. I need to fetch the sum of the transactions for showing balances and other aspects of the site.
My current query can get extremely slow and often times out. I have tried optimizing the query but can't seem to get it to run efficiently.
Environment: My application is running on Heroku using a Postgres Standard-2 plan (8GB ram, 400 max connections, 256GB allowed storage). My max connections at any given time is about 20 and my current DB size is 35GB. According to statistics, this query runs on average about 1,000ms and is used very frequently which has a big impact on site performance.
For the database, the index cache hit rate is 99% and the table cache hit rate is 97%. Autovacuum runs about every other day based on the current thresholds.
Here's my current transactions table setup:
CREATE TABLE transactions (
id bigint DEFAULT nextval('transactions_id_seq'::regclass) NOT NULL,
user_id integer NOT NULL,
date timestamp without time zone NOT NULL,
amount numeric(15,2) NOT NULL,
transaction_type integer DEFAULT 0 NOT NULL,
account_id integer DEFAULT 0,
reconciled integer DEFAULT 0,
parent integer DEFAULT 0,
ccparent integer DEFAULT 0,
created_at timestamp without time zone DEFAULT now() NOT NULL
);
CREATE INDEX transactions_user_id_key ON transactions USING btree (user_id);
CREATE INDEX transactions_user_date_idx ON transactions (user_id, date);
CREATE INDEX transactions_user_ccparent_idx ON transactions (user_id, ccparent) WHERE ccparent >0;
And here's my current query:
SELECT account_id,
sum(deposit) - sum(withdrawal) AS balance,
sum(r_deposit)-sum(r_withdrawal) AS r_balance,
sum(deposit) AS o_deposit,
sum(withdrawal) AS o_withdrawal,
sum(r_deposit) AS r_deposit,
sum(r_withdrawal) AS r_withdrawal
FROM
(SELECT t.account_id,
CASE
WHEN transaction_type > 0 THEN sum(amount)
ELSE 0
END AS deposit,
CASE
WHEN transaction_type = 0 THEN sum(amount)
ELSE 0
END AS withdrawal,
CASE
WHEN transaction_type > 0 AND reconciled=0 THEN sum(amount)
ELSE 0
END AS r_deposit,
CASE
WHEN transaction_type = 0 AND reconciled=0 THEN sum(amount)
ELSE 0
END AS r_withdrawal
FROM transactions AS t
WHERE user_id = $1 AND parent=0 AND ccparent=0
GROUP BY transaction_type, account_id, reconciled ) AS t0
GROUP BY account_id;
The query has several parts. I have to get the following for each account the user has:
1) the overall account balance
2) the balance for all reconciled transactions
3) separately, the sum of all deposits, withdrawals, reconciled deposits and reconciled withdrawals.
Here's one query plan when I run explain analyze
on the query:
QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------------------------------------
HashAggregate (cost=13179.85..13180.14 rows=36 width=132) (actual time=1326.200..1326.204 rows=6 loops=1)
Group Key: t.account_id
-> HashAggregate (cost=13179.29..13179.58 rows=36 width=18) (actual time=1326.163..1326.171 rows=16 loops=1)
Group Key: t.transaction_type, t.account_id, t.reconciled
-> Bitmap Heap Scan on transactions t (cost=73.96..13132.07 rows=13491 width=18) (actual time=17.410..1317.863 rows=12310 loops=1)
Recheck Cond: (user_id = 1)
Filter: ((parent = 0) AND (ccparent = 0))
Rows Removed by Filter: 2
Heap Blocks: exact=6291
-> Bitmap Index Scan on transactions_user_id_key (cost=0.00..73.29 rows=13601 width=0) (actual time=15.901..15.901 rows=12343 loops=1)
Index Cond: (user_id = 1)
Planning time: 0.895 ms
Execution time: 1326.424 ms
Does anyone have any suggestions on how to speed up this query? Like I said, it's the most run query in my application and is also one of the most demanding on the DB. If I could optimize this, it would have tremendous benefits to the app in general.
postgresql
postgresql
asked Jan 3 at 16:27


BrandonOBrandonO
534
534
I suspect that indices are all you'll need here, however, because of the nature of your query you have some options. You could calculate the sums for a historical set of records and store that value and then just calculate it for new data when it's requested, store the most recent id and the sums associated with it as the new historical value and then return. That way you only have to look back as far as their most recent lookup.
– sage88
Jan 4 at 6:13
Also, your primary key syntax could use some work:id bigserial primary key
will do the same thing
– sage88
Jan 4 at 6:20
add a comment |
I suspect that indices are all you'll need here, however, because of the nature of your query you have some options. You could calculate the sums for a historical set of records and store that value and then just calculate it for new data when it's requested, store the most recent id and the sums associated with it as the new historical value and then return. That way you only have to look back as far as their most recent lookup.
– sage88
Jan 4 at 6:13
Also, your primary key syntax could use some work:id bigserial primary key
will do the same thing
– sage88
Jan 4 at 6:20
I suspect that indices are all you'll need here, however, because of the nature of your query you have some options. You could calculate the sums for a historical set of records and store that value and then just calculate it for new data when it's requested, store the most recent id and the sums associated with it as the new historical value and then return. That way you only have to look back as far as their most recent lookup.
– sage88
Jan 4 at 6:13
I suspect that indices are all you'll need here, however, because of the nature of your query you have some options. You could calculate the sums for a historical set of records and store that value and then just calculate it for new data when it's requested, store the most recent id and the sums associated with it as the new historical value and then return. That way you only have to look back as far as their most recent lookup.
– sage88
Jan 4 at 6:13
Also, your primary key syntax could use some work:
id bigserial primary key
will do the same thing– sage88
Jan 4 at 6:20
Also, your primary key syntax could use some work:
id bigserial primary key
will do the same thing– sage88
Jan 4 at 6:20
add a comment |
1 Answer
1
active
oldest
votes
Try if it picks up an index on transactions (user_id, parent, ccparent, transaction_type, account_id, reconciled)
.
CREATE INDEX transactions_u_p_ccp_tt_a_r_idx
ON transactions
(user_id,
parent,
ccparent,
transaction_type,
account_id,
reconciled);
Maybe you can even include amount
in the index.
CREATE INDEX transactions_u_p_ccp_tt_a_r_a_idx
ON transactions
(user_id,
parent,
ccparent,
transaction_type,
account_id,
reconciled,
amount);
I was worried that having such complex indexes might affect the performance of the table when it comes to other actions (insert, update). Is that an unfair assumption? Again, the table has about 30 million rows in it and is used extremely frequently.
– BrandonO
Jan 3 at 18:46
1
@BrandonO: Of course it will have impact on theINSERT
s andUPDATE
s (but for the latter indexes may also be a benefit, as one part of anUPDATE
is the retrieval, that could be faster with an appropriate index). But I wouldn't worry that much about it. You'd have to try it. And then see what weighs more. I think it's likely that the decrease in speed of writes is neglectable but the increase in speed for your query is significant.
– sticky bit
Jan 3 at 19:02
It's proving difficult to test since after running the query once, the query planner seems to better account for it and running the same query back to back will result in the second time being something like 78ms vs 5000ms. I'll keep playing with this and see if some different indexes will help.
– BrandonO
Jan 3 at 19:16
add a comment |
Your Answer
StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f54026202%2fhow-to-speed-up-postgresql-aggregate-select-with-sub-queries-and-case-statements%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
Try if it picks up an index on transactions (user_id, parent, ccparent, transaction_type, account_id, reconciled)
.
CREATE INDEX transactions_u_p_ccp_tt_a_r_idx
ON transactions
(user_id,
parent,
ccparent,
transaction_type,
account_id,
reconciled);
Maybe you can even include amount
in the index.
CREATE INDEX transactions_u_p_ccp_tt_a_r_a_idx
ON transactions
(user_id,
parent,
ccparent,
transaction_type,
account_id,
reconciled,
amount);
I was worried that having such complex indexes might affect the performance of the table when it comes to other actions (insert, update). Is that an unfair assumption? Again, the table has about 30 million rows in it and is used extremely frequently.
– BrandonO
Jan 3 at 18:46
1
@BrandonO: Of course it will have impact on theINSERT
s andUPDATE
s (but for the latter indexes may also be a benefit, as one part of anUPDATE
is the retrieval, that could be faster with an appropriate index). But I wouldn't worry that much about it. You'd have to try it. And then see what weighs more. I think it's likely that the decrease in speed of writes is neglectable but the increase in speed for your query is significant.
– sticky bit
Jan 3 at 19:02
It's proving difficult to test since after running the query once, the query planner seems to better account for it and running the same query back to back will result in the second time being something like 78ms vs 5000ms. I'll keep playing with this and see if some different indexes will help.
– BrandonO
Jan 3 at 19:16
add a comment |
Try if it picks up an index on transactions (user_id, parent, ccparent, transaction_type, account_id, reconciled)
.
CREATE INDEX transactions_u_p_ccp_tt_a_r_idx
ON transactions
(user_id,
parent,
ccparent,
transaction_type,
account_id,
reconciled);
Maybe you can even include amount
in the index.
CREATE INDEX transactions_u_p_ccp_tt_a_r_a_idx
ON transactions
(user_id,
parent,
ccparent,
transaction_type,
account_id,
reconciled,
amount);
I was worried that having such complex indexes might affect the performance of the table when it comes to other actions (insert, update). Is that an unfair assumption? Again, the table has about 30 million rows in it and is used extremely frequently.
– BrandonO
Jan 3 at 18:46
1
@BrandonO: Of course it will have impact on theINSERT
s andUPDATE
s (but for the latter indexes may also be a benefit, as one part of anUPDATE
is the retrieval, that could be faster with an appropriate index). But I wouldn't worry that much about it. You'd have to try it. And then see what weighs more. I think it's likely that the decrease in speed of writes is neglectable but the increase in speed for your query is significant.
– sticky bit
Jan 3 at 19:02
It's proving difficult to test since after running the query once, the query planner seems to better account for it and running the same query back to back will result in the second time being something like 78ms vs 5000ms. I'll keep playing with this and see if some different indexes will help.
– BrandonO
Jan 3 at 19:16
add a comment |
Try if it picks up an index on transactions (user_id, parent, ccparent, transaction_type, account_id, reconciled)
.
CREATE INDEX transactions_u_p_ccp_tt_a_r_idx
ON transactions
(user_id,
parent,
ccparent,
transaction_type,
account_id,
reconciled);
Maybe you can even include amount
in the index.
CREATE INDEX transactions_u_p_ccp_tt_a_r_a_idx
ON transactions
(user_id,
parent,
ccparent,
transaction_type,
account_id,
reconciled,
amount);
Try if it picks up an index on transactions (user_id, parent, ccparent, transaction_type, account_id, reconciled)
.
CREATE INDEX transactions_u_p_ccp_tt_a_r_idx
ON transactions
(user_id,
parent,
ccparent,
transaction_type,
account_id,
reconciled);
Maybe you can even include amount
in the index.
CREATE INDEX transactions_u_p_ccp_tt_a_r_a_idx
ON transactions
(user_id,
parent,
ccparent,
transaction_type,
account_id,
reconciled,
amount);
answered Jan 3 at 17:00


sticky bitsticky bit
16.7k111733
16.7k111733
I was worried that having such complex indexes might affect the performance of the table when it comes to other actions (insert, update). Is that an unfair assumption? Again, the table has about 30 million rows in it and is used extremely frequently.
– BrandonO
Jan 3 at 18:46
1
@BrandonO: Of course it will have impact on theINSERT
s andUPDATE
s (but for the latter indexes may also be a benefit, as one part of anUPDATE
is the retrieval, that could be faster with an appropriate index). But I wouldn't worry that much about it. You'd have to try it. And then see what weighs more. I think it's likely that the decrease in speed of writes is neglectable but the increase in speed for your query is significant.
– sticky bit
Jan 3 at 19:02
It's proving difficult to test since after running the query once, the query planner seems to better account for it and running the same query back to back will result in the second time being something like 78ms vs 5000ms. I'll keep playing with this and see if some different indexes will help.
– BrandonO
Jan 3 at 19:16
add a comment |
I was worried that having such complex indexes might affect the performance of the table when it comes to other actions (insert, update). Is that an unfair assumption? Again, the table has about 30 million rows in it and is used extremely frequently.
– BrandonO
Jan 3 at 18:46
1
@BrandonO: Of course it will have impact on theINSERT
s andUPDATE
s (but for the latter indexes may also be a benefit, as one part of anUPDATE
is the retrieval, that could be faster with an appropriate index). But I wouldn't worry that much about it. You'd have to try it. And then see what weighs more. I think it's likely that the decrease in speed of writes is neglectable but the increase in speed for your query is significant.
– sticky bit
Jan 3 at 19:02
It's proving difficult to test since after running the query once, the query planner seems to better account for it and running the same query back to back will result in the second time being something like 78ms vs 5000ms. I'll keep playing with this and see if some different indexes will help.
– BrandonO
Jan 3 at 19:16
I was worried that having such complex indexes might affect the performance of the table when it comes to other actions (insert, update). Is that an unfair assumption? Again, the table has about 30 million rows in it and is used extremely frequently.
– BrandonO
Jan 3 at 18:46
I was worried that having such complex indexes might affect the performance of the table when it comes to other actions (insert, update). Is that an unfair assumption? Again, the table has about 30 million rows in it and is used extremely frequently.
– BrandonO
Jan 3 at 18:46
1
1
@BrandonO: Of course it will have impact on the
INSERT
s and UPDATE
s (but for the latter indexes may also be a benefit, as one part of an UPDATE
is the retrieval, that could be faster with an appropriate index). But I wouldn't worry that much about it. You'd have to try it. And then see what weighs more. I think it's likely that the decrease in speed of writes is neglectable but the increase in speed for your query is significant.– sticky bit
Jan 3 at 19:02
@BrandonO: Of course it will have impact on the
INSERT
s and UPDATE
s (but for the latter indexes may also be a benefit, as one part of an UPDATE
is the retrieval, that could be faster with an appropriate index). But I wouldn't worry that much about it. You'd have to try it. And then see what weighs more. I think it's likely that the decrease in speed of writes is neglectable but the increase in speed for your query is significant.– sticky bit
Jan 3 at 19:02
It's proving difficult to test since after running the query once, the query planner seems to better account for it and running the same query back to back will result in the second time being something like 78ms vs 5000ms. I'll keep playing with this and see if some different indexes will help.
– BrandonO
Jan 3 at 19:16
It's proving difficult to test since after running the query once, the query planner seems to better account for it and running the same query back to back will result in the second time being something like 78ms vs 5000ms. I'll keep playing with this and see if some different indexes will help.
– BrandonO
Jan 3 at 19:16
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f54026202%2fhow-to-speed-up-postgresql-aggregate-select-with-sub-queries-and-case-statements%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
I suspect that indices are all you'll need here, however, because of the nature of your query you have some options. You could calculate the sums for a historical set of records and store that value and then just calculate it for new data when it's requested, store the most recent id and the sums associated with it as the new historical value and then return. That way you only have to look back as far as their most recent lookup.
– sage88
Jan 4 at 6:13
Also, your primary key syntax could use some work:
id bigserial primary key
will do the same thing– sage88
Jan 4 at 6:20