Does SQL Server Table-Scan Time depend on the Query?

I observed that doing a full table scan takes a different time based on the query. I believed that under similar conditions (set of columns under select, column data types) a table scan should take a somewhat similar time. Seems like it's not the case. I just want to understand the reason behind that.

I have used "CHECKPOINT" and "DBCC DROPCLEANBUFFERS" before querying to make sure there is no impact from the query cache.

Table:

10 Columns

10M rows Each column has different densities ranging from 0.1 to 0.000001

No indexes

Queries:

Query A: returned 100 rows, time took: ~ 900ms

SELECT [COL00]

FROM [TEST].[dbo].[Test]

WHERE COL07 = 50000

Query B: returned 910595 rows, time took: ~ 15000ms

SELECT [COL00]

FROM [TEST].[dbo].[Test]

WHERE COL01 = 5

** Where column COL07 was randomly populated with integers ranging from 0 to 100000 and column COL01 was randomly populated with integers ranging from 0 to 10

Time Taken:

Query A: around 900 ms

Query B: around 18000 ms

What's the point I'm missing here?

edited Jan 3 at 0:21

GMB

20.8k51028

asked Jan 2 at 23:12

Malinga

387314

2

There are about x100 more rows to fetch in the second test, that only could explain why it took 20 times longer

– GMB
Jan 2 at 23:21

1

Switch to a count(1) as the result.

– user2864740
Jan 2 at 23:28

add a comment |

I have used "CHECKPOINT" and "DBCC DROPCLEANBUFFERS" before querying to make sure there is no impact from the query cache.

Table:

10 Columns

10M rows Each column has different densities ranging from 0.1 to 0.000001

No indexes

Queries:

Query A: returned 100 rows, time took: ~ 900ms

SELECT [COL00]

FROM [TEST].[dbo].[Test]

WHERE COL07 = 50000

Query B: returned 910595 rows, time took: ~ 15000ms

SELECT [COL00]

FROM [TEST].[dbo].[Test]

WHERE COL01 = 5

** Where column COL07 was randomly populated with integers ranging from 0 to 100000 and column COL01 was randomly populated with integers ranging from 0 to 10

Time Taken:

Query A: around 900 ms

Query B: around 18000 ms

What's the point I'm missing here?

edited Jan 3 at 0:21

GMB

20.8k51028

asked Jan 2 at 23:12

Malinga

387314

2

There are about x100 more rows to fetch in the second test, that only could explain why it took 20 times longer

– GMB
Jan 2 at 23:21

1

Switch to a count(1) as the result.

– user2864740
Jan 2 at 23:28

add a comment |

I have used "CHECKPOINT" and "DBCC DROPCLEANBUFFERS" before querying to make sure there is no impact from the query cache.

Table:

10 Columns

10M rows Each column has different densities ranging from 0.1 to 0.000001

No indexes

Queries:

Query A: returned 100 rows, time took: ~ 900ms

SELECT [COL00]

FROM [TEST].[dbo].[Test]

WHERE COL07 = 50000

Query B: returned 910595 rows, time took: ~ 15000ms

SELECT [COL00]

FROM [TEST].[dbo].[Test]

WHERE COL01 = 5

** Where column COL07 was randomly populated with integers ranging from 0 to 100000 and column COL01 was randomly populated with integers ranging from 0 to 10

Time Taken:

Query A: around 900 ms

Query B: around 18000 ms

What's the point I'm missing here?

edited Jan 3 at 0:21

GMB

20.8k51028

asked Jan 2 at 23:12

Malinga

387314

I have used "CHECKPOINT" and "DBCC DROPCLEANBUFFERS" before querying to make sure there is no impact from the query cache.

Table:

10 Columns

10M rows Each column has different densities ranging from 0.1 to 0.000001

No indexes

Queries:

Query A: returned 100 rows, time took: ~ 900ms

SELECT [COL00]

FROM [TEST].[dbo].[Test]

WHERE COL07 = 50000

Query B: returned 910595 rows, time took: ~ 15000ms

SELECT [COL00]

FROM [TEST].[dbo].[Test]

WHERE COL01 = 5

** Where column COL07 was randomly populated with integers ranging from 0 to 100000 and column COL01 was randomly populated with integers ranging from 0 to 10

Time Taken:

Query A: around 900 ms

Query B: around 18000 ms

What's the point I'm missing here?

sql sql-server

edited Jan 3 at 0:21

GMB

20.8k51028

asked Jan 2 at 23:12

Malinga

387314

edited Jan 3 at 0:21

GMB

20.8k51028

asked Jan 2 at 23:12

Malinga

387314

edited Jan 3 at 0:21

GMB

20.8k51028

edited Jan 3 at 0:21

GMB

20.8k51028

edited Jan 3 at 0:21

GMB

20.8k51028

asked Jan 2 at 23:12

Malinga

387314

asked Jan 2 at 23:12

Malinga

387314

asked Jan 2 at 23:12

Malinga

387314

2

There are about x100 more rows to fetch in the second test, that only could explain why it took 20 times longer

– GMB
Jan 2 at 23:21

1

Switch to a count(1) as the result.

– user2864740
Jan 2 at 23:28

add a comment |

2

There are about x100 more rows to fetch in the second test, that only could explain why it took 20 times longer

– GMB
Jan 2 at 23:21

1

Switch to a count(1) as the result.

– user2864740
Jan 2 at 23:28

There are about x100 more rows to fetch in the second test, that only could explain why it took 20 times longer

– GMB
Jan 2 at 23:21

Switch to a count(1) as the result.

– user2864740
Jan 2 at 23:28

add a comment |

3 Answers
3

active

oldest

votes

Query A: (returned 100 rows, time took: ~ 900ms)

Query B: (returned 910595 rows, time took: ~ 15000ms)

I believe that what you are missing is that there are about x100 more rows to fetch in the second query. That only could explain why it took 20 times longer.

answered Jan 2 at 23:25

GMB

20.8k51028

Yap! seems like that. However what I thought is, isn't table scan mean going through actual row data? if that's the case anyway both queries have to go through all the data (I mean fetch data). Or is there a separate cost involved in fetching?

– Malinga
Jan 3 at 2:36

As an example think I used SELECT [COL07] FROM [TEST].[dbo].[Test] WHERE COL07 = 50000 and SELECT [COL01] FROM [TEST].[dbo].[Test] WHERE COL01 = 5 (which behave similar to what I had in question ) now I'm selecting whatever in the where clause. If the query has to go through all values for the columns in where clause to find the matching rows, why there is an additional fetching cost?

– Malinga
Jan 3 at 2:37

1

It still has to actually do something with the rows when you find them, which takes time as well. If you tried Select count(1) from ... instead of Select *, that would minimise the amount of processing to be done per row, and I suspect the times would be a lot closer.

– DancingFool
Jan 3 at 2:54

Ya, I checked with count(1) and Query B now takes only a very small time. I might have to dig deep to understand what that something is. Thanks.

– Malinga
Jan 3 at 5:38

add a comment |

The two columns have different density of the data.

Query A, COL07: 10000000/100000 = 100

Query B, COL05: 10000000/10 = 1000000

The fact that both the search parameters are in the middle of the data range doesn't necessarily impact the speed of the search. This is depending on the number of times the engine scans the column to return the values of the search predicate.

In order to see if this is indeed the case, I would try the following:

COL04: 10000000/1000 = 10000. Filtering on WHERE COL04 = 500

COL08: 10000000/10000 = 1000. Filtering on WHERE COL05 = 5000

Considering the times from the initial test, you would expect to see COL04 at ~7200ms and COL05 at ~3600ms.

An interesting article about SQL Server COUNT() Function Performance Comparison

edited Jan 4 at 22:15

answered Jan 3 at 0:18

Marc0

1116

add a comment |

-1

Full Table Scan (also known as Sequential Scan) is a scan made on a database where each row of the table under scan is read in a sequential (serial) order

Reference

In your case, full table scan scans sequentially (in ordered way) so that it does not need to scan whole table in order to advance next record because Col7 is ordered.

but in Query2 the case is not like that, Col01 is randomly distributed so full table scan is needed.

Query 1 is optimistic scan where as Query 2 is pessimistic can.

answered Jan 2 at 23:23

Derviş Kayımbaşıoğlu

15.7k22042

None of them are ordered

– Malinga
Jan 3 at 2:16

add a comment |

Your Answer

StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});

}
});

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f54014377%2fdoes-sql-server-table-scan-time-depend-on-the-query%23new-answer', 'question_page');
}
);

Post as a guest

Name

Required, but never shown

3 Answers
3

active

oldest

votes

3 Answers
3

active

oldest

votes

Query A: (returned 100 rows, time took: ~ 900ms)

Query B: (returned 910595 rows, time took: ~ 15000ms)

I believe that what you are missing is that there are about x100 more rows to fetch in the second query. That only could explain why it took 20 times longer.

answered Jan 2 at 23:25

GMB

20.8k51028

Yap! seems like that. However what I thought is, isn't table scan mean going through actual row data? if that's the case anyway both queries have to go through all the data (I mean fetch data). Or is there a separate cost involved in fetching?

– Malinga
Jan 3 at 2:36

As an example think I used SELECT [COL07] FROM [TEST].[dbo].[Test] WHERE COL07 = 50000 and SELECT [COL01] FROM [TEST].[dbo].[Test] WHERE COL01 = 5 (which behave similar to what I had in question ) now I'm selecting whatever in the where clause. If the query has to go through all values for the columns in where clause to find the matching rows, why there is an additional fetching cost?

– Malinga
Jan 3 at 2:37

1

It still has to actually do something with the rows when you find them, which takes time as well. If you tried Select count(1) from ... instead of Select *, that would minimise the amount of processing to be done per row, and I suspect the times would be a lot closer.

– DancingFool
Jan 3 at 2:54

Ya, I checked with count(1) and Query B now takes only a very small time. I might have to dig deep to understand what that something is. Thanks.

– Malinga
Jan 3 at 5:38

add a comment |

Query A: (returned 100 rows, time took: ~ 900ms)

Query B: (returned 910595 rows, time took: ~ 15000ms)

I believe that what you are missing is that there are about x100 more rows to fetch in the second query. That only could explain why it took 20 times longer.

answered Jan 2 at 23:25

GMB

20.8k51028

Yap! seems like that. However what I thought is, isn't table scan mean going through actual row data? if that's the case anyway both queries have to go through all the data (I mean fetch data). Or is there a separate cost involved in fetching?

– Malinga
Jan 3 at 2:36

As an example think I used SELECT [COL07] FROM [TEST].[dbo].[Test] WHERE COL07 = 50000 and SELECT [COL01] FROM [TEST].[dbo].[Test] WHERE COL01 = 5 (which behave similar to what I had in question ) now I'm selecting whatever in the where clause. If the query has to go through all values for the columns in where clause to find the matching rows, why there is an additional fetching cost?

– Malinga
Jan 3 at 2:37

1

It still has to actually do something with the rows when you find them, which takes time as well. If you tried Select count(1) from ... instead of Select *, that would minimise the amount of processing to be done per row, and I suspect the times would be a lot closer.

– DancingFool
Jan 3 at 2:54

Ya, I checked with count(1) and Query B now takes only a very small time. I might have to dig deep to understand what that something is. Thanks.

– Malinga
Jan 3 at 5:38

add a comment |

Query A: (returned 100 rows, time took: ~ 900ms)

Query B: (returned 910595 rows, time took: ~ 15000ms)

I believe that what you are missing is that there are about x100 more rows to fetch in the second query. That only could explain why it took 20 times longer.

answered Jan 2 at 23:25

GMB

20.8k51028

Query A: (returned 100 rows, time took: ~ 900ms)

Query B: (returned 910595 rows, time took: ~ 15000ms)

I believe that what you are missing is that there are about x100 more rows to fetch in the second query. That only could explain why it took 20 times longer.

answered Jan 2 at 23:25

GMB

20.8k51028

answered Jan 2 at 23:25

GMB

20.8k51028

answered Jan 2 at 23:25

GMB

20.8k51028

answered Jan 2 at 23:25

GMB

20.8k51028

Yap! seems like that. However what I thought is, isn't table scan mean going through actual row data? if that's the case anyway both queries have to go through all the data (I mean fetch data). Or is there a separate cost involved in fetching?

– Malinga
Jan 3 at 2:36

As an example think I used SELECT [COL07] FROM [TEST].[dbo].[Test] WHERE COL07 = 50000 and SELECT [COL01] FROM [TEST].[dbo].[Test] WHERE COL01 = 5 (which behave similar to what I had in question ) now I'm selecting whatever in the where clause. If the query has to go through all values for the columns in where clause to find the matching rows, why there is an additional fetching cost?

– Malinga
Jan 3 at 2:37

1

It still has to actually do something with the rows when you find them, which takes time as well. If you tried Select count(1) from ... instead of Select *, that would minimise the amount of processing to be done per row, and I suspect the times would be a lot closer.

– DancingFool
Jan 3 at 2:54

Ya, I checked with count(1) and Query B now takes only a very small time. I might have to dig deep to understand what that something is. Thanks.

– Malinga
Jan 3 at 5:38

add a comment |

Yap! seems like that. However what I thought is, isn't table scan mean going through actual row data? if that's the case anyway both queries have to go through all the data (I mean fetch data). Or is there a separate cost involved in fetching?

– Malinga
Jan 3 at 2:36

As an example think I used SELECT [COL07] FROM [TEST].[dbo].[Test] WHERE COL07 = 50000 and SELECT [COL01] FROM [TEST].[dbo].[Test] WHERE COL01 = 5 (which behave similar to what I had in question ) now I'm selecting whatever in the where clause. If the query has to go through all values for the columns in where clause to find the matching rows, why there is an additional fetching cost?

– Malinga
Jan 3 at 2:37

1

It still has to actually do something with the rows when you find them, which takes time as well. If you tried Select count(1) from ... instead of Select *, that would minimise the amount of processing to be done per row, and I suspect the times would be a lot closer.

– DancingFool
Jan 3 at 2:54

Ya, I checked with count(1) and Query B now takes only a very small time. I might have to dig deep to understand what that something is. Thanks.

– Malinga
Jan 3 at 5:38

Yap! seems like that. However what I thought is, isn't table scan mean going through actual row data? if that's the case anyway both queries have to go through all the data (I mean fetch data). Or is there a separate cost involved in fetching?

– Malinga
Jan 3 at 2:36

As an example think I used SELECT [COL07] FROM [TEST].[dbo].[Test] WHERE COL07 = 50000 and SELECT [COL01] FROM [TEST].[dbo].[Test] WHERE COL01 = 5 (which behave similar to what I had in question ) now I'm selecting whatever in the where clause. If the query has to go through all values for the columns in where clause to find the matching rows, why there is an additional fetching cost?

– Malinga
Jan 3 at 2:37

It still has to actually do something with the rows when you find them, which takes time as well. If you tried Select count(1) from ... instead of Select *, that would minimise the amount of processing to be done per row, and I suspect the times would be a lot closer.

– DancingFool
Jan 3 at 2:54

Ya, I checked with count(1) and Query B now takes only a very small time. I might have to dig deep to understand what that something is. Thanks.

– Malinga
Jan 3 at 5:38

add a comment |

The two columns have different density of the data.

Query A, COL07: 10000000/100000 = 100

Query B, COL05: 10000000/10 = 1000000

In order to see if this is indeed the case, I would try the following:

COL04: 10000000/1000 = 10000. Filtering on WHERE COL04 = 500

COL08: 10000000/10000 = 1000. Filtering on WHERE COL05 = 5000

Considering the times from the initial test, you would expect to see COL04 at ~7200ms and COL05 at ~3600ms.

An interesting article about SQL Server COUNT() Function Performance Comparison

edited Jan 4 at 22:15

answered Jan 3 at 0:18

Marc0

1116

add a comment |

The two columns have different density of the data.

Query A, COL07: 10000000/100000 = 100

Query B, COL05: 10000000/10 = 1000000

In order to see if this is indeed the case, I would try the following:

COL04: 10000000/1000 = 10000. Filtering on WHERE COL04 = 500

COL08: 10000000/10000 = 1000. Filtering on WHERE COL05 = 5000

Considering the times from the initial test, you would expect to see COL04 at ~7200ms and COL05 at ~3600ms.

An interesting article about SQL Server COUNT() Function Performance Comparison

edited Jan 4 at 22:15

answered Jan 3 at 0:18

Marc0

1116

add a comment |

The two columns have different density of the data.

Query A, COL07: 10000000/100000 = 100

Query B, COL05: 10000000/10 = 1000000

In order to see if this is indeed the case, I would try the following:

COL04: 10000000/1000 = 10000. Filtering on WHERE COL04 = 500

COL08: 10000000/10000 = 1000. Filtering on WHERE COL05 = 5000

Considering the times from the initial test, you would expect to see COL04 at ~7200ms and COL05 at ~3600ms.

An interesting article about SQL Server COUNT() Function Performance Comparison

edited Jan 4 at 22:15

answered Jan 3 at 0:18

Marc0

1116

The two columns have different density of the data.

Query A, COL07: 10000000/100000 = 100

Query B, COL05: 10000000/10 = 1000000

In order to see if this is indeed the case, I would try the following:

COL04: 10000000/1000 = 10000. Filtering on WHERE COL04 = 500

COL08: 10000000/10000 = 1000. Filtering on WHERE COL05 = 5000

Considering the times from the initial test, you would expect to see COL04 at ~7200ms and COL05 at ~3600ms.

An interesting article about SQL Server COUNT() Function Performance Comparison

edited Jan 4 at 22:15

answered Jan 3 at 0:18

Marc0

1116

edited Jan 4 at 22:15

answered Jan 3 at 0:18

Marc0

1116

answered Jan 3 at 0:18

Marc0

1116

answered Jan 3 at 0:18

Marc0

1116

add a comment |

-1

Full Table Scan (also known as Sequential Scan) is a scan made on a database where each row of the table under scan is read in a sequential (serial) order

Reference

In your case, full table scan scans sequentially (in ordered way) so that it does not need to scan whole table in order to advance next record because Col7 is ordered.

but in Query2 the case is not like that, Col01 is randomly distributed so full table scan is needed.

Query 1 is optimistic scan where as Query 2 is pessimistic can.

answered Jan 2 at 23:23

Derviş Kayımbaşıoğlu

15.7k22042

None of them are ordered

– Malinga
Jan 3 at 2:16

add a comment |

-1

Full Table Scan (also known as Sequential Scan) is a scan made on a database where each row of the table under scan is read in a sequential (serial) order

Reference

In your case, full table scan scans sequentially (in ordered way) so that it does not need to scan whole table in order to advance next record because Col7 is ordered.

but in Query2 the case is not like that, Col01 is randomly distributed so full table scan is needed.

Query 1 is optimistic scan where as Query 2 is pessimistic can.

answered Jan 2 at 23:23

Derviş Kayımbaşıoğlu

15.7k22042

None of them are ordered

– Malinga
Jan 3 at 2:16

add a comment |

-1

Full Table Scan (also known as Sequential Scan) is a scan made on a database where each row of the table under scan is read in a sequential (serial) order

Reference

In your case, full table scan scans sequentially (in ordered way) so that it does not need to scan whole table in order to advance next record because Col7 is ordered.

but in Query2 the case is not like that, Col01 is randomly distributed so full table scan is needed.

Query 1 is optimistic scan where as Query 2 is pessimistic can.

answered Jan 2 at 23:23

Derviş Kayımbaşıoğlu

15.7k22042

Full Table Scan (also known as Sequential Scan) is a scan made on a database where each row of the table under scan is read in a sequential (serial) order

Reference

In your case, full table scan scans sequentially (in ordered way) so that it does not need to scan whole table in order to advance next record because Col7 is ordered.

but in Query2 the case is not like that, Col01 is randomly distributed so full table scan is needed.

Query 1 is optimistic scan where as Query 2 is pessimistic can.

answered Jan 2 at 23:23

Derviş Kayımbaşıoğlu

15.7k22042

answered Jan 2 at 23:23

Derviş Kayımbaşıoğlu

15.7k22042

answered Jan 2 at 23:23

Derviş Kayımbaşıoğlu

15.7k22042

answered Jan 2 at 23:23

Derviş Kayımbaşıoğlu

15.7k22042

None of them are ordered

– Malinga
Jan 3 at 2:16

add a comment |

None of them are ordered

– Malinga
Jan 3 at 2:16

None of them are ordered

– Malinga
Jan 3 at 2:16

add a comment |

draft saved

draft discarded

Thanks for contributing an answer to Stack Overflow!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

Search This Blog

Ufyukyu