Why is the auto created statistic on this column empty?

.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty{ margin-bottom:0;
}

Info

My question relates to a moderately big table (~40GB data space) that is a heap
(Unfortunately, I am not allowed to add a clustered index to the table by the application owners)

An auto created statistic on an Identity column (ID) was created, but is empty.

Auto create stats & auto update stats are on

Modifications have happened in the table

There are other (auto created) statistics that are getting updated

There is another statistic on the same column created by an index (duplicate)

Build: 12.0.5546

The duplicate statistic is getting updated:
enter image description here

The actual question

To my understanding, all stats could be used and modifications are tracked, even if there are two statistics on exactly the same columns (duplicates), so why does this statistic remain empty?

Stats Info

enter image description here

DB stat info

enter image description here

Table Size

enter image description here

Column Information where the statistic is created on

enter image description here

[ID] [int] IDENTITY(1,1) NOT NULL

Identity column

select * from sys.stats  

where name like '%_WA_Sys_0000000A_6B7099F3%';

enter image description here Auto created

Getting some info on another statistic

select * From sys.dm_db_stats_properties (1802541555, 3)

enter image description here

In comparison with my empty stat:

enter image description here

Stats + Histogram from "generate scripts":

/****** Object:  Statistic [_WA_Sys_0000000A_6B7099F3]    Script Date: 2/1/2019 10:18:19 AM ******/



    CREATE STATISTICS [_WA_Sys_0000000A_6B7099F3] ON [dbo].[table]([ID]) WITH STATS_STREAM = 0x01000000010000000000000000000000EC03686B0000000040000000000000000000000000000000380348063800000004000A00000000000000000000000000

When creating a copy of the stats, no data is inside

CREATE STATISTICS [_WA_Sys_0000000A_6B7099F3_TEST] ON [dbo].[table]([ID]) WITH STATS_STREAM = 0x01000000010000000000000000000000EC03686B0000000040000000000000000000000000000000380348063800000004000A00000000000000000000000000

enter image description here

When manually updating the stat they do get updated.

UPDATE STATISTICS [dbo].[Table]([_WA_Sys_0000000A_6B7099F3_TEST])

enter image description here

edited Feb 1 at 12:15

Paul White♦

54.2k14288461

asked Feb 1 at 9:02

Randi Vertongen

4,6911924

add a comment |

Info

My question relates to a moderately big table (~40GB data space) that is a heap
(Unfortunately, I am not allowed to add a clustered index to the table by the application owners)

An auto created statistic on an Identity column (ID) was created, but is empty.

Auto create stats & auto update stats are on

Modifications have happened in the table

There are other (auto created) statistics that are getting updated

There is another statistic on the same column created by an index (duplicate)

Build: 12.0.5546

The duplicate statistic is getting updated:
enter image description here

The actual question

To my understanding, all stats could be used and modifications are tracked, even if there are two statistics on exactly the same columns (duplicates), so why does this statistic remain empty?

Stats Info

enter image description here

DB stat info

enter image description here

Table Size

enter image description here

Column Information where the statistic is created on

enter image description here

[ID] [int] IDENTITY(1,1) NOT NULL

Identity column

select * from sys.stats  

where name like '%_WA_Sys_0000000A_6B7099F3%';

enter image description here Auto created

Getting some info on another statistic

select * From sys.dm_db_stats_properties (1802541555, 3)

enter image description here

In comparison with my empty stat:

enter image description here

Stats + Histogram from "generate scripts":

/****** Object:  Statistic [_WA_Sys_0000000A_6B7099F3]    Script Date: 2/1/2019 10:18:19 AM ******/



    CREATE STATISTICS [_WA_Sys_0000000A_6B7099F3] ON [dbo].[table]([ID]) WITH STATS_STREAM = 0x01000000010000000000000000000000EC03686B0000000040000000000000000000000000000000380348063800000004000A00000000000000000000000000

When creating a copy of the stats, no data is inside

CREATE STATISTICS [_WA_Sys_0000000A_6B7099F3_TEST] ON [dbo].[table]([ID]) WITH STATS_STREAM = 0x01000000010000000000000000000000EC03686B0000000040000000000000000000000000000000380348063800000004000A00000000000000000000000000

enter image description here

When manually updating the stat they do get updated.

UPDATE STATISTICS [dbo].[Table]([_WA_Sys_0000000A_6B7099F3_TEST])

enter image description here

edited Feb 1 at 12:15

Paul White♦

54.2k14288461

asked Feb 1 at 9:02

Randi Vertongen

4,6911924

add a comment |

Info

My question relates to a moderately big table (~40GB data space) that is a heap
(Unfortunately, I am not allowed to add a clustered index to the table by the application owners)

An auto created statistic on an Identity column (ID) was created, but is empty.

Auto create stats & auto update stats are on

Modifications have happened in the table

There are other (auto created) statistics that are getting updated

There is another statistic on the same column created by an index (duplicate)

Build: 12.0.5546

The duplicate statistic is getting updated:
enter image description here

The actual question

To my understanding, all stats could be used and modifications are tracked, even if there are two statistics on exactly the same columns (duplicates), so why does this statistic remain empty?

Stats Info

enter image description here

DB stat info

enter image description here

Table Size

enter image description here

Column Information where the statistic is created on

enter image description here

[ID] [int] IDENTITY(1,1) NOT NULL

Identity column

select * from sys.stats  

where name like '%_WA_Sys_0000000A_6B7099F3%';

enter image description here Auto created

Getting some info on another statistic

select * From sys.dm_db_stats_properties (1802541555, 3)

enter image description here

In comparison with my empty stat:

enter image description here

Stats + Histogram from "generate scripts":

/****** Object:  Statistic [_WA_Sys_0000000A_6B7099F3]    Script Date: 2/1/2019 10:18:19 AM ******/



    CREATE STATISTICS [_WA_Sys_0000000A_6B7099F3] ON [dbo].[table]([ID]) WITH STATS_STREAM = 0x01000000010000000000000000000000EC03686B0000000040000000000000000000000000000000380348063800000004000A00000000000000000000000000

When creating a copy of the stats, no data is inside

CREATE STATISTICS [_WA_Sys_0000000A_6B7099F3_TEST] ON [dbo].[table]([ID]) WITH STATS_STREAM = 0x01000000010000000000000000000000EC03686B0000000040000000000000000000000000000000380348063800000004000A00000000000000000000000000

enter image description here

When manually updating the stat they do get updated.

UPDATE STATISTICS [dbo].[Table]([_WA_Sys_0000000A_6B7099F3_TEST])

enter image description here

edited Feb 1 at 12:15

Paul White♦

54.2k14288461

asked Feb 1 at 9:02

Randi Vertongen

4,6911924

Info

My question relates to a moderately big table (~40GB data space) that is a heap
(Unfortunately, I am not allowed to add a clustered index to the table by the application owners)

An auto created statistic on an Identity column (ID) was created, but is empty.

Auto create stats & auto update stats are on

Modifications have happened in the table

There are other (auto created) statistics that are getting updated

There is another statistic on the same column created by an index (duplicate)

Build: 12.0.5546

The duplicate statistic is getting updated:
enter image description here

The actual question

To my understanding, all stats could be used and modifications are tracked, even if there are two statistics on exactly the same columns (duplicates), so why does this statistic remain empty?

Stats Info

enter image description here

DB stat info

enter image description here

Table Size

enter image description here

Column Information where the statistic is created on

enter image description here

[ID] [int] IDENTITY(1,1) NOT NULL

Identity column

select * from sys.stats  

where name like '%_WA_Sys_0000000A_6B7099F3%';

enter image description here Auto created

Getting some info on another statistic

select * From sys.dm_db_stats_properties (1802541555, 3)

enter image description here

In comparison with my empty stat:

enter image description here

Stats + Histogram from "generate scripts":

/****** Object:  Statistic [_WA_Sys_0000000A_6B7099F3]    Script Date: 2/1/2019 10:18:19 AM ******/



    CREATE STATISTICS [_WA_Sys_0000000A_6B7099F3] ON [dbo].[table]([ID]) WITH STATS_STREAM = 0x01000000010000000000000000000000EC03686B0000000040000000000000000000000000000000380348063800000004000A00000000000000000000000000

When creating a copy of the stats, no data is inside

CREATE STATISTICS [_WA_Sys_0000000A_6B7099F3_TEST] ON [dbo].[table]([ID]) WITH STATS_STREAM = 0x01000000010000000000000000000000EC03686B0000000040000000000000000000000000000000380348063800000004000A00000000000000000000000000

enter image description here

When manually updating the stat they do get updated.

UPDATE STATISTICS [dbo].[Table]([_WA_Sys_0000000A_6B7099F3_TEST])

enter image description here

sql-server sql-server-2014 statistics

edited Feb 1 at 12:15

Paul White♦

54.2k14288461

asked Feb 1 at 9:02

Randi Vertongen

4,6911924

edited Feb 1 at 12:15

Paul White♦

54.2k14288461

asked Feb 1 at 9:02

Randi Vertongen

4,6911924

edited Feb 1 at 12:15

Paul White♦

54.2k14288461

edited Feb 1 at 12:15

Paul White♦

54.2k14288461

edited Feb 1 at 12:15

Paul White♦

54.2k14288461

asked Feb 1 at 9:02

Randi Vertongen

4,6911924

asked Feb 1 at 9:02

Randi Vertongen

4,6911924

asked Feb 1 at 9:02

Randi Vertongen

4,6911924

add a comment |

1 Answer
1

active

oldest

votes

I was able to reproduce this, both with an empty statistic, and a populated statistic. I arranged for an automatic statistic to be created on an empty table, and the index was created later:

IF OBJECT_ID(N'dbo.Heap', N'U') IS NOT NULL

BEGIN

    DROP TABLE dbo.Heap;

END;

GO

CREATE TABLE dbo.Heap 

(

    id integer NOT NULL IDENTITY,

    val integer NOT NULL,

);

GO

-- Add 1000 rows

INSERT dbo.Heap

    WITH (TABLOCKX)

    (val)

SELECT

    SV.number

FROM master.dbo.spt_values AS SV

WHERE

    SV.[type] = N'P'

    AND SV.number BETWEEN 1 AND 1000;

GO

SELECT COUNT_BIG(*) 

FROM dbo.Heap AS H

JOIN dbo.Heap AS H2

    ON H2.id = H.id

WHERE H.id > 0

AND H2.id > 0;

GO

-- Empty table

TRUNCATE TABLE dbo.Heap;

GO

-- Repeat exact same query (RT = 500 + 0.2 * 1000 = 700)

GO

SELECT COUNT_BIG(*) 

FROM dbo.Heap AS H

JOIN dbo.Heap AS H2

    ON H2.id = H.id

WHERE H.id > 0

AND H2.id > 0;

GO

-- Add 1000 rows

INSERT dbo.Heap

    WITH (TABLOCKX)

    (val)

SELECT

    SV.number

FROM master.dbo.spt_values AS SV

WHERE

    SV.[type] = N'P'

    AND SV.number BETWEEN 1 AND 1000;

GO

-- Add index

ALTER TABLE dbo.Heap ADD 

    CONSTRAINT [PK dbo.Heap id]

    PRIMARY KEY NONCLUSTERED (id);

GO

SELECT

    S.[name],

    S.auto_created,

    DDSP.stats_id,

    DDSP.last_updated,

    DDSP.[rows],

    DDSP.rows_sampled,

    DDSP.steps,

    DDSP.unfiltered_rows,

    DDSP.modification_counter

FROM sys.stats AS S

CROSS APPLY sys.dm_db_stats_properties(S.[object_id], S.stats_id) AS DDSP

WHERE 

    S.[object_id] = OBJECT_ID(N'dbo.Heap', N'U');

Output

I found that modifications continue to be tracked accurately on all non-empty duplicates, but only one statistic is updated automatically (regardless of the asynchronous setting).

Automatic statistics updates only occur when the query optimizer needs a particular statistic, and finds that it is out of date (an optimality-related recompile).

The optimizer chooses from duplicate statistics as mentioned in the Plan Caching and Recompilations in SQL Server 2012 paper:

An issue not directly related to the topic of this document is: given multiple statistics on the same set of columns in the same order, how does the query optimizer decide which ones to load during query optimization? The answer is not simple, but the query optimizer uses such guidelines as: Give preference to recent statistics over older statistics; Give preference to statistics computed using FULLSCAN option to those computed using sampling; and so on.

The point being that the optimizer chooses one of the available duplicate statistics (the "best one), and that one is automatically updated if found to be stale.

I believe this is a change in behaviour from older releases - or at least the documentation suggests that all out-of-date statistics for an object would be updated as part of this process, but I have no idea when this changed. It was certainly after August 2013 when Matt Bowler posted Duplicate Statistics, which contains a handy AdventureWorks-based repo. That script now results in only one of the statistics objects being updated, while at the time both were.

The above explanation matches all the behaviours I observed while attempting to reproduce your scenario, but I doubt it is explicitly documented anywhere. It does seem like a sensible optimization, since there is little value in keeping duplicates fully updated.

This is probably all at a level of detail below that which Microsoft are willing to support. This also means it could change without notice.

edited Feb 1 at 12:59

answered Feb 1 at 12:15

Paul White♦

54.2k14288461

add a comment |

Your Answer

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "182"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});

}
});

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdba.stackexchange.com%2fquestions%2f228656%2fwhy-is-the-auto-created-statistic-on-this-column-empty%23new-answer', 'question_page');
}
);

Post as a guest

Name

Required, but never shown

1 Answer
1

active

oldest

votes

1 Answer
1

active

oldest

votes

I was able to reproduce this, both with an empty statistic, and a populated statistic. I arranged for an automatic statistic to be created on an empty table, and the index was created later:

IF OBJECT_ID(N'dbo.Heap', N'U') IS NOT NULL

BEGIN

    DROP TABLE dbo.Heap;

END;

GO

CREATE TABLE dbo.Heap 

(

    id integer NOT NULL IDENTITY,

    val integer NOT NULL,

);

GO

-- Add 1000 rows

INSERT dbo.Heap

    WITH (TABLOCKX)

    (val)

SELECT

    SV.number

FROM master.dbo.spt_values AS SV

WHERE

    SV.[type] = N'P'

    AND SV.number BETWEEN 1 AND 1000;

GO

SELECT COUNT_BIG(*) 

FROM dbo.Heap AS H

JOIN dbo.Heap AS H2

    ON H2.id = H.id

WHERE H.id > 0

AND H2.id > 0;

GO

-- Empty table

TRUNCATE TABLE dbo.Heap;

GO

-- Repeat exact same query (RT = 500 + 0.2 * 1000 = 700)

GO

SELECT COUNT_BIG(*) 

FROM dbo.Heap AS H

JOIN dbo.Heap AS H2

    ON H2.id = H.id

WHERE H.id > 0

AND H2.id > 0;

GO

-- Add 1000 rows

INSERT dbo.Heap

    WITH (TABLOCKX)

    (val)

SELECT

    SV.number

FROM master.dbo.spt_values AS SV

WHERE

    SV.[type] = N'P'

    AND SV.number BETWEEN 1 AND 1000;

GO

-- Add index

ALTER TABLE dbo.Heap ADD 

    CONSTRAINT [PK dbo.Heap id]

    PRIMARY KEY NONCLUSTERED (id);

GO

SELECT

    S.[name],

    S.auto_created,

    DDSP.stats_id,

    DDSP.last_updated,

    DDSP.[rows],

    DDSP.rows_sampled,

    DDSP.steps,

    DDSP.unfiltered_rows,

    DDSP.modification_counter

FROM sys.stats AS S

CROSS APPLY sys.dm_db_stats_properties(S.[object_id], S.stats_id) AS DDSP

WHERE 

    S.[object_id] = OBJECT_ID(N'dbo.Heap', N'U');

Output

I found that modifications continue to be tracked accurately on all non-empty duplicates, but only one statistic is updated automatically (regardless of the asynchronous setting).

Automatic statistics updates only occur when the query optimizer needs a particular statistic, and finds that it is out of date (an optimality-related recompile).

The optimizer chooses from duplicate statistics as mentioned in the Plan Caching and Recompilations in SQL Server 2012 paper:

An issue not directly related to the topic of this document is: given multiple statistics on the same set of columns in the same order, how does the query optimizer decide which ones to load during query optimization? The answer is not simple, but the query optimizer uses such guidelines as: Give preference to recent statistics over older statistics; Give preference to statistics computed using FULLSCAN option to those computed using sampling; and so on.

The point being that the optimizer chooses one of the available duplicate statistics (the "best one), and that one is automatically updated if found to be stale.

This is probably all at a level of detail below that which Microsoft are willing to support. This also means it could change without notice.

edited Feb 1 at 12:59

answered Feb 1 at 12:15

Paul White♦

54.2k14288461

add a comment |

I was able to reproduce this, both with an empty statistic, and a populated statistic. I arranged for an automatic statistic to be created on an empty table, and the index was created later:

IF OBJECT_ID(N'dbo.Heap', N'U') IS NOT NULL

BEGIN

    DROP TABLE dbo.Heap;

END;

GO

CREATE TABLE dbo.Heap 

(

    id integer NOT NULL IDENTITY,

    val integer NOT NULL,

);

GO

-- Add 1000 rows

INSERT dbo.Heap

    WITH (TABLOCKX)

    (val)

SELECT

    SV.number

FROM master.dbo.spt_values AS SV

WHERE

    SV.[type] = N'P'

    AND SV.number BETWEEN 1 AND 1000;

GO

SELECT COUNT_BIG(*) 

FROM dbo.Heap AS H

JOIN dbo.Heap AS H2

    ON H2.id = H.id

WHERE H.id > 0

AND H2.id > 0;

GO

-- Empty table

TRUNCATE TABLE dbo.Heap;

GO

-- Repeat exact same query (RT = 500 + 0.2 * 1000 = 700)

GO

SELECT COUNT_BIG(*) 

FROM dbo.Heap AS H

JOIN dbo.Heap AS H2

    ON H2.id = H.id

WHERE H.id > 0

AND H2.id > 0;

GO

-- Add 1000 rows

INSERT dbo.Heap

    WITH (TABLOCKX)

    (val)

SELECT

    SV.number

FROM master.dbo.spt_values AS SV

WHERE

    SV.[type] = N'P'

    AND SV.number BETWEEN 1 AND 1000;

GO

-- Add index

ALTER TABLE dbo.Heap ADD 

    CONSTRAINT [PK dbo.Heap id]

    PRIMARY KEY NONCLUSTERED (id);

GO

SELECT

    S.[name],

    S.auto_created,

    DDSP.stats_id,

    DDSP.last_updated,

    DDSP.[rows],

    DDSP.rows_sampled,

    DDSP.steps,

    DDSP.unfiltered_rows,

    DDSP.modification_counter

FROM sys.stats AS S

CROSS APPLY sys.dm_db_stats_properties(S.[object_id], S.stats_id) AS DDSP

WHERE 

    S.[object_id] = OBJECT_ID(N'dbo.Heap', N'U');

Output

I found that modifications continue to be tracked accurately on all non-empty duplicates, but only one statistic is updated automatically (regardless of the asynchronous setting).

Automatic statistics updates only occur when the query optimizer needs a particular statistic, and finds that it is out of date (an optimality-related recompile).

The optimizer chooses from duplicate statistics as mentioned in the Plan Caching and Recompilations in SQL Server 2012 paper:

An issue not directly related to the topic of this document is: given multiple statistics on the same set of columns in the same order, how does the query optimizer decide which ones to load during query optimization? The answer is not simple, but the query optimizer uses such guidelines as: Give preference to recent statistics over older statistics; Give preference to statistics computed using FULLSCAN option to those computed using sampling; and so on.

The point being that the optimizer chooses one of the available duplicate statistics (the "best one), and that one is automatically updated if found to be stale.

This is probably all at a level of detail below that which Microsoft are willing to support. This also means it could change without notice.

edited Feb 1 at 12:59

answered Feb 1 at 12:15

Paul White♦

54.2k14288461

add a comment |

I was able to reproduce this, both with an empty statistic, and a populated statistic. I arranged for an automatic statistic to be created on an empty table, and the index was created later:

IF OBJECT_ID(N'dbo.Heap', N'U') IS NOT NULL

BEGIN

    DROP TABLE dbo.Heap;

END;

GO

CREATE TABLE dbo.Heap 

(

    id integer NOT NULL IDENTITY,

    val integer NOT NULL,

);

GO

-- Add 1000 rows

INSERT dbo.Heap

    WITH (TABLOCKX)

    (val)

SELECT

    SV.number

FROM master.dbo.spt_values AS SV

WHERE

    SV.[type] = N'P'

    AND SV.number BETWEEN 1 AND 1000;

GO

SELECT COUNT_BIG(*) 

FROM dbo.Heap AS H

JOIN dbo.Heap AS H2

    ON H2.id = H.id

WHERE H.id > 0

AND H2.id > 0;

GO

-- Empty table

TRUNCATE TABLE dbo.Heap;

GO

-- Repeat exact same query (RT = 500 + 0.2 * 1000 = 700)

GO

SELECT COUNT_BIG(*) 

FROM dbo.Heap AS H

JOIN dbo.Heap AS H2

    ON H2.id = H.id

WHERE H.id > 0

AND H2.id > 0;

GO

-- Add 1000 rows

INSERT dbo.Heap

    WITH (TABLOCKX)

    (val)

SELECT

    SV.number

FROM master.dbo.spt_values AS SV

WHERE

    SV.[type] = N'P'

    AND SV.number BETWEEN 1 AND 1000;

GO

-- Add index

ALTER TABLE dbo.Heap ADD 

    CONSTRAINT [PK dbo.Heap id]

    PRIMARY KEY NONCLUSTERED (id);

GO

SELECT

    S.[name],

    S.auto_created,

    DDSP.stats_id,

    DDSP.last_updated,

    DDSP.[rows],

    DDSP.rows_sampled,

    DDSP.steps,

    DDSP.unfiltered_rows,

    DDSP.modification_counter

FROM sys.stats AS S

CROSS APPLY sys.dm_db_stats_properties(S.[object_id], S.stats_id) AS DDSP

WHERE 

    S.[object_id] = OBJECT_ID(N'dbo.Heap', N'U');

Output

I found that modifications continue to be tracked accurately on all non-empty duplicates, but only one statistic is updated automatically (regardless of the asynchronous setting).

Automatic statistics updates only occur when the query optimizer needs a particular statistic, and finds that it is out of date (an optimality-related recompile).

The optimizer chooses from duplicate statistics as mentioned in the Plan Caching and Recompilations in SQL Server 2012 paper:

An issue not directly related to the topic of this document is: given multiple statistics on the same set of columns in the same order, how does the query optimizer decide which ones to load during query optimization? The answer is not simple, but the query optimizer uses such guidelines as: Give preference to recent statistics over older statistics; Give preference to statistics computed using FULLSCAN option to those computed using sampling; and so on.

The point being that the optimizer chooses one of the available duplicate statistics (the "best one), and that one is automatically updated if found to be stale.

This is probably all at a level of detail below that which Microsoft are willing to support. This also means it could change without notice.

edited Feb 1 at 12:59

answered Feb 1 at 12:15

Paul White♦

54.2k14288461

I was able to reproduce this, both with an empty statistic, and a populated statistic. I arranged for an automatic statistic to be created on an empty table, and the index was created later:

IF OBJECT_ID(N'dbo.Heap', N'U') IS NOT NULL

BEGIN

    DROP TABLE dbo.Heap;

END;

GO

CREATE TABLE dbo.Heap 

(

    id integer NOT NULL IDENTITY,

    val integer NOT NULL,

);

GO

-- Add 1000 rows

INSERT dbo.Heap

    WITH (TABLOCKX)

    (val)

SELECT

    SV.number

FROM master.dbo.spt_values AS SV

WHERE

    SV.[type] = N'P'

    AND SV.number BETWEEN 1 AND 1000;

GO

SELECT COUNT_BIG(*) 

FROM dbo.Heap AS H

JOIN dbo.Heap AS H2

    ON H2.id = H.id

WHERE H.id > 0

AND H2.id > 0;

GO

-- Empty table

TRUNCATE TABLE dbo.Heap;

GO

-- Repeat exact same query (RT = 500 + 0.2 * 1000 = 700)

GO

SELECT COUNT_BIG(*) 

FROM dbo.Heap AS H

JOIN dbo.Heap AS H2

    ON H2.id = H.id

WHERE H.id > 0

AND H2.id > 0;

GO

-- Add 1000 rows

INSERT dbo.Heap

    WITH (TABLOCKX)

    (val)

SELECT

    SV.number

FROM master.dbo.spt_values AS SV

WHERE

    SV.[type] = N'P'

    AND SV.number BETWEEN 1 AND 1000;

GO

-- Add index

ALTER TABLE dbo.Heap ADD 

    CONSTRAINT [PK dbo.Heap id]

    PRIMARY KEY NONCLUSTERED (id);

GO

SELECT

    S.[name],

    S.auto_created,

    DDSP.stats_id,

    DDSP.last_updated,

    DDSP.[rows],

    DDSP.rows_sampled,

    DDSP.steps,

    DDSP.unfiltered_rows,

    DDSP.modification_counter

FROM sys.stats AS S

CROSS APPLY sys.dm_db_stats_properties(S.[object_id], S.stats_id) AS DDSP

WHERE 

    S.[object_id] = OBJECT_ID(N'dbo.Heap', N'U');

Output

I found that modifications continue to be tracked accurately on all non-empty duplicates, but only one statistic is updated automatically (regardless of the asynchronous setting).

Automatic statistics updates only occur when the query optimizer needs a particular statistic, and finds that it is out of date (an optimality-related recompile).

The optimizer chooses from duplicate statistics as mentioned in the Plan Caching and Recompilations in SQL Server 2012 paper:

An issue not directly related to the topic of this document is: given multiple statistics on the same set of columns in the same order, how does the query optimizer decide which ones to load during query optimization? The answer is not simple, but the query optimizer uses such guidelines as: Give preference to recent statistics over older statistics; Give preference to statistics computed using FULLSCAN option to those computed using sampling; and so on.

The point being that the optimizer chooses one of the available duplicate statistics (the "best one), and that one is automatically updated if found to be stale.

This is probably all at a level of detail below that which Microsoft are willing to support. This also means it could change without notice.

edited Feb 1 at 12:59

answered Feb 1 at 12:15

Paul White♦

54.2k14288461

edited Feb 1 at 12:59

answered Feb 1 at 12:15

Paul White♦

54.2k14288461

answered Feb 1 at 12:15

Paul White♦

54.2k14288461

answered Feb 1 at 12:15

Paul White♦

54.2k14288461

add a comment |

draft saved

draft discarded

Thanks for contributing an answer to Database Administrators Stack Exchange!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

Search This Blog

Ufyukyu