SQL覆盖索引如何工作?(How does SQL covering index work?)

我知道,当我们使用覆盖索引时,sql server在执行计划中只使用索引查找(非集群)或索引扫描(非集群)运算符,而不用查找运算符检索数据。 但为什么不能在聚集索引中查找值? 非聚簇索引不会在叶级别上存储数据,因此无论其包含的列数是多少,都必须要求聚簇索引返回数据行,因此它应该是执行计划中的查找运算符。 我对吗? 我读过https://www.red-gate.com/simple-talk/sql/learn-sql-server/using-covering-indexes-to-improve-query-performance/和其他文章,但没有解释。

I know, when we use a covering index, sql server uses only index seek(nonclustered) or index scan (nonclustered) operator in execution plan without retrieving data by lookup operator. But why is it possible not to look up values in clustered index? Nonclustered index doesn't store data on leaf level so regardless the number of columns it contains it has to ask clustered index to return data rows so it should be lookup operator in execution plan. Am I right? I've read https://www.red-gate.com/simple-talk/sql/learn-sql-server/using-covering-indexes-to-improve-query-performance/ and other articles but there is no explanation.

最满意答案

任何级别的索引都会存储定义该索引键的列的值。 此外,在非聚簇索引的叶级别,叶子存储属于聚簇索引关键字1并且不属于非聚簇索引关键字的任何其他列的值,因为这是然后如何执行聚集索引查找。

如果查询需要检索的唯一列是非聚簇索引键的一部分,或者是聚簇索引键的一部分,那么我们已通过导航非聚簇索引来获取所有这些列值。

一般来说,查询并不是试图检索 ,而是只检索特定列中的行值。


作为比喻,考虑你正在为整个城镇进行人口普查,并且正在存储所有的数据和物理卡。 这些卡片包含人员的姓名,地址,出生日期,当前职业等。进一步假设每个人都有一个唯一的地址,因此您决定将所有这些卡以地址顺序存储在一个大盒子文件中。 这是你的聚集索引。

你经常想根据他们的名字找人。 因此,您可以创建另一组索引卡片,告诉您,姓氏和名字的任何特定组合都可以显示姓名所在的所有地址。 你把这些卡放在第二个盒子文件中,并按姓氏,名字值排序。 这是你的非Clusered指数。

最后,假设你的任务是确定所有带Radish姓的人居住的街道。 您显然可以使用您的非聚集索引来识别姓氏为Radish所有人。 但请记住,这个二级索引中的卡片为您提供了这些人的地址。 如果我们唯一的任务是确定他们的街道,我们已经掌握了这些信息。 我们不需要去查找所有原始人口普查卡片,其中包含我们未被要求的各种信息,只是为了完成此查询。


1自2012年起,在INCLUDE子句中为索引定义标识的任何其他列。

Any index, at all levels, stores the values for the column(s) defining that index's key. In addition, at the leaf level of the non-clustered index, the leaves store the values for any additional column(s) which are part of the clustered index key1 and were not part of the non-clustered index key, because that is how the clustered index lookup is then performed.

If the only column(s) that the query needs to retrieve are either part of the non-clustered index key or are part of the clustered index key then we've already obtained all of those column values by navigating the non-clustered index.

Queries, in general, are not trying to retrieve rows, only row values from particular columns.


As an analogy, consider that you're running a census for an entire town, and are storing all of the data and physical cards. These cards contain the person's name, address, date of birth, current occupation, etc. Assume further that every individual has a unique address and so you decide to store all of these cards in address order, in a big box file. This is your clustered index.

You frequently want to locate people based on their names. So you create another set of index cards that tell you, for any particular combination of surname and firstname all of the addresses at which someone with that name resides. You put these cards in a second box file and sort them by surname, firstname values. This is your non-clusered index.

Finally, suppose your task is to identify the street on which all people with the surname Radish live. You can obviously use your non-clustered index to identify all of the people with the surname Radish. But remember, the cards in this secondary index gives you the addresses for these people. If our only task is to identify their street, we already have that information at hand. There's no need for us to go and look up all of the original census cards, containing all kinds of information that we've not been asked for, just to complete this query.


1And since 2012, any additional columns identified in an INCLUDE clause for the index definition.

SQL覆盖索引如何工作?(How does SQL covering index work?)

我知道,当我们使用覆盖索引时,sql server在执行计划中只使用索引查找(非集群)或索引扫描(非集群)运算符,而不用查找运算符检索数据。 但为什么不能在聚集索引中查找值? 非聚簇索引不会在叶级别上存储数据,因此无论其包含的列数是多少,都必须要求聚簇索引返回数据行,因此它应该是执行计划中的查找运算符。 我对吗? 我读过https://www.red-gate.com/simple-talk/sql/learn-sql-server/using-covering-indexes-to-improve-query-performance/和其他文章,但没有解释。

I know, when we use a covering index, sql server uses only index seek(nonclustered) or index scan (nonclustered) operator in execution plan without retrieving data by lookup operator. But why is it possible not to look up values in clustered index? Nonclustered index doesn't store data on leaf level so regardless the number of columns it contains it has to ask clustered index to return data rows so it should be lookup operator in execution plan. Am I right? I've read https://www.red-gate.com/simple-talk/sql/learn-sql-server/using-covering-indexes-to-improve-query-performance/ and other articles but there is no explanation.

最满意答案

任何级别的索引都会存储定义该索引键的列的值。 此外,在非聚簇索引的叶级别,叶子存储属于聚簇索引关键字1并且不属于非聚簇索引关键字的任何其他列的值,因为这是然后如何执行聚集索引查找。

如果查询需要检索的唯一列是非聚簇索引键的一部分,或者是聚簇索引键的一部分,那么我们已通过导航非聚簇索引来获取所有这些列值。

一般来说,查询并不是试图检索 ,而是只检索特定列中的行值。


作为比喻,考虑你正在为整个城镇进行人口普查,并且正在存储所有的数据和物理卡。 这些卡片包含人员的姓名,地址,出生日期,当前职业等。进一步假设每个人都有一个唯一的地址,因此您决定将所有这些卡以地址顺序存储在一个大盒子文件中。 这是你的聚集索引。

你经常想根据他们的名字找人。 因此,您可以创建另一组索引卡片,告诉您,姓氏和名字的任何特定组合都可以显示姓名所在的所有地址。 你把这些卡放在第二个盒子文件中,并按姓氏,名字值排序。 这是你的非Clusered指数。

最后,假设你的任务是确定所有带Radish姓的人居住的街道。 您显然可以使用您的非聚集索引来识别姓氏为Radish所有人。 但请记住,这个二级索引中的卡片为您提供了这些人的地址。 如果我们唯一的任务是确定他们的街道,我们已经掌握了这些信息。 我们不需要去查找所有原始人口普查卡片,其中包含我们未被要求的各种信息,只是为了完成此查询。


1自2012年起,在INCLUDE子句中为索引定义标识的任何其他列。

Any index, at all levels, stores the values for the column(s) defining that index's key. In addition, at the leaf level of the non-clustered index, the leaves store the values for any additional column(s) which are part of the clustered index key1 and were not part of the non-clustered index key, because that is how the clustered index lookup is then performed.

If the only column(s) that the query needs to retrieve are either part of the non-clustered index key or are part of the clustered index key then we've already obtained all of those column values by navigating the non-clustered index.

Queries, in general, are not trying to retrieve rows, only row values from particular columns.


As an analogy, consider that you're running a census for an entire town, and are storing all of the data and physical cards. These cards contain the person's name, address, date of birth, current occupation, etc. Assume further that every individual has a unique address and so you decide to store all of these cards in address order, in a big box file. This is your clustered index.

You frequently want to locate people based on their names. So you create another set of index cards that tell you, for any particular combination of surname and firstname all of the addresses at which someone with that name resides. You put these cards in a second box file and sort them by surname, firstname values. This is your non-clusered index.

Finally, suppose your task is to identify the street on which all people with the surname Radish live. You can obviously use your non-clustered index to identify all of the people with the surname Radish. But remember, the cards in this secondary index gives you the addresses for these people. If our only task is to identify their street, we already have that information at hand. There's no need for us to go and look up all of the original census cards, containing all kinds of information that we've not been asked for, just to complete this query.


1And since 2012, any additional columns identified in an INCLUDE clause for the index definition.