Bienvenue à Blogs CodeS-SourceS Identification | Inscription | Aide

SQL Server: Data access strategies

SQL Server can use different data access strategies when retrieving rows from the table. The strategy that will be used depends on the columns of the table, the available indexes, the query, the data in the table, and the statistics. There are 7 basic data access strategies.

Clustered index seek

A clustered index seek can only happen on a clustered table. If SQL Server identifies a seek predicate usable in the clustered index, it can perform a clustered index seek. The ordering of the clustered index is leveraged to scan only the rows satisfying the seek predicate.

This is the most efficient data access strategy.

Clustered index scan and table scan

They are both the same operation. It is called a clustered index scan when it is performed on a clustered table and table scan when it is performed on a heap. The whole table is scanned, and only the rows satisfying the query condition are returned.

A full table scan is a good strategy when the number of rows returned is big, and the query is not selective.

Nonclustered index scan + key lookup or RID lookup

When SQL Server cannot identify a seek predicate for the clustered index, but it can identify one for a nonclustered index, it will try to use it. It will perform an index seek in the nonclustered index to find all the rows qualifying for the seek predicate. Then, for every row found in the index, it will perform a lookup into the actual data table to retrieve all the columns needed by the query, and not present in the index. SQL Server uses the row locator in the index rows to locate the corresponding data rows.

The cost for that operation is always at least an index seek in the nonclustered index. Then:

  • If the table is a clustered table, the row locator is a clustering key. For every row returned by the nonclustered index seek, the corresponding clustering key is used to perform a clustered index seek (key lookup) and retrieve all the columns needed. So there will be as many clustered index seek as the number of rows returned by the nonclustered index seek.
  • If the table is a heap, the row locator is a row ID. For every row returned by the nonclustered index seek, the corresponding row ID is used to access directly the data row in the table (RID lookup). This counts as a single logical IO. However, if the row has been forwarded (forwarded record), SQL Server might have to perform a second logical IO to read the forwarded row (a second logical IO).

This data access strategy is efficient if the number of rows returned is small. However, if the nonclustered index scan returns a large number of rows, there will be a lot of lookups. Lookups are random IOs, so they are very slow.

The query optimizer will rather choose a full table scan (clustered index scan or table scan) if the number of rows exceeds few percents of the whole table. More logical IO will be performed (more pages read), but they will be sequential IOs, so they will be faster than random IOs.

Covering nonclustered index seeks

When a nonclustered index seek is performed, if all the columns needed by the query are stored in the nonclustered index (columns of the nonclustered key for instance), the additional lookups won’t be needed.

The cost of that operation is just a single index seek. It is a very fast operation.

In the case of a query returning a large number of rows, this strategy is much more efficient than a full table scan or a nonclustered index scan + lookup.

Covering nonclustered index scan

When SQL Server cannot identify any seek predicate for any index in a query, the full table scan is not always the only option. If a nonclustered index contains all the columns needed by the query, SQL Server will choose to scan this nonclustered index rather than the table.

Nonclustered indexes contain a subset of the columns of the table, so a row in a nonclustered index will be smaller than a row in the table, so every leaf page of the nonclustered index will contain more rows than a leaf page in the table structure, containing all the columns. Scanning the entire leaf level of a nonclustered index will require less logical IOs than scanning the entire level of the table structure. Therefore, it is faster to scan a nonclustered index than the table (heap of clustered table), but it can only be done if a nonclustered index is covering the query.

Publié mercredi 7 janvier 2009 11:58 par RaptorXP
Classé sous : ,
Ce post vous a plu ? Ajoutez le dans vos favoris pour ne pas perdre de temps à le retrouver le jour où vous en aurez besoin :

Commentaires

Pas de commentaires
Les commentaires anonymes sont désactivés

Les 10 derniers blogs postés

- Parution de mon livre sur WPF 4 par Perspective le il y a 1 heure et 42 minutes

- EDM : comment utiliser l’Horizontal Entity Splitting par Matthieu MEZIL le il y a 14 heures et 49 minutes

- [WP7Dev][Reactive] Rendre les Reactive Extensions Plus Stables par Jerome Laban le 09-08-2010, 02:24

- [SharePoint 2010] [Visio] Manipulation n°1 : Comment générer automatiquement la carte d’un site Web avec Visio ? par Le blog de Patrick le 09-07-2010, 14:12

- WinDbg / SOS / PSSCOR2 : Failed to load data access DLL (mscordacwks) par CoqBlog le 09-06-2010, 22:29

- Perspective 2.0 : version finale par Perspective le 09-06-2010, 19:42

- SharePoint 2010 : Comparaison entre la version 2007 et la version 2010 par Philippe Sentenac [MVP SharePoint] le 09-06-2010, 12:00

- Utilisation de la réplication SQL dans le code .NET d'une application mobile - Implémentation & Conseils : PARTIE 1/3 (classe SqlCeReplication & premi... par Le Blog de Pi-R (Pierre Cambier) le 09-06-2010, 08:37

- Quelques trucs intéressants (05/09/2010) par CoqBlog le 09-05-2010, 14:53

- EnumerableCollection par Matthieu MEZIL le 09-03-2010, 23:23