How To Visualize Clustered and Unclustered Index In SQL
Working with SQL databases forces you to come across primarily the clustered and the unclustered indexes. These are handy features that if used wisely can greatly enhance the speed of execution of the listed queries. But it can be hard to grasp their fundamentals, especially if you want to picture it in your head.
In this blog post, I explain what clustered and unclustered
indexes are, why they are of importance and most important how they can be
explained without the necessity of explaining all the details.
What is a Clustered Index?
A clustered index specifies how the data stored in a table
is sorted out in relation to the other fields in that table. Let’s consider, as
an example, a to-do list that you create and find that it has automatically
been arranged in a certain order. When you apply a clustered index on a unique
key The control column such as SQL Server will automatically add rows of that
table in accordance with the control column. Since the book-placement is done
according to the control column, there will be shorter response time when
querying a column that has been indexed.
To illustrate this better, let us picture a case where there
is a mini library. As for example the books are placed in order of the ISBN
number, then retrieving a particular book basing its retrieval on the ISBN
number will be much easier. In this regard, the ISBN can be referred to as the
clustered index.
For more detailed information on clustered indexes, check
out the official Microsoft SQL Serverdocumentation.
Diagram 1: Visualizing a Clustered Index
Let’s visualize this. Picture a table as a block of rows:
| ID (Clustered Index) | Name | Age |
|----------------------|---------|-----|
| 1
| John
| 34 |
| 2
| Sarah
| 28 |
| 3
| David | 45
|
In this table, the ID column has a clustered index, meaning that the data is physically stored in order based on the ID. As new rows are inserted, SQL Server arranges them so that the ID column stays sorted.
This means every time you query the table by ID, the
database doesn’t need to look all over the place—it goes straight to the right
block.
What is an Unclustered Index?
An unclustered index is most likely considered as some kind
of pointer. It does not cause any changes in the logical structure of the data
is located and stored in the table, but rather establishes an individual
structure that allows fast searching for information contained in the primary
structure. Imagine a table of contents in a book. As the name suggests, the
table of contents contains pages that summarize the contents. The contents
themselves of course are still spread throughout the various pages. But at
least from the index, you know where you are at least a page away from what you
want.
Following the same analogy of a library, if you wish to know
how a book is structured and wish to find a book by author’s name instead of
looking up by an ISBN then unclustered index works the same way as a card
catalog that will give you information of where the book is located.
For additional understanding of the unclustered indexes functionality see this article about Stack Overflow’s Indexing Guide.
Although the laws themselves are still sorted by the clustered index (ID), with the help of the unclustered index, SQL Server can swiftly search for rows and return them based on the Name column. Instead, it keeps track of the relevant rows located throughout the main table, thereby increasing the performance of queries that requires the Name.
Diagram 2: Visualizing an Unclustered Index
Now, let’s add an unclustered index on the Name column:
| ID | Name (Unclustered Index) | Age |
|----|--------------------------|-----|
| 1 | John | 34 |
| 2 | Sarah | 28 |
| 3 | David | 45 |
The unclustered index might look something like this:
| Name (Unclustered Index) | Pointer to Row |
|--------------------------|----------------|
| David
| Row 3 |
| John
| Row 1 |
| Sarah
| Row 2 |
Even though the rows themselves are still ordered by the clustered index (ID), the unclustered index lets SQL Server quickly locate rows based on the Name column. It stores a reference (or pointer) to the actual rows in the table, improving the efficiency of queries involving the Name.
Why Should You Care?
We need to be quick and hence indexes can do wonders with
the speed for any of your database operations. Even though there are no
indexes, SQL Server still can find what it is looking for but, it might have to
check each row in the table. An appropriate index allows SQL Server to skip
right to the necessary rows and reduce the time onioniq.com spent on the given
query.
However, it is a double-edged sword. Though indexes can help
in speeding up SELECT queries, they tend to affect the performance of INSERT,
UPDATE and DELETE queries negatively. This is owing to the fact that SQL Server
has to update the index structures as the data is being inserted.
To dive deeper into the pros and cons of indexes, you can visit this SQL Performance TuningGuide
Visualizing How Queries Use Indexes
The queries that one runs in SQL Server will utilize the indexes available in order to do data retrieval in the fastest of ways. For instance, if one questions a particular column that contains a clustered index, then he or she does not have to worry since the data will be accessed in a sorted manner. On the other hand, if a column with an unclustered index is queried, an index is used to retrieve the row pointers before the data is retrieved.
Example:
Let’s say you run this query:
SELECT FROM Employees WHERE Name = 'John';
Taking the Name column for example where there could be an unclustered index, First it will locate the name using the index, find the pointer of John and arrive at the full row. This whole exercise of course is much faster and easier than searching the entire table.
On the other hand, if you query by the ID column (the
clustered index), SQL Server goes directly to the relevant row without needing
a lookup.
You might want to Check out Must Know commands in SQL
Creating Indexes in SQL
Here’s how you would create both types of indexes in SQL:
1. Clustered Index:
CREATE CLUSTERED INDEX IX_Employees_ID ON Employees(ID);
This creates a clustered index on the ID column.
2. Unclustered Index:
CREATE NONCLUSTERED INDEX IX_Employees_Name ON
Employees(Name);
This creates an unclustered index on the Name column.
For more details on index creation and management, you can visit SQL Server’s Create Indexdocumentation.
When Should You Use a Clustered or an Unclustered Index?
A clustered index is the only type that can be created for a
particular table, since there can only ever be one way of ordering data. A
clustered index is most appropriate for columns that are used to search or for
sorting purposes, mostly the primary key.
There can be an unlimited number of unclustered indexes for
a single table. These are helpful in cases for columns that are often searched
but are not good candidates for a clustered index. For example, these
unclustered fields could be a better option for the use of unclustered
indexes-taking the name or email field as examples.
You might want to check out Loop through Select Result
Final thoughts
Knowing how each of the clustered or the unclustered index
performs in general will help a lot in improving the performance of your
database. When you are able to visualize the way data is arranged, it is more
likely that you will make better decisions concerning when and where to use
indexes.
Indexes are one of the indispensable arms in the SQL arsenal
of any developer, but as all arms are prone to abuse, so is the Index. There
are too many drawbacks of having many – excessive – indexes, a prime one being
they gain write performance while simultaneously boosting write performance.
With this, you will deal with several aspects of index
management and optimization, which form part of crucial skills in ensuring that
you produce efficient SQL queries in a shorter span.
Comments
Post a Comment