Skip to main content

When to use SQL server indexes?

SQL Indexes

Indexes are objects based on data in tables. Unlike views, indexes do indeed require physical storage. The SQL indexes are used to retrieve data super faster from database. By the end of this splash reading, you will understand the following:

  • How to create indexes
  • How to create composite indexes
  • What indexes do
  • When indexes should be used
  • When it is better not to use an index
An index is a way of presenting data differently than the way it appears on the disk. Special types of indexes reorder the record's physical location within a table. Indexes can be created on a column within a table or on a combination of columns within a table. When a index is used, the data is presented to the user in a sorted order, which you can control with the CREATE INDEX statement . You can usually gain substantial performance improvements by indexing on the correct fields, particularly fields that are being joined between tables.

Using Indexes Effectively 

Another way, besides views, to present data in a different format than it physically exists on the disk is to use an index. In addition, indexes can also reorder the data stored on the disk (something that views cannot do).
Indexes are used in a SQL database for four supreme reasons:
  • To enforce referential integrity constraints by using the UNIQUE or PRIMARY KEY keywords.
  • To facilitate the ordering of data based on the contents of the indexes field or fields 
  • To optimize the execution speed of queries
  • To sort and store the data rows in the table or view based on their key values using Clustered Indexes. 

What are Indexes?

Data can be retrieved from a database using two methods. The first method, often called the Sequential Access Method, requires SQL to go through each record looking for a match. This search method is inefficient, but is the only way for SQL to locate the correct record. Think back to the days when libraries had massive card catalog filing systems. Suppose the librarian removed the alphabetical index cards, tossed the cards into the air, and then placed them back into the filing cabinets.

When you wanted to look up this book's shelf location, you would probably start start at the very beginning, and then go through one card at a time until you found the information you wanted. Now suppose the librarian sorted the book titles alphabetically. You could quickly access this books information by using the knowledge of the alphabet to move through the catalog.

Imagine the flexibility if the librarian was diligent enough to not only sort the books by title, but also create another catalog sorted by authors name and another sorted by topic. This process would provide you, the library user, with a great deal of flexibility in retrieving information. Also, you would be able to retrieve you information in a fraction of the time it originally would have taken.

Adding indexes to you database enables SQL to use the Direct Access Method. SQL uses tree like structure to store and retrieve indexes data. Pointers to a group of data are stored at the top of the tree. These groups are called nodes. Each node contains pointers to other nodes. The nodes pointing to the left contain values that are less than its parent node. The pointers to the right point to values greater than the parent node.
The basic SQL syntax to create an index is as follows:
CREATE INDEX index_name
ON table_name(column1,column2,......);
For instance, to create an index on the BOOK_ID field of the BOOKS table, the CREATE INDEX statement would look like this:
CREATE INDEX ID_INDEX
ON BOOKS(BOOK_ID);
If you want to remove the Index that you have created, The DROP INDEX statement removes an INDEX  from the database. As usual, the DROP INDEX statement is very straightforward:
DROP INDEX ID_INDEX;

Now the BOOKS table is in its original form. Using the simplest form of the CREATE INDEX statement did not physically change the way the table was stored.

You may be wondering why database systems even provide indexes of they also enable you to use the ORDER BY clause. The difference is that an ORDER BY clause re-sorts  and orders the data each time you execute the corresponding SQL statement. When using an index, the database system creates a physical index object and reuses the same index each time you query the table.

How to Index on more than One Field?

SQL also enables you to index on more than one field. This type of index is a composite index. The following code illustrates a simple composite index. Note that even though two fields are being combined, only one physical index in created.
CREATE INDEX ID_BOOK_ID
ON BOOKS(BOOK_ID, PRICE);
You can achieve performance gains by selecting the column with the most unique values. For instance, every value in the NAME field of the BOOKS table is unique. When using a compound index, place the most selective field first in the column list. That is, place the field that you expect to select most often at the beginning of the list.
Composite indexes are also used to combine two or more columns that, by themselves, may have low selectivity.

Using Clustered Indexes

A special type of index supported by many database systems allows the database manager or developer to cluster data. When tables are clustered, the data is stored in the same data blocks, allowing fewer database block reads and resulting in quicker performance. When a clustered index is used, the physical arrangement of  the data within a table is modified. Using a clustered index usually results in faster data retrieval than using a traditional, non clustered index. However, many database systems allow only one clustered index per table.

The field used to create the clustered index is usually the primary key field. Using Transact-SQL, you could create a clustered, unique index on the BOOK_ID field of the BOOK_LISTS table using the following syntax:
CREATE UNIQUE CLUSTERED INDEX 
Id_Index
ON BOOK_LISTS(BOOK_ID);
You should add tables only to clusters that are frequently joined. Do not add tables to clusters that are accessed individually  through a simple select statement.

Clusters are a vendor-specific feature of SQL. To know more about Clustered Index, you can go through the Microsoft documentation here.

Comments

Popular posts from this blog

FORCE ORDER in SQL Server

The FORCE ORDER is a query hint it executes the order of the tables exactly specified in a statement. When we use this query hint in a statement it will tell SQL server not to change the order of the joins in the query. Basically, the SQL server rearrange your joins to be in the order that it thinks it will be optimal for your query to execute. Now, Lets see the execution plan without the FORCE ORDER: The above execution plan demonstrates the optimal order of the joins returned by the SQL server. As you can see the order starts from the sales details and goes by bank details to employee details. Suppose if you don't want the SQL server to change the order of the joins in a query you can use FORCE ORDER to stop the default ordering. The syntax for FORCE ORDER query hint is, OPTION ( FORCE ORDER ); Now, Lets see the execution plan with the FORCE ORDER: select * from tbl_EmployeeDetails as e inner join tbl_BankDetails as b on e.Id=b.EmpID inner join tbl_S

How to create comma separated values in SQL server?

Comma Separated Values in SQL Predominantly in reporting, you may gone through a situation where you need to convert a comma separated values into list of rows or to convert a list of rows into single column to display in a report. In this splash reading, you will understand the following: How to convert comma separated data in a column to multiple rows How to convert multiple rows into one comma separated values Using COALESCE function Using STUFF function Convert comma separated data in a column to multiple rows Here we have an Customer table with a list of products bought by each customer, ID CUSTOMER PRODUCTS 1 Stuart Chain Saw,Circular Saw 2 Michael Drill,Hammer 3 Jonathan Sticky Notes,Mouse 4 Nabeel Mobile,Headset Suppose you want to return this as a single table, list of products bought by each customer. we need to create a function that splits our comma separated col

OPTION (MERGE JOIN) in SQL Server

OPTION (MERGE JOIN) in SQL Server The MERGE JOIN query hint is a best available join algorithm in SQL server. It is based on first sorting  both data sets according to the join conditions and then traversing through the sorted data sets and finding matches. The MERGE JOIN itself is very fast, but it can be an expensive choice if sort operations are required. This produces the best optimal execution plan. The syntax is, OPTION ( MERGE JOIN ); Now, Lets see how to use MERGE JOIN with SELECT statement.? select * from tbl_EmployeeDetails as e inner join tbl_BankDetails as b on b.EmpId=e.Id OPTION(MERGE JOIN); The statement returned optimal execution plan is, The MERGE JOIN operator gets a row from each input and compares them. In above statement, it merges all joins and first sorting data sets and then traversing through the sorted data sets and finding matches, if they are equal the rows are returned. If they are not equal, the lower-value row is removed

How to manipulate JSON data in SQL server?

Manipulate JSON data in SQL Server JavaScript Object Notation (JSON) is a lightweight popular data exchangeable format used across modern IoT platforms, web and mobile applications etc. It is a language independent textual data format. JavaScript Object Notation (JSON) is also used for storing unstructured data in log files or NoSQL Databases such as Microsoft Azure Cosmos DB. Many RESTful web services that allow us to store and retrieve JSON formatted texts among different protocols using different Endpoints. The example of JSON text is as follows, [{         "customer":"Michael",         "products":["Watch","Mobile","Books"] }, {         "customer":"Stuart",         "products":["Laptop","Keyboard","Mouse"] } ] In this article you will understand the following, Read JSON data from table Modify JSON data in a table Validate JSON objects 

OPTION Loop Join in SQL Server

The OPTION ( LOOP JOIN ) would enforce LOOP JOIN across all joins in the query. Using the OPTION ( LOOP JOIN ) appears to allows the query optimizer to join the tables using the nested loops in which ever order SQL server decides it is optimal. The OPTION clause must be a last clause in a statement. The syntax is, OPTION ( LOOP JOIN ); Now, Lets see how to use LOOP JOIN with SELECT statement.? select *from tbl_EmployeeDetails as e inner join tbl_BankDetails as b on b.EmpID=e.Id inner join tbl_SalesDetails as s on s.BankId=b.Id OPTION(LOOP JOIN); The statement returned execution plan is, The above statement loops through all the joins that starts from sales details and goes by bank details to employee details in which the order that SQL server thinks it is optimal. This does not follow the order of joins that we specify. Suppose, if you want the SQL server to follow the order of joins that you specify you have to use FORCE ORDER in OPTION clause. Howe