Tune Queries by Using Indexers – Troubleshoot Data Storage Processing-2

9 Jan, 2024

Tune Queries by Using Indexers – Troubleshoot Data Storage Processing-2

A disadvantage of clustered indexes is that the only queries that benefit from them are those that filter the query in the way the index is created. This means that only queries that have those four IDs in theWHERE clause will utilize the index; other queries will result in a table scan. One way to determine which indexes are needed for your data analytics is by knowing about all the queries and the data they are retrieving. For smaller applications, this might be possible; however, it would not be feasible in an enterprise with numerous teams and many databases. One way you can find out which queries are being run against tables and if they would benefit from an index is by using an indexer. An indexer is the tool or process by which data is indexed. A tool called Query Store is useful for SQL databases and is supported on Azure Synapse Analytics. It is not enabled by default. To enable it, execute the following SQL statement, replacing databaseName with the name of the dedicated SQL pool where you want this feature enabled:

ALTER DATABASE [databaseName]
SET QUERY_STORE = ON;

The information you can find using the data logged by Query Store includes queries with long execution times, queries that perform the most I/O transactions, queries with the highest wait durations, and comparisons between like queries that highlight downward performance trends. This data is pulled from dynamic management views (DMV), which hold information useful for identifying which queries on the database are latent and which queries would benefit from an index. As shown in Table 10.4, there are DMVs that recommend columns in a table that should exist in an index, and much more.

TABLE 10.4 Index‐related Dynamic Management Views

Name	View	Description
sys.query_store_query_text	Query Store	Stores the query command text
sys.query_store_query	Query Store	Stores the query command text and runtime execution statistics
sys.query_store_plan	Query Store	Links plans with queries
sys.query_store_runtime_stats	Query Store	Stores runtime statistics for queries
sys.query_store_wait_stats	Query Store	Contains query wait information
sys.dm_pdw_exec_requests	Execution	Lists all steps taken for recent or active queries
sys.dm_pdw_request_steps	Execution	Lists all steps taken for a given query
sys.dm_pdw_dms_workers	Execution	Contains information about workers that process DMS steps
sys.dm_pdw_sql_requests	Execution	Holds information about SQL queries
sys.indexes	Object	Lists the indexes on the database
sys.index_columns	Object	Lists indexed columns per index
sys.pdw_index_mappings	Synapse	Links the physical index name to the logical index used on Compute node
sys.dm_db_index_operational_stats	Index	Returns I/O, latching, access method, and locking activity
sys.dm_db_index_physical_stats	Index	Returns fragments and size information for indexes and data
sys.dm_db_index_usage_stats	Index	Lists the index types and usage frequency
sys.dm_db_missing_index_columns	Index	Identifies the columns that are missing an index
sys.dm_db_missing_index_details	Index	Renders details about missing indexes

The Query Store uses these DMVs behind the scenes and offers some built‐in capabilities, which can be graphically viewed in SQL Server Management Studio (SSMS). Figure 10.7 illustrates the output of one such report named Top Resource Consuming Queries.

FIGURE 10.7 Tuning queries with the Top Resource Consuming Queries report

The default reports can give you insights into the areas in your data analytics solution that consume the most amount of CPU, memory, and/or time. Once you have the top problem queries identified, you can use the plethora of DMVs from Table 10.4 to dig deeper into the query execution and determine if there is a need to create additional indexes or even re‐create the table with a clustered columnstore index. Performing the actual tuning depends on the queries you discovered from your analysis. Take the previous SQL query, which retrieves data from the READING table and has a clustered index like the following:

CLUSTERED INDEX ( READING_ID, SESSION_ID, ELECTRODE_ID, FREQUENCY_ID )

Then hypothetically consider that you found many queries that matched the following pattern:

SELECT READING_DATETIME, COUNT, VALUE
FROM READING
WHERE ELECTRODE_ID = 2 AND FREQUENCY_ID = 4

trygone.org

trygone.org

Tune Queries by Using Indexers – Troubleshoot Data Storage Processing-2

Leave a Reply Cancel reply