Develop a Stream Processing Solution – Monitoring Azure Data Storage and Processing
Develop a Batch Processing Solution, Handle Skew in Data, Microsoft DP-203, Tune Queries by Using CacheChapter 7, “Design and Implement a Data Stream Processing Solution,” covered stream processing in detail. The content in this chapter that pertains to data streaming needed context around logging and monitoring capabilities before covering them. At this point you should have a sophisticated understanding of what monitoring features exist specifically for Azure Stream Analytics.
Monitor for Performance and Functional Regressions
Most of what you need to know from this section’s perspective is covered in the section “Configure Monitoring Services.” Figure 9.18 and Figure 9.19 illustrate the job diagram and available metrics, respectively. This section provides a detailed look at the available Azure Stream Analytics metrics (see Table 9.9).
TABLE 9.9 Azure Stream Analytics metrics
Name | Description |
Backlogged Input Events | The number of incoming event messages being queued, waiting to be processed |
CPU % Utilization | The percentage of CPU utilized by the job |
Data Conversion Errors | The number of data conversion errors (refer to Figure 7.51) |
Early Input Events | Event messages with a timestamp earlier than their arrival time by 5 minutes |
Failed Function Requests | The sum of failed Azure Machine Learning function calls |
Function Events | The sum of events sent to an Azure Machine Learning function |
Function Requests | The sum of calls to an Azure Machine Learning function |
Input Deserialization Errors | The number of input messages that could not be deserialized |
Input Event Bytes | The amount of data the job receives in bytes |
Input Events | The sum of records deserialized from the input event |
Input Sources Received | The number of messages the job receives |
Late Input Events | The events received outside the tolerance window (refer to Figure 7.43) |
Out of order Events | The sum of events received out of order that occur outside the configured tolerance window (refer to Figure 4.43) |
Output Events | The sum of event messages sent to output targets |
Runtime Events | The sum of errors that occur during query processing |
SU (Memory) % Utilization | The amount of memory used for processing the job |
Watermark Delay | The Avg, Min, or Max watermark delay across all job outputs |
Any number greater than zero for Backlogged Input Events means that your job cannot process the event quickly enough. The volume and frequency of incoming event messages are too great, and the job does not have enough compute power to process them all. If you notice this in your metrics, then you need to add more SUs to your job. The CPU % Utilization metric is not an indication that more SUs need to be added when the average value is over 90 percent. Use the CPU % Utilization metric in combination with the Backlogged Input Events and Watermark Delay metrics to determine if CPU is the bottle neck. The metrics for Early Input Events, Late Input Events, and Out of order Events are managed on the Event Ordering blade in the Azure portal for the given Azure Stream Analytics job (refer to Figure 7.43). The ordering of incoming event messages is discussed in Chapter 7, in the section “Configure Checkpoints/Watermarking During Processing,” and illustrated in Figure 7.41. The Input Sources Received metric represents the number of event messages. As you learned in Chapter 7, an event hub message is sent within an EventData object. Each EventData object is counted as one received event message. If SU (Memory) % Utilization remains near or over 80 percent and both the Watermark Delay and the Backlogged Input Events are rising, you should consider increasing the number of SUs allocated to the job. As mentioned in Chapter 7, the Watermark Delay should be zero. Any average value other than zero means that there are delays in processing the event messages. When you see this metric consistently greater than zero, it is a good indication that more SUs are needed.