Optimize Resource Management – Troubleshoot Data Storage Processing-2
Develop a Batch Processing Solution, Microsoft DP-203, Tune Queries by Using CacheEven more important than managing costs is security. Security is the most important aspect to consider when running any data analytics solution. You learned a lot about this in Chapter 8, “Keeping Data Safe and Secure,” which covered topics such as firewalls, private endpoints, and using Azure Key Vault to store secrets, keys, and certificates. Azure Advisor can provide results of vulnerability scans, identify Azure products with exposed public endpoints, and identify products that would benefit from enabling firewall rules. Each presented recommendation is a link that routes you to further details and instructions on how to implement the recommendation.
A data analytics solution that is not reliable adds minimal value to an organization. Reliability means that the data and the system ingesting and transforming the data are available when needed. From an Azure storage account perspective, you learned about LRS, ZRS, GRS, and GZRS, which are redundancy options for storing your data. These redundancy levels were introduced in Chapter 1, “Gaining the Azure Data Engineer Associate Certification,” and discussed in numerous chapters throughout the book. In addition to storage redundancy levels, an in‐depth review of BCDR concepts and features was provided in Chapter 4, “The Storage of Data,” where redundancy, which is a key component required to make a solution reliable, was specifically called out from an Azure Synapse Analytics perspective. If you recall, you implemented some data redundancy features available for a dedicated SQL pool in Exercise 4.3. Some recommendations you might find in the Azure Advisor for this pillar occur when your resources have an endpoint in only a single Azure region. If this is the case, in the rare scenario that a region is unavailable, then so is your application. Having your mission‐critical applications in multiple Azure regions is recommended to avoid that scenario. Additional common recommendations include enabling soft deletes and recovery points on the Azure products that support them and any kind of networking optimizations that apply to the provisioned Azure products.
Operational excellence recommendations have to do with automation, deployments, and monitoring, to name a few examples. Many Azure customers use a large number of Azure VMs and can therefore benefit from automation capabilities. Activities such as installing security patches, performing backups, or configuring a new VM can all be performed using scripts. These scripts typically contain JSON script that describes the configuration details for all dependent Azure resources required to build and configure the environment. These JSON‐structured configuration files are often used when deploying new applications in a context known as ARM deployments. Deployments using this approach can provision and configure entire architectures including security, networking, compute, data storage, and application installation. These deployments can be very complex and require the name of the pillar in which it exists, i.e., operational excellence. Monitoring is another component of this pillar and was covered in detail in Chapter 9. From a monitoring perspective, the Azure Advisor report will identify which of your Azure products would benefit from enabling monitoring and configuring alerts based on defined performance thresholds.
Finally, performance efficiency has to do with the ability of your system to adapt to an increase or decrease of usage demand. This pillar has to do primarily with scalability, which includes not only the expectation to provide additional capacity when required but also to deallocate unneeded capacity when no longer needed. Not all Azure products can scale automatically and therefore some capacity planning and ongoing performance monitoring are necessary to maintain the optimal level of performance. There is more detail about scaling later in this chapter with regard to Azure Batch, Azure Stream Analytics, and Azure Synapse Analytics pipelines.