By: Phillip Sharpless

 
 

Introduction to Azure Databricks

Finding the right tools to manage your big data ecosystem can be a daunting task, as there seem to be a myriad of options, all advertising impressive-sounding features. One analytics platform that is seriously worth taking a look at is Azure Databricks. In this blog post, we’ll take a quick overview of Azure Databricks and briefly discuss ways the technology could be useful to your business.
 

What is Azure Databricks?

To understand Databricks, one must take a step back and look at Apache Spark, which is the actual analytics engine behind Databricks. Apache Spark is a unified analytics engine for large-scale data processing. The framework bills itself for its speed and flexibility. API’s for it are available in Scala, Python, Java, and R, empowering developers to use their preferred language. Spark also advertises the ability for it to “run everywhere”, meaning it can run on Hadoop, Apache Mesos, Kubernetes, standalone, or in the cloud.

Databricks was founded in 2013 by the original creators of Apache Spark and seeks to further simplify the management and use of Apache Spark by providing a cloud-based platform for big data operations and collaboration.

 

Azure Databricks

In 2017 Microsoft and Databricks announced a partnership that would see Databricks offered as an integrated Azure service. The advantages of this are numerous. Azure Databricks allows you to harness the power of Apache Spark while also seamlessly integrating with all other components of the Azure stack that a business might already be leveraging, such as Azure Synapse Analytics or Power BI. The framework also takes immediate advantage of all the Azure security features, integrating with the Azure Active Directory security framework. Another major feature of Azure Databricks is the ease of collaboration. Azure provides a straightforward manner to access shared workspaces which allow different individuals or teams to collaborate on projects very easily.
 

Business Use Cases

No matter how impressive big data analytic technologies may be from a technical standpoint, some fundamental questions remain:

  • How would I actually use this?
  • How would this improve my business or organization?

The types of useful business intelligence that can be gleaned from your data can sometimes be hard to imagine. A white paper published by Microsoft gives three practical examples of the types of useful, actionable intelligence one could hypothetically extract. Microsoft’s first example details using data to track customer churn. The data reveals that with up to 90% accuracy one can predict if a customer is on the verge of churning. Identifying this population so that steps can be taken to prevent their departure could be extremely valuable.

The second example details a recommendation engine, something customers of platforms such as Amazon or Netflix should be very familiar with. With proper analysis of data, one can predict the types of products or services a customer may be more likely to purchase and recommend them, resulting in significantly more sales. The final example details a security scenario, building a system to detect possible network intrusion. One can analyze network traffic to detect unusual or suspicious patterns, indicating an attack of some sort may be occurring.

It’s definitely worth checking Microsoft’s white paper in detail for anyone curious about potential practical applications of the platform. The useful intelligence that can be generated by analyzing your data is bound by imagination as much as anything else.

 

Just the Beginning

Hopefully now you have a basic idea of what Azure Databricks is and how incorporating it may be useful. Look for more info on Azure Databricks coming from us in the near future, diving deeper into some of the specific features and how to set them up and use them.
 

Have Questions?

Thanks for reading. We hope you found this blog post to be useful. Do let us know if you have any questions or topic ideas related to BI, analytics, the cloud, machine learning, SQL Server, (Star Wars), or anything else of the like that you’d like us to write about. Simply leave us a comment below, and we’ll see what we can do!
 

Keep Your Business Intelligence Knowledge Sharp by Subscribing to our Email List

Get fresh Key2 content around Business Intelligence, Data Warehousing, Analytics, and more delivered right to your inbox!

 


Key2 Consulting is a data warehousing and business intelligence company located in Atlanta, Georgia. We create and deliver custom data warehouse solutions, business intelligence solutions, and custom applications.