Collecting Logs in Azure Databricks

Azure Databricks is an Apache Spark-based analytics platform optimized for the Microsoft Azure cloud services platform. In this blog, we are going to see how we can collect logs from Azure to ALA. Before going further we need to look how to set up a Spark cluster in Azure.

Create a Spark Cluster in Databricks

  1. In the Azure portal, go to the Databricks workspace that you created, and then click Launch Workspace.
  2. You are redirected to the Azure Databricks portal. From the portal, click New Cluster.
  3. Under “Advanced Options,” click on the “Init Scripts” tab. Go to the last line under the “Init Scripts" section. Under the “destination” dropdown, select “DBFS," and enter “dbfs:/databricks/spark-monitoring/spark-monitoring.sh” in the text box. Click the “add” button. 

Run a Spark SQL job

  1. In the left pane, select Azure Databricks. From the Common Tasks, select New Notebook.
  2. In the Create Notebook dialog box, enter a name, select language, and select the Spark cluster that you created earlier.