Skip to main content

Databricks

This page provides information on connecting your application to Databricks, which enables you to explore and analyze large volumes of data using SQL.

Connect Databricks

Databricks datasource
Databricks datasource

Connection parameters

The following section is a reference guide that provides a complete description of all the parameters to connect to a Databricks datasource.

Configuration method

This method allows you to select a configuration type.

Options

  • JDBC: JDBC (Java Database Connectivity) provides a standardized approach to connect to Databricks databases. This method allows you to add a JDBC URL for authentication.

  • Use form properties: This method allows you to customize and define specific connection parameters such as host, port, HTTP path, default catalog, and default schema.

JDBC URL

JDBC URL (Java Database Connectivity Uniform Resource Locator) is a string of characters that specifies the address and parameters needed to connect to a Databricks database using JDBC.

Example format:

jdbc:databricks://<server-hostname>:443;httpPath=<http-path>[;<setting1>=<value1>;<setting2>=<value2>;<settingN>=<valueN>]
  • <server-hostname>: Replace this placeholder with the hostname or IP address of your Databricks workspace server.

  • 443: Indicates the port number (default is 443) for the JDBC connection.

  • httpPath=<http-path>: Specifies the HTTP path for the connection.

  • [;<setting1>=<value1>;<setting2>=<value2>;<settingN>=<valueN>]: Additional optional settings and their corresponding values. These settings can include authentication details, catalog, schema, etc.

Learn more about JDBC URL

Personal access token

A Personal Access Token is a secure authentication token used to access resources within Databricks.

To create a Databricks personal access token, go to User Settings in your workspace, navigate to Developer, click Manage next to Access tokens, and generate a new token, e.g., dapi69test66c547ee2sample51d9f1007.

Learn more about Personal access tokens

Host

This property specifies the server hostname or IP address of your Databricks workspace. Example, dbc-dbtest-ea55.cloud.databricks.com

Port

This property defines the port number to establish the connection. The default port is set to 443, which is commonly used for secure HTTPS connections.

HTTP Path

Indicates the path for the HTTP connection. It represents the specific endpoint or location within your Databricks workspace where the connection is directed. Example, /sql/1.0/warehouses/6123123test

Default Catalog

Refers to the default database catalog to be used in your Databricks connection. It specifies the database where the tables and data are stored.

Default Schema

This property defines the default schema to be used in the connection. A schema is a logical container for database objects (tables, views, etc.), and specifying a default schema streamlines queries by indicating where to find these objects.

UserAgent Tag

The UserAgent tag identifies the client making a request, aiding in tracking and customization based on the environment. By default, the UserAgent tag is set as Appsmith and remains the same unless changed.

This is a recommended practice by Databricks, and the user agent tag would appear in the audit logs on Databricks for all queries executed from Appsmith Databricks datasources.

Query Databricks

The following section provides examples of creating basic CRUD queries on Databricks. The syntax and operations are similar to standard SQL, and Databricks supports common SQL commands for creating, reading, updating, and deleting data.

Query Databricks
Query Databricks

Fetch data

-- Use quotes for non-integer values
SELECT *
FROM default.customer
LIMIT {{ tableUsers.pageSize }} OFFSET {{ tableUsers.pageOffset }};

This comment fetches customer data from the default.customer table using the page size and offset values provided by Table widget. If the values are non-integer, make sure to enclose them in quotes.

Update data

-- Use quotes for non-integer values
UPDATE default.customer
SET email = '{{emailInput.text}}'
WHERE id = {{ tableUsers.selectedRow.id}};

This command dynamically updates customer details based on user input. If the values are non-integer, make sure to enclose them in quotes.

Insert data

-- Use quotes for non-integer values
INSERT INTO default.customer
VALUES (
'{{ nameInput.text }}',
'{{ genderDropdown.selectedOptionValue }}',
'{{ emailInput.text }}'
);

This command dynamically inserts a new customer into the database. If the values are non-integer, make sure to enclose them in quotes.

Delete data

-- Use quotes for non-integer values
DELETE FROM default.customer
WHERE id = {{tableUsers.selectedRow.id}};

This command dynamically deletes a customer based on their ID, obtained from the selected row in the Table. If the values are non-integer, make sure to enclose them in quotes.