MSSQL connection#

class onetl.connection.db_connection.mssql.connection.MSSQL(*, spark: SparkSession, user: str, password: SecretStr, host: Host, port: int = 1433, database: str, extra: MSSQLExtra = MSSQLExtra())#

MSSQL JDBC connection. support_hooks

Based on Maven package com.microsoft.sqlserver:mssql-jdbc:12.2.0.jre8 (official MSSQL JDBC driver).

Warning

Before using this connector please take into account Prerequisites

Parameters:
hoststr

Host of MSSQL database. For example: test.mssql.domain.com or 192.168.1.14

portint, default: 1433

Port of MSSQL database

userstr

User, which have proper access to the database. For example: some_user

passwordstr

Password for database connection

databasestr

Database in RDBMS, NOT schema.

See this page for more details

sparkpyspark.sql.SparkSession

Spark session.

extradict, default: None

Specifies one or more extra parameters by which clients can connect to the instance.

For example: {"connectRetryCount": 3, "connectRetryInterval": 10}

See MSSQL JDBC driver properties documentation for more details

Examples

MSSQL connection with plain auth:

from onetl.connection import MSSQL
from pyspark.sql import SparkSession

# Create Spark session with MSSQL driver loaded
maven_packages = MSSQL.get_packages()
spark = (
    SparkSession.builder.appName("spark-app-name")
    .config("spark.jars.packages", ",".join(maven_packages))
    .getOrCreate()
)

# Create connection
mssql = MSSQL(
    host="database.host.or.ip",
    user="user",
    password="*****",
    extra={
        "trustServerCertificate": "true",  # add this to avoid SSL certificate issues
    },
    spark=spark,
)

MSSQL connection with domain auth:

# Create Spark session with MSSQL driver loaded
...

# Create connection
mssql = MSSQL(
    host="database.host.or.ip",
    user="user",
    password="*****",
    extra={
        "Domain": "some.domain.com",  # add here your domain
        "IntegratedSecurity": "true",
        "authenticationScheme": "NTLM",
        "trustServerCertificate": "true",  # add this to avoid SSL certificate issues
    },
    spark=spark,
)

MSSQL read-only connection:

# Create Spark session with MSSQL driver loaded
...

# Create connection
mssql = MSSQL(
    host="database.host.or.ip",
    user="user",
    password="*****",
    extra={
        "ApplicationIntent": "ReadOnly",  # driver will open read-only connection, to avoid writing to the database
        "trustServerCertificate": "true",  # add this to avoid SSL certificate issues
    },
    spark=spark,
)
check()#

Check source availability. support_hooks

If not, an exception will be raised.

Returns:
Connection itself
Raises:
RuntimeError

If the connection is not available

Examples

connection.check()
classmethod get_packages(java_version: str | Version | None = None) list[str]#

Get package names to be downloaded by Spark. support_hooks

Parameters:
java_versionstr, default 8

Java major version.

Examples

from onetl.connection import MSSQL

MSSQL.get_packages()
MSSQL.get_packages(java_version="8")