Clickhouse connection#
- class onetl.connection.db_connection.clickhouse.connection.Clickhouse(*, spark: SparkSession, user: str, password: SecretStr, host: Host, port: int = 8123, database: str | None = None, extra: ClickhouseExtra = ClickhouseExtra())#
-
Based on Maven package
ru.yandex.clickhouse:clickhouse-jdbc:0.3.2
(official Clickhouse JDBC driver).Warning
Before using this connector please take into account Prerequisites
- Parameters:
- hoststr
Host of Clickhouse database. For example:
test.clickhouse.domain.com
or193.168.1.11
- portint, default:
8123
Port of Clickhouse database
- userstr
User, which have proper access to the database. For example:
some_user
- passwordstr
Password for database connection
- databasestr, optional
Database in RDBMS, NOT schema.
See this page for more details
- spark
pyspark.sql.SparkSession
Spark session.
- extradict, default:
None
Specifies one or more extra parameters by which clients can connect to the instance.
For example:
{"continueBatchOnError": "false"}
.
Examples
Clickhouse connection initialization
from onetl.connection import Clickhouse from pyspark.sql import SparkSession # Create Spark session with Clickhouse driver loaded maven_packages = Clickhouse.get_packages() spark = ( SparkSession.builder.appName("spark-app-name") .config("spark.jars.packages", ",".join(maven_packages)) .getOrCreate() ) # Create connection clickhouse = Clickhouse( host="database.host.or.ip", user="user", password="*****", extra={"continueBatchOnError": "false"}, spark=spark, )
- check()#
-
If not, an exception will be raised.
- Returns:
- Connection itself
- Raises:
- RuntimeError
If the connection is not available
Examples
connection.check()
- classmethod get_packages() list[str] #
Get package names to be downloaded by Spark.
Examples
from onetl.connection import Clickhouse Clickhouse.get_packages()