Spark LocalFS#
- class onetl.connection.file_df_connection.spark_local_fs.SparkLocalFS(*, spark: SparkSession)#
Spark connection to local filesystem.
Based on Spark Generic File Data Source.
Warning
To use SparkLocalFS connector you should have PySpark installed (or injected to
sys.path
) BEFORE creating the connector instance.You can install PySpark as follows:
pip install onetl[spark] # latest PySpark version # or pip install onetl pyspark=3.5.0 # pass specific PySpark version
See Spark installation instruction for more details.
Warning
Currently supports only Spark sessions created with option
spark.master: local
.Note
Supports only reading files as Spark DataFrame and writing DataFrame to files.
Does NOT support file operations, like create, delete, rename, etc.
- Parameters:
- spark
pyspark.sql.SparkSession
Spark session
- spark
Examples
from onetl.connection import SparkLocalFS from pyspark.sql import SparkSession # create Spark session spark = SparkSession.builder.master("local").appName("spark-app-name").getOrCreate() # create connection local_fs = SparkLocalFS(spark=spark).check()
- check()#
-
If not, an exception will be raised.
- Returns:
- Connection itself
- Raises:
- RuntimeError
If the connection is not available
Examples
connection.check()