FileDF Writer#

class onetl.file.file_df_writer.file_df_writer.FileDFWriter(*, connection: ~onetl.base.base_file_df_connection.BaseFileDFConnection, format: ~onetl.base.base_file_format.BaseWritableFileFormat, target_path: ~onetl.base.pure_path_protocol.PurePathProtocol, options: ~onetl.file.file_df_writer.options.FileDFWriterOptions = FileDFWriterOptions(if_exists=<FileDFExistBehavior.ERROR: 'error'>, partition_by=None))#

Allows you to write Spark DataFrame as files in a target path of specified file connection with parameters. support_hooks

Parameters:
connectionBaseFileDFConnection

File DataFrame connection. See File DataFrame Connections section.

formatBaseWritableFileFormat

File format to write.

target_pathos.PathLike or str

Directory path to write data to.

optionsFileDFWriterOptions, optional

Common writing options.

Examples

Create writer to parse CSV files in local filesystem:

from onetl.connection import SparkLocalFS
from onetl.file import FileDFWriter
from onetl.file.format import CSV

local_fs = SparkLocalFS(spark=spark)

writer = FileDFWriter(
    connection=local_fs,
    format=CSV(delimiter=","),
    target_path="/path/to/directory",
)

All supported options

from onetl.connection import SparkLocalFS
from onetl.file import FileDFWriter
from onetl.file.format import CSV

csv = CSV(delimiter=",")
local_fs = SparkLocalFS(spark=spark)

writer = FileDFWriter(
    connection=local_fs,
    format=csv,
    target_path="/path/to/directory",
    options=FileDFWriter.Options(if_exists="replace_entire_directory"),
)
run(df: DataFrame) None#

Method for writing DataFrame as files. support_hooks

Note

Method does support only batching DataFrames.

Parameters:
dfpyspark.sql.dataframe.DataFrame

Spark dataframe

Examples

Write df to target:

writer.run(df)