Base interface#

class onetl.base.base_file_format.BaseReadableFileFormat#

Representation of readable file format.

abstract check_if_supported(spark: SparkSession) None#

Check if Spark session does support this file format. support_hooks

Raises:
RuntimeError

If file format is not supported.

abstract apply_to_reader(reader: DataFrameReader) DataFrameReader | ContextManager[DataFrameReader]#

Apply provided format to pyspark.sql.DataFrameReader. support_hooks

Returns:
pyspark.sql.DataFrameReader

DataFrameReader with options applied.

ContextManager[DataFrameReader]

If returned context manager, it will be entered before reading data and exited after creating a DataFrame. Context manager’s __enter__ method should return pyspark.sql.DataFrameReader instance.

class onetl.base.base_file_format.BaseWritableFileFormat#

Representation of writable file format.

abstract check_if_supported(spark: SparkSession) None#

Check if Spark session does support this file format. support_hooks

Raises:
RuntimeError

If file format is not supported.

abstract apply_to_writer(writer: DataFrameWriter) DataFrameWriter | ContextManager[DataFrameWriter]#

Apply provided format to pyspark.sql.DataFrameWriter. support_hooks

Returns:
pyspark.sql.DataFrameWriter

DataFrameWriter with options applied.

ContextManager[DataFrameWriter]

If returned context manager, it will be entered before writing and exited after writing a DataFrame. Context manager’s __enter__ method should return pyspark.sql.DataFrameWriter instance.