0.9.0 (2023-08-17)#

Breaking Changes#

  • Rename methods:

    • DBConnection.read_df -> DBConnection.read_source_as_df

    • DBConnection.write_df -> DBConnection.write_df_to_target (#66)

  • Rename classes:

    • HDFS.slots -> HDFS.Slots

    • Hive.slots -> Hive.Slots

    Old names are left intact, but will be removed in v1.0.0 (#103)

  • Rename options to make them self-explanatory:

    • Hive.WriteOptions(mode="append") -> Hive.WriteOptions(if_exists="append")

    • Hive.WriteOptions(mode="overwrite_table") -> Hive.WriteOptions(if_exists="replace_entire_table")

    • Hive.WriteOptions(mode="overwrite_partitions") -> Hive.WriteOptions(if_exists="replace_overlapping_partitions")

    • JDBC.WriteOptions(mode="append") -> JDBC.WriteOptions(if_exists="append")

    • JDBC.WriteOptions(mode="overwrite") -> JDBC.WriteOptions(if_exists="replace_entire_table")

    • Greenplum.WriteOptions(mode="append") -> Greenplum.WriteOptions(if_exists="append")

    • Greenplum.WriteOptions(mode="overwrite") -> Greenplum.WriteOptions(if_exists="replace_entire_table")

    • MongoDB.WriteOptions(mode="append") -> Greenplum.WriteOptions(if_exists="append")

    • MongoDB.WriteOptions(mode="overwrite") -> Greenplum.WriteOptions(if_exists="replace_entire_collection")

    • FileDownloader.Options(mode="error") -> FileDownloader.Options(if_exists="error")

    • FileDownloader.Options(mode="ignore") -> FileDownloader.Options(if_exists="ignore")

    • FileDownloader.Options(mode="overwrite") -> FileDownloader.Options(if_exists="replace_file")

    • FileDownloader.Options(mode="delete_all") -> FileDownloader.Options(if_exists="replace_entire_directory")

    • FileUploader.Options(mode="error") -> FileUploader.Options(if_exists="error")

    • FileUploader.Options(mode="ignore") -> FileUploader.Options(if_exists="ignore")

    • FileUploader.Options(mode="overwrite") -> FileUploader.Options(if_exists="replace_file")

    • FileUploader.Options(mode="delete_all") -> FileUploader.Options(if_exists="replace_entire_directory")

    • FileMover.Options(mode="error") -> FileMover.Options(if_exists="error")

    • FileMover.Options(mode="ignore") -> FileMover.Options(if_exists="ignore")

    • FileMover.Options(mode="overwrite") -> FileMover.Options(if_exists="replace_file")

    • FileMover.Options(mode="delete_all") -> FileMover.Options(if_exists="replace_entire_directory")

    Old names are left intact, but will be removed in v1.0.0 (#108)

  • Rename onetl.log.disable_clients_logging() to onetl.log.setup_clients_logging(). (#120)

Features#

  • Add new methods returning Maven packages for specific connection class:

    • Clickhouse.get_packages()

    • MySQL.get_packages()

    • Postgres.get_packages()

    • Teradata.get_packages()

    • MSSQL.get_packages(java_version="8")

    • Oracle.get_packages(java_version="8")

    • Greenplum.get_packages(scala_version="2.12")

    • MongoDB.get_packages(scala_version="2.12")

    • Kafka.get_packages(spark_version="3.4.1", scala_version="2.12")

    Deprecate old syntax:

    • Clickhouse.package

    • MySQL.package

    • Postgres.package

    • Teradata.package

    • MSSQL.package

    • Oracle.package

    • Greenplum.package_spark_2_3

    • Greenplum.package_spark_2_4

    • Greenplum.package_spark_3_2

    • MongoDB.package_spark_3_2

    • MongoDB.package_spark_3_3

    • MongoDB.package_spark_3_4 (#87)

  • Allow to set client modules log level in onetl.log.setup_clients_logging().

    Allow to enable underlying client modules logging in onetl.log.setup_logging() by providing additional argument enable_clients=True. This is useful for debug. (#120)

  • Added support for reading and writing data to Kafka topics.

    For these operations, new classes were added.

    Currently, Kafka does not support incremental read strategies, this will be implemented in future releases.

  • Added support for reading files as Spark DataFrame and saving DataFrame as Files.

    For these operations, new classes were added.

    FileDFConnections:

    High-level classes:

    • FileDFReader (#73)

    • FileDFWriter (#81)

    File formats:

Improvements#

  • Remove redundant checks for driver availability in Greenplum and MongoDB connections. (#67)

  • Check of Java class availability moved from .check() method to connection constructor. (#97)