Skip to content

Version history

Unreleased

v0.4.1

Released on 2025-03-10. See v0.4.1 release on GitHub

Added

  • ClickhouseServerAPI can register pandas tables with datetime columns, and allows integers to be signed #61.
  • ClickhouseServerAPI will now register dict or list via pandas #61.
  • Easier dependency resolution for python 3.13 #67.

v0.4.0

Released on 2024-12-23. See v0.4.0 release on GitHub

Changed

  • Renamed ClickhouseAPI and ClickhouseDataFrame to ClickhouseServerAPI and ClickhouseServerDataFrame respectively, and splinkclickhouse.clickhouse to splinkclickhouse.clickhouse_server #54.

v0.3.4

Released on 2024-12-16. See v0.3.4 release on GitHub

Added

  • Added Clickhouse appropriate versions of comparison level PairwiseStringDistanceFunctionLevel and comparison PairwiseStringDistanceFunctionAtThresholds to the relevant libraries #51.
  • ClickhouseAPI can now properly register pandas tables with string array columns #51.

Fixed

  • Table registration in chdb now works for pandas tables whose indexes do not have a 0 entry #49.

v0.3.3

Released on 2024-12-05. See v0.3.3 release on GitHub

Added

  • Term frequency adjustments are now not limited in Clickhouse server (or chdb when debug_mode is switched on) #46.

Changed

  • Dropped support for Splink <= 4.0.5 #46.

v0.3.2

Released on 2024-10-23. See v0.3.2 release on GitHub

Added

  • SQL UDF days_since_epoch to parse a date representing a string to the number of days since 1970-01-01 #39.
  • Custom Clickhouse ColumnExpression with additional transform parse_date_to_int to parse string to days since epoch #39.
  • Custom date comparison and comparison levels working with integer type representing days since epoch #39.

v0.3.1

Released on 2024-10-14. See v0.3.1 release on GitHub

Added

  • ClickhouseAPI now has a function .set_union_default_mode() to allow manually setting client state necessary for clustering, if session has timed out e.g. when running interactively #36.
  • Added support for Splink 4.0.4 #37.

Fixed

  • estimate_probability_two_random_records_match now works correctly when debug_mode is switched on #34.

v0.3.0

Released on 2024-09-26. See v0.3.0 release on GitHub

Changed

  • chdb is now an optional dependency, requiring opt-in installation for use of ChDBAPI #28.

v0.2.5

Released on 2024-09-23. See v0.2.5 release on GitHub

Changed

  • Added support for Splink >= 4.0.2, dropped support for 4.0.0, 4.0.1 #26.

v0.2.4

Released on 2024-09-19. See v0.2.4 release on GitHub

Added

  • Extended ClickhouseAPI pandas table registration to support float columns #24.
  • Added Clickhouse-specific library comparisons/levels - cll_ch.DistanceInKMLevel, cl_ch.DistanceInKMAtThresholds, and cl_ch.ExactMatchAtSubstringSizes #24.

v0.2.3

Released on 2024-09-16. See v0.2.3 release on GitHub

Changed

  • Dropped support for python 3.8 #20.
  • Removed numpyrequirements #20.

v0.2.2

Released on 2024-09-12. See v0.2.2 release on GitHub

Added

  • ClickhouseAPI now allows for registering tables directly from pandas DataFrames, if they contain only integer and string columns #18.

Fixed

  • Create an alias for rand, random so that Linker.visualisations.comparison_viewer_dashboard runs without error #14.
  • Workaround for Clickhouse count(*) filter ... parsing issue so that linker.clustering.compute_graph_metrics(...) now runs #18.

v0.2.1

Released on 2024-09-12. See v0.2.1 release on GitHub

Changed

  • Updated numpy dependency requirements to allow compatible versions for all supported python versions #9.

v0.2.0

Released on 2024-09-11. See v0.2.0 release on GitHub

Added

  • ClickhouseAPI and dataframe added to support running calculations in a Clickhouse instance #4.

v0.1.1

Released on 2024-09-10. See v0.1.1 release on GitHub

Fixed

  • Fix random_sample_sql so that u-training works when we don't sample the entire dataset #1.

Changed

  • try_parse_date and try_parse_timestamp now use DateTime64 to extend the range to more useful values, and no longer support custom format strings #2.

v0.1.0

Released on 2024-09-09. See v0.1.0 release on GitHub

Added

  • Basic working version of package with api for chdb