Version history
Unreleased
v0.4.1
Released on 2025-03-10. See v0.4.1 release on GitHub
Added
ClickhouseServerAPIcan registerpandastables with datetime columns, and allows integers to be signed #61.ClickhouseServerAPIwill now registerdictorlistviapandas#61.- Easier dependency resolution for python 3.13 #67.
 
v0.4.0
Released on 2024-12-23. See v0.4.0 release on GitHub
Changed
- Renamed 
ClickhouseAPIandClickhouseDataFrametoClickhouseServerAPIandClickhouseServerDataFramerespectively, andsplinkclickhouse.clickhousetosplinkclickhouse.clickhouse_server#54. 
v0.3.4
Released on 2024-12-16. See v0.3.4 release on GitHub
Added
- Added Clickhouse appropriate versions of comparison level 
PairwiseStringDistanceFunctionLeveland comparisonPairwiseStringDistanceFunctionAtThresholdsto the relevant libraries #51. ClickhouseAPIcan now properly registerpandastables with string array columns #51.
Fixed
- Table registration in 
chdbnow works for pandas tables whose indexes do not have a0entry #49. 
v0.3.3
Released on 2024-12-05. See v0.3.3 release on GitHub
Added
- Term frequency adjustments are now not limited in Clickhouse server (or 
chdbwhendebug_modeis switched on) #46. 
Changed
- Dropped support for Splink <= 
4.0.5#46. 
v0.3.2
Released on 2024-10-23. See v0.3.2 release on GitHub
Added
- SQL UDF 
days_since_epochto parse a date representing a string to the number of days since1970-01-01#39. - Custom Clickhouse 
ColumnExpressionwith additional transformparse_date_to_intto parse string to days since epoch #39. - Custom date comparison and comparison levels working with integer type representing days since epoch #39.
 
v0.3.1
Released on 2024-10-14. See v0.3.1 release on GitHub
Added
ClickhouseAPInow has a function.set_union_default_mode()to allow manually setting client state necessary for clustering, if session has timed out e.g. when running interactively #36.- Added support for Splink 4.0.4 #37.
 
Fixed
estimate_probability_two_random_records_matchnow works correctly whendebug_modeis switched on #34.
v0.3.0
Released on 2024-09-26. See v0.3.0 release on GitHub
Changed
chdbis now an optional dependency, requiring opt-in installation for use ofChDBAPI#28.
v0.2.5
Released on 2024-09-23. See v0.2.5 release on GitHub
Changed
- Added support for Splink >= 4.0.2, dropped support for 4.0.0, 4.0.1 #26.
 
v0.2.4
Released on 2024-09-19. See v0.2.4 release on GitHub
Added
- Extended 
ClickhouseAPIpandas table registration to support float columns #24. - Added Clickhouse-specific library comparisons/levels - 
cll_ch.DistanceInKMLevel,cl_ch.DistanceInKMAtThresholds, andcl_ch.ExactMatchAtSubstringSizes#24. 
v0.2.3
Released on 2024-09-16. See v0.2.3 release on GitHub
Changed
v0.2.2
Released on 2024-09-12. See v0.2.2 release on GitHub
Added
ClickhouseAPInow allows for registering tables directly from pandasDataFrames, if they contain only integer and string columns #18.
Fixed
- Create an alias for 
rand,randomso thatLinker.visualisations.comparison_viewer_dashboardruns without error #14. - Workaround for Clickhouse 
count(*) filter ...parsing issue so thatlinker.clustering.compute_graph_metrics(...)now runs #18. 
v0.2.1
Released on 2024-09-12. See v0.2.1 release on GitHub
Changed
- Updated 
numpydependency requirements to allow compatible versions for all supported python versions #9. 
v0.2.0
Released on 2024-09-11. See v0.2.0 release on GitHub
Added
ClickhouseAPIand dataframe added to support running calculations in a Clickhouse instance #4.
v0.1.1
Released on 2024-09-10. See v0.1.1 release on GitHub
Fixed
- Fix 
random_sample_sqlso that u-training works when we don't sample the entire dataset #1. 
Changed
try_parse_dateandtry_parse_timestampnow useDateTime64to extend the range to more useful values, and no longer support custom format strings #2.
v0.1.0
Released on 2024-09-09. See v0.1.0 release on GitHub
Added
- Basic working version of package with api for 
chdb