Scalable identity resolution, entity resolution, data mastering and deduplication using ML

1.1K stars 138 forks 1.1K watchers Java GNU Affero General Public License v3.0
analytics cdp customer-data-platform data-science databricks dataengineering datalake dataquality dedupe deduplication entity-resolution fuzzy-matching fuzzymatch identity-resolution master-data-management masterdata mdm ml snowflake spark
1 Open Issue Need Help Last updated: Sep 11, 2025

Open Issues Need Help

View All on GitHub
good first issue

Scalable identity resolution, entity resolution, data mastering and deduplication using ML

Java
#analytics#cdp#customer-data-platform#data-science#databricks#dataengineering#datalake#dataquality#dedupe#deduplication#entity-resolution#fuzzy-matching#fuzzymatch#identity-resolution#master-data-management#masterdata#mdm#ml#snowflake#spark