Details

Identity Mapping and De-Duplicating

Speaker: Dejan Sarka

Duration: 60 minutes

Track: Application & Database Development

In an enterprise, merging master data, like customer data, from multiple sources is a common problem. Typically, you do not have a single, i.e. the same key identifying a customer in different sources. You have to match data based on similarity of strings, like names and addresses. In this session, we are going to check how different algorithms for comparing strings included in SQL Server  work. We are going to use four different algorithms that come with Master Data Services (Levenshtein, Jaccard, Jaro-Winkler and Ratcliff-Obershelp), and Fuzzy Lookup transformation from Integration Services. Finally, we are going to introduce how SQL Server Data Quality Services help us here.


Back to Top
cage-aids
cage-aids
cage-aids
cage-aids