Friday, October 4, 2019

Word of the Day: data profiling

 
Word of the Day WhatIs.com
Daily updates on the latest technology terms | October 4, 2019
data profiling

Data profiling is the process of examining, analyzing and reviewing data to collect information about the accuracy, consistency and completeness of data in a particular dataset. The goal of data profiling is to identify data anomalies as well as shared characteristics among data elements.

Data profiling is an important tool for ensuring data quality and plays an important role in any initiative that requires moving data from one system to another. For example, projects that involve data warehousing or business intelligence often require gathering data from multiple, disparate systems and aggregating the data into one report for analysis. The profile can be used to identify whether enough data has been gathered and what work will be required before the data can be moved and used.

 

The results of a data profile typically capture key aggregate measures associated with the values in individual columns of data, including the number and percentage of populated versus null fields.  A typical profile will also reveal:

 

  • The number of unique values.
  • The frequency counts for each value and for patterns in the data.
  • The maximum and minimum values.
  • Information about data types.
  • The length of character strings in data elements.
  • Dependencies between columns.
  • Relationships across tables.

Quote of the Day

 
"The allure of the exploration aspect of the data profiling process can draw the data quality practitioner into a never-ending spiral of analysis." - David Loshin

Learning Center

 

Align business and IT drivers through data quality best practices
Consultant Nigel Turner advises adopting data quality best practices that align business drivers with data-centric technology, incorporating an enterprise-wide view, and being proactive in validating data and preventing errors.

Naveego launches tool for analyzing data quality and health
Naveego launches tool for analyzing data quality and health.

Using data profiling techniques -- and estimating the effort required
Data profiling techniques are a key component of data quality efforts. Here's a simple formula for calculating the amount of time that a data profiling process will take.

11 features to look for in data quality management tools
As the need for quality data has increased, so have the capabilities of data quality management tools. Expert David Loshin offers a comprehensive list of features that can enable efficient data quality management.

Identifying data quality issues via data profiling, reasonability
Data quality architect Laura Sebastian-Coleman explores data profiling, identifying and managing data quality issues, and using reasonability checks in this book excerpt.

Quiz Yourself

 
The use of a spreadsheet when a data warehouse was required created a situation _______ effective analysis was impossible.
a. where
b. in which

Answer

Stay in Touch

 
For feedback about any of our definitions or to suggest a new definition, please contact me at: mrouse@techtarget.com

Visit the Word of the Day Archives and catch up on what you've missed!

FOLLOW US

TwitterRSS
About This E-Newsletter
The Word of the Day is published by TechTarget, Inc., 275 Grove Street, Newton, Massachusetts, 02466 US.

Click to: Unsubscribe.

You are receiving this email because you are a member of TechTarget. When you access content from this email, your information may be shared with the sponsors or future sponsors of that content and with our Partners, see up-to-date Partners List, as described in our Privacy Policy. For additional information, please contact: webmaster@techtarget.com.

© 2019 TechTarget, Inc. all rights reserved. Designated trademarks, brands, logos, and service marks are the property of their respective owners.

Privacy Policy | Partners List
TechTarget

No comments:

Post a Comment