Spotless Data

How would your home look like if you let dirt and mess accumulate for years? It would be a health hazard and would also make it impossible to find what you need when you need it most. In the end, you would reach a point when the problem simply couldn’t be overlooked. This is the situation that many plant managers are facing after accumulating huge quantities of manufacturing data over the years. 

By implementing a data-driven company culture, manufacturers can exponentially improve virtually any aspect of production. Big data can be used, among other things, to maximise energy efficiency, improve the business’s predictive maintenance strategy, and prevent downtime caused by equipment failure. To do this, manufacturers need accurate and reliable data.    
 
But when data is collected and accumulated for several years, its quality can start to decline. Dirty or rogue data is data affected by issues such as duplicates, inaccuracies, inconsistencies, and out-of-date information. When plants reach this point, it’s time for a good clean-up. 

Not The Exception

Dirty data is the norm, not the exception. As companies evolve, the amount of data they collect grows in quantity and complexity. High employee turnover, the use of different enterprise resources planning (ERP) solutions across several departments, and lack of standard guidelines for data entry complicate the situation. For these reasons, achieving perfect data is almost impossible, especially in large organisations.  

Data cleansing, or cleaning, is the process of detecting and correcting or eliminating incomplete, inaccurate, out-of-date or irrelevant data. It differs from data validation in that the latter is automatically performed by the system at the time of data entry, while data cleaning is done later on batches of data that have become unreliable.

There are a lot of data cleansing tools available, such as Trifacta, Openprise, WinPure, OpenRefine and many more. It’s also possible to use libraries like Panda for Python, or Dplyr for R. The variety of solutions on the market means that manufacturers might want to consult a data analyst to choose the best one for their business case.

How Dirty, Exactly?

Regardless of the solution employed and the type of data being cleansed, the first step is assessing the quality of the existing data. In this phase, a data analyst will assess the company’s needs and establish specific KPIs for clean data. Legacy data is then audited using statistical and database methods to reveal anomalies and inconsistencies.  

This can be done using commercial software that allows the user to specify various constraints. The existing data will be uploaded and tested against these constraints, and data that doesn’t pass the test should be cleansed.  

During this phase, manufacturers should establish which input fields must be standardised across the company. Standardisation rules can help businesses prevent the build-up of dirty data in that they minimise inconsistencies and facilitate the uploading of clean data into a common ERP.

Keep It Clean

After the audit, the cleaning process can begin. Data will pass through a series of automated software programmes that discard what is not compliant with the specified KPIs. The result is then tested for correctness and incomplete data will be amended manually, if possible. A final quality control phase will ensure that the output data is clean enough to by seamlessly uploaded into the chosen ERP.

However, just like when cleaning our homes, a big clean-up every now and then is not enough. The best approach is to implement a culture of continuous data improvement, distributing tasks among each member of the team. Developing practices that support ongoing data hygiene is the key to success.

About the Author:  Neil Ballinger is head of EMEA at automation parts supplier EU Automation and for more information on how to use big data to optimise your business, visit www.euautomation.com

Image: Unsplash

You Might Also Read: 

Some Expert Predictions For Industrial Cyber Security:

 

« Myanmar’s Cyber Security Bill
A Successful Solar Winds Investigation »

CyberSecurity Jobsite
Perimeter 81

Directory of Suppliers

XYPRO Technology

XYPRO Technology

XYPRO is the market leader in HPE Non-Stop Security, Risk Management and Compliance.

BackupVault

BackupVault

BackupVault is a leading provider of completely automatic, fully encrypted online, cloud backup.

MIRACL

MIRACL

MIRACL provides the world’s only single step Multi-Factor Authentication (MFA) which can replace passwords on 100% of mobiles, desktops or even Smart TVs.

ManageEngine

ManageEngine

As the IT management division of Zoho Corporation, ManageEngine prioritizes flexible solutions that work for all businesses, regardless of size or budget.

Cyber Security Supplier Directory

Cyber Security Supplier Directory

Our Supplier Directory lists 6,000+ specialist cyber security service providers in 128 countries worldwide. IS YOUR ORGANISATION LISTED?

Optimum Insurance

Optimum Insurance

Optimum's Cyber Risk & Data Protection Insurance policies are designed to protect against cyber exposures that arise when a company’s data and customer information is breached or stolen.

First Response

First Response

First Response is a Cyber Incident Response and Digital Forensic Investigation company.

CyberPlat

CyberPlat

CyberPlat is an integrated broad-based multibank Internet payment system. It is the largest electronic payment system in Russia and CIS.

Ingalls Information Security

Ingalls Information Security

Ingalls Information Security provides network security, monitoring and forensics.

WISeKey

WISeKey

WISeKey is a leading cybersecurity company currently deploying large scale digital identity ecosystems for people and objects using Blockchain, AI and IoT.

Fugue

Fugue

Fugue ensures cloud infrastructure stays in continuous compliance with enterprise security policies.

GuardSI

GuardSI

GuardSI was created to protect companies from growing threats to security such as fraud, hacking, internal theft, accidents and human mistakes that can directly affect the business.

Eco Recycling (Ecoreco)

Eco Recycling (Ecoreco)

Eco Recycling is India's first and leading professional E-waste Management Company that has set industry benchmarks with its innovative & environment friendly disposal practices.

e-End

e-End

e-End provides hard drive shredding, degaussing and data destruction solutions validated by the highest electronic certifcations to keep you compliant with GLB, SOX, FACTA, FISMA, HIPAA, COPPA, ITAR.

Nokia

Nokia

Nokia is a proven leader in fixed, mobile and IoT security offering capabilities that range from systems design to integration and support.

Responsible Cyber

Responsible Cyber

Protect yourself with Responsible Cyber’s 360° platform, IMMUNE, arming you with comprehensive support for your business.

Quintillion Consulting

Quintillion Consulting

Quintillion Consulting is a strategic risk based consulting firm. We help companies safeguard the core business and IT capabilities that deliver competitive advantage.

Cyberfort Group

Cyberfort Group

Cyberfort exists to provide our clients with the peace-of-mind about the security of their data and the compliance of their business.

Numen Cyber Technology

Numen Cyber Technology

Numen Cyber Technology is committed to becoming a Threat Discovery and Response expert for corporate customers.

Xoriant

Xoriant

Xoriant is a technology leader and execution partner throughout the Build, Run and Transform lifecycle for companies that create and use technology products.

EPIQ Infotech

EPIQ Infotech

EPIQ Infotech is a trusted consulting and implementation partner for Oracle JD Edwards and Amazon Web Services (AWS).