Spotless Data

How would your home look like if you let dirt and mess accumulate for years? It would be a health hazard and would also make it impossible to find what you need when you need it most. In the end, you would reach a point when the problem simply couldn’t be overlooked. This is the situation that many plant managers are facing after accumulating huge quantities of manufacturing data over the years. 

By implementing a data-driven company culture, manufacturers can exponentially improve virtually any aspect of production. Big data can be used, among other things, to maximise energy efficiency, improve the business’s predictive maintenance strategy, and prevent downtime caused by equipment failure. To do this, manufacturers need accurate and reliable data.    
 
But when data is collected and accumulated for several years, its quality can start to decline. Dirty or rogue data is data affected by issues such as duplicates, inaccuracies, inconsistencies, and out-of-date information. When plants reach this point, it’s time for a good clean-up. 

Not The Exception

Dirty data is the norm, not the exception. As companies evolve, the amount of data they collect grows in quantity and complexity. High employee turnover, the use of different enterprise resources planning (ERP) solutions across several departments, and lack of standard guidelines for data entry complicate the situation. For these reasons, achieving perfect data is almost impossible, especially in large organisations.  

Data cleansing, or cleaning, is the process of detecting and correcting or eliminating incomplete, inaccurate, out-of-date or irrelevant data. It differs from data validation in that the latter is automatically performed by the system at the time of data entry, while data cleaning is done later on batches of data that have become unreliable.

There are a lot of data cleansing tools available, such as Trifacta, Openprise, WinPure, OpenRefine and many more. It’s also possible to use libraries like Panda for Python, or Dplyr for R. The variety of solutions on the market means that manufacturers might want to consult a data analyst to choose the best one for their business case.

How Dirty, Exactly?

Regardless of the solution employed and the type of data being cleansed, the first step is assessing the quality of the existing data. In this phase, a data analyst will assess the company’s needs and establish specific KPIs for clean data. Legacy data is then audited using statistical and database methods to reveal anomalies and inconsistencies.  

This can be done using commercial software that allows the user to specify various constraints. The existing data will be uploaded and tested against these constraints, and data that doesn’t pass the test should be cleansed.  

During this phase, manufacturers should establish which input fields must be standardised across the company. Standardisation rules can help businesses prevent the build-up of dirty data in that they minimise inconsistencies and facilitate the uploading of clean data into a common ERP.

Keep It Clean

After the audit, the cleaning process can begin. Data will pass through a series of automated software programmes that discard what is not compliant with the specified KPIs. The result is then tested for correctness and incomplete data will be amended manually, if possible. A final quality control phase will ensure that the output data is clean enough to by seamlessly uploaded into the chosen ERP.

However, just like when cleaning our homes, a big clean-up every now and then is not enough. The best approach is to implement a culture of continuous data improvement, distributing tasks among each member of the team. Developing practices that support ongoing data hygiene is the key to success.

About the Author:  Neil Ballinger is head of EMEA at automation parts supplier EU Automation and for more information on how to use big data to optimise your business, visit www.euautomation.com

Image: Unsplash

You Might Also Read: 

Some Expert Predictions For Industrial Cyber Security:

 

« Myanmar’s Cyber Security Bill
A Successful Solar Winds Investigation »

CyberSecurity Jobsite
Check Point

Directory of Suppliers

Directory of Cyber Security Suppliers

Directory of Cyber Security Suppliers

Our Supplier Directory lists 8,000+ specialist cyber security service providers in 128 countries worldwide. IS YOUR ORGANISATION LISTED?

CSI Consulting Services

CSI Consulting Services

Get Advice From The Experts: * Training * Penetration Testing * Data Governance * GDPR Compliance. Connecting you to the best in the business.

TÜV SÜD Academy UK

TÜV SÜD Academy UK

TÜV SÜD offers expert-led cybersecurity training to help organisations safeguard their operations and data.

Practice Labs

Practice Labs

Practice Labs is an IT competency hub, where live-lab environments give access to real equipment for hands-on practice of essential cybersecurity skills.

Jooble

Jooble

Jooble is a job search aggregator operating in 71 countries worldwide. We simplify the job search process by displaying active job ads from major job boards and career sites across the internet.

Cyber Security Expo

Cyber Security Expo

Cyber Security EXPO is a unique one day recruitment event for the cyber security industry.

Cyber London (CyLon)

Cyber London (CyLon)

CyLon is a leading cyber security accelerator and seed investment programme. We help entrepreneurs from across the globe to build cyber security businesses, raise investment, and develop partnerships.

Cyber Defense Initiative Conference (CDIC)

Cyber Defense Initiative Conference (CDIC)

Cyber Defense Initiative Conference (CDIC) is one of the most distinguished Cybersecurity, Privacy and Information Security Conference in Thailand and Southeast Asia.

Calian Group

Calian Group

Calian is a diverse Canadian company offering professional services in areas including Advanced Technologies, Health, Learning and IT & Cyber Solutions.

Travelers

Travelers

Travelers is a leading writer of US commercial property casualty insurance and one of the world’s largest global insurers for cyber insurance.

Cybersecurity Collaboration Forum

Cybersecurity Collaboration Forum

The mission of the Cybersecurity Collaboration Forum is to foster information security communication and idea sharing across the C-Suite, enabling leaders to better protect their enterprises.

Aligned Technology Solutions (ATS)

Aligned Technology Solutions (ATS)

ATS manage, monitor, and maintain everything from your network and servers to your workstations and mobile devices, and we do it proactively to eliminate downtime and keep hackers at bay.

NARIS

NARIS

NARIS is the leading provider of an integrated Governance, Risk and Compliance platform called NARIS GRC.

Suridata

Suridata

Suridata’s SaaS Security platform enables organizations to secure the use of SaaS applications.

RMRF Tech

RMRF Tech

RMRF is a team of cybersecurity engineers and penetration testers which specializes in the development of solutions for early cyber threat detection and prevention.

Datastream Cyber Insurance

Datastream Cyber Insurance

DataStream Cyber Insurance is designed to give SMB’s across the US greater confidence in the face of increasing cyber attacks against the small and medium business community.

SilverEdge Government Solutions

SilverEdge Government Solutions

SilverEdge is a next generation provider of innovative and proprietary cybersecurity, software, and intelligence solutions for the Defense and Intelligence Communities.

ASMGi

ASMGi

ASMGi is a managed services, security and GRC solutions, and software development provider.

Boltonshield

Boltonshield

Boltonshield provide a unique and proactive approach to cyber defence with managed security services, integrated technologies, and a team of security experts, ethical hackers and analysts.

CoinCover

CoinCover

Blockchain technology is changing everything. However, it brings its own set of unique risks. Coincover ensures everyone is protected, enabling them to innovate freely, without constraints.

Arms Cyber

Arms Cyber

Arms Cyber is redefining ransomware defense with advanced solutions that stop attacks before they start.