The hunger for data has become insatiable and rightfully so. Digital gold is changing the face of business and frankly, society in so many ways. Think more accessible e-health solutions and public services, automated business workflows, and even easy access to knowledge – all possible thanks to data. Yet, the insatiable appetite comes with its own set of challenges.

The European Union, over the past few years, has slammed companies with fines totaling nearly €3 billion—over 1,400 times for violations of the GDPR. In fact, in 2023 data protection fines have reached an all time high. This isn’t just a mere corrective measure by EU; it’s a sign of concern that privacy should be discussed more thoroughly in boardrooms and server rooms. 

Moreover, as AI embeds itself further into our day-to-day, casting scarcity on the job market, a new chapter unfolds. The one where we have creators from different backgrounds protesting over the way AI collects the data. In an age where every click leaves a trace and every byte holds a story, the biggest issue is no longer the rate of progress but our ability to direct it.

The Heart of the Problem

True to form, (1) ethics often lag behind progress. The field of data science expands at such a rapid pace that many practitioners overlook the ethical dimensions of their activities. In fact, many leaders behind closed curtains work under the notion that “ethical constraints” are holding them back. Business and technical leaders debate product innovation, UX, resources, and strategies but rarely the data compliance. Needless to say, pursuing profitability and accelerated growth rates can lead to a sidelining of ethical codes. Therefore, striking a balance between the rate of technological progress and ethical conduct is crucial to navigating privacy challenges successfully.

Furthermore, privacy is not just a technological problem requiring technology solutions. It’s an (2) organizational challenge that involves a lot of people from top to bottom of the hierarchy. If one part of the chain is complicit in their actions, the whole array of privacy issues can surface. Therefore, just like accessibility, it’s best when data compliance is not regarded as an initiative but rather a guiding compass. 

Finally, consider the (3) challenge of limited resources. Companies operating on tight budgets might find it difficult to invest in the essential technology and expertise needed for strong data protection. Small to mid-sized businesses often face obstacles due to limited resources, like a lack of proper incident response plans, occasional human errors, and not having enough cybersecurity measures in place.

Now, let’s explore how organizations can improve handling data in line with the latest advice from regulators and experts.

#1 Should You Entrust a Toddler to Do Serious Calculus?

In many companies, Data Protection Officer (DPO) makes sure the company follows the regulations in handling personal data. This rings true in both smaller and bigger companies. The DPO typically has a legal background and is ideally fairly familiar with the technical aspects of the role.

With that said, many companies rely on the mid-level to senior DPO to make decisions. That’s a great burden on the shoulders of one person who can be misinformed into saying what’s okay and what’s not. Even in Europe, where we care about data regulations, many companies still leave it up to a mid-level professional who might not fully get the gist. This is far from the ideal – especially when it comes to data protection and privacy rights. It’s like expecting someone who’s been a great physics student in high school to solve really hard algebra. 

Some big companies are setting an example by doing it differently. They have a special group called an Institutional Review Board (IRB) made by members who understand both data ethics and technicalities. It’s like a C-level board of directors but for privacy concerns. Software engineers directly involved in projects, cyber security experts, data engineers, compliance experts, and even outside consultants – they are all a part of this formalized or ad hoc team. This way, it’s not just one person waving the green and red flags. 

Now, for some companies, this may be too much to bother – and reasonably so. Bringing in a third-party law firm to manage compliance is another safe bet. These professionals excel at deciphering legal intricacies and have experience in helping companies adhere to regulations.The European Data Protection Board, sensing the gravity of this situation, raised a red flag in March 2023. They expressed concerns about mismatched DPOs steering the ship without a profound overview of the issue. Consequently, through a series of questionnaires and a coordinated review of the position and tasks of data protection officers, they hope to nudge companies to better handle the people’s sensitive data.

#2 User Consent And Carefully Handling Second-Hand Data

We live in a world where computers can understand our personalities as accurately as our closest friends. As shown by Youyou Wu, Michal Kosinski, and David Stillwell’s research, computer estimations using Facebook likes – openness, agreeableness, extraversion, conscientiousness, neuroticism – are nearly as accurate as assessments made by your spouse. And no, this is not a slap on your close ones…

Now, imagine the gravity of the issue where companies decide to repurpose this data or just sell it and forget it. Where does that leave people’s private thoughts and actions? Therefore, the need to protect this data and use it for the very purpose you clarified in user consent is paramount.

For companies, the solution for handling collected data is simple: be clear, ask for permission, and clarify the purpose and data retention period. You should tell exactly why you want your information by using words people can easily understand, have an option to disagree to, and change their decision later on. Also, companies need a clear information about the location of the company’s data, who can access it, whether and when it will be anonymized, and the timeline for its destruction. As a result, numerous companies may need to modify their current protocols and arrangements – yes, potentially driving additional costs.This leads us to the next issue of repurposing the data. Many companies use customer databases to offer additional services, but this is known to lead to problems. In 2021, the Information Commissioner’s Office in the UK, responsible for promoting data privacy, accused Virgin Media of breaking customer privacy rules. Virgin Media had sent 1,964,562 emails to inform customers about freezing subscription prices, which was okay. However, they also used these emails to advertise other things to the customers. Because 450,000 people on the list had said they didn’t want to get these ads, the regulator fined Virgin £50,000 for not following that agreement. This is a classic case of using previously gathered data to make money without permission. Companies need to carefully think about and discuss this internally before making any risky choices.

#3 Anonymize, But Really…

The challenge for many companies lies in striking the balance between too little and too much anonymization of data. Striking this balance is crucial, as insufficient anonymization violates government regulations, while excessive measures render the data ineffective for AI advancements and marketing. 

Now the question lies, why is issue of data anonymization so crucial for the ethics of handling the data? Well, as the world turns more to the hands of AI with more small to mid-sized companies training algorithms for various purposes, the pivotal ethical concern becomes proper data anonymization. This urgency intensifies when deploying AI for public services, healthcare, and more. As the world entrusts AI with crucial tasks, maintaining robust anonymization practices becomes paramount for ethical data handling.

Various anonymization techniques exist, ranging from aggregating and approximating data to pseudonymizing variables with nonrepeating values. However, even seemingly well-anonymized data can pose privacy risks. Researchers have demonstrated the ability to identify individuals using seemingly innocuous details like gender, birth date, and postal code. Netflix’s attempt at anonymizing customer movie ratings data failed when researchers, using a third-party dataset, identified 84% of customers. This just illuminates the importance of making sure data anonymization is consolidated and eventually done right.

To enhance privacy protection, a technique called differential privacy, employed by startups like Sarus, adds an extra layer by blocking algorithms from disclosing specific records. It’s a method to safeguard people’s privacy when analyzing data. By adding controlled noise to data, it prevents the identification of individual records while still allowing valuable insights to be extracted. Although differential privacy is one of the proven to work, this is not to say that “this or that method” is the only way to go.

In fact, there are a myriad of state-of-the-art techniques that help companies keep data intact. The critical factor is to do your research and find a way to add another layer of protection to reduce the risk of sensitive information being exposed.

Similar insights