Regulatory data audit: exploring current legal frameworks
Even in these early years of the formation of big data governance strategies and legislation across countries, proprietary company data is subject to inspection and audit by standards bodies depending on the jurisdiction, especially in the European Union.
Since data-related regulation is in a nascent phase, with the emergence of coherent and globally enforced standards as a distant prospect, it's not possible here to cover all the current state regulatory processes to which your data may be liable, and for which it may one day be audited — either under existing rules or new ones that are set to emerge over the next decade.
Nonetheless, we can examine some of the existing statutes and review others that are coming into focus during this period. For more insights into the specific requirements that apply to your region and industry, however, consider relying on data management services.
Europe’s take on data: the GDPR
The EU General Data Protection Regulation (GDPR) levies fines of 4% of revenue on companies that infract its rules on data protection, while the Draft AI Regulations proposed by the European Commission in April 2021 promise a 6% bite of global revenue for companies that contravene subsequent laws derived from it.
Eventually, companies whose data impinges on European borders (even indirectly, such as through geo-oblivious cloud-hosted services) will be subject to both of these frameworks, each of which specifically deals with data provenance and governance (but not in a way that is necessarily consistent).
The GDPR is being considered around the world as a template for data privacy frameworks. In the US, the GDPR-style California Consumer Privacy Act of 2018 (CCPA) led the way for formal data oversight frameworks in the States three years ago, with frequent calls since for the US to match Europe's lead. Therefore, adhering to GDPR guidelines now may be the best preparation for future regulatory inspections, since even the EU's draft AI regulations cover a lot of the same territory.
Data audit according to the GDPR
The European Data Protection Supervisor (EDPS) provides various insights into the rationale and requirements for an on-the-spot or scheduled data audit, including a helpful overview, an inspection policy framework, and a set of general guidelines to follow.
The GDPR guidelines for a data audit are divided into four sections: lawful basis and transparency; data security; privacy rights; and accountability/governance.
Here we'll look at that part of the European Union's advice on the GDPR as relates directly to data auditing.
Accountability: The GDPR requires companies with more than 250 employees (or companies of any size, if the company handles sensitive data) to maintain an updated list of processing activities for inspection. A data impact assessment (for which an official template is available) is the best way to gauge a company's obligations in this respect.
Justification: Additionally, a company must have legal justification for the data it records, retains, or processes, as outlined in Article 6 of the GDPR.
Disclosure: Article 12 requires robust transparency mechanisms to inform people that their data is being collected, and to define who may access the data and how the data is being secured.
Security: Stored data must adhere to the data protection principles of the GDPR Article 5 and follow the principles laid out there.
Encryption: The GDPR requires the use of encryption or pseudonymization for stored data, and an organization may need to provide evidence that this has been implemented.
Internal access: Besides the data itself, operational security will be included in any data audit, to establish the existence of a strong internal security policy.
Breach disclosure: Undisclosed breaches revealed by a data audit will invoke some of the strongest penalties, depending on the extent to which it can be established that the company was aware of the breach.
Data Officer: The GDPR requires that someone in the organization be accountable for compliance, and authorized to make necessary changes in policy. In certain circumstances, a Data Protection Officer must be employed and dedicated to these matters.
Sharing agreements: Where company data is disclosed to third parties, agreements must be in place, and the EU provides a draft template for this purpose.
User control of existing data: Where company data holds material on individuals, a data audit will need to demonstrate that the end-user can access and correct data held on them. Processes must also be in place to stop sharing or delete the data, if requested.
Oversight of automated processes: This section of the GDPR crosses most into the coming European AI regulations, mandating that any automated decision processes that have a legal or 'similarly significant' effect on an individual's rights must have human-led processes in place in the event of a challenge from end-users.
The state of data regulation in the UK and US
In the UK, the GDPR was copy-pasted into national law at the time of Brexit, with no obligation to retain the European standards in the future. Nonetheless, the Joint Information Systems Committee (JISC) offers the Data Audit Framework Development (DAFD) guideline document as a policy guide and preparatory checklist for companies researching data audit liabilities.
It's uncertain when a specific machine learning-related regulation will come to the US. Currently, a company's data liability is still largely subject there to older statutes, such as the data protection component of the Health Insurance Portability and Accountability Act (HIPAA); the Gramm-Leach-Bliley Act (GLBA, for financial services); the US Privacy Act of 1974; the Children’s Online Privacy Protection Act (COPPA, which has at least specifically addressed issues around data retention in recent years); and, in the most general terms possible, section 5 of the 1914 Federal Trade Commission Act.
A growing focus on data audit
Two growing trends are set to contribute to an increased demand for data inspection over the next ten years: the emerging ability of new techniques, such as model inversion, to identify data that was used to train a machine learning system and, along with the aforementioned increase in data-related regulation.
Indeed, while current data audit legislation is at a relatively early stage and still suffers from general fragmentation at the international level, the GDPR may serve as a template and pave the way for future governance models.