The success of an artificial intelligence (AI) algorithm depends in large part upon trust, yet many AI technologies function as opaque ‘black boxes.’ Indeed, some are intentionally designed that way. This charts a mistaken course.

Trust in AI is engendered through transparency, reliability and explainability. In order to achieve those ends, an AI application must be trained on data of sufficient variety, volume and verifiability. Given the criticality of these factors, it is unsurprising that regulatory and enforcement agencies afford particular attention to whether personally-identifiable information (“PII”) has been collected and employed appropriately in the development of AI. Thus, as a threshold matter, when AI training requires PII (or even data derived from PII), organizations need to address whether such data have been obtained and utilized in a permissible, transparent and compliant manner.

While regulatory and enforcement mandates require consistency, transparency and precision in the acquisition and utilization of sensitive data, the current governing AI environment, especially in health care, is one of an incomplete, and sometimes contradictory, patchwork of laws and regulations. This patchwork is comprised of a myriad of Federal and State laws and regulations addressing different aspects of risk, international law, and common law. For example, healthcare data are often regulated under the Health Insurance Portability and Accountability Act (“HIPAA”), but HIPAA does not always apply to AI technology companies under many circumstances. To fill the gap in the technology space where HIPAA might not apply, the Federal Trade Commission (“FTC”), has taken action against some companies for having lax privacy and data protection practices.

As development of complex AI models often requires years of investment in money, time and other resources, it is critical that organizations thoroughly comply with privacy laws throughout the development process. Furthermore, once developed and deployed, AI models cannot easily be replicated or replaced. Therefore, any gap in compliance with respect to obtaining proper authority to acquire or utilize training data could lead to significant (perhaps existential) consequences for an organization. Recent trends have illustrated the power of regulators and enforcement agencies to require disgorgement of data from AI or even wholesale AI destruction as a remedy for improper data practices.

Federal Trade Commission Authority and Oversight

The FTC increasingly has made it clear that its enforcement priorities will focus exclusively on consumer privacy and data protection. The FTC has broad enforcement authority under Section 5(a) of the Federal Trade Commission Act, which regulates unfair and deceptive trade practices. The FTC has leveraged the FTC Act to enforce against entities alleged to engage in deceptive or unfair privacy practices. The FTC has pursued investigation and cases involving companies based on allegations related to failures to:  (1) sufficiently notify consumers about privacy practices; (2) adhere to representations made in privacy policies; and (3) implement reasonable security safeguards to protect PII.

In several recent cases involving allegations that companies improperly collected and utilized personal data to train AI applications, the FTC has requested disgorgement of the personal data from the AI app, even to the extent of demanding destruction of the AI algorithm itself. One of the first FTC enforcement actions of this type arose in 2019, where the FTC ordered Cambridge Analytica to destroy AI derived from PII collected from consumers on an allegedly deceptive basis. On the heels of that action in 2021, the FTC similarly required a photo app developer to destroy its facial recognition AI.

The FTC alleged in that case that the company deceived or otherwise misled consumers about its practices related to collecting, retaining and using user photos even after deactivation of user accounts. In announcing this settlement, Rohit Copra, former FTC Commissioner, stated:

“First, the FTC’s proposed order requires Everalbum to forfeit the fruits of its deception. Specifically, the company must delete the facial recognition technologies enhanced by any improperly obtained photos. Commissioners have previously voted to allow data protection law violators to retain algorithms and technologies that derive much of their value from ill-gotten data. This is an important course correction.”

Clearly, the FTC’s position is that ‘but for’ the use of ill-gotten PII to train AI, such AI would not exist in permissible form, and violating organizations should not benefit from such technologies in the market.

More recently, the FTC continues this “algorithmic disgorgement” trend by seeking three remedies against a company that allegedly obtained and used children’s PII in violation of the Children’s Online Privacy Protection Act (COPPA). In this case, the court-approved settlement required the company to: (1) destroy any children’s PII collected without verifiable parental consent; (2) destroy any AI developed using such PII; and (3) pay a $1.5 million civil monetary penalty. Interestingly here, the FTC was able to obtain civil monetary penalties available under COPPA which would not otherwise be available under the FTC Act.

Preserving Privacy Throughout AI Development

Based on this enforcement trend, that has magnified the risk of severe penalties up to and including destroying AI applications that have been compiled or trained based on improper data practices, it is mandatory for AI developers to ensure that they have adequate data rights to permit collection and use of PII for purposes of AI development and training. There are three key steps that an AI company should take to ensure regulatory and legal compliance.

First, AI companies should consider establishing a Data Governance Program. A robust Data Governance Program should: (1) establish the policies, procedures, and other compliance infrastructure necessary to identify key data sources; (2) determine data rights; and (3) maintain oversight of data used during AI training.

Second, an organization should undertake risk assessment and business continuity processes to evaluate which AI models could be at risk based on data rights, and establish alternative plans in the event that an AI model is ordered to be disgorged of PII or otherwise destroyed.

Third, AI companies should conduct adequate diligence of third-party vendors and data sources with whom they partner to develop and train AI. By engaging in these activities, AI companies can generate a solid understanding of the data rights they have, can document that data was used appropriately, and manage risks emerging from regulators and enforcement agencies related to “algorithmic disgorgement”.

To learn more about the legal risks of and solutions to preserving privacy in AI development, please join us at Epstein Becker Green’s virtual briefing on Explainable Artificial Intelligence and Transparency: Legal Risks and Remedies for the “Black Box” Problem on June 9 from 1:00 – 4:00 p.m. (ET). To register, please click here.