OpenEvidence v. Pathway: The Legal Battle Over AI Reverse Engineering

OpenEvidence v. Pathway: The Legal Battle Over AI Reverse Engineering

Can generative AI models like ChatGPT be “reverse engineered” in order to develop competing models? If so, will this activity be deemed legal reverse engineering or illegal trade secret misappropriation? This question is set to be tested in a recent trade secret lawsuit filed by OpenEvidence Inc. against Pathway Medical Inc.

Guest Post by Professor Camilla Hrdy (Rutgers Law)

Can generative AI models like ChatGPT be “reverse engineered” in order to develop competing models? If so, will this activity be deemed legal reverse engineering or illegal trade secret misappropriation? I have now written a few articles on this topic and it is clear that the lines are not clearly defined.

In software cases, courts have permitted distributors of software to assert trade secrecy in their source code, even after widespread public distribution, because the code is typically compiled into “object code,” making it difficult to “decompile,” and so still legally secret. However, a bunch of questions jumped out at me: Contractual precautions will be important, both for establishing that the putative secret is “not readily ascertainable” and for establishing that plaintiff took “reasonable” secrecy precautions.

OpenEvidence does have a “Terms of Use.” It states, among other things, that users agree the “software” “contains proprietary and confidential information that is protected by applicable intellectual property and other laws.” However, I don’t think this Terms of Use is sufficient to generate an express duty of confidentiality. It’s a mass-market, non-negotiated “contract of adhesion.”

Instead, OpenEvidence will have to argue under an “improper means” theory. The question will be whether defendant used “improper means” to access OpenEvidence trade secrets. This case raises some big picture questions about protecting generative AI models as trade secrets.

The Legal Landscape

The main types of trade secrets that generative AI companies might be able to protect include algorithms, code, training data, and aspects of the models’ overall system architecture, including how it was trained, developed, implemented, and “fine-tuned.”

The primary trade secret identified in OpenEvidence’s Complaint is the system prompt code. This refers to the instructions given to a generative AI model in the order to guide its responses to users, customize the model, and enhance its performance.

Contractual Precautions

Contractual precautions will be important, both for establishing that the putative secret is “not readily ascertainable” and for establishing that plaintiff took “reasonable” secrecy precautions.

However, I don’t think OpenEvidence’s Terms of Use is sufficient to generate an express duty of confidentiality. It’s a mass-market, non-negotiated “contract of adhesion.”

Implications

This case raises some big picture questions about protecting generative AI models as trade secrets.

First, how hard is it, in fact, to reverse engineer generative AIs? A few years ago, people apparently thought reverse engineering generative AIs would be hard if not impossible. But now it’s not so clear.

Second, how will courts view data extraction through strategic prompting in order to learn about how a particular model was developed? Will they see this as akin to buying a product on the open market and picking it apart, i.e., traditional legal reverse engineering? Or will they view this as acquisition by improper means, like hacking into a computer or flying a plane over an unfinished plant to see what’s inside?

Third, how much deference will courts give to contracts? Can attaching a terms of use that prohibits reverse engineering turn otherwise-lawful reverse engineering into acquisition by improper means?