We live in a digital world and as more inventions and institutions become subject to computer programming intellectual property law is being applied to code. Given the extensive nature of programming languages and code execution, analyzing and comparing source code - lines of code commands to be compiled into a computer program - or complete software is a tedious manual task and one subject to human error. As a result, new tools for assessing coded materials in IP litigation are emerging.
Bob Zeidman, founder and president of Zeidman Consulting and Software Analysis and Forensic Engineering (SAFE), has pioneered multiple litigation tools for forensic engineering and software plagiarism detection as well as source code and software analysis methods and standards. With a formal education in physics and electrical engineering, Zeidman says he’s always had an entrepreneurial spirit and found his way into IP law when he was asked to help analyze software for a case and realized he could automate much of the work. In a recent interview Zeidman discussed the growing role of code analysis tools in IP litigation and his work in the field.
Q&A with Bob Zeidman
What is the role of software analysis and comparison tools in IP litigation? Can you explain how this area got started and how it is evolving?
BZ: About 20 years ago, I was asked by an attorney to compare some software source code for a trade secret case. It was long, tedious work and although I got paid per hour, I decided to write a utility program to automate the task. I then began telling colleagues about the program and getting their input. I kept refining the algorithms and when lawyers heard about it they started calling me. Eventually, I released the program as CodeSuite, a set of utility programs for analyzing software for IP litigation from my new company Software Analysis and Forensic Engineering (SAFE Corporation).
Professor Mike Flynn helped me formalize the theory behind “source code correlation” and I published several articles about the algorithms and mathematics in academic journals. There were already a few programs from universities that compared software code to find “software plagiarism” (a term I avoid because it’s loaded with legal implications), but they were experimental and frankly didn’t work well and couldn’t stand up in litigation. SAFE Corporation started selling software and training IP consultants.
How can students and lawyers educate themselves about this area of IP litigation and new developments in it?
BZ: My book The Software Detective’s Handbook: Measurement, Comparison, and Infringement Detection is a good reference. In the beginning there’s a guide for which chapters are best for lawyers, programmers, and mathematicians. The book explains programming and software code to lawyers, it explains intellectual property to programmers, and it explains the algorithms and mathematics of code similarity and code correlation to mathematicians. Of course anyone can read the entire book and glean a lot of information in all of these fields.
SAFE has created tools used in roughly 100 software copyright and trade secret cases to date - can you give some detail on what those tools are, why such tools are important, and how they’ve been used?
BZ: Our main tool is CodeSuite, and the most commonly used function of CodeSuite in litigation is CodeMatch, which compares thousands of source code files in multiple directories and subdirectories to determine which files are the most highly correlated. This can be used to significantly speed up the work of finding source code plagiarism, because it can direct the examiner to look closely at a small amount of code in a handful of files rather than thousands of combinations.
CodeMatch divides the software source code of two programs into basic elements - statements, comments and strings, identifiers, and instruction sequences - before comparing them. It compares these elements separately, deriving a correlation score between 0 and 100 for each element and then combines the individual correlation scores into an overall correlation score. CodeMatch then generates a database of the most highly correlated files in the two programs being compared. The SourceDetective function of CodeSuite can then be used to search the internet and eliminate third party code such as open source code or common code elements. After this filtering, a software forensics expert only needs to examine the remaining correlation using a process that I developed to filter out any remaining code elements that are not protectable by copyright. This equates to the well-known abstraction-filtration-comparison test first introduced in Computer Associates v. Altai. At the end of the process, if any correlation still exists between the two programs, then one was copied from the other.
Zeidman Consulting provides expert witness services for intellectual property litigation - can you explain this a bit? What sorts of trials have you been involved in and in what capacity?
**BZ:**We train experts in the use of CodeSuite and certify them, so that a client knows that these people can efficiently use CodeSuite and correctly produce results for court. These consultants often work through Zeidman Consulting or they work independently. Zeidman Consulting has provided consultants and expert witnesses for more than 180 court cases involving billions of dollars in disputed intellectual property including Brocade v. A10 Networks, for which I testified at trial; ConnectU v. Facebook, made famous by the Academy Award-winning movie “The Social Network;” Texas Instruments v. Samsung Electronics, which resulted in an award to TI of over $1 billion; and the landmark Oracle v. Google case about the copyrightability of software APIs. In all of these cases, our results were found to be accurate and correct.
You said you still find yourselves up against experts using inaccurate or incorrect tools - can you explain this? When has this occurred and why do you think the issue persists?
BZ: I’ve made it a personal goal to quantify and standardize the field of software forensics by providing quantitative tools and a formal methodology for comparing software for determining IP infringement. Many times, experts use their own proprietary tools to examine software, but these tools are rarely tested or peer reviewed even though they are often accepted in court. There are tools coming from major universities that have not gone through real-world testing and qualification, and they make many assumptions that can produce false results.
Many experts have impressive backgrounds in computer science but little knowledge of IP law. I think this occurs because people trust scientists and experts and don’t really understand the underlying principles of infringement analysis. In other words, the judge and jury are given a lot of confusing technical jargon that sounds impressive. And in some cases, when a party to a trial has a poor case, they want to hire an expert who can manufacture results that support their position and confuse the judge or jury. I find that really disturbing, and I like to think I’m doing my part to combat that and help ensure that justice is served.
What do you see for the future of IP litigation technology?
BZ: We are always examining ways to improve the performance of CodeSuite and our other tools. Also, we are always adding support for new programming languages as they come into being. I believe that technology is creating amazing new ways of assisting the legal profession, but it’s important to have the final results reviewed by experts and lawyers. Some technologists believe that artificial intelligence will eventually eliminate the need for experts in many fields, but that concerns me. Technology is a tool. We don’t want to lose our understanding of the law or allow technology to make important decisions for us, legal or otherwise; we want to leverage technology to make the best decisions possible.