Features

Proving and Preventing Genericide with AI

Published: May 18, 2022

Cameron Shackell Queensland University of Technology Brisbane, Australia

As in many areas of society, technological disruption seems inevitable in the trademark space. Is artificial intelligence (AI) the future of genericness evidence and monitoring? To answer this question, we need to consider drivers for change and clarify some of the perceptions of AI evidence. Indeed, there are some nuances in the application of AI to trademark genericness that warrant careful consideration.

Driver One: Big “Natural” Evidence

Most genericness evidence adduced today is curated in some sense. Dictionary definitions, consumer surveys, and even large corpus linguistics projects (such as the Corpus of Contemporary American English) are all designed and selected by someone. Unconscious bias, especially availability bias, can lead to the exclusion or de-emphasis of large parts of “the relevant public.” At best, curated evidence is a tiny but well-balanced sample of how a mark is used.

AI evidence, on the other hand, can harness what is called natural language processing (NLP). The “natural” in NLP is not just a term of art. NLP’s rationale is to target real-world (rather than curated) discourse. By embracing orthographic, grammatical, and semantic oddity, it aims to make analysis more definitive.

Because NLP is not fussy about what it is applied to, it has practically unlimited capacity to consume data. So, its power grows with the terabytes of new data—contributions to Amazon, Facebook, Reddit, Twitter, and so on—that expand the digital record each day. And it is arguable that proof of genericness of virtually any mark can be found in Internet data, simply because of its size and coverage.

The rise of NLP raises an important question: is the primary significance of a term better deduced from selective, curated evidence, or from the unfiltered online behavior of practically the entire jurisdiction? Or, in other words, is genericness shown through examples selected by experts or by what people are actually saying and doing online?

Driver Two: Time and Precision

In the author’s commercial experience assessing genericness, clients do not usually expect the question, “What precise period should I analyze for genericness?” But it does matter. A cancellation action may last years. Is the year of the initial filing most relevant? The five years before that? What about last month? It seems many attorneys have never dared to dream of this sort of precision in evidence.

There are currently low expectations about the timeliness and precision of genericness evidence.

There are currently low expectations about the timeliness and precision of genericness evidence. It is assumed, on practical grounds, that evidence will be collated from relatively tiny language samples that may not be contemporaneous. Yet, there is little acknowledgement that both such factors militate against reliability.

A longitudinal picture of semantic change over precise census periods is a revelation that NLP can bring to genericness evidence for one simple reason: the data is usually timestamped.

So, AI can design an experiment to answer a question such as: “How often was the word ‘aspirin’ capitalized in June 2021?” It would be a stretch to say traditional evidence could do the same.

Clarifying Perceptions

It is somewhat understandable that trademark genericness evidence remains largely in the pre-Internet era. Legislators and courts want to maintain a transparent standard with historical continuity, even at the risk of some lag to altered consumer reality. Some popular perceptions of AI evidence, however, are outdated or not well founded and need clarification.

Is the ‘Relevant Public’ Online?

Digital data is, of course, biased toward those who go online. A few years ago, this may have been a severe limitation. In 2001, online discourse was probably skewed toward the relatively affluent and those interested in technology. But is this a valid objection in 2021?

The amount of analyzable data, expanding in scope and intensity during the pandemic, is staggering and will only increase. Certainly, many governments now assume that its citizens have access to the Internet for various obligations. The same can broadly be said of judicial institutions. Moreover, many purchasing decisions are now made online, not to mention advertising. The link between trademark use online and consumer behavior is clearly established.

Today, even those with poor literacy are not excluded from online discourse. The spoken word is just as contributable and available as the written word thanks to the rise of podcasts and video-sharing channels. There is no reason to think Internet data does not represent trademark use by the relevant public anymore.

Gamification

Evidence derived from social media and other online sources is often perceived as manipulable by interested parties or, in the parlance of AI, open to “gamification.” A crafty hacker, for example, might create bots or employ farms of “influencers” to try to take the big “G” out of “Google” or seed genericizing phrases such as: “My Samsung is the best iPhone I’ve ever owned.”

While some of the big tech firms might be suspected of language engineering on this scale, most are not that influential. And, on the whole, digital forums are quite resistant to gamification. This integrity comes from both firms and customers. Firms realize that they will lose their customers if they feel targeted by controversial content and take all manner of steps to keep things well-moderated. In fact, this is reflected in the difficulty many have had in monetizing their platforms—people jump ship at the first sign of exploitation.

The amount of analyzable data, expanding in scope and intensity during the pandemic, is staggering and will only increase.

Also, from the customer side, there is the force of self-moderation, which transcends any one platform. If an individual uses a trademark in a way the community thinks strange, it will be called out. It is simply not that easy to premeditate a “viral” shift in the usage of a term. Moreover, things can easily backfire.

An attempt to genericize a mark, much like a high-profile genericide action, may only serve to publicize the fact that a term is a mark. While there is certainly a need to be aware of data sources, the results are potentially less susceptible to undue influence than traditional evidence because of the data’s size and its natural, self-curating nature.

There is also a broader fallacy in this perception. Gamification is not new. Firms have invested in promoting their marks and influencing the public for centuries. If online genericness gamification is successful, it will impact all evidence, online or traditional. It is a new mechanism rather than a new process.

Automation and Explainability

On certain well-defined tasks, it is now impossible for human beings to outperform computers. Humans can carry out these tasks, but computers are superior in both speed and precision. Unfortunately, this rather straightforward automation aspect of AI is often conflated with the complexity of more inscrutable AI techniques such as neural networks.

An AI process for genericness measurement may utilize complex mathematics, but it does not need to. The “explainability” of the process can be maintained by designing it to adhere to what humans could do if human resources were unlimited.

For example, it is theoretically possible for a team of human researchers to meticulously count every capitalized and uncapitalized occurrence of a mark in a library of books. But, if NLP is used to automate the task, nothing is lost in terms of explainability, and something is gained in precision. It is just a shortcut.

However, if a “black box” is trained using deep learning to classify marks into “generic” and “not generic,” we have a much more difficult time understanding and trusting the results, and we should not expect courts to be any different.

An AI approach to genericness evidence need not be an incomprehensible presentation of abstract mathematics and arcane engineering. It can and should be a well-grounded description of how computers were used to create evidence in a way human beings could if there were enough of us to do it on the same scale.

Our AI System for Measuring Genericness

Although NLP has advanced in leaps and bounds over the past decade, practically no work has been done on applying it to trademark genericness. Seeing this gap, the authors developed a set of metrics for genericness by working backward from the common recommendations on how to avoid it.

The first metric was simply to measure the rate of capitalization of a mark. For example, if aspirin is almost always written as “aspirin” instead of “Aspirin,” it probably is an indicator of genericness.

A possible objection to such a metric is that ASPIRIN might be systematically written with a capital “A” in some contexts but not others, even by the same person. The second metric, therefore, accounted for the context in which each form occurs. To do this, experts can use an NLP technique called word embedding which “embeds” words into what is called a “high dimensional vector space.”

Although it sounds complex, word embedding is just a statistical way of profiling which words occur together most often. Armed with this model trained on 100 billion words, experts are able to determine if mark capitalization was systematic.

So, after seeing from the first metric that “aspirin” is very common, the model enables experts to determine that “Aspirin” and “aspirin” in fact occur in the same context very frequently. This provides evidence that most people do not differentiate the two forms.

The economic harms of genericide do not suddenly materialize at a sharp inflection point.

Of course, at this point, evidence that “aspirin” is used in its category as a generic term is still incomplete. To measure this further, experts need a way to compare marks and known generic terms. It turns out that there is a theory that can help called “distributional generality.”

The theory asserts that words that are hypernyms (words for categories) tend to occur in a larger range of contexts than hyponymic terms (words in a category). The intuitive example for this is that the word “dog” is found near a much larger range of words than a specific breed such as “dachshund.”

Again, experts can use word embeddings to formalize this. If “aspirin” occurs in many more contexts than, say, “salicylic acid,” or any other mark name in the category, it is very likely a generic term. To derive a third metric that preserves explainability, experts can calculate the simple mean and standard deviation of the similarity of 20 neighboring terms.

Of course, these metrics mean only so much in isolation. To offer some validation of the above metrics, the authors cast them into a procedure that successfully classified 30 example marks into generic, somewhat generic, and distinctive with only a couple of anomalies.

Economic Gains

There is a broader picture in developing AI to measure genericness than simply bolstering evidence in individual cases. The societal harms caused by marks that are used as generic terms are well-known.

The traditional approach to evidence in genericide cases implies a binary view of genericness. But the economic harms of genericide do not suddenly materialize at a sharp inflection point. There is a long path to recognition and cancellation along which market distortion and economic efficiency occurs. AI offers a way out of the binary view of genericness.

Using NLP and the growing digital data, eventually it will be possible to quantify the genericness of almost every mark in near real time. With this possibility in mind, the future of trademark genericness may not be court battles over a binary “yes” or “no” ruling. It may be genericness “speeding tickets” sent out by regulating authorities monitoring the online discourse of its jurisdiction with an AI system.

Parts of this feature are based on the article: Shackell, C., De Vine, L. Quantifying the genericness of trademarks using natural language processing: an introduction with suggested metrics. Artif Intell Law 30, 199–220 (2022).

Although every effort has been made to verify the accuracy of this article, readers are urged to check independently on matters of specific concern or interest.