The Next Wave of Generative AI: Domain-Specific LLMs

ABSTRACT: This blog examines the upcoming trend of domain-specific LLMs and evaluates three different methods of implementation.

Sponsored by Intel

Generative AI dazzles us in new and exciting ways. But the business opportunity boils down to a familiar trend: using data-hungry algorithms to transform business processes, boost productivity, and gain competitive advantage.

This time around the algorithms are large language models (LLMs) that increasingly will consume domain-specific enterprise data rather than public data scraped off the internet. This blog explores this coming wave of domain-specific LLMs (which Eckerson Group also calls “small language models”) and assesses the pros and cons of three implementation options: (1) building LLMs from scratch, (2) fine-tuning pre-trained LLMs, and (3) enriching prompts. While the approaches vary by sophistication, cost, and company type, they all depend on internal enterprise data to succeed. 

Generative AI Defined

Generative AI (GenAI) is a type of neural network that trains itself to interpret and create digital content such as text, images, or audio. GenAI first gained prominence in 2014 with the rise of generative adversarial networks (GANs), in which one neural network iteratively approves or rejects synthetic content from another network. (For example, GANs create “deep fake” videos.) In 2017, researchers at Google introduced a “transformer” architecture that converts sequences of inputs into sequences of outputs, giving rise to the large language model (LLM.)

An LLM is a huge calculator that predicts content, most often strings of words, based on what it learned from other words. It relies on an “attention network” whose parameters quantify how tokens—i.e., words or punctuation marks—relate to one another in a large corpus of existing text. This attention network enables the LLM to generate responses to human prompts during what is known as the inference phase. OpenAI’s release of Chat-GPT 3.5 one year ago triggered today’s arms race among open source-communities and vendors such as Google, Microsoft, Hugging Face, and Anthropic. The recent chaos at OpenAI and its investor Microsoft illustrates the high stakes and potential risks of this disruptive technology.


The LLM is a huge calculator that predicts strings of words based on what it learned from other words


GenAI manipulates words, images, numbers, and sounds, which Paul Baier, Jimmy Hexter, and John Sviokla of GAI Insights refer to as “WINS work”. Baier and team suggest that GenAI will disrupt digital WINS work in unprecedented ways. “Firms with heavy reliance on WINS work need to act today to fend off stiffer competition and to overcome disruptive competitors within 36 to 60 months.” (Eckerson Group is a partner of GAI Insights. Check out their recent article in Harvard Business Review.)


GenAI manipulates manipulates words, images, numbers, and sounds


This competitive pressure drives companies to get ahead of the curve and play GenAI to their advantage. Common use cases focus on customer service, document processing, research, sales, and marketing. Consider these examples.

  • Customer service. Recommend responses to technical questions, complaints, or unusual requests; and automatically escalate issues. 

  • Document processing. Draft custom documents to assist robotic process automation; detect anomalies and recommend responses to customer inquiries.

  • Research. Find and summarize documents to assist medical diagnoses, legal arguments, academic studies, and other research projects.

  • Sales. Identify opportunities, recommend strategies, draft sales pitches, and gauge sentiment.

  • Marketing. Outline marketing plans, research competitors, draft blogs and outbound content.

Early adopters can tackle multiple use cases at once. This summer Priceline announced plans for an external chatbot to help customers book travel, as well as internal GenAI tools to help employees develop software and create marketing content. Health providers at Meditech use GenAI to summarize patient histories, auto-generate clinical documents, and place orders. And insurance provider Lemonade positions GenAI as a strategic differentiator for the entire business. Its latest letter to shareholders boasts  “we have LLMs trained to answer customer emails, review pet medical records, evaluate satellite and other imagery, read home condition reports and more.”


Many early adopters tackle multiple GenAI use cases at once


Risks

Such use cases require careful governance because they pose a range of risks. Expert humans must oversee LLM inputs and outputs to reduce the likelihood of hallucinations, privacy breaches, bias, regulatory infractions, or mishandling of intellectual property. They also must ensure ethical behavior (including safety and fairness) as well as transparency and accountability. Given the vigilance required to mitigate these risks, we can expect humans to remain in the loop for the foreseeable future.

Time to Get Domain Specific

To boost productivity and gain competitive advantage, LLM adopters are not just using platforms such as ChatGPT or tools such as GitHub Copilot. They also are building LLMs into proprietary applications and workflows, all of which consume domain-specific data. Companies can implement these domain-specific LLMs in one of three ways. They can: 

  • 1. Build an LLM from scratch

  • 2. Fine-tune a pre-trained LLM

  • 3. Enrich LLM prompts

Let’s explore how these options compare in terms of their requirements, pros, cons, and types of adopters, as outlined in this chart.

Implementation Options

Option 1. Build from scratch. This option involves collecting and preparing a corpus of text sources such as documents, emails, and customer service records. Then data scientists build an  LLM and train it to interpret, summarize, and generate content based on patterns it identifies within the corpus. By building an LLM from scratch, a company can address domain-specific ways in which its stakeholders use language—and increase the odds they’ll get the facts right.

However, this option has multiple cons. It requires extensive data science expertise, high data volumes, many iterations, and lots of expensive compute cycles. Cloud-native, AI-centered tech companies can overcome these challenges. But for now most other companies will avoid this option because they lack the funding and expertise to produce anything more effective than what big vendors and open-source communities have built already. 

Option 2. Fine-tune existing model. Companies also can take a pre-trained LLM such as Llama or BLOOM and fine-tune it to better interpret domain-specific language. With this option, they apply the LLM to their own content, check outputs, and adjust parameters—that is, the many variables that describe content patterns—to make outputs more accurate over many iterations. This fine-tuning process can enable accurate use of language and facts while consuming much smaller data volumes than building from scratch.

This option will appeal to companies in healthcare, legal services, or other industries whose stakeholders use language in domain specific ways that pretrained LLMs cannot interpret well on their own. Companies that use language in more general ways, in contrast, will avoid this option. They get sufficient accuracy from pretrained LLMs and therefore don’t need to invest in the necessary data science resources and compute cycles. 

Option 3. Enrich prompts. Companies also can enrich prompts by inserting domain-specific content such as product documentation, customer service records, books, and academic articles. We call this process retrieval augmented generation or RAG. RAG asks the LM to find the answer within that trusted content, reducing the risk of hallucinations or other issues. RAG is known as a type of “grounding” because it creates a foundation of facts. It costs less money than fine-tuning because it does not require high data volumes or compute-intensive workloads. 

Many, even most GenAI adopters will implement RAG as a cost-effective method of getting the facts straight. While RAG does not require extensive data science expertise, data teams must design and implement reliable data pipelines that validate, find, and insert the right content into LLM prompts. We should expect RAG to become a standard approach across sectors, often complementing the fine-tuning option. 

Generate the future

As with many technologies of the past, generative AI represents an opportunity as well as a mandate. Companies must disrupt their own businesses with GenAI before their competitors do so. They can succeed in this effort by embracing domain-specific LLMs and selecting the right implementation approach to make it happen. This can mitigate governance risks, make workers more productive, and generate new business value.

To learn more, check out Eckerson Group’s recent webinar with Intel. In this webinar Ro Shah, AI Product Director, explores Intel’s Generative AI Platform for addressing all three implementation approaches described here. Intel offers Gaudi 2 clusters, Gaudi 2 systems, and Xeon SP Systems that accelerate the training and inference process. It also integrates with a GenAI ecosystem that includes various open-source LLMs, the Pytorch ML framework, Hugging Face Transformers, and Deepset’s Haystack LLM development framework.

Kevin Petrie

Kevin is the VP of Research at BARC US, where he writes and speaks about the intersection of AI, analytics, and data management. For nearly three decades Kevin has deciphered...

More About Kevin Petrie