May 7, 2024 | By Anik Bose, General Partner at BGV & Katya Evchenko, Sr. Associate at BGV
This piece was crafted for enterprises and startups implementing Gen AI-based applications. The blog consists of two parts: in Part I, we explore the key challenges of GenAI adoption; then, in Part II, we discuss strategies and best practices to mitigate these obstacles. Our insights are drawn from BGV’s direct experience with our portfolio companies and a broader set of engagements with the enterprise ecosystem.
Summary
Effective Gen AI implementation depends on actively incorporating human feedback along with defining and measuring KPIs for success. Strategies include rolling out Gen AI incrementally through pilot projects, aligning pricing models, and optimizing AI infrastructure costs with smaller, domain-specific models for enterprise use cases. Addressing poor data quality involves robust data governance and using domain-specific datasets while managing trust issues, which requires a focus on model interpretability and compliance.
Strategies for Mitigating Barriers
Whether a company is inherently Gen AI-native or whether it is simply becoming Gen AI-enabled by incorporating LLM features into their products (e.g., co-pilots), certain successful practices are emerging from our portfolio companies as well as other enterprise Gen AI implementations. The overarching theme behind these strategies is human-centric, emphasizing active human participation and input, rather than deploying Gen AI technology with minimal human interaction and feedback. Defining and measuring clear KPIs is the most effective way to gauge success in overcoming these barriers. The following section summarizes a few ideas on strategies and KPIs to help navigate these barriers successfully.
Adoption Barriers | Suggested Strategies | KPIs | What We Heard |
No Clear ROI | Use an incremental implementation approach. Begin with pilot projects to test Gen AI solutions in real-world scenarios and iterate the ROI impact based on feedback and lessons learned. Such an approach allows for refinement of Gen AI applications while developing clear ROI and minimizing risks and costs. | Productivity gains in cost reductions or time savings (e.g., onboarding time), gained through automation. | “The recipe for success in an enterprise is to have a niche specific use case, delivering 10x improvement at job-to-be-done.”“Start with a POC… many enterprises overestimate what Gen AI can do for them and learn that they need non-AI solutions and more investment, resulting in dropping the projects.”“Because change management takes time, first we see a drop in productivity… the POC has to be long enough to go beyond this phase.” |
Understand your customer-perceived value and explore outcome- or value-based pricing models. SaaS Vendors and Gen AI startups must offer pricing models that enable enterprises to make investments in Gen AI based on tangible ROI. | Tie pricing tiers to key ROI parameters. | “Consider putting a cap on your outcome-based pricing if you deploy at scale; it may discourage a large customer.” “While seat-based model can provide simplicity and predictability for pricing, it may not align with the actual value that customers derive from the product or servicer” | |
To address the rising cost of AI infrastructure: Optimize model size with smaller open-source language models and fine-tune the model with domain specific data sets. This can reduce the cost structure by 3-6X or more in comparison to large models API usage.Consider hosting or owning GPUs as you scale. Utilize APIs during the validation phase for a faster ‘time to value.’Use prompt reduction and prompt compression to improve latency and costs. | Number of Parameters in Gen AI mode. Annotated domain specific data sets. Cost benchmarks comparing a standard configuration of an NVIDIA A100 80GB hosted on Azure with a LLM (e.g. Mistral 7B derivative) vs a customized configuration using an open source small LLM (e.g. BERT 1B parameters). | “Think of the next 5-10 years of your solution… at scale running your own GPUs makes pricing works better without giving out margins… This is especially true if you have a high utilization rate of GPUs.” “We use an earlier version of Llama which is 10x cheaper than the latest one. Same can be applied to earlier versions of GPT.” | |
Low Data Quality and Poor Integration with Workflows | Implement robust data governance practices to ensure data quality and relevance, including domain specific data. This involves establishing processes for data collection, cleaning, and validation to maintain high-quality datasets. | F1 score – commonly used metrics for measuring accuracy, combining both recall and precision in the model’s responses. | “Invest in building high-quality metadata: either manually adding, or using LLM to infer this data.” |
To improve the quality of your data: Use domain-specific datasets.RAG and embeddings are a quick and cost-effective method of utilizing your proprietary data. This approach enables you to obtain more relevant results from LLMs while reducing the occurrence of hallucinations.Have more human input to improve the model; automation is a journey and should be achieved incrementally | Rating of the output based on coherence, relevance, and factual accuracy in comparison to human (e.g., expert) generated output. | “You can charge a premium when you are more accurate than public sources. You can either inject more tailored and domain-specific data – RAG comes in, or you deliver on security, so that the data never leaves your firewall.” “We leverage user-generated data by enabling our users to choose the best option instead of having the machine decide each time,” | |
Foster collaboration among data scientists, domain experts, and IT professionals to ensure seamless integration of Gen AI solutions into existing enterprise systems and workflow; revise existing workflows; if you build atop existing solutions, then gradually add Gen AI features to the team’s daily tasks, rather than suddenly introducing many new and unfamiliar tools. Starting with smaller tasks can show the tool’s value and get the team on board | Engagement metrics (time spent for a job-to-be-done, interaction frequency, task success rate) Comparison of time spent using the product with the time spent using freemium tools (e.g., ChatGPT). | “Many verticals have a legacy vendor which is very hard to strip out, creating friction. Customers are rationalizing the need for applications, preferring one-stop-shop solutions.” “We took time to understand that our buyer and user needs are different… Behavior changes are hard. Now, we are doing one-hour user-shadow diagnostics sessions to understand and quantify current user workflows.”“Automating the whole workflow brings much more ROI to our customers than co-pilot only. We try to solve problems end-to-end.” | |
Low Level of Trust | Prioritize model interpretability and compliance. Document model architecture, training data, decision-making and compliance with safety frameworks like NIST as well as regulations to enhance understanding and trust among stakeholders. Implement mechanisms for continuous monitoring and evaluation of Gen AI models to detect biases, errors, and performance degradation over time. | Compliance with risk frameworks and responsible AI guidelines like NIST and as well as EU AI Regulations. | “You have to show you’ve done your homework, and you are checking boxes on all regulatory requirements to be a preferred vendor; proactively educating your customer is also important.” “…different types of data masking are used; for example, tokenization, when data is tokenized for a public LLM model, then detokenized.” |
To address Data Privacy, consider running your own GPUs so that sensitive data never leaves your firewall. | Cost of running your own GPU’s. | “In [large enterprise] most Gen AI solutions are built in-house; your security team has power, and they are concerned about open-source projects or marrying the team to a specific vendor.” |
In conclusion, although Gen AI adoption is advancing rapidly, we’re only beginning our journey toward complete autonomy. Recognizing the importance of human input is crucial for its success in the enterprise. Companies that skillfully integrate human supervision and involvement into the transition will be better equipped to navigate the rapidly evolving environment.
We want to thank all our colleagues and friends, including Julio Casal, Sujay Rao, and Chyngyz Dzhumanazarov, who contributed to this piece. If you are building in the space, or have some ideas on the topic and want to discuss them, feel free to reach out to us at [email protected] and [email protected].
Other sources used:
- Public Enterprise LLM Benchmarks – https://www.vals.ai/
- The Economist “How Businesses Are Actually using Gen AI”, Feb 2024
- Wall Street Journal, “AI Is Taking On New Work. But Change Will Be Hard—and Expensive”, March 2024
- AI Index Report published by the Stanford Institute for Human-Centered Artificial Intelligence, April 2024
- Key Takeaways from “AI x Risk Management Meet-up“ organized by BGV and Astorya VC, Paris, April 2024