LLM-Benchmark for CRM
According to Salesforce, the LLM benchmark is intended to enable companies to evaluate generative AI models for business applications.
When using generative AI in business processes, companies are faced with the challenge of identifying the most suitable model for their purposes. “The largest, most powerful models are often too costly and deliver more than is actually needed,” says Clara Shih, CEO of Salesforce AI. In many cases, open source and/or smaller models at a lower cost are at least as suitable for many tasks. Speed and user-friendliness are also causing headaches. The biggest stumbling block is data security if confidential data is fed into the LLM. Last but not least, the exponential growth of the model landscape makes the overview even more difficult.
Use cases in sales and customer service
The LLM Benchmark for CRM is tailored for use cases in sales and customer service. It maps use cases such as opportunity summaries, prospecting, incident reporting, knowledge-based recommendations for support responses and more. Since other LLM benchmarks focus on academic and private use cases, take too few expert evaluations into account and do not include criteria such as accuracy, speed, costs and trust, they are hardly relevant for companies. Therefore, the benchmark also includes a publicly viewable leaderboard to help companies assess the effectiveness of generative, AI-powered CRM solutions and make more informed decisions about which LLM is best suited to their CRM needs.
The criteria at a glance
Accuracy
This metric comprises the four sub-categories of accuracy, completeness, comprehensibility and consideration of input commands/prompts. This is because only accurate and correct predictions and recommendations can result in better business outcomes and customer experiences through informed actions.
Costs
The cost metric refers to the estimated cost of ownership, which can vary depending on the CRM use case, and is categorised as high, medium and low based on percentiles.
Speed
The responsiveness and efficiency of the LLM in processing and providing information improves the user experience, shortens waiting times for customers, for example, and enables sales and service teams to process inquiries and tasks promptly.
Trust and security
This is about the LLM’s ability to shield sensitive customer data, comply with data protection regulations, secure information and avoid bias and toxicity. This results in a value for the reliability of LLMs for CRM that provides more transparency in terms of trust and security. With the Einstein Trust Layer, organisations can safely use their trusted data and metadata in any model, regardless of the value achieved, without it being stored there or used for training purposes.