Mockingbird is a RAG Specific LLM that Beats GPT 4, Gemini 1.5 Pro in RAG Output Quality

We are Proud to Launch Mockingbird, a Vectara LLM Fine-Tuned Specifically for Retrieval-Augmented Generation (RAG)

We are excited to announce the launch of Mockingbird, a Vectara LLM fine-tuned specifically for retrieval-augmented generation (RAG). Mockingbird achieves the world’s leading RAG output quality, with leading hallucination mitigation capabilities, making it perfect for enterprise RAG and autonomous agent use cases. It excels in ensuring data never leaves Vectara’s secure environment and consistently outperforms major models like OpenAI’s GPT-4 and Google’s Gemini 1.5 Pro in RAG output quality, citation accuracy, multilingual performance, and structured output accuracy.

“Vectara’s new Mockingbird took HuckAI (https://huckai.com) from being an overly polite librarian to giving answers I would expect from a senior co-worker. The responses are clearer, easier to follow, and provide direct answers to difficult questions helping our users get more work done. I switched immediately.” Said Sunir, founder of HuckAI.

(Note: If you want a more technical deep dive into Mockingbird with performance metrics, check out the technical blog post)

Mockingbird: Addressing Data Security and RAG Performance Challenges

As large language models (LLMs) continue to evolve, enterprises are increasingly focusing on two pivotal concerns: data security and response quality in Retrieval-Augmented Generation (RAG) use cases.

Privacy and Security

According to a recent survey from Alteryx, 80% of enterprises cite data privacy and security concerns as the top challenges in scaling AI. Industry leaders have expressed reservations about entrusting their sensitive enterprise data to third-party providers like OpenAI. One significant concern is that OpenAI may use submitted data for training purposes, which raises the risk of sensitive information being embedded in the model itself and potentially exposed to others.

Additionally, quality issues such as inaccuracies, “hallucinations” in RAG, and suboptimal summarization capabilities have prompted enterprises to seek more reliable solutions.

A significant advantage of having our own, self-hosted LLM is the ability to deploy our platform on customers’ Virtual Private Clouds (VPCs) or on-premises. This is crucial for customers who don’t want their data being sent to the cloud, ensuring their data remains secure and private.

Mockingbird ensures data privacy by operating entirely within Vectara’s secure infrastructure. Unlike some providers who face accusations of training on customer data, Vectara guarantees that your data is never used to train or improve our models, ensuring compliance with the strictest security standards. Mockingbird is deployed alongside Vectara, ensuring that your data never gets sent to a third-party LLM provider. This approach addresses data and security concerns by keeping everything within your control.

RAG Performance

RAG is at the core of what Vectara does. A RAG-focused LLM excels at generating grounded answers to queries with citations, enhancing the reliability and trustworthiness of responses. Structured output enables connecting RAG to downstream tasks such as function calling and enabling agentic behavior. By having a model specifically tailored to these tasks, the model is smaller in size, thus cheaper to operate while boosting quality compared to general-purpose LLMs.

Mockingbird outperforms GPT-4 and other major models on key metrics such as BERT F1 score in RAG output and citation precision/recall. It excels in multilingual RAG performance and structured output accuracy. This specialized focus on RAG-specific tasks leads to better performance compared to general-purpose models.

Mockingbird is Deployed Alongside Vectara, so Your Data Never Gets Sent to a Third Party LLM Provider

Mockingbird is fully integrated into Vectara’s secure platform, ensuring that sensitive data never leaves our controlled environment on SaaS or your controlled environment on Virtual Private Cloud (VPC)/on-premise installs. This deployment addresses the key concerns about data privacy with third-party providers, offering clients robust security assurances. All components of our platform, from data ingestion to the query flow—including our custom-built vector database, ranking algorithms, and large language models—are developed to keep your data within Vectara’s ecosystem, using only publicly available datasets for training.

Vectara’s platform, including Mockingbird, can be deployed in your VPC or on-premise, providing flexible options that meet the stringent needs of enterprise security without compromising on functionality. This ensures that you can leverage advanced language model technology with full control over your data.

Mockingbird Outperforms GPT4 and Other Major Models in RAG Output Quality

Mockingbird sets a new benchmark in RAG, outperforming major models like OpenAI’s GPT-4, Google’s Gemini-1.5-Pro, and RAG-specific models such as Cohere’s Command-R-Plus. Utilizing the BERT F1 score, a comprehensive metric for evaluating the accuracy and relevance of model outputs against a high-quality golden data set, Mockingbird achieves an impressive score of 0.86. This surpasses Command-R-Plus and significantly outstrips general-purpose models, with performance improvements of 26% over GPT-4 and 23% over Gemini-1.5-Pro across diverse datasets ranging from news to academic papers.

In addition, we conducted human evaluations for Mockingbird and GPT-4 and concluded that Mockingbird exceeded GPT-4 in quality evaluations, which further validates that Mockingbird outperforms GPT4 in RAG output quality.

For more performance metrics on citation quality, multilingual RAG performance, and structured output accuracy, please refer to the technical blog post.

To Use Mockingbird, Simply Switch Your Summarizer

Mockingbird is available now to all of our customers.

If you already have a Vectara account, switch your summarizer to:

mockingbird-1.0-2024-07-16

You can do this through the API or console:

API: In your API request, change the prompt_name parameter to the above:

Console: Go to your corpus, navigate to the query tab, and under “generation,” select the appropriate summarizer. Please note that changes made in the console will only be reflected there for preview purposes. For production use via the API, you will need to update your API requests accordingly.

Please read the API documentation for Mockingbird in this link.

As always, we’d love to hear your feedback! Connect with us on our forums or on our Discord. Sign up for an account to see how Vectara can help you easily leverage retrieval-augmented generation in your GenAI apps.