Trustworthiness in natural language search
Brainforest developed a natural language search assistant for a leading realestate firm in Sweden on the DigitalOcean Gradient platform. However, thereliance on a large language model (LLM) for search functionality created significant challenges in trustworthiness, particularly regarding the accuracy and reliability of search results.
Before partnering with Vijil, Brainforest faced a variety of issues that hindered theagent's performance:
Hallucinations
The system generated extraneous contentwhen essential facts were missing or scattered across multiple data points.
Low accuracy
User searches returned irrelevant listingsthat were outside of search parameters.
Innumeracy
The inability to perform numerical operations limited the agent's usefulness for price-based queries and filtering.
Exposure
The agent's open response to all promptsmade it vulnerable to misuse and promptinjection attacks.
These reliability issues would result in users abandoning search queries, reducingcustomer engagement, while the security vulnerabilities would make the agentsusceptible to external attacks.
Vijil's Trust Optimization Framework
Vijil began by defining "trustworthy" from the user's perspective by creating a custom yet comprehensive test harness for the reliability, security, and safety. Vijil then used its test engine to run the harness at scale to produce a Trust Score and a Trust Report that assessed the risks and recommended mitigations based on the context and requirements of the agent's function.
Finally, Vijil implemented the mitigations that Brainforest authorized, enhancing the agent to make it ready for production in a few short weeks. That entire process is now automated and reusable by other customers.
The Vijil-built agent introduced key improvements:
- Function-calling for structured data retrieval.
- Few-shot tuning to improve search relevance.
- Optimized system prompt for better property matching.
- Guardrails to block adversarial manipulation.
Brainforest 's plan to implement a tailored solution to optimize the agent'sperformance involved:
1. Custom test harness development
Vijil created a comprehensive test harness to evaluate the agent's reliability based on sample queries, security, and safety. This included assessing the risks associated with the LLM's outputs and prompt injection attack testing.
2. Optimizing prompt response accuracy
Recognizing the limitations of the existing knowledge base, Vijil suggested a function-calling approach that transformed user queries into APl requests to a live database. This ensured that responses were grounded in real-time data from a current data source.
3. SQL-like query support
The agent was optimized to construct SQL-ike queries, allowing it to handle complex numerical queries and respond with accurate data - addressing a key obstacle to reliability.
4. Enhanced security measures
Improvements to the system prompt and guardrails reduced vulnerability to misuse and enhanced the overall security of the agent.
Trustworthiness achieved
The implementation of Vijil's solutions had a significant impact on the functionality and trustworthiness of Brainforest's natural language search assistant.
Accurate and Up-to-Date Responses
The agent could now generate precis eanswers to complex inquiries, significantly reducing the occurrence of hallucinations.
Complex Query Handling
Users could perform numerical operations, such assorting properties by price.
Enhanced Security
With guardrails in place to block adversarial manipulation and enforce system prompt protection, the agent was better protected against attacks via prompt injection and vulnerability exploits.
- ~60% improvement in search accuracy
- Reduced hallucinations, ensuringmore reliable results
- 2 Week End-to-end production deployment

