Evaluation Testing in Software Using LLMs
Dear readers, I hope that you are as curious as I am and join me on this learning journey. So, get your curiosity ready and let’s get started. 🙂 Why do we need to approach testing differently in an LLM based software? Software testing has been my passion for years, and therefore, I will deep dive and explore the options about it in the context of LLMs. Using LLMs has introduced new challenges into how we approach testing. So it’s very important to know how we can combine our current knowledge about testing and how we approach testing into this new era. In traditional non LLM based software, the output is predictable. Tests in which we compare specific output against expected one and given input is concrete and can be built with existing knowledge. Whereas in software with LLM we have a nondeterministic output. This means that every time we might receive a different correct response for a given input. As a result, this makes testing with a fixed input and output and comparing this against expected output, more challenging and difficult. ...
RAG with Spring AI
Dear readers, I hope that you are as curious as I am and join me on this learning journey. So, get your curiosity and development environment ready and let’s get started. 🙂 To begin with, I will guide you through some hopefully not so boring terminology. What is RAG? R = Retrieval A = Augmented G = Generation Flow without RAG Original prompt usually as an user specific input. The original prompt is sent directly to the LLM. The LLM responds based on a large amount of generic data that it was trained and is likely to be out of date. Flow with RAG Original prompt usually as an user specific input. Retrieval of additional context and information based on the domain. For example, a company specific data. The original prompt is considered together with the extra context that has been retrieved in step 2 and both are sent to the LLM. LLM responds based on the up to date information it has been provided with. What needs to be done to build RAG? Collect and create the context specific data - This can be achieved by, for example, with another LLM technique called embedding language models, that converts textual data into numerical representation and stores it in a vector database. Please consider that this is also a huge transformation step. This is due to the fact that you need to have the data and also at best to store it in LLM-friendly format, e.g. format that the LLM understands. In this way, retrieving this data later and giving it back on the LLM will be a more smooth and time efficient approach. Retrieve the relevant context information - in this step, the original prompt that is in a text format is transformed into a vector representation and matched with the vector databases. Augment the prompt for the LLM - in this step RAG augments the original prompt by adding the retrieved specific information. In addition, you need to keep your extra knowledge source up to date as well. Technology Decisions and Use Case I am going to demonstrate how to build RAG using Spring AI with text input. ...