These days Generative AI is being employed for everything from interpretation and summarisation of text to problem solving with a conversational natural language interface. You can now get output from a computer by using the same kind of language you use to speak to other people. Recent developments such as the release of tools like ChatGPT powered by Large Language Models have put Generative AI into the hands of anyone with an internet connection.
What sort of conceptual model should we have in mind when thinking about LLM systems? This question was on my mind a few weeks ago while attending TuringFest 2023. In this post I’ll share some highlights from the conference and attempt to pull together a conceptual model for generative AI systems based on what I learned at the conference.
In his talk “Building Products in the Age of AI”, Fergal Reid highlighted the “accidental bundling” of features in Generative AI systems. These components are a reasoning engine, which in my mind is the machine learning model trained on some input data to learn how to reason, and the database, which may or may not be used in addition to the model to generate output:
|Data used to generate output||Large Language model|
It’s a bit like a traditional computer with processing applied to some data. Except now the processing is to generate reasoned answers rather than execute predetermined instructions.
And then there’s the input. How do we interact with the model? Bastian Grimm shared some tips to create a well structured ChatGPT prompt in his talk “The Rise of AI: Strategies and Tips to Drive Growth”. The suggested structure included the following information:
|Who is ChatGPT creating as?||What is the situation it is creating for?||What specifically do you want it to do?|
|How do you want it to return its response||Samples of the output that you expect.||What should ChatGPT not do?|
This looks like we’re writing clauses in a SQL query. I’m suspicious of natural language as a good, precise interface for anything. I think we should consider writing prompts for LLMs more like structured programming, and this seems to be an active area of research.
Now we have a large language model trained on some input data to be able to carry out reasoning tasks defined by a declarative input. So let’s bring it all together. The name that comes to my mind is “Programmable Reasoning Machine”, although I may regret writing that later!
Programmable Reasoning Machine
Fergal explained how LLMs tend to be better at interpolating between known data points than extrapolation away from the known. Unbundling the system into a separate reasoning engine and database allows us to exploit this by constraining the knowledge the system uses, for example by restricting it to a well curated set of documents.
We can unbundle a large language model system into three core components. These are a reasoning engine, a source of data to reason about and an interface to instruct the reasoning engine. I’m currently referring to this ensemble as a Programmable Reasoning Machine, but there may well be better labels out there.
Thinking about the system this way makes the importance of appropriate data and a clear interface apparent and might even encourage us to be more imaginative than thinking of every solution as “just another chat bot”.
Is this a useful way of thinking about AI systems building on LLMs? Let me know what you think!