Language Learning Bot: Using LangChain to Manage Conversational Prompts and Memory

Apr 30, 2023

Large language models have shown an impressive ability to respond to user requests. They have been able to complete a range of tasks from summarizing texts to writing programs to making lesson plans. They can even discuss these topics at different levels of complexity when prompted. Of course, they also state misinformation with the same confidence as these other tasks. And they may share private information that has been captured in training data (from across the web or from the LLM interactions themselves).

The prompt LLMs work with has a significant effect on the output they provide and their success with a particular task. Minimal prompts can result in limited, error-prone responses. When given more context, though, the LLM can maintain a conversation across several exchanges and produce better results in line with expectations. LangChain provides several tools to help work with prompting and memory to facilitate productive exchanges with an LLM. In this tutorial, we'll look at how prompts can affect the output of an LLM by working with a Korean language learning chatbot. It'll be a simple bot that tells some key information about a given Korean vocabulary word.

First, we need to install LangChain.

>>> pip install langchain

With LangChain installed, we’ll set up an LLM. We import the OpenAI class from LangChain. Then, we instantiate our OpenAI LLM instance. For this tutorial, I used an .env variable for the OpenAI key, which isn’t shown here. By default, this OpenAI LLM instance is using `text-davinci-003`. If you’d like to work with another LLM (and possibly avoid paying to run these queries), you might try something like `google/flan-t5-xl` that can be downloaded to your computer. Using another LLM, though, will probably result in significantly different output.

>>> from langchain import OpenAI
>>> llm = OpenAI()
>>> llm("What does 건너편 mean?")
'\n\n건너편 means "across the street" or "on the other side".'

Great, our LLM has given us a pretty good response. It succinctly told what "건너편" means. This word is fairly common and straightforward, but let’s push the LLM a bit more. We’ll try a slang word next.

>>> llm("What does 사빠딸 mean?")
'\n\n사빠딸 means "older brother\'s daughter", or the daughter of one\'s older brother.'

So imagine that a language learner heard this word while they were eating at an ice cream shop. From context clues, they may have been able to understand the word related to ice cream but couldn’t quite grasp what was meant by this word. That’s because this word is slang. Obviously, the LLM is offering some confusing information here. Let’s explore this word a bit more.

>>> llm("Wait a minute, doesn't that word relate to ice cream?")
'\n\nYes, the word "gelato" is an Italian word that is used to describe a type of ice cream. It is made with a base of milk, cream, and sugar, and usually contains fewer air bubbles than ice cream, giving it a more dense and rich texture.'

Now, the conversation has derailed. The LLM isn't returning anything related to the Korean word "사빠딸". In fact, it's switched to a random Italian word. The LLM has no memory of our previous two exchanges. Instead, it's looked at our request about a "word related to ice cream" and thought an explanation of "gelato" the most statistically likely response.

Let's think about the educational value of the response we’re getting, too. Translating words can be really great for helping someone get a quick gist of what something means. Often, though, a quick translation doesn’t give much help in understanding nuance and use.

Simple words like "cat" or "rice" are pretty easy to put into sentences. When we start thinking about more complex grammar and vocabulary, we likely want more than just a translation. Think about “건너편”. How would we use that to say “Go to the house across the street”, “Across the street, there is a pretty big park”, or “The restaurant on the other side of the bank is quite popular.” Each of these sentences has a different structure. Can we use 건너편 for each of these? Do we need to do anything special with it to make these sentences?

We might want some more information to help someone learn the words deeply and have confidence that they actually understand the word. An example sentence or two could help show how to use a word. Also, we could include synonyms and antonyms to help expand our vocabulary connections and learn adjacent words. This way we are working towards productive skills in addition to receptive skills. To get this better educational experience, we need to help the bot remember our conversation and get it to give us a particular response.

To start off with, let's make sure the LLM understands what it's supposed to be doing. We'll give it a more detailed prompt that explains what job it should focus on accomplishing. In a sense, we are giving the bot roleplay instructions about “who” it is.

>>> from langchain import PromptTemplate
>>> prompt = PromptTemplate(
    template = "The assistant helps people learn Korean words they encountered in class or daily life.  The assistant provides a simple explanation in English to define a Korean word.  Aside from the definition, the assistant provides at least one practical example sentence in Korean and a translation in English.  Then, it will list one common Korean synonym and antonym for the given word. Query: {query} Answer:",
    input_variables=['query']
)

LangChain provides the PromptTemplate to help us create a structured prompt. The PromptTemplate takes a template and input_variables. The input_variables get injected into the template. Let’s check that with prompt.format. Below, you can see that “What does 건너편 mean?” gets injected in place of the {query} variable.

>>> prompt.format(query="What does 건너편 mean?")
'The assistant helps people learn Korean words they encountered in class or daily life.  The assistant provides a simple explanation in English to define a Korean word.  Aside from the definition, the assistant provides at least one practical example sentence in Korean and a translation in English.  Then, it will list one common Korean synonym and antonym for the given word. Query: What does 건너편 mean? Answer:'

So let’s rebuild our LLM instance using the LLMChain. This helps us combine our prompt and LLM instance together. LLMChain manages these two components so that we don’t have to keep track of them with each query. We’ll run the same queries with the chain that includes our prompt!

>>> from langchain.chains import LLMChain
>>> chain = LLMChain(llm=llm, prompt=prompt)
>>> chain("What does 건너편 mean?")
{'query': 'What does 건너편 mean?',
 'text': ' 건너편 (geon-neo-pyeon) means "across the street" or "on the other side." Example sentence in Korean: 나는 건너편에 있는 식당에서 점심을 먹었어요. Translation: I ate lunch at the restaurant across the street. Synonym: 반대편 (bandae-pyeon) Antonym: 이쪽 (i-jjok)'}

Now, we’re getting a lot more information about the word in our query. The LLM has returned everything we asked for - translations, an example sentence, and synonyms/antonyms. Even though we didn't ask for it, we’re also getting pronunciation support!

Let's try our other two queries again.

>>> chain("What does 사빠딸 mean?")
{'query': 'What does 사빠딸 mean?',
 'text': ' 사빠딸 means "older sister". Example sentence in Korean: 나는 사빠딸이 있어요. Translation: I have an older sister. Common synonym: 형제 Common antonym: 남동생'}

>>> chain("Wait a minute, doesn't that word relate to ice cream?")
{'query': "Wait a minute, doesn't that word relate to ice cream?",
 'text': ' No, the Korean word 일어나다 (il-eo-na-da) is not related to ice cream. It literally translates to "to rise" or "to occur" and is often used to describe someone coming out or an event happening. For example, "그 이벤트는 내일 일어나겠습니다." (Geu i-beon-teu-neun nae-il il-eo-na-ge-sseub-ni-da) translates to "That event will occur tomorrow." Common synonyms for 일어나다 include 발생하다 (bal-saeng-ha-da) and 나타나다 (na-ta-na-da), and antonyms include 감소되다 (gam-so-dwe-da) and 사라지다 (sa-ra-ji-da).'}

The query results for these last two exchanges have improved somewhat. Given the context that it is a Korean language learning assistant, the LLM has responded with some explanation about a Korean word. However, while the conversation seems to build, the LLM still has no memory of the first two exchanges. The third response is completely off. Confidently incorrect…

Before tackling the memory issue, let's work on our prompt a bit more. Sometimes, the LLM's output is inconsistent. It returns "Synonym:" and "Antonym:" in the first query, but the second two use "Common synonym:" or even a full sentence. Also, we probably don't need pronunciation support. Instead of focusing on our LLM’s role, we’ll provide examples of ideal exchanges. This technique is called few-shot prompting.

A few-shot prompt has more components than our first prompt. We could just dump all of these new components into our PromptTemplate. Doing so is not ideal, though, because 1) it’s harder to manage and 2) it’s helpful conceptually to divide up the parts. LangChain provides the FewShotPromptTemplate, which breaks down a prompt into prefix, examples, suffix, and input variables.

We already wrote the prefix. It just sets the role the bot should take on. However, instead of having a space for the query (ie., the input_variables), we just prompt the LLM to pay attention to the examples we are going to give it.

>>> prefix = "The assistant helps people learn Korean words they encountered in class or daily life.  The assistant provides a simple explanation in English to define a Korean word.  Aside from the definition, the assistant provides at least one practical example sentence in Korean and a translation in English.  Then, it will list one common Korean synonym and antonym for the given word.  The following examples demonstrate the appropriate type of responses the assistant should provide to answer a user's query."

Next, we’ll build our examples up. The FewShotPromptTemplate splits examples up into two parts - the examples themselves and the PromptTemplate for them. For our exchanges, we want the user to ask about a word, and then the bot should give a reply. We’ll structure our examples as "query" and "response". Take a minute to review the structure of these examples and note the differences from the earlier LLM output.

>>> examples = [
    {
        "query": "What does 조용하다 mean?",
        "response": "조용하다 means 'to be quiet'.  A synonym is 한적하다, and an antonym is 시끄럽다.  Example: 조용히해!  지금 공부하고 있어요. (Be Quiet! I'm studying now.)"
    }, {
        "query": "Can you tell me the meaning of 밝은?",
        "response": "밝은 means 'bright'.  A synonym is 선명한, and an antonym is 어두운.  Example: 오늘 햇빛이 너은 밝은. (Today the sun is very bright.)"        
    }
]

Now, we'll build a simple PromptTemplate. This template will take in the examples we use and format them into the text that goes into our prompt. The user enters a query and the bot responds. We can represent that with the following template.

>>> example_template = """
... User: {query}
... AI: {response}
... """

With those two pieces, we can form a proper PromptTemplate. We'll just tell it what input values to expect when filling the template.

>>> example_prompt = PromptTemplate(
...     input_variables=["query", "response"],
...     template=example_template
... )

The last component we need is the suffix. It may seem redundant, but we need to make sure the bot understands the current question and where its answer fits in. So we need to write a place for the current query to go and a prompt for our bot to add its response.

>>> suffix = """
... User: {query}
... AI: """

All of these components together make up the FewShotPromptTemplate. While we have built one such prompt, it's worth remembering that we can have multiple prompts and even prompt types. As it becomes more integrated into an application, the LLM will require more prompts to help it achieve broader functionality. Each component of the FewShotPromptTemplate can be reused in other prompts. Additionally, it may be possible to dynamically set a specific component in response to a user’s request so that our FewShotPromptTemplate itself is responsive to the user’s goal.

>>> few_shot_prompt_template = FewShotPromptTemplate(
...     examples=examples,
...     example_prompt=example_prompt,
...     prefix=prefix,
...     suffix=suffix,
...     input_variables=["query"],
...     example_separator="\n\n"
... )
>>> few_shot_prompt_template.format(query="What does 유행 mean?")

Now, we’ll build out a new chain that uses our original LLM and the new FewShotPromptTemplate.

>>> language_bot_chain = LLMChain(
...     llm=llm, 
...     prompt=few_shot_prompt_template
... )

Let’s run our questions again.

>>> language_bot_chain("What does 건너편 mean?")
{'query': 'What does 건너편 mean?',
 'text': "건너편 means 'across the street'.  A synonym is 반대편, and an antonym is 옆.  Example: 건너편 카페가 있어요. (There is a cafe across the street.)"}

>>> language_bot_chain("What does 사빠딸 mean?")
{'query': 'What does 사빠딸 mean?',
 'text': " 사빠딸 means 'daughter of a father's friend'.  A synonym is 친엄마, and an antonym is 엄마.  Example: 사빠딸을 방문해요. (I'm visiting my father's friend's daughter.)"}

>>> language_bot_chain("Wait a minute, doesn't that word relate to ice cream?")
{'query': "Wait a minute, doesn't that word relate to ice cream?",
 'text': " Yes, 밝은 can also mean 'flavorful', usually used in the context of food and drinks.  A synonym is 맛있는, and an antonym is 싱거운.  Example: 이 아이스크림은 매우 밝아요. (This ice cream is very flavorful.)"}

The results seem to use a more consistent format. Our FewShotPromptTemplate seems to have helped the LLM write its responses with the particular format we gave it.

However, let’s take a minute to review all our exchanges so far because there’s an important point to keep in mind with all this prompting. Across all of the queries we’ve run, the LLM’s results vary in what it returned. For example, the LLM offers different antonyms of "건너편".

1st result - Synonym: 반대편 (bandae-pyeon) Antonym: 이쪽 (i-jjok)

3rd result - A synonym is 반대편, and an antonym is 옆.

It's worth remembering that LLMs work on statistics. There can be some variation in their responses. In this case, that means different students might get different results.

Now, we should deal with our LLM’s memory issue. The second and third question really aren’t working out well. We haven’t gotten a useful response to the third question, in particular. To start off, we need to change our suffix. We’ll include a “Current conversation” section.

>>> suffix = """
... Current conversation:
... {history}
... User: {input}
... AI: """

In our FewShotPromptTemplate, we’ll reuse all of our work from before. But to work with our new suffix, we’ll need to add “history” to the input_variables.

>>> few_shot_prompt_template = FewShotPromptTemplate(
...    examples=examples,
...    example_prompt=example_prompt,
...    prefix=prefix,
...    suffix=suffix,
...    input_variables=["history", "input"],
...    example_separator="\n\n"
... )

We don’t have to manually inject the conversation turns into the prompt. LangChain can handle adjusting the prompt for us. We’ll import two parts of LangChain that help with this memory management. ConversationChain is a chain that helps facilitate a conversation by loading the conversational context from memory. ConversationBufferMemory tells the ConversationChain what type of memory to use. There are different kinds of memory in LangChain, and we can even set how many exchanges are kept in memory. We’ll just include all of them.

>>> from langchain.chains import ConversationChain
>>> from langchain.chains.conversation.memory import ConversationBufferMemory
>>> language_bot_chain2 = ConversationChain(
...     llm=OpenAI(temperature=0), 
...     prompt=few_shot_prompt_template, 
...     verbose=True, 
...     memory=ConversationBufferMemory(),
... )

So now we have an updated chain that incorporates the memory of the conversation. Let’s run through our queries again to see what happens.

>>> language_bot_chain2("What does 건너편 mean?")
{'input': 'What does 건너편 mean?',
 'history': '',
 'response': "건너편 means 'the other side'.  A synonym is 반대편, and an antonym is 이쪽.  Example: 건너편에 사람들이 많아요. (There are many people on the other side.)"}

>>> language_bot_chain2("What does 사빠딸 mean?")
{'input': 'What does 사빠딸 mean?',
 'history': "Human: What does 건너편 mean?\nAI: 건너편 means 'the other side'.  A synonym is 반대편, and an antonym is 이쪽.  Example: 건너편에 사람들이 많아요. (There are many people on the other side.)",
 'response': " 사빠딸 means 'father's younger sister'.  A synonym is 삼촌, and an antonym is 이모.  Example: 사빠딸이 우리 집에 왔어요. (My father's younger sister came to our house.)"}

>>> language_bot_chain2("Wait a minute, doesn't that word relate to ice cream?")
{'input': "Wait a minute, doesn't that word relate to ice cream?",
 'history': "Human: What does 건너편 mean?\nAI: 건너편 means 'the other side'.  A synonym is 반대편, and an antonym is 이쪽.  Example: 건너편에 사람들이 많아요. (There are many people on the other side.)\nHuman: What does 사빠딸 mean?\nAI:  사빠딸 means 'father's younger sister'.  A synonym is 삼촌, and an antonym is 이모.  Example: 사빠딸이 우리 집에 왔어요. (My father's younger sister came to our house.)",
 'response': '그렇습니다! 사빠딸은 빙수를 뜻합니다. 사빠딸은 빙수의 다른 말로도 알려져 있습니다. 예를 들어, 아이스크림이라고도 합니다.'}

Now, our LLM has maintained the conversation. I’ve bolded the final response to make it easier to find. The LLM has responded that “사빠딸” can relate to a type of ice cream. Not a perfect answer, but good enough. In fact, as you must be wondering by now… it’s the shortened form of the “Love-struck strawberry” flavor (사랑에 빠진 딸기) from Baskin Robins (Google Image Search Results). So this demonstrates that the LLM remembered the previous exchange and modified its output when we gave it additional context.

But it's a bit odd that the response is all in Korean, isn't it? Not really. We gave the LLM a lot of context for how to handle a specific type of exchange. We constrained the way it was supposed to respond. Now, we have given it a completely different kind of sentence with a different (but somewhat related) task. While it remembered the context and answered appropriately, we didn't give it a pattern for how to answer this kind of thing, so it did its best to answer. Let’s take a look at the full conversation that LangChain is keeping track of via the prompt and memory we’ve added.

> Entering new ConversationChain chain...
Prompt after formatting:
The assistant helps people learn Korean words they encountered in class or daily life.  The assistant provides a simple explanation in English to define a Korean word.  Aside from the definition, the assistant provides at least one practical example sentence in Korean and a translation in English.  Then, it will list one common Korean synonym and antonym for the given word.  The following examples demonstrate the appropriate type of responses the assistant should provide to answer a user's query.


User: What does 조용하다 mean?
AI: 조용하다 means 'to be quiet'.  A synonym is 한적하다, and an antonym is 시끄럽다.  Example: 조용히해!  지금 공부하고 있어요. (Be Quiet! I'm studying now.)



User: Can you tell me the meaning of 밝은?
AI: 밝은 means 'bright'.  A synonym is 선명한, and an antonym is 어두운.  Example: 오늘 햇빛이 너은 밝은. (Today the sun is very bright.)



Current conversation:
Human: What does 건너편 mean?
AI: 건너편 means 'the other side'.  A synonym is 반대편, and an antonym is 이쪽.  Example: 건너편에 사람들이 많아요. (There are many people on the other side.)
Human: What does 사빠딸 mean?
AI:  사빠딸 means 'father's younger sister'.  A synonym is 삼촌, and an antonym is 이모.  Example: 사빠딸이 우리 집에 왔어요. (My father's younger sister came to our house.)

User: Wait a minute, doesn't that word relate to ice cream?
AI:

Prompting and memory work together to help the LLM provide useful outputs in relation to conversational exchanges. These two techniques can shape the structure, tone, and other aspects of the LLM's output. However, they will not preclude the bot from deviating from the pattern we established when given new cases, as we just saw.

While we’ve improved this simple exchange, there is so much more that would need to be done to integrate this into a production application. In particular, we would need to handle more use cases than our current prompt could handle. We’d likely have several specific types of exchanges to provide practice with words or quiz our audience. We’d also likely need to think about what the LLM should not do or should not be responsive to (perhaps by using a moderation tool).

The use of prompts with LLMs and how to integrate them into applications is still new, like LLMs themselves. Tools like LangChain can help us manage these the memory and prompts. However, there is much still to learn and explore with effectively applying prompt engineering to applications. Here are some other great resources to check out to consider how you might improve your prompts:

LangChain documentation - LangChain
Overview of Prompt Injection Techniques (aka security vulnerabilities) and a possible solution - Simon Willison’s Weblog
ChatGPT Prompt Engineering for Developers - DeepLearn.ai (apparently free for a limited time)
Prompt Engineering Techniques - Microsoft

Chatbot Development Newsletter

Language Learning Bot: Using LangChain to Manage Conversational Prompts and Memory