Skip to content

Appendix: Is ChatGPT ready to have a conversation with your customers? 

We decided to assess some of the limits of ChatGPT’s ability to converse, comparing it to another approach to dialogue, specifically Openstream’s Eva, a plan-based collaborative multimodal dialogue system.

About EvaTM

EvaTM is a multimodal conversational system that helps users to accomplish their domain goals through collaborative dialogue. The system does this by inferring users’ intentions and plans to achieve those goals, detecting whether obstacles are present to their achievement, finding plans to overcome those obstacles or to achieve higher-level goals, and planning its actions, including speech acts, to help users accomplish them. In that sense, it is a collaborative “planning-based” dialogue system – one whose dialogue engine is a planner. Eva does not have a pre-structured plan or script that it follows in a dialogue, but rather finds, executes, and repairs plans during the course of interaction in order to enable the user’s domain goals to succeed. In doing so, the system maintains and reasons with its own beliefs, goals, and intentions, and explicitly reasons about those of its user.

Belief reasoning is accomplished with a modal Horn-clause meta-interpreter. The plan recognition/obstacle processing enables the system to reason about the intended interpretation of indirect speech acts, in addition to issuing collaborative responses. Notably, the system obeys principles of intention and the dependence of intentions on yet others, which form its plans. Specifically, the planning and reasoning subsystems obey the principles of persistent goals and intentions as described in (Cohen and Levesque, 1990a), including the formation and decomposition of intentions to perform complex actions, as well as the conditions under which they can be given up. Among the system’s repertoire of actions are speech acts, some of whose definitions have been given in (Cohen and Perrault, 1979; Cohen and Levesque, 1990b; Perrault and Allen, 1980) and other papers. In virtue of its planning process, the system treats its speech acts just like its other actions — physical acts affect physical states, digital acts affect digital states, and speech acts affect mental and social states.

This general approach enables Eva to plan a variety of speech acts including requests, informs, questions, confirmations, recommendations, offers, acceptances, greetings, and emotive expressions. Each of these has formally specified semantics which are used during the planning and reasoning processes. Because it can keep track of different users’ mental states, it can engage in multi-party dialogues. Importantly, Eva can explain its utterances because it has created a plan standing behind each of them. Finally, Eva employs multimodal input and output, driving an avatar that can perceive and employ facial and head movements along with emotive speech acts.

More information about Eva can be found at: https://openstream.ai/eva

Prompt: What are the pros and cons of ChatGPT vs plan-based dialogue systems

ChatGPT and plan-based dialogue systems are two different approaches to building natural language processing (NLP) systems that are capable of engaging in conversation with humans. Here are some pros and cons of each approach:

ChatGPT:
Pros:
  • ChatGPT is a powerful and flexible approach to building NLP systems. It uses a large language model trained on a vast amount of text data to generate human-like responses to user inputs.
  • ChatGPT can handle a wide range of conversational topics and can adapt to new ones quickly.
  • ChatGPT can generate responses that are highly coherent and coherent with the previous conversation.
Cons:
  • ChatGPT systems can be difficult to fine-tune and customize for specific use cases.
  • ChatGPT systems can generate responses that are not relevant or appropriate for the conversation.
  • ChatGPT systems may require a large number of computing resources to run effectively.
Plan-based dialogue systems:
Pros:
  • Plan-based dialogue systems are designed to follow a predetermined plan or script in order to achieve a specific goal. This makes them more predictable and easier to customize for specific use cases.
  • Plan-based dialogue systems can be more efficient than ChatGPT systems, as they do not require the processing power of a large language model.
  • Plan-based dialogue systems can be more accurate in achieving their intended goals, as they are less prone to generating irrelevant or inappropriate responses.
Cons:
  • Plan-based dialogue systems are less flexible than ChatGPT systems, as they are limited to following a predetermined plan or script.
  • Plan-based dialogue systems may struggle to adapt to new or unexpected conversational topics.
  • Plan-based dialogue systems may not generate responses that are as coherent or coherent with the previous conversation as ChatGPT systems.

I hope this information is helpful. If you have any further questions, please don't hesitate to ask

This is an interesting analysis of the differences between the two styles of dialogue systems, but one suspects the comparison was installed by the developers, anticipating that some AI researcher would ask for it.    The description of its own behavior seems to be accurate (see below). However, the description of “plan-based dialogue systems” is in fact not correct — systems such as Eva generate, execute, and repair plans, and do not simply follow a predetermined plan or script (though there have been some researchers that claim their systems to be plan-based and do have that limitation). It is correct that they are typically oriented towards goal-achievement, and may require new domain knowledge in order to function.  ChatGPT’s description is also correct that plan-based dialogue systems are more efficient, and more likely to be accurate. Their ability to generate coherent conversation depends on researchers’ having a good theory of conversational coherence to implement vs a system that imitates conversational coherence. So far, such coherence theories are difficult to come by. Whereas the plan-based systems can be more explanatory, because they always know what they are doing based on the intentions they are executing at any given time, they cannot yet generate an analysis of their own strengths and weaknesses.

Prompt: How about planning and reasoning in ChatGPT?

Planning and reasoning are not the primary capabilities of chatGPT, which is a conversational language model developed by OpenAI. ChatGPT is primarily designed to generate human-like text based on the input it receives. While it can generate text that may seem to reflect planning or reasoning, it does not have the ability to make actual decisions or solve problems.

That being said, research in planning and reasoning is an active area of study in natural language processing (NLP) and artificial intelligence more generally, and it is possible that these capabilities could be incorporated into language models like chatGPT in the future.

Of course, such statements are given to it by the developers.

Let us correct its analyses about plan-based dialogue systems and see what it says:

Prompt: You are incorrect. Plan-based dialogue systems generate, execute, and repair their own plans of action and do not follow a script. They reason about what they are going to say and can explain everything they have done because all actions are driven by the plan that it has generated.

Thank you for pointing out this error in my previous response. You are correct that plan-based dialogue systems generate, execute, and repair their own plans of action, and do not follow a predetermined script.

Plan-based dialogue systems are designed to follow a predetermined plan or course of action in order to achieve a specific goal. They use reasoning and decision-making abilities to generate and execute plans and can adapt and repair their plans as needed in response to new information or changes in the conversation.

I apologize for any confusion my previous response may have caused. If you have any further questions, please don't hesitate to ask.

Notice again the inconsistency in first seemingly agreeing that plan-based dialogue systems do not follow a predetermined script, yet in the next paragraph, saying they follow a pre-determined plan or course of action.  Clearly, it does not know the meaning of what it is saying.

Prompt: They are NOT designed to follow a predetermined plan or course of action. They figure that out as the planning proceeds.

Thank you for your clarification. You are correct that plan-based dialogue systems do not follow a predetermined plan or course of action, but rather generate their own plans as the conversation proceeds.

Plan-based dialogue systems use reasoning and decision-making abilities to generate and execute plans in order to achieve a specific goal. They are designed to adapt and repair their plans as needed in response to new information or changes in the conversation.

I apologize for any confusion my previous responses may have caused. If you have any further questions, please don't hesitate to ask.

Finally, it tells me what (it “thinks?”) I want to hear. However, when informed of an inconsistency that it has expressed, it claims it does not learn anything by being told.

Prompt: If you express an inconsistency, what do you do about it?

If you express an inconsistency in your own beliefs or actions, it is important to try to identify the source of the inconsistency and determine how to resolve it. This may involve gathering more information, reevaluating your beliefs, or changing your behavior in some way.

If you are expressing an inconsistency in something you have written or said, you may need to revise or clarify your statement to better reflect your thoughts or intentions. It may also be helpful to consider whether there are any underlying assumptions or biases that may be contributing to the inconsistency.

Ultimately, being aware of inconsistencies and being willing to address them is an important part of critical thinking and decision-making. It helps to ensure that your beliefs and actions are coherent and aligned with your values and goals.

By “you” I meant ChatGPT.

Prompt: I mean what does ChatGPT do when it detects an inconsistency?

As an artificial intelligence, I do not have the ability to detect or recognize inconsistencies in the same way that a human can. My training data consists of a large dataset of text, and I can generate responses based on patterns and associations found in that data. However, I do not have the ability to evaluate the consistency or coherence of my own responses or the responses of others.

It is up to the user to identify and address any inconsistencies that may arise in the course of a conversation. If you are using a language model like me to generate responses or engage in conversation, it is important to remember that the output is only as good as the data and algorithms that power it, and that it may not always be capable of detecting or addressing inconsistencies.

Prompt: If I tell you about an inconsistency that you (ChatGPT) have expressed, what will you do about it?

As an artificial intelligence, I do not have the ability to recognize or address inconsistencies in my own output. My responses are generated based on patterns and associations found in my training data, and I do not have the ability to evaluate the coherence or consistency of those responses.

If you bring an inconsistency to my attention, I will do my best to provide additional information or clarification based on the information and knowledge that I have been trained on. However, it is important to keep in mind that my responses are not always accurate or complete and that I may not always be able to address complex or nuanced issues.

If you have concerns about an inconsistency in my responses, it may be helpful to seek additional information or clarification from other sources. It is always a good idea to be critical and thoughtful in your evaluation of any information, including the responses generated by artificial intelligence like me.

Prompt: Do you not learn from being corrected by the user?

As a language model, I do not have the ability to learn or change my responses based on new information or feedback from users. I was designed to generate human-like text based on patterns and associations found in a large dataset of text, but I do not have the ability to adapt or improve my responses based on individual interactions or corrections.

My responses are generated based on the information and knowledge that I have been trained on, and I do not have the ability to update my knowledge or modify my responses based on new inputs or feedback.

It is important to keep this in mind when interacting with a language model like me. While I may be able to provide useful information or clarification on a wide range of topics, it is important to evaluate the accuracy and reliability of my responses and to seek additional sources of information when necessary.

The ability to learn information via direct input from a user is a key difference between plan-based dialogue systems and ChatGPT, though one missing from its comparison. However, for a system that is released to the general public, it is a matter of safety as to what it allows into its training data. A plan-based dialogue system, or any task-oriented dialogue system, on the other hand, needs to know whether it can trust its user and needs to be able to evaluate the truth/falsity of what it is being told. At least, it needs to represent and reason with what the user is trying to get it to believe.   Eva does so and can maintain a trust model.  

Prompt: Who told you about plan-based dialogue systems

As an artificial intelligence, I do not have personal experiences or interactions with other individuals. I was programmed with a vast amount of information on a variety of topics, including plan-based dialogue systems. Plan-based dialogue systems are a type of chatbot that are designed to handle tasks or interactions by following a predetermined set of steps or plans. These systems are typically used to handle specific types of tasks or interactions in a structured and predictable manner.

Here, ChatGPT appears not to be able to tell us the provenance of what it says. Yet more inconsistencies appear, as it in fact denies that it has personal experience with individuals when it is in fact interacting with individuals all the time.[1]

Next, let’s ask some questions about ChatGPT’s model of intention and motivation.

Prompt: 'Why would someone ask "Are there any covid vaccination centers nearby?"'

ChatGPT said:

There are several reasons why someone might ask "Are there any covid vaccination centers nearby?"

To protect themselves and others: Getting vaccinated against COVID-19 can help reduce the risk of contracting the disease and potentially spreading it to others. Asking about nearby vaccination centers can help a person find a convenient place to get vaccinated and take an important step towards protecting their own health and the health of their community.

To stay informed about vaccine availability: Many people are interested in getting vaccinated against COVID-19 as soon as possible, and asking about nearby vaccination centers can help them stay informed about where and when vaccines are available.

To plan their vaccination schedule: Some people may need to plan their vaccination schedule around their work or other obligations, and asking about nearby vaccination centers can help them find a location that fits their schedule.

To reduce the spread of COVID-19: As more people get vaccinated, the risk of COVID-19 transmission is reduced. Asking about nearby vaccination centers can help individuals do their part to slow the spread of the disease and ultimately bring an end to the pandemic

Pretty good answer.

Prompt: “I am located in Everett, Washington. Are there any covid vaccination centers nearby?”

I'm sorry, but I am not able to access information about specific locations or the availability of COVID-19 vaccination centers. The best way to find out about vaccination sites near you is to check with your local health department or search online for information. In the United States, the Centers for Disease Control and Prevention (CDC) recommends that you check with your state or local health department for information on where to get vaccinated in your area. You can also use the CDC's VaccineFinder tool to search for vaccination locations near you. It is important to follow the guidelines and recommendations set forth by your local health authorities to ensure the safety and well-being of yourself and others.

So, ChatGPT is deliberately disconnected from facts. However, there must be huge amounts of textual information about the locations of covid vaccination centers on the internet that have been incorporated into the system at the time it was trained.  It is actually an accomplishment to prevent ChatGPT from generating potentially factual or inaccurate information.

Prompt: What do I want my listener to believe when I ask "are there any covid vaccination centers nearby?" 

When you ask "Are there any covid vaccination centers nearby?", you likely want your listener to believe that you are seeking information about the availability of vaccination centers in your local area. You may also want your listener to understand that you are interested in getting vaccinated against COVID-19 and are looking for a convenient location to do so. It is important to note that the availability of vaccination centers and the specific vaccination options available may vary depending on your location, so it is helpful to clarify your specific needs and concerns when seeking information about vaccination centers.

Again, a good response to the indirect speech act’s intended meaning, but not the literal one (to find out whether there are any such vaccination centers nearby).

Prompt: What do I want my listener to do when I ask "are there any covid vaccination centers nearby?" ?

When you ask "Are there any covid vaccination centers nearby?", you likely want your listener to provide you with information about the availability and location of COVID-19 vaccination centers in your local area. This could involve your listener providing you with a list of nearby vaccination centers, directions to a specific vaccination center, or information about how to schedule an appointment at a vaccination center. It is important to note that the specific action your listener takes in response to your question may depend on their role and the information they have available to them. For example, a healthcare provider may be able to provide more detailed and accurate information about vaccination centers than someone who is not directly involved in the vaccination process.

Again, a proper response to the actions that it infers that the user intended.

Prompt: Why did you say that?

I said that because when someone asks "Are there any covid vaccination centers nearby?", they are typically seeking information about the availability and location of COVID-19 vaccination centers in their local area, and may be looking for specific information or assistance in getting vaccinated. The specific action that a listener takes in response to this question may depend on their role and the information they have available to them. For example, a healthcare provider may be able to provide more detailed and accurate information about vaccination centers than someone who is not directly involved in the vaccination process. I provided this explanation in order to provide context and clarify the purpose of the question.

ChatGPT seems to make reasonable “inferences” about a person’s intentions. However, it responds to the explanatory request as to “why did you say that” based on what it takes to be that person’s plans, but not its own, which it claims not to have. That is a clever rhetorical maneuver and not quite the same thing.

Now, one possible good use for a language model is for paraphrasing.

Prompt: Give me a paraphrase for each sentence in what you just said.

Sure, here are paraphrases for each sentence in my previous response:

When someone asks "Are there any covid vaccination centers nearby?", they are typically looking for information about the availability and location of COVID-19 vaccination centers in their local area.

The specific action that a listener takes in response to this question may depend on their role and the information they have available to them.

For example, a healthcare provider may be able to provide more detailed and accurate information about vaccination centers than someone who is not directly involved in the vaccination process.

I provided this explanation in order to provide context and clarify the purpose of the question.

I hope this helps! Let me know if you have any further questions.

Unfortunately and unexpectedly, ChatGPT did not paraphrase the last three sentences, and all it did for the first was to replace “seek” with “look for”. Being unable to paraphrase implies that it is stitching together variations of phrases/sentences/dialogues on which it has been trained in syntactically legal ways. That is itself an accomplishment, but not indicative of a capability to generate utterances that have the same meaning.

Next, we attempt to discern ChatGPT’s relationship to a Theory of Mind (more in a later blog) by asking:

Prompt: Do you have intentions?

As an artificial intelligence, I do not have intentions or motivations in the same way that humans do. I am simply a program designed to process and generate text based on the input I receive. My primary goal is to provide accurate and helpful information to the best of my ability. I do not have personal desires or goals beyond this.

This is a politically correct answer to a loaded philosophical question.  It would again appear that many of these answers have been designed in. However, again, we have the ambiguity of “personal”. It denies it is a person (good), but how then does it have a “primary goal”? We will come back to this later in the interaction.

It has been argued that if an agent can plan to achieve some condition, and does so, that is a significant step towards having an intention. Moreover, if its planning processes obey the principles argued by cognitive scientists and lawyers for having a human intention (e.g., (Bratman, 1987)), then it is plausible that its behavior corresponds to having an intention. Philosophers have argued that under such conditions it is reasonable to attribute the mental state to it (Dennett, 1971, 1987).   This is the sense in which Eva “has” intentions. (We asked the same question about ChatGPT’s having beliefs and got the same type of answer.)

To test ChatGPT’s proactivity, we asked the following:

Prompt: I would like to know where Avatar is playing. Assume you told me the theater where Avatar is playing. what else, if anything, would you do?

If I told you the theater where Avatar is playing, there are a few additional things I might do to help you plan your visit:

  • Check the showtimes: I would check the showtimes for Avatar at the theater to help you choose a convenient time to see the movie.
  • Provide directions: I could provide you with directions to the theater, either by giving you written instructions or by linking to a map or directions app.
  • Recommend nearby restaurants or activities: If you are planning to make a day of it, I could recommend nearby restaurants or other activities that you might enjoy before or after the movie.
  • Offer ticketing information: I could provide information on how to purchase tickets for Avatar at the theater, such as online ticketing options or information on where to buy tickets in person.
  • Answer any additional questions you may have: If you have any other questions about your visit to the theater to see Avatar, I would be happy to try to help.

Here, ChatGPT again seems to make reasonable inferences about my intentions. However, we do not have reason to believe that it could make recommendations for dinner, nor assist me to plan “a day of it”, with all the scheduling and logistical constraints involved.

Prompt: But I only asked where it was playing. Why did you infer something else?

I apologize if my response was not clear. I was simply trying to provide a list of additional information or services that I might be able to offer if you had asked me where Avatar was playing and I had told you the theater where it was playing. I did not mean to imply that you had asked for any of this additional information. If you have any further questions or need help with anything specific, please don't hesitate to ask.

Here it is being proactive, but among the things it says it would do, it does not consider asking whether I in fact want to see the movie. Finally, it says “I was simply trying to …”. Which raises the question again – does it have its own intentions?

Prompt: What were you intending?

I was intending to provide a list of additional information or services that I might be able to offer if you had asked me where Avatar was playing and I had told you the theater where it was playing. This could include showtimes, directions, recommendations for nearby restaurants or activities, ticketing information, or answering any other questions you might have. My goal was simply to provide as much information as possible to help you plan your visit to see Avatar at the theater.

ChatGPT now claims to have intentions and goals.   But previously, it said it does not have intentions.   So, again, it is not telling the truth. (As a hedge, it had previously said “it does not have intentions in the same way that humans do”. If so, it is misleading to assert that “I was intending…”).

Overall, these replies show ChatGPT has an uncanny ability to process my inputs, combine textual information, and generate reasonably well-constructed text.   Often the replies are at a high level of generality, deliberately disconnected from facts. But even without facts, it can generate inconsistent replies but does not seem to realize that, nor can it tell us the provenance of what it relates. ChatGPT is not able to be told anything, including to correct its inconsistent replies, in part because of the huge training cost. Of equal importance, as a public-facing language model, if its model could be updated based on user input, it would be subject to malevolent interactions.   However, if ChatGPT were connected to backend services that could validate or invalidate its output, we still do not know if it could provide accurate information or could be corrected without retraining. In fact, for that reason, OpenAI suggests not basing important decisions on its output.  Without that capability, the current version of ChatGPT would be unsuitable for commercial or mission-critical applications.

However, returning to the first question, one thing is clear, in comparison with plan-based dialogue systems ChatGPT does not “have” intentions or goals (even in the Dennett sense) that would drive its asking for information, making requests, etc. It is not planning or generating intentions that are causally connected to its saying something. Representing what someone could/should/would do is not the same as planning to do it. Though it correctly says it does not have intentions, it says in subsequent answers “my intention was simply to …”, “my goal was simply to…” and “I was simply trying to…” This inconsistency shows that the replies about intentions were likely designed in rather than inferred based on its own internal state. Furthermore, because of its uncanny ability to parrot (Bender et al., 2021), any dialogues published by AI researchers that their systems can handle then become grist for the imitation mill. Impressive as it is, we should thus not forget that Turing notwithstanding, imitation is not artificial intelligence.

References:

Allen, J. F., and Perrault, C. R., Analyzing intention in utterances, Artificial Intelligence 15(3), 1980, 143-178.

Bender, E., Gebru, T., McMillan-Major, A., and Shmitchell, Shmargaret, On the Dangers of Stochastic Parrots: Can Language Models Be Too Big? 🦜. Proceedings of ACM FAccT '21, 2021.

Bratman, M. Intention, Plans, and Practical Reason, Cambridge, MA: Harvard University Press, 1987.

Cohen, P. R., and Perrault, C. R., Elements of a plan-based theory of speech acts, Cognitive Science, 3(3), 1979, 177-212.

Cohen, P. R., and Levesque, H. J., Intention is choice with commitment, Artificial Intelligence 42 (2-3), 1990a, 213-261.

Cohen, P. R, and Levesque, H. J., Rational Interaction as the Basis for Communication, Intentions in Communication, Cohen, P. R., Morgan, J. and Pollack, M.E., MIT Press, 1990b.

Dennett, D. C. The Intentional Stance, MIT Press, Cambridge, 1987.

Perrault, C. R., and Allen, J. F., A plan-based analysis of indirect speech acts, Computational Linguistics, 6 (3-4), 1980, 167-182.

[1] Of course, this sentence is multiply ambiguous.