Note: Descriptions are shown in the official language in which they were submitted.
S&B Ref: 84678114 (8500010-481)
SYSTEMS AND METHODS FOR PERFORMING AUTOMATED INTERACTIVE
CONVERSATION WITH A USER
FIELD
[1] The following relates to a computer-implemented dialogue system for
conversing
with a human.
BACKGROUND
[2] A dialogue system is a computer system that converses with a human via
a user
interface. Many dialogue systems utilize a computer program to conduct the
conversation using
auditory and/or textual methods. A common name for such a computer program is
a `chatbot'. A
chatbot may be implemented using a natural language processing system.
[3] An organization may use a dialogue system to help support and scale
their
customer relation efforts. A dialogue system may be used to provide a wide
variety of
information to many different users. For example, the dialogue system may be
used to perform
automated interactive conversation with users in order to provide answers to
questions posed by
the users. Questions originating from different users may be very different in
nature, and the
questions may be received and answered at any time of day or night. The users
of the dialogue
system may be customers or potential customers of the organization.
[4] Current dialogue systems have a technical problem in that they are
often not
robust. 'Robustness' refers to the ability of the dialogue system to
satisfactorily answer a
question posed by a human user. Some current dialogue systems may provide less
than 50%
correct/satisfactory answers. If the dialogue system returns an incorrect or
unsatisfactory answer
too often, then the dialogue system will not be adopted by human users. Also,
the organization's
reputation may be negatively impacted.
SUMMARY
151 One way to try to increase the robustness of a dialogue system is
to invest
significant resources in the writing of questions and answers, and/or to
invest significant
resources in enriching the ability of the dialogue system to recognize
intentions. Technical
1
CA 3026936 2018-12-07
S&B Ref: 84678114 (8500010-481)
implementations often focus less on linguistics and more on improvements to
algorithms /
models. Achieving satisfactory results may be expensive. In some cases, the
results are not even
satisfactory because of a technical challenge: the combinatory complexity of
human language is
boundless, and so it is difficult in technical implementation to predict the
natural language a user
could use to ask a particular question. The system may be highly based on
recursion, and manual
dialogue tree manufacture may not be a viable solution.
[6] Instead, in some embodiments disclosed herein, a dialogue system is
provided
that may increase the probability of finding a satisfactory response with
relatively little iteration
of dialogue between a user and the system. An interactive process is
introduced to help facilitate
the exchange between the user and the dialogue system to try to increase the
level of robustness
of the dialogue system.
[7] In some embodiments, the number of responses in the form of questions
may be
progressively increased during an interaction with a user. This may have the
effect of increasing
the overall robustness of the dialogue system. For example, the responses that
are progressively
increased may be questions that the system determines the user may be asking.
[8] Another problem with dialogue systems is that a response formulated by
a
dialogue system is not customized based on the user. For example, if two
different users ask the
exact same question, e.g. "What is the monthly fee for your savings account",
the answer would
be the same. However, one user may actually be entitled to a preferable
monthly fee compared to
another user, e.g. based on the volume of monthly financial transactions
associated with the
user's bank accounts or based on the number of accounts held by the user.
[9] In some embodiments, a technical solution is provided in which a
response
returned by a dialogue system is generated based on financial information
specific to the user.
The response may be in reply to a user's finance related question or finance
related action. The
response may be a question or an answer or an action.
BRIEF DESCRIPTION OF THE DRAWINGS
[10] Embodiments will be described, by way of example only, with reference
to the
accompanying figures wherein:
2
CA 3026936 2018-12-07
S&B Ref: 84678114 (8500010-481)
[11] FIG. 1 is a block diagram of a computer implemented system for
performing
automated interactive conversation with a user, according to one embodiment;
[12] FIG. 2 illustrates a flowchart of a computer-implemented method for
performing
automated interactive conversation with a user, according to one embodiment;
[13] FIGs. 3 and 4 illustrate example message exchanges on a user
interface;
[14] FIG. 5 illustrates a flowchart of a computer-implemented method for
interacting
with a user, according to one embodiment;
[15] FIG. 6 illustrates a flowchart of a computer-implemented method for
performing
automated interactive conversation with a user, according to another
embodiment; and
[16] FIG. 7 illustrates a flowchart of a computer-implemented method for
interacting
with a user, according to another embodiment.
DETAILED DESCRIPTION
[17] For illustrative purposes, specific embodiments and examples will be
explained in
greater detail below in conjunction with the figures.
[18] FIG. 1 is a block diagram of a computer implemented system 102 for
performing
automated interactive conversation with a user, according to one embodiment.
The system 102
implements a dialogue system.
[19] The system 102 includes a user interface 104 for receiving a natural
language
input originating from the user, and for providing a response to the user. The
attributes of the
user interface 104 are implementation specific and depend on how the user is
interacting with the
system 102. Two examples of a user interface 104 are illustrated in FIG. 1. In
one example, the
user interface 104 interfaces with a telephone handset belonging to the user.
The telephone
handset includes a transmitter through which the user speaks and a receiver
through which the
user hears the response. A speech recognition module 106 is included as part
of the system 102
in order to convert from speech to text. As another example, the user
interface 104 may interface
with a graphical user interface (GUI) on a computing device, such as on the
user's mobile
device. The user may use a keyboard or touchscreen to provide a text input,
and the response
3
CA 3026936 2018-12-07
S&B Ref: 84678114 (8500010-481)
would be presented as text on the display screen hosting the GUI. The user
interface 104 is the
component of the system 102 that interfaces with users, and is meant to refer
to the components
of the interface that belong to the system 102, rather than to the user
device.
[20] The system 102 further includes a data processing unit 110, which
may
implement a natural language processing system. The data processing unit 110
includes a
keyword extractor 112, an intent classifier 114, response generator 116, and a
learning
component 118.
[21] The keyword extractor 112 receives a natural language input
originating from the
user. The input received at the keyword extractor 112 is a string of text. In
general, the string of
text includes multiple words, although in some cases it could be that the
string of text is only a
single word. The string of text may convey a question asked by the user, or a
user instruction, or
a user's answer to a question that was asked by the system 102. The keyword
extractor 112
attempts to extract words and/or phrases from the string of text. If any
keywords are extracted,
the extracted keywords are stored in a memory, e.g. memory 122.
[22] In some embodiments, the keyword extractor 112 may recognize
properties that
indicate a particular word in the string of text may be a keyword, such as the
use of a date,
capital letter, brand name, recognized phrase, etc. Examples of keyword
extraction algorithms
that may be implemented by the keyword extractor 112 are described in:
(1) Jean-Louis, L., Gagnon, M., and Charton, E., "A knowledge-base oriented
approach for
automatic keyword extraction" Computacion y Sistemas, 2013, vol. 17, no 2, p.
187-196; and
(2) Bechet, F., and Charton, E., "Unsupervised knowledge acquisition for
extracting named
entities from speech", in Acoustics Speech and Signal Processing (ICASSP),
2010 IEEE
International Conference on, pp. 5338-5341, March 2010.
[23] In some embodiments, a keyword extraction algorithm is used
involving named
entity recognition based on knowledge representation of the semantic domain
covered by the
dialog system application.
[24] The intent classifier 114 also receives the natural language input
originating from
the user in the form of a string of text, and analyses the string of text to
determine the intent of
4
CA 3026936 2018-12-07
S&B Ref: 84678114 (8500010-481)
the user. In some embodiments, the words in the string of text are compared to
a library of
intents and entity values. For example, if the user asked the question "What
is the rate on your
cashback Mastercard?", then the intent classifier 114 may match the word
"rate" to an intent "get
rate" that is stored in a library of intents. The intent classifier 114 may
determine that the entity
value relating to that intent is "cashback" by the presence of the word
"cashback". In such a
scenario, the intent classifier 114 therefore determines that the user is
asking for a cashback rate.
The presence of word "Mastercard" may cause the intent classifier 114 to
determine that the
cashback rate requested by the user is the cashback rate for the MastercardTM
brand credit card.
The intent classifier 114 may associate a confidence value with the determined
intent. The
confidence value will be referred to as a "confidence score", and it
quantifies how confident the
intent classifier 114 is regarding the correctness of its determined intent.
For example, the intent
determined by the intent classifier 114 may be "get cashback rate for
MastercardTM brand credit
card". However, this intent is not necessarily correct, e.g. there is some
ambiguity from the string
of text as to whether the rate requested is cashback rate or another type of
rate instead (e.g.
interest rate for the MastercardTM brand credit card). Therefore, the
confidence score may not be
100%, but may instead have a lower value, e.g. 75%.
[25] In other embodiments, the intent classifier 114 instead works by
simply looking
for matches between words in the natural language input and words in
prewritten questions that
are stored in memory 122.
[261 One example of an algorithm that may be implemented by the intent
classifier 114
is described in: Serban, I. V., Sordoni, A., Bengio, Y., Courville, A. C., and
Pineau, J., "Building
End-To-End Dialogue Systems Using Generative Hierarchical Neural Network
Models", in
Association for the Advancement of Artificial Intelligence (AAAI), Vol. 16,
pp. 3776-3784,
February 2016.
[27] The response generator 116 receives the intent from the intent
classifier 114,
determines the question that the user is possibly asking based on the intent,
and returns the
possible question to the user for verification. Once the user verifies the
question, the response
generator 116 formulates and returns the answer. In some embodiments, the
answer may be
stored in memory and simply retrieved using a mapping between the verified
question and the
answer. In other embodiments, the response generator 116 may need to send a
request over a
CA 3026936 2018-12-07
S&B Ref: 84678114 (8500010-481)
network to obtain the answer. For example, if the verified question is "what
is the cashback rate
for MastercardTM brand credit card", then the response generator 116 may query
a database
storing the cashback rate in order to obtain the cashback rate, and then
formulate and send the
response to the user, e.g. "The cashback rate for our MastercardTM is 1%".
[28] The learning component 118 adapts the keyword extractor 112 and/or
intent
classifier 114 based on information provided by the user, as discussed in more
detail later.
[29] Operation of the intent classifier 114, response generator 116, and
learning
component 118 will be explained in more detail below in relation to FIG. 2.
[30] The system 102 further includes a memory 122 for storing information
used by
the data processing unit 110. For example, the memory 122 may store a library
of intents, the
extracted keywords from the keyword extractor 112, responses or partial
responses
preprogrammed for use by the response generator 116, etc.
[31] The data processing unit 110 and its components (e.g. the keyword
extractor 112,
intent classifier 114, response generator 116, and learning component 118) may
be implemented
by one or more processors that execute instructions (software) stored in
memory. The memory in
which the instructions are stored may be memory 122 or another memory not
illustrated. The
instructions, when executed, cause the data processing unit 110 and its
components to perform
the operations described herein, e.g. extracting keywords from the user input,
classifying intent,
computing a confidence score, formulating the response to send to the user,
updating one or
more algorithms based on input from the user, etc. In some embodiments, the
one or more
processors consist of a central processing unit (CPU).
[32] Alternatively, some or all of the data processing unit 110 and its
components may
be implemented using dedicated circuitry, such as an application specific
integrated circuit
(ASIC), a graphics processing unit (GPU), or a programmed field programmable
gate array
(FPGA) for performing the operations of the data processing unit 110 and its
components.
[33] In some embodiments, in order to try to increase the robustness of the
dialogue
system, an interactive process is used in which the number of responses may be
progressively
increased during an interaction with a user. Example embodiments are provided
below.
6
CA 3026936 2018-12-07
S&B Ref: 84678114 (8500010-481)
[34] FIG. 2 illustrates a flowchart of a computer-implemented method for
performing
automated interactive conversation with a user, e.g. in order to provide an
answer to a question
from the user, according to one embodiment.
[35] In step 202, a natural language input originating from a user is
received via the
user interface 104. The natural language input is a string of text that
conveys a question.
[36] In step 204, the keyword extractor 112 attempts to extract keywords
from the
natural language input. If one or more keywords are extracted, then they are
stored in memory
122.
[37] In step 206 the intent classifier 114 determines an intent from the
natural language
input. The intent classifier 114 also determines the confidence score for its
determined intent. In
step 207, if the confidence score is below a threshold, then the method
proceeds to step 221,
explained later. Otherwise, if the confidence score is above the threshold, it
indicates that the
system 102 is confident enough in its determined intent to return a single
question for
verification, and the method proceeds to step 208.
[38] In step 208, the response generator 116 returns the question for
verification by the
user.
[39] In step 209, the data processing unit 110 determines whether the
question returned
in step 208 was verified as correct by the user. Step 209 may include
receiving a natural
language input from the user, and the intent classifier 114 determining from
the intent of the
natural language input whether or not the user has verified the correctness of
the question.
[40] For example, the original natural language input received in step 202
may ask
"What is the rate on your cashback MasterCard?". The intent classifier 114 may
determine with
high enough confidence that the intent is "get cashback rate for MasterCard",
and so in step 208
the response generator 116 returns "I think I understand your question. Can
you verify for me
that your question is: What is the cashback rate for the MasterCard credit
card?" The user replies
"Yes". The input "Yes" is determined in step 209 to be verifying that the
question is correct.
7
CA 3026936 2018-12-07
S&B Ref: 84678114 (8500010-481)
[41] If the question is verified as correct, then in step 210 the response
generator 116
returns the answer to the question, and the method ends. If the question is
not verified as correct,
then the method proceeds to step 211.
[42] When a question is derived by the intent classifier 114 from the
natural language
input, and the confidence score is above the threshold, then the derived
question is referred to as
a "likely question". A "likely question" is a question that the system 102
determines was likely
conveyed by the natural language input. In step 208, it is the likely question
that is returned.
However, it is only a "likely" question because it is not necessarily the
actual question that was
asked, e.g. if the intent determined by the intent classifier 114 does not
correctly reflect the
user's intent.
[43] If step 211 is reached, it means that the initial question presented
to the user for
verification in steps 208 and 209 is not verified as correct. In step 211, the
data processing unit
110 determines whether one or more keywords were recognized and extracted by
the keyword
extractor 112 in step 204. If no keywords were recognized, then the method
proceeds from step
211 to step 230. Step 230 is explained later. Otherwise, if one or more
keywords were
recognized and extracted, then the method proceeds from step 211 to 212.
[44] In step 212, the intent classifier 114 identifies n alternative
intents based on the
keywords extracted in step 204, where n is a natural number. n may vary
depending upon how
many alternative intents can be determined, and n may also be capped. For
example, if only one
alternative intent is determined by the intent classifier 114, then n is
limited to n = 1. As another
example, if five alternative intents are determined by the intent classifier
114, then n may be
capped at four, e.g. only the top four alternative intents are identified.
[45] An alternative intent is identified by the intent classifier 114 as
follows: the
keywords are processed, but instead of identifying the most likely intent
(identified in step 206),
a different intent is identified that is determined to be less likely, e.g.
has a lower confidence
score. For example, the user's question may be "What is the rate on your
cashback Mastercard?"
The intent classifier 114 determines two possible intents: (1) the user is
requesting the cashback
rate for the MastercardTM brand credit card, and the confidence score of this
determined intent is
75%; or (2) the user is requesting the interest rate for the MastercardTM
brand credit card, and the
8
CA 3026936 2018-12-07
S&B Ref: 84678114 (8500010-481)
confidence score of this determined intent is 65%. The intent identified in
step 206 is the one
with the higher confidence score, which in this example is cashback rate. The
(n = 1) alternative
intent identified in step 212 is the one with the lower confidence score,
which in this example is
interest rate. Each alternative intent corresponds to an alternative question
the user might be
asking, which is derived from the natural language input conveying the
question that was
received at step 202.
[46] The n alternative intents correspond to n alternative questions, and
in step 214 the
n alternative questions are returned to the user via the user interface 104.
[47] In step 215 it is determined whether one of the n alternative
questions is identified
as correct by the user. Step 215 may be performed by determining the intent of
an input received
from the user after the n alternative questions are presented to the user. For
example, if the user
responds "First question", then the system determines that the first question
of the n alternative
questions is the correct question.
[48] If one of the n alternative questions is identified as correct, then
in step 216 the
response generator 116 returns the corresponding answer and the method ends.
If none of the n
alternative questions is identified as correct, then the method proceeds to
step 230. Step 230 is
described later.
[49] Returning to step 207, if the intent determined by the intent
classifier 114 in step
207 is below a threshold, then the method proceeds to step 221. If step 221 is
reached, it means
that an intent has been determined from the natural language input, but the
intent classifier 114 is
not particularly confident that the determined intent is correct.
[50] Therefore, in step 221, the data processing unit 110 determines
whether one or
more keywords were recognized and extracted by the keyword extractor 112 in
step 204. If no
keywords were recognized, then the method proceeds from step 221 to step 230.
Step 230 is
explained later. Otherwise, if one or more keywords were recognized and
extracted, then the
method proceeds from step 221 to step 222. In step 222, the intent classifier
114 identifies the k
most likely intents, where k is a natural number greater than or equal to one.
k does not need to
have any relation to n, but in some embodiments k = n or k = n + 1. k may vary
depending
upon how many intents can be determined, and k may also be capped. The k
intents returned
9
CA 3026936 2018-12-07
S&B Ref: 84678114 (8500010-481)
may be the k intents having the highest confidence scores. For example, the
user's question may
be "What is the big deal about your Mastercard?" The intent classifier 114
determines two
possible intents: (1) the user is requesting a summary of the features of the
MastercardTM brand
credit card, and the confidence score of this determined intent is 45%; or (2)
the user is
requesting information on promotional offers for signing up for the
MastercardTM brand credit
card, and the confidence score of this determined intent is 35%. Neither
intent has a high enough
confidence score to proceed to step 208, but in step 222 both intents (k = 2)
are identified. Each
intent corresponds to an alternative question the user might be asking, which
is derived from the
natural language input conveying the question that was received at step 202.
1511 The k intents correspond to k questions, and in step 224 the k
questions are
returned to the user via the user interface 104.
[52] In step 225 it is determined whether one of the k questions is
identified as correct
by the user. Step 225 may be performed by determining the intent of an input
received from the
user after the k questions are presented to the user. If one of the k
questions is identified as
correct, then in step 226 the response generator 116 returns the corresponding
answer and the
method ends. If none of the k questions is identified as correct, then the
method proceeds to step
230.
[53] If step 230 is reached in the method of FIG. 2 it means that the
system 102 is not
able to determine the question the user is asking. In step 230 the response
generator 116 sends a
reply to the user indicating this, e.g. "Sorry, I do not understand your
question. Please try to
rephrase your question".
[54] FIG. 3 illustrates an example message exchange on a user interface
104,
according to one embodiment. The message exchange corresponds to steps 202,
204, 206, 207,
208, 209, 211, 212, 214, 215, and 216 of FIG. 2. The number of responses (in
the form of
questions) is progressively increased during the interaction with the user. In
particular, initially
only one question is presented for verification at 382. However, upon
receiving user feedback
indicating that the initial question is incorrect, n = 3 alternative questions
are provided at 384.
The user indicates that the first one of the three alternative questions is
correct at 386, and the
answer corresponding to that question is returned at 388.
CA 3026936 2018-12-07
S&B Ref: 84678114 (8500010-481)
[55] FIG. 4 illustrates an example message exchange on a user interface
104,
according to another embodiment. The message exchange corresponds to steps
202, 204, 206,
207, 221, 222, 224, 225, and 226 of FIG. 2. The confidence score relating to
the most likely
intent does not exceed the threshold, and so the k = 3 most likely responses
are returned at 392.
The user indicates that the first one of the three questions is correct at
396, and the answer
corresponding to that question is returned at 398.
[56] Returning to FIG. 2, optionally, in step 234, the learning component
118 updates
the keyword extractor 112 and/or the intent classifier 114 to reflect the
user's response that
indicates which question is the correct question. For example, the learning
component 118 may
receive the output of the "Yes" branch of step 215 and/or step 225, which
indicates the correct
question, and the learning component 118 may use this indication to update or
train the intent
classifier 114 and/or the keyword extractor 112. Two examples follow.
[57] One example: The user initially asks the question "What is the big
deal about your
Mastercard?" The system does not determine an intent with a high enough
confidence score and
so three questions are returned to the user, as shown at 392 of FIG. 4. The
user replies that the
first question is the correct, i.e. the correct question is "What are the
features of the Mastercard
credit card?". The learning component 118 then updates the keyword extractor
112 and/or intent
classifier 114 to add the vocabulary "big deal" and to indicate that "big
deal" is a synonym to
"features". Then, if in the future a user asks a question including "big
deal", e.g. "What is the big
deal regarding your savings account", then the intent classifier 114 will more
confidently
determine that the user intent is that the user wants to learn about the
features of the savings
account.
[58] Another example: The user initially asks the question "What is the
rate on your
cashback Mastercard?" The system initially returns the incorrect question, as
shown at 382 of
FIG. 3, and so three alternative questions are returned to the user, as shown
at 384 of FIG. 3. The
user replies at 386 that the first question is correct, i.e. the correct
question is "What is the
interest rate on the Mastercard credit card". The learning component 118 then
updates the intent
classifier 114 to increase the confidence score of the entity value "interest
rate" when "rate" is
used in the user's question. Then, if in the future a user asks a similar
question, e.g. "What is the
11
CA 3026936 2018-12-07
S&B Ref: 84678114 (8500010-481)
rate on your Visa card", then the intent classifier will more confidently
determine that the user
intent is that the user wants to know the interest rate for the VisaTM brand
credit card.
[59] An example of a learning algorithm that may be implemented by the
learning
component 118 is: Schatzmann, J., Weilhammer, K., Stuttle, M., & Young, S.
(2006), "A survey
of statistical user simulation techniques for reinforcement-learning of
dialogue management
strategies", The knowledge engineering review, 21(2), 97-126.
[60] In alternative embodiments, steps 208 and 209 of FIG. 2 may be
modified to
instead just return an answer to the determined question, and ask for
validation that the returned
answer is correct, in which case step 210 is not needed. For example, box 382
in FIG. 3 may
instead be: "The cashback rate on the MasterCard credit card is 1%. Did that
answer your
question?". If the user answers "Yes" then the method ends, whereas if the
user answers "No,
that did not answer my question", then the method proceeds to step 211.
[61] In some embodiments, the original question asked in the natural
language input
received at step 202 may actually consist of more than one question, in which
case the system
102 may extract and process each question separately, or the intent classifier
114 may try to
determine an overall intent. For example, if the natural language input from
the user in step 202
is "Does your bank offer multiple credit cards? What is the rate of each
one?", then the intent
classifier 114 may determine that the intent is that the user wants a
comparison of the rate of
each of the bank's credit cards.
[62] In some embodiments, the natural language input received at step 202
may not be
a question, but may instead be a request or an instruction to perform an
action. For example, the
input may be a request for information. The reply may then be a question that
confirms whether
particular information is being requested. For example, the natural language
input received in
step 202 may be "Provide me with the rate on your cashback MasterCard", and
the initial
question returned in step 208 may be "Please confirm that you are asking: What
is the cashback
rate on the MasterCard credit card?" Similarly, the alternative questions in
steps 214 and 224
may ask whether particular information is being requested.
[63] In some embodiments, the natural language input received at step 202
may be an
instruction to perform an action, e.g. "open a new account", in which case the
question(s)
12
CA 3026936 2018-12-07
S&B Ref: 84678114 (8500010-481)
returned may relate to clarification or confirmation before proceeding, e.g.
in step 214 "Do you
mean any one of the following actions: (1) Open a new savings account?; or (2)
open a new
chequing account?; or (3) open a new student account?".
[64] In some embodiments, when a user asks a question or requests an
action, the
response returned by the system 102 may be formulated based on information
specific to the
user. In some embodiments, the response may be in reply to a user's finance
related question or
finance related action. The response may be a function of the user's financial
information, e.g.
the user's prior financial transactions. The response may be a question or an
answer or an action.
[65] FIG. 5 illustrates a flowchart of a computer-implemented method for
interacting
with a user, according to one embodiment. In step 452, the data processing
unit 110 receives, in
text form, a natural language input originating from a user via the user
interface 104. The natural
language input conveys a finance related question or a finance related action
to be performed. As
an example, the user may be asking "What is the monthly fee for your savings
account?" (a
finance related question), or the user may be instructing "Please open a new
savings account" (a
finance related action).
[66] In step 454, an intent is determined from the natural language input,
possibly
using keywords extracted from the natural language input. In step 456, the
response generator
116 formulates a response (e.g. a question, an answer, or an action) based on
the intent.
However, the response formulated by the response generator 116 is based on
user-specific
financial information, as explained below.
[67] Stored in memory 122 is the identity of the user. The system 102 knows
and
stores the identity of the user because the identity of the user has been
previously provided to the
system 102. As one example, the user may have previously provided their bank
card number to
the system 102, which is used to uniquely identify the user. As another
example, the system 102
may be part of an online banking platform, and the user is signed into their
online banking, such
that the system 102 is aware of the identity of the user.
[68] Stored in a data structure, e.g. a database, is user-specific
financial information.
User-specific financial information is financial information that is specific
or unique to the user.
A non-exhaustive list of user-specific financial information includes any one,
some, or all of the
13
CA 3026936 2018-12-07
S&B Ref: 84678114 (8500010-481)
following: prior financial transactions performed by the user, e.g. a stored
record of previous
financial transactions; and/or quantity, quality, or type of financial
transactions performed by the
user; and/or user account balances; and/or number or type of accounts held by
a user (examples
of accounts include banking accounts, mortgage accounts, investment accounts,
etc.); and/or
credit information for the user; and/or information relating to banking
products utilized by the
user, e.g. whether the user has a mortgage, a credit card, investments, etc.
The data structure may
be stored in memory 122 or at another location, e.g. a database connected to
data processing unit
110 via a network.
[69] There are multiple candidate responses that may be returned to the
user, which are
selected or weighted based on the user-specific financial information. Some
examples are
provided below.
[70] Example: The natural language input originating from the user in step
452
conveys the following finance-related question: "What is the monthly fee for
your savings
account?" The intent determined in step 454 is that the user is requesting the
monthly fee for a
savings account. The response generator 116 determines the following, e.g. by
querying a
database: the standard monthly fee is $10 per month, but the fee is reduced to
$5 per month if the
user has a mortgage account or an investment account with the bank, and the
fee is reduced to $0
per month if the user has both a mortgage account and an investment account
with the bank.
Therefore, there are three candidate responses: $10, $5, or $0. The response
generator 116 uses
the user identification stored in memory 122 to query a database that lists
the accounts held by
the user. The accounts held by the user include a mortgage account, but not an
investment
account, and so the response returned to the user in step 456 is that the
monthly fee is $5, or the
response may be a question, e.g. "We can offer you a savings account for a
monthly fee of only
$5, are you interested?".
[71] Another example: The natural language input originating from the user
in step
452 conveys the following finance-related question: "What is this month's fee
for my savings
account?" The intent determined in step 454 is that the user wants to know
this month's fee for
the user's savings account. The fee is a function of the number of financial
transactions
performed by the user involving the user's savings account, e.g. $1 fee for
every transfer into or
out of the savings account in the month. The response generator 116 uses the
user identification
14
CA 3026936 2018-12-07
S&B Ref: 84678114 (8500010-481)
stored in memory 122 to query a database that lists the number of transactions
that month. The
database returns a value indicating that there were three transfers since the
beginning of the
month, and so the response returned to the user in step 456 is that the fee
will be $3.
[72] Another example: The natural language input originating from the user
in step
452 conveys the following finance-related action: "transfer $100 from my
savings account to my
chequing account". The intent determined in step 454 is that $100 is to be
transferred from the
user's savings account to the user's chequing account. The response generator
116 determines
that the user has two savings accounts ("A" and "B"), and so there are two
candidate responses:
either transfer the $100 from the user's savings account A or transfer the
$100 from the user's
savings account B. The response generator 116 uses the user identification
stored in memory 122
to query the account balances for savings accounts A and B and determines that
savings account
B has no money in it. In response, the response generator 116 performs the
transfer from savings
account A, perhaps after sending a question to the user confirming that the
money is to be
transferred from savings account A.
[73] In some embodiments, the method of FIG. 2 may be modified to
incorporate
generating a response based on user-specific financial information. For
example, the answer
returned in step 210 and/or 216, and/or 226 may be based on user-specific
financial information.
In a variation of FIG. 2, answers (instead of questions) may be returned in
steps 208/209,
214/215, and 224/225 (in which case steps 210, 216 and 226 are not needed).
The initial answer
returned in step 208/209 may be formulated based on financial information
specific to the user. If
in step 209 the user found the answer to be unsatisfactory (e.g. incorrect),
then the alternative
intents or answers (e.g. of step 214/215) may or may not be based on the
user's financial
information.
[74] An example: The natural language input originating from the user
conveys the
following finance-related question: "What is the rate of your savings
account?" The intent
determined is that the user is requesting the interest rate of a savings
account, and the confidence
score is high enough to immediately supply an answer to the question. The
standard interest rate
for a savings account is 1%, but can be offered at 1.5% if the user has a
mortgage account with
the bank. The response generator 116 uses the user identification stored in
memory 122 to query
a database that lists the accounts held by the user. The accounts held by the
user include a
CA 3026936 2018-12-07
S&B Ref: 84678114 (8500010-481)
mortgage account, and so the response returned to the user is that the
interest rate is 1.5%. It is
then determined that the user is not satisfied with the answer, e.g. the user
actually wanted to
know the fee for the savings account. n alternative intents are therefore
identified, and n
corresponding alternative answers are returned to the user. However, the n
corresponding
alternative answers are not formulated based on user-specific financial
information because the
system 102 is now not as confident about whether the alternative answers even
reflect the
question actually asked by the user. This is because the confidence scores
associated with the
alternative intents are lower than the confidence score associated with intent
initially determined.
[75] In some embodiments, the response may only be formulated based on user-
specific financial information if the confidence score of the intent
associated with the response is
above a particular threshold. For example, if the intent has a confidence
score of 90% or above,
then modify the corresponding response based on the user-specific financial
information;
otherwise, do not modify the corresponding response based on the user-specific
financial
information.
[76] FIG. 6 illustrates a flowchart of a computer-implemented method for
performing
automated interactive conversation with a user, according to one embodiment.
The automated
interactive conversation may be performed in order to provide an answer to a
question from the
user.
[77] In step 502, a user interface 104 is provided at which the user can
provide a
natural language input. The natural language input is processed by the data
processing unit 110.
The data processing unit 110 comprises at least one processor executing
instructions. The
instructions are configured to cause the data processing unit 110 to perform
the remaining steps
of FIG. 6.
[78] In step 504, the data processing unit 110 derives, from the natural
language input,
a possible question the user might be asking. An example of step 504 is
described earlier in
relation to steps 202 to 208 of FIG. 2.
[79] In step 506, the data processing unit 110 conveys the possible
question to the user
through the user interface 104 for verification by the user. An example of
step 506 is described
earlier in relation to steps 208 and 209 of FIG. 2.
16
CA 3026936 2018-12-07
S&B Ref: 84678114 (8500010-481)
[80] In step 508, the data processing unit 110 processes user input at the
user interface
104 indicating that the possible question is incorrect (e.g. the "No" branch
of step 209 of FIG. 2).
[81] In step 510, the data processing unit 110 derives a series of
alternate questions
that the user might be asking (e.g. step 212 of FIG. 2). In some embodiments,
step 510 may only
be performed if at least one keyword was recognized and extracted from the
natural language
input. In some embodiments, at least one keyword is recognized and extracted
from the natural
language input, and step 510 includes deriving the series of alternate
questions based on the at
least one keyword.
[82] In step 512, the data processing unit presents the series of alternate
questions to
the user through the user interface 104.
[83] In some embodiments, deriving the possible question the user might be
asking in
step 504 includes determining a user intent from the natural language input.
In some
embodiments, deriving the possible question the user might be asking is
performed without an
extracted keyword.
[84] In some embodiments, an algorithm for extracting at least one keyword
and/or an
algorithm for determining user intent is modified based on an indication from
the user of which
one of the alternate questions is a correct question.
[85] In some embodiments, the method further includes receiving an
indication, from
the user, that a particular question of the series of alternate questions is
correct, and presenting to
the user through the user interface an answer to the particular question. In
some embodiments,
the method includes generating the answer to the particular question using
user-specific financial
information. In some embodiments, the particular question is a finance-related
question, and the
user-specific financial information relates to financial transactions
previously performed by the
user and/or accounts held by the user.
[86] FIG. 7 illustrates a flowchart of a computer-implemented method for
interacting
with a user, according to one embodiment.
[87] In step 552, a user interface 104 is provided at which the user can
provide a
natural language input conveying a finance related question or a finance
related action to be
17
CA 3026936 2018-12-07
S&B Ref: 84678114 (8500010-481)
performed. The natural language input is processed by the data processing unit
110. The data
processing unit 110 comprises at least one processor executing instructions.
The instructions are
configured to cause the data processing unit 110 to perform the remaining
steps of FIG. 7.
[88] In step 554, the data processing unit 110 derives, from the natural
language input,
a possible finance related question or possible finance related action. In
step 556, the data
processing unit 110 obtains a series of candidate responses, each of which is
a response to the
possible finance related question or the possible finance related action. In
step 558, the data
processing unit 110 selects one of the candidate responses on the basis of
user-specific financial
information. In step 560, the data processing unit 110 presents the selected
candidate response to
the user through the user interface 104. Examples are provided earlier when
describing FIG. 5
and related embodiments.
[89] In some embodiments, the candidate responses are a series of answers,
each
answer corresponding to a respective possible finance related question. In
other embodiments,
the candidate responses are a series of actions, each action corresponding to
a respective possible
finance related action instructed by the user. In other embodiments, the
candidate responses are a
series of questions. The questions may each correspond to possible finance
related question
being asked. In some embodiments, the user-specific financial information
relates to financial
transactions previously performed by the user and/or accounts held by the
user. In some
embodiments, the user-specific financial information is retrieved using an
identifier of the user
stored in memory.
[90] Although the foregoing has been described with reference to certain
specific
embodiments, various modifications thereof will be apparent to those skilled
in the art without
departing from the scope of the claims appended hereto.
[91] Moreover, any module, component, or device exemplified herein that
executes
instructions may include or otherwise have access to a non-transitory
computer/processor
readable storage medium or media for storage of information, such as
computer/processor
readable instructions, data structures, program modules, and/or other data. A
non-exhaustive list
of examples of non-transitory computer/processor readable storage media
includes magnetic
cassettes, magnetic tape, magnetic disk storage or other magnetic storage
devices, optical disks
18
CA 3026936 2018-12-07
S&B Ref: 84678114 (8500010-481)
such as compact disc read-only memory (CD-ROM), digital video discs or digital
versatile disc
(DVDs), Blu-ray DiscTM, or other optical storage, volatile and non-volatile,
removable and non-
removable media implemented in any method or technology, random access memory
(RAM),
read-only memory (ROM), electrically erasable programmable read-only memory
(EEPROM),
flash memory or other memory technology. Any such non-transitory
computer/processor storage
media may be part of a device or accessible or connectable thereto. Any
application or module
herein described may be implemented using computer/processor
readable/executable instructions
that may be stored or otherwise held by such non-transitory computer/processor
readable storage
media.
19
CA 3026936 2018-12-07