Eliza

From Real Software Documentation

Jump to: navigation, search

Aim
In this lesson, we will put together a program to exercise our new programming muscles. We’re going to stick mostly to things we’ve been using for a while, not venturing into the new ADT and polymorphism material. We’re just trying to see what a slightly more complex program looks like.

Contents

What is Eliza?

Eliza is a classic computer program. The program pretends to chat with its user, acting as a psychiatrist or counselor. The user types plain English, and the program seems to respond intelligently in English.

The program does this by looking for particular words or phrases in the text entered by the user. If it finds a match, the program responds with a matching question or statement. If the program doesn’t find a match, it responds with a request for more information or with some other generic phrase.

Planning the Program

We have to implement several different things to make this program work. We need to provide:

  • A way to get the phrases searched for and the responses to them into the program;
  • Storage and retrieval of the search phrases and responses; and
  • The user interface and the logic that implements the actual questions and responses. Try to decide how you would do these things before you go on.

Loading the Phrases from a File

We will load the phrases from a text file on the disk. We’ve seen how to read a file before, but this time, we have to store some structured information in the file.

The file will be able to contain multiple responses. Each response will consist of one or more words or phrases to search for, and one or more suitable responses.

This means we need to separate:

  • Each response from the next;
  • Within a given response, the phrases to search for from the responses; and
  • Within the phrases and responses, the individual search terms or responses from each other.

We will do this by having each response on a separate line in the file. We will then separate the search phrases from the responses with a tab. Finally, we will separate the individual search phrases from each other, and the individual responses from each other, with a slash (/). So a single line from the file might look like this (the big space in the line is where the tab is):

mum/mom/mother Tell me more about your mother./What does this tell us about your mother?

We will indicate the random responses, to be given when there is no match, by a line that is empty before the tab.

ASCII Characters and Unicode

Computers represent letters as numbers. There are two ways of doing this in Real Studio:

  • Using the ASCII (American Standard Code for Information Interchange) character set, a set of 127 characters, including upper and lower-case Roman letters, numbers, punctuation and a few others, including some non-printing “control” characters (like tab); and
  • Using the Unicode character set, a much larger character set, intended to encompass every symbol needed to represent every language on Earth, as well as a great many other symbols.

ASCII is an old standard. It is important for two reasons: 1) compatibility with other systems; and (2) because ASCII characters fit into a single byte[note 1].

Unicode is a new standard for text on major computer systems. It is important because most people in the world need more than just the Roman character set to communicate. The only significant disadvantage to Unicode is that it requires two bytes to represent each character, so it takes twice as much space to store text in Unicode, and it is a little slower to process as well.

Unicode has been designed so that the first 127 characters are the same as the ASCII character set.

Real Studio stores strings internally in ASCII if the operating system it is running on uses Roman characters, and in Unicode on other systems. Its regular string operations operate on ASCII or Unicode characters accordingly. There are also ‘B’ versions of all the operations (AscB and so on), which work on bytes only. Consult the built-in reference for more information.

For our purposes, we just need to be able to identify two non-printing characters, which we will do with the Chr function (which you should look up).

StringResponder

Let’s begin. Since this is a fairly simple program, we will put most of the features in a single class.

  1. Create a new Desktop Application project.
  2. Switch to the Project Editor and add a new class, with no Super and with its name set to “StringResponder”.
  3. Open its Code Editor.
  4. Add the following properties:
Name Data Type
RandomResponse() String
RespondWith() String
SearchFor() String


  1. Create a new method and name it “LoadFromFile”, with parameters “f As FolderItem, ElDelimiter As String, RespDelimiter as String”.
    It has no return type.
  2. Enter the following code for LoadFromFile.
//Pre: f is a valid text folderitem, delimited as described below.
//Post: Records, delimited by Chr(13),
//are loaded into the responses,
//with response delimited from the string searched for
//being delimited by RespDelimiter,
//and multiple resonses or search
//fors being delimited by ElDelimiter
//Note that Random Responses are indicated
//by empty search fors
Dim InputFrom As TextInputStream
Dim InputLine, searchForIn, RespondWithIn As String
Dim Counter1, Counter2 As Integer
InputFrom=TextInputStream.Open(f)
//Split searchForIn from RespondWithIn
While not InputFrom.EOF
InputLine = InputFrom.ReadLine
searchForIn = Trim(NthField(InputLine, RespDelimiter, 1))
RespondWithIn = Trim(NthField(InputLine, RespDelimiter, 2))

If searchForIn <> "" then
//Separate multiple searchForIns, RespondWithIns
// and pair them up
for Counter1 = 1 to CountFields(searchForIn, ElDelimiter)
for Counter2 = 1 to CountFields(RespondWithIn, ElDelimiter)
AddResponse NthField(searchForIn, ElDelimiter, Counter1),_
NthField(RespondWithIn, ElDelimiter, Counter2)
next //Counter2
next //Counter1

else //Empty searchForIn, so do randoms
//Append all RespondWithIns to RandomResponse
for Counter2 = 1 to CountFields(RespondWithIn, ElDelimiter)
AddRandom NthField(RespondWithIn, ElDelimiter, Counter2)
next //Counter2
end if //searchForIn
Wend
InputFrom.Close


You should look up the string functions we haven’t seen before, such as NthField and Trim.
This method looks longish, but Real Studio’s string commands actually make it pretty straightforward: we separate the two major parts of the line, then just loop over the parts of each. We separate the storage of the information into methods such as AddResponse.

  1. Add the following methods:
    AddResponse (S as String, R as String)
SearchFor.Append S
RespondWith.Append R


AddRandom(R as String)

RandomResponse.Append R


The last thing we need in our class is the method that fetches the response.
FindMatch(S as String) as String

//Pre: There is at least one element in RandomResponse
//Post: Find all elements of SearchFor that are in S,
//and return a random
// matching RespondWith String
//If no matches are found,
//return a random RandomResponse string
Dim Matches() As Integer
Dim Counter As Integer
//Build list of matches
For Counter = 0 to UBound(SearchFor)
if Instr(S, SearchFor(Counter)) <> 0 then
Matches.Append Counter
end if
Next
//Return a random match, or a random response
If UBound(Matches) <> -1 then //Random Match
Return RespondWith(Matches(RandomInteger(UBound(Matches))))
Else //Random Response
Return RandomResponse(RandomInteger(UBound(RandomResponse)))
End if

This method searches the list of search words, and if one is found, its index is added to the Matches array. At the end of this process, we choose one of the indexes in the array at random, and return the matching response. If the matches array comes up empty, we return one of the RandomResponse strings.

Now we just need to wire in the rest. First, we need to write the RandomInteger function that FindMatch refers to. We will put RandomInteger into a module.

  1. Create a new module. Name it “RandomNumber”.
  2. Add the function “RandomInteger” with parameter “Top as Integer” with a return type of Integer.
  3. Enter the following code.
//Return a random integer in the range 0..Top
Return round(rnd*Top)


We needed a function that gave us a random number between 0 and a given number. The Real Studio function Rnd returns a fraction between 0 and 1. You can use the rnd number as “from none to all” of some quantity, so by multiplying it by top, and rounding it to the nearest whole number, we get the value we want.

Now we need to build our user interface.

  1. Open Window1 and set up a user interface like this:
    The larger TextArea is named Discussion; the smaller is called Entry.
The Eliza window.

  1. Open the window’s code editor. Add a property: Eliza As StringResponder.
  2. Turn off the Enabled property for Discussion in the Properties pane.
  3. Add the following methods:
    HandleEntry
//The user just entered something. Handle it.
Discussion.Text = Eliza.FindMatch(Entry.Text)
Entry.Text = ""


SetResponder(R as StringResponder)

Eliza = R


The reason we put these into methods rather than just into event handlers is because we will let the user enter what they typed several ways.

  1. Enter the following as the KeyDown event handler for Entry:
Dim K As Integer
K = Asc(Key)
If Key = Chr(13) or Key = Chr(3) then
HandleEntry
Return True
Else
Return False
End if

This lets the user enter what they’ve typed by pressing Enter or Return.

  1. Add the following code to the button’s Action event handler.
HandleEntry
Now we have just one thing to do: we have to create a StringResponder and load its contents when the application starts.
  1. In the Project Editor, double-click the App class and add the following code to its Open event handler:
//Initialize Eliza from file
Dim f As FolderItem
Eliza = New StringResponder
f = new FolderItem
f = f.Child("Responses.txt")
If f<>Nil then
Eliza.LoadFromFile f, "/", chr(9)
Else
MsgBox "Error: Responses.txt not found."
Quit
End if
Window1.SetResponder Eliza

You should read up on anything new here, particularly the child property of FolderItems.
A new FolderItem starts out pointing at the folder the application is in (or when you’re running in the IDE, the folder you saved the project in), so this code will try to load a file called Responses.txt from that same folder.

  1. Add the following property to the App class:
Eliza As StringResponder

Scope of Identifiers

You may have noticed that we have defined two properties with identical type and name (Eliza As StringResponder in both the App class and in Window1). It is important to understand that these are separate properties, which can potentially refer to different objects (although in this case they end up referring to the same object). This works because all the identifiers (the computer programming term for names in a computer program) in a program have a scope (meaning the part of the program where the identifier is valid). When you refer to a name in your code, the meaning of the name is looked for in the following order:

  • Local variables, defined and valid only with the same method, declared via a Dim command;
  • Properties, methods or new events visible in the same code editor (meaning in the same class or window);
  • Properties or methods in a superclass (if they are not set to Private); and
  • Global or Public Properties or methods in a module or Public properties or methods in other windows and classes.

This means that an identifier defined further up in this list will prevent code from using an identifier that would otherwise be visible further down in the list. Also, remember that unless a property or method is declared as Protected or Private when the method is defined, you can always refer to it from anywhere in the program using the MyObject.SomeProperty or MyWindow.SomeMethod (and so on) dot notation.

Test It

A suitable simple Responses.txt file should have been available in the place from which you got this lesson. If you want to write your own file, use a text editor like NotePad or BBEdit; make sure you save the new file as a Text file. That’s it. As usual, you should test the program out, and trace through any parts you are interested in.

Further Exercises

This project can obviously be improved in all sorts of ways. Here are some ideas:

  • The project should keep track of its responses better. At the moment, if you enter the same thing twice, the computer will sometimes give back the same response twice. How could this be fixed? (One suggestion: the responses should not be random from among the possible matches, but in order. Then rearrange the responses so the last one given goes to the bottom of the list. You could shuffle the list before you start to make it random between runs. There are other solutions, though, such as tagging the responses in some way). To shuffle a list, go through the list, and swap each element with another element selected at random;
  • The project could start out by asking the user’s name, and then incorporate the name into some of the responses (say, by replacing a special character such as * with the name).
  • If you feel more ambitious, allow for responses that display a dialog box and store the answer, then allow that answer to be used in other responses. The program will have to let the Responses.txt file name these answers, and it can’t know ahead of time what these names will be, so it will need another list, for storing named answers, and some way of indicating that a named answer should be inserted into a response. You’ve now got two pairs of arrays with the same role (letting you look up a value in the second array according to the value in the first array; the technical name for such a data structure is a symbol table); you might like to implement a SymbolTable class (remembering the benefits of ADTs) to take care of both tasks.

References

Notes
Personal tools