Kingston University
Research
Development
Customising Arthur
Program Functions
Critique
Source Code
Knowledge Base
Template File
Home
Evaluation
Related Links
Arthur the Chatterbot

Overview of the Header File -


The header file consists of the define statements that allocate integer values to the different classes of both words and question identities. The different classes of words are adjective, noun, prepositional, interrogative, adverb, pronoun, possessive, verb and article. Lists of the words that the program recognises in the classes are found in the lexicon within the header file.

The question identities are also given numerical values in order to relate the answers generated by the response function to the template file. Question identities recognised by Arthur are: -"What �1", "Who � 2", "Where � 3", "Which � 4", "When � 5", "Why - 6", "How � 7", "Are � 8", "Am � 9", "Were � 10", "Is � 11", "Was � 12", "Would � 13", "Can � 14", "Could � 15", "Do � 16", "Does � 17", "Did � 18", "Have � 19", "Has � 20", "Had - 21", "Must � 22", "Might � 23", "Shall - 24", "Should � 25".

The header file also contains the two functions, removed() and tolower(). The main() function sends each character in each string first to the removed() function, and then to the tolower() function. Removed() checks to see, if the passed character is in the ASCII range of 65-90 or 97-122 (this covers all upper and lower case alphabetical characters). If it is not, then an empty space is returned. If it is, the character is returned. In this way, all of the punctuation is removed from the strings. Main() immediately returns the same character to tolower(), which in contrast to the name returns the lower case value of the character to main(). The input strings have now been cleaned, the case lowered and now have empty spaces between them, so that they can now be treated as a sentence that the program can �understand�. The strings have an identity created by using the argv [ ] array (a C++ defined array) in main(). Each input string is an argument, and is allocated a number. Now that the strings have been cleaned of punctuation, they may contain spaces. However, the sentence is split on the spaces, between the strings as defined by their argument number.


Overview of the Function main ()


In main(), three counters are declared and initialised. One is used to count each of the arguments of the array argv [ ] into the structure sent (which is the sentence to be used by the program). One is used to count the contents of each of those arrays into the strings used in the program. The third is used to send the characters of each string to the header file for cleaning. Main() then sends the cleaned sentence to search_database (). If no match is found in the database, the sentence is sent to parse(). A seed value is also created in main() using the system clock time command, to be able to generate random numbers in the program.


Overview of the Function search_database ()


The int search_database(char *sentence) function, takes as an argument, a pointer to sentence [ ], in other words the sentence is passed to it from main(), it returns an integer, 0 or 1 (true or false). The database file is opened, �no_r� is declared as an integer and initialised. An array is created - 20 lines, 100 characters in length for holding the possible responses. A buffer is also created, 80 characters long, to hold the keyword or phrase that the search will try to match in the sentence.

It is then merely a question of stepping through the database, taking each line into the buffer and using strstr() to compare the buffer and the sentence. The strstr() function compares the buffer to the sentence in order to see if the string in the buffer exists anywhere in the sentence. If a match is found, the next line is read which will be the number of answers in the database for that query. This number is randomised; the result is the number of the answer that is selected. The search_database() function is true and returns �0�. The response then goes to the display function, through main(). However, if no match is found, the function returns �1� to main(), and the sentence is sent to Parse.