Proceedings of the IASTED International Conference

Artificial Intelligence and Soft Computing

August 9-12, 1999 - Honolulu, Hawaii - USA

 

Building an Assessment Agent in Statistics

 

Harry Keeling

 

Department of Systems and Computer Science

Howard University

2300 6th Street, NW, Washington, DC 20059

phone: (202) 806-4830

hkeeling@scs.howard.edu

 


Abstract: Traditionally, a major obstacle in the building of knowledge-based systems has been the “knowledge acquisition bottleneck”.  That is, the acquisition of domain knowledge has significantly slowed the development of intelligent software.  Nowhere has this been truer than in the area of intelligent educational software where the domain experts are teachers with little or no experience with knowledge engineering. Recent advances in the area of artificial intelligence, particularly in the fields of machine learning and knowledge acquisition have addressed this issue and have shown promise. This paper discusses some of these advances in the context of a methodology for developing agents that synergistically combines methods from machine learning and knowledge acquisition, concepts from intelligent tutoring research and advances in skill assessment from educational research. This methodology utilizes an apprenticeship, multistrategy learning approach called Disciple, where intelligent learning agents that are taught by experts with examples and explanations in much the same way that one would teach a human apprentice. This research demonstrates solutions to the problems involved in building intelligent educational software and prescribes a new approach that draws from the fields of artificial intelligence and educational research.

 

KEYWORDS: Educational Agents, Intelligent Tutoring Systems, Machine Learning, Knowledge Acquisition

 

1    Introduction

 

     The building of knowledge-based systems has been impeded by the “knowledge acquisition bottleneck”.  Difficulties in the acquisition of expert knowledge has significantly affected the number of intelligent software in use today.  This is particularly evident in the area of intelligent educational software where very few systems have made it into the classroom. The problem stems from the fact that the domain experts are teachers with little or no experience with knowledge engineering. Recent advances in the area of artificial intelligence, particularly in the fields of machine learning and knowledge acquisition have addressed this issue and have shown promise. Specifically, the Disciple apprenticeship learning approach [12] has provided the foundation for the definition and application of a agent building methodology.  This full life cycle methodology includes a software toolkit that facilitates its application. Also this comprehensive methodology, derived from software engineering principles, include a specialized agent evaluation phase. This approach has been successfully applied to develop educational agents that act as indirect communicational channels between the educator and the student [9,13].

 

     This paper presents a case study of building and training the latest of these educational agents. This assessment agent has been developed and applied in the area of statistics. It generates test questions and provides tutoring through intelligent hints and explanations dynamically generated from its knowledge base. This statistical agent seeks to assess and support the development of students' higher-order thinking skills [2,3,6]. Specifically, this agent assesses students’ knowledge in the area of inferential and descriptive statistics. Further, it tutors them on issues related to statistical analysis.  This approach demonstrates the benefits of integrating machine learning and intelligent tutoring systems [1].

 

2    Overview of the Methodology

 

     The methodology used to build these intelligent agents [4] is based on the Disciple multistrategy apprenticeship learning paradigm [11,12] instead of traditional knowledge engineering.  Domain experts have used this approach and the customized/expanded toolkit, called the Disciple Learning Agent Shell [14], to build intelligent agents [4,5] that tutor and assess a learner thinking skills. These agents were trained in much the same way that a human apprentice would be taught. Similar to the way a human apprentice is first taught an initial set of concepts and relationships in the particular problem domain, Disciple agents are provided with an initial knowledge base composed of declarative knowledge organized into a semantic net.  Also, similar to a human apprentice being shown examples of correct task performance, the agents are given an initial example by the domain expert.  In these agents, an example is a problem/solution pair expressed in terms of the concepts and properties in the agent’s knowledge base.  Next, the agent proposes several feasible explanations for these solutions and prompts the expert to select the relevant ones. The agent also solicits additional expert explanations.  The agent then uses the initial example and the set of expert verified explanations to form an initial rule. Guided by this rule, the agent searches it's semantic net for instances of this rule.  Then it displays these instances/examples using the Disciple Learning Agent Shell and domain specific interfaces. The expert can accept or rejects these examples. In an interactive dialog, the expert continues to supervise the agent as it solves new problems and validates its solutions until the agent has been sufficiently trained. The resulting, trained agent is then given its own interface, or it is integrated into the target educational package, to provide teachers and learners with intelligent features such as skill assessment and intelligent feedback.

 

3    Sample Interaction with the Agent

 

     The Disciple Statistical Agent seeks to measure the entire array of higher-order thinking skills that are required for statistical analysis and problem solving.  So the agent takes one problem and tests the analysis skills required for that problem, in turn, following closely the intuitive and natural problem solving process. This agent goes over the complete analysis of a data set containing statistical data. It first tests whether the student understands the type of the data set so as to determine the kind of questions that should be asked about that data set. Second, it tests whether the student can formalize a question in the form of hypothesis testing or statistical measures. Third, it tests whether the student can identify the techniques and tools necessary to successfully complete the analysis.  The agent is designed to test and support students’ abilities defined in Table 1.

Table 1. Three Levels of Assessment

Capture the type of the data set and determine the kind of questions that should be asked about that data set.

Formalize these questions in the form of hypothesis testing (inferential statistics) or in the form of statistical measures (descriptive statistics).

Identify the techniques and tools necessary to successfully complete the analysis.

 

     The first set of questions shown in Figure 1 attempts to assess whether the student has captured the type of the data set and whether he/she can determine the kind of questions that should be asked about the Cigarette data. This is the first level of the assessment (see Table 1).

 



                                            

Figure 1. Analysis of Cigarette Data - Level 1 of Assessment

 


In this sample interaction students are asked to analyze a data set which contains measurements of weight, tar, nicotine, and carbon monoxide contents for 25 brands of domestic cigarettes. The students must click on the button next to the question to be linked to a new page containing the agent’s response. Each set of questions of this type may have one or more correct answers, and possibly no correct answer at all.

 

4    Defining Agent Requirements

 

     The methodology for building this and the other Disciple educational agents began with the definition of the agent's functional requirements and its related knowledge requirements. In this phase, the need for the statistical agent was investigated and its main goal identified. It was concluded that, in order to achieve the three levels of assessment described in Table 1, the agent was required to generate a sequence of three sets of questions. Each set of questions needed to target one of the three levels. It was required that agent's assessment function for a data set starts by the agent asking general questions about the kind of information the learner could extract from this data set. It then asks questions about the formalization of an appropriate question about data set. Finally, the agent must seek to assess the learner's ability to identify the appropriate mathematical tool and technique to answer the question he/she asked.

 

5    Building of the initial knowledge base

 

     The starting point in building the agent's initial knowledge base was to define the required knowledge elements that needed to be represented.  A top-level ontology [7] was defined and a semantic net was employed to represent the concepts and instances.    For example, there are two concepts needed in order to have a qualitative” description of a data set: the type of the data and the variables contained in the data set. The type of the data can be described by the concept TYPE-OF-DATA (see Figure 2). There are three types of data: Case-Data, Categorical-Data, and Time-Data. These are concepts in the agent’s knowledge base, sub-concepts of the concept Type-Of-Data.

 

     To begin the agent building process, the agent developer in cooperation with the domain expert customized the Disciple Learning Agent Shell.  This software toolkit contains components for basic knowledge acquisition and learning, problems solving and knowledge base management. The knowledge base of the Statistical Agent is comprised of 335 concepts and instances and knowledge about 30  data sets.

 

Figure 2. Sample of the agent’s semantic net

 

6    Teaching the agent to generate questions

 

     The starting point, following the general Disciple apprenticeship approach, for training the agent was an initial example of a correct problem-solving episode given to the agent by a domain expert.  For training the Statistical Agent the expert was Dr. Philippe Loustaunau in the Mathematics Department at George Mason University, Fairfax VA, USA. This domain expert employed the Disciple Learning Agent Shell to create initial examples comprised of a data set and a relevant question to ask about that data. Each example was used to generate an initial plausible version space rule with an upper bound and a lower bound. The upper bound corresponds to the highest level concepts that can possibly fit the example given, and the lower bound is exactly the initial example.  The exact rule is somewhere in between these two bounds.

 

     The Shell's interfaces facilitate the interaction between the expert and the agent as the agent seeks to refine the current rule. Like a human apprentice attempting to refine his/her initial knowledge, the teaching process continues as the agent generates examples in the form of new relevant questions similar to the one formulated initially by the expert. The expert rejects or accepts each example until a refined rule is created.

 

     Figure 3 shows the final learned rule with a few of the natural language patterns (see Figure 1) that the agent has automatically associated with it during the learning process. The expert augments and corrects the English of these agent-generated patterns.

 

IF                    

                ?W1     IS         ANALYZE

                                        THE-DATA ?S1

 

                ?S1       IS         CATEGORICAL-DATA

                             CONTAINS-VARIABLE ?V1,

                             CATEGORICAL-WITH-RESPECT-TO ?V2

 

                ?V1      MEASUREMENT-VARIABLE

                             HAS-VALUES-FOR ?H1 ?H2

 

                ?V2       IS        CATEGORICAL-VARIABLE

 

                ?H1       IS        DOMAIN-INFO

 

                ?H2       IS        DOMAIN-INFO

THEN     

      RELEVANT-QUESTION IS-THE-QUESTION ?Q1

 

                ?Q1       IS        IS-STATISTICAL-DIFFERENCE-1,

                                        BETWEEN ?V1

                                        FOR ?H1

                                        AND ?H2

 

Task Description

                the task is to analyze the ?S1

Operator Description

                a relevant question to ask about this data is whether there is a

                statistical difference between ?V1 for ?H1 and ?H2

Explanation Pieces

                ?S1 contains the variable ?V1

                ?S1 is categorical with respect to ?V2

                ?V1 has values for ?H1

                ?V1 has values for ?H2

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

                   Figure 3. The learned rule

 

     This  agent has 30 unique rules in its knowledge base.  These rules and the accompanying semantic net and its domain-specific problem solver are capable of generating over 1,000,000 unique test questions.

 

     In a similar way the agent was taught to generate irrelevant questions i.e. questions that are not relevant to the data set, for the purpose of assessment. The generation of rules for irrelevant questions was a significant aspect of this approach, since these rules were generated using the same techniques as the rules for correct questions, and are designed to “make sense”. These irrelevant questions are based on pedagogical experience, reflecting the mistakes students might make, and/or reflecting the subtle points of a statistical technique.

 

6   Building the Agent's Problem-Solving Engine

 

     After the agent was trained, an assessment engine was designed and developed.  These software modules were designed to use specialized methods that were extensions of the basic operations (rule instantiation and rule matching) that were the building blocks of the problem-solving elements of the Disciple Learning Agent Shell.  In general, the problem-solving engine of the Disciple Statistical Agent was build and organized in a manner that facilitated the agent's question generation tasks and to meet its other functional requirements like the requirement to provide intelligent hints and explanations.

 

7      Verification and Validation of the Agent

 

     The evaluation of the Disciple Statistical Agent is currently in progress. The traditional software testing phases have been completed. Currently, like its predecessor the History Agent [9,13], the adequacy of this agent's knowledge base is being verified. These evaluation activities focus on measuring the completeness and correctness of the agent’s knowledge base. Several evaluation experiments are an integral part of this methodology. For example, two experiments measure the predictive accuracy of the agent’s knowledge base [10]; one with respect to the training expert and another with respect to a independent domain expert. The first study measures the ability of the Disciple approach to acquire the expertise of the training expert, while the other seeks to determine how well the agent has acquired general knowledge in its problem domain. In other experimental studies, both the domain expert and the agent’s potential users provide subjective survey-based evaluations. In these experiments, the History Agent scored a predictive accuracy rating of 96% and very positive subjective ratings. Early indications are that the evaluation of the Disciple Statistical Agent will produce similar results and will soon be ready for operation in a classroom environment.

 

8    Conclusions and Future Research Directions

 

     This research demonstrates solutions to problems involved in building intelligent educational software and prescribes a new approach that draws from the fields of artificial intelligence and educational research. All the agents developed in this research can be used as multipurpose assistants to both the teacher and the student. This agent provides the educator with a flexible tool that can lift the burden of generating tests for large classes, tests that do not repeat themselves and that can also tutor the learner.

 

     Disciple Statistical Agent acts as a tutor, using the same process as the one by which it was taught by the expert. The examples and explanations given to agent by the educational expert are similar to the hints and intelligent feedback provided by the agent to the learner through their interaction. Since the agent is taught by the educator through examples and explanations, and then it is able to provide similar examples and explanations to the students (as part of the generated tests), it could be considered as being a preliminary example of a new type of educational agent that can be taught by an educator to teach the students [8,13]. The agent replicates some part of usual teacher/student or mentor/student interactions. Therefore, it can be concluded that this agents acts as an indirect communication medium between the educator and the students. This illustrates a significant benefit to be derived from using the Disciple approach to building educational agents. This work also shows an automated computer-based approach to the assessment of higher-order thinking skills, as well as an assessment that involves multimedia documents. Both of these represent very important goals in current educational research.

 

     This assessment agent will be further evaluated and integrated into an actual classroom for user acceptance testing. It is envisioned that additional roles for educational agents built with this methodology will be explored in future research. The functionality of these agents will increase substantially as they are built to deal with other issues. For instance, a Disciple educational agent could be integrated into the HELP modules of end-user software to provide hints and instruction about the use of the software.  

 

 Acknowledgements

 

     This work was carried out at the Learning Agents Laboratory at George Mason University, USA.  This research was supported by the DARPA contract N66001-95-D-8653, as part of the Computer-Aided Education and Training Initiative and by the NSF grant No. CDA-as part of the program Collaborative Research on Learning Technologies. 9616478. Support was also received from AFOSR grant F49620-97-1-0188, as part of the DARPA’s High Performance Knowledge Bases Program.

 

References

 

[1] Aïmeur, E. and Frasson, C., Eliciting the learning context in co-operative tutoring systems. IJCAI-95 Workshop on Modeling Context in Knowledge Representation and Reasoning, 1995.

[2] Beyer, B., Practical Strategies for the Teaching of Thinking (Boston, MA; Allyn and Bacon, Inc., 1987).

[3] Beyer, B., Developing a Thinking Skills Program (Boston, MA; Allyn and Bacon, Inc., 1988).

 

[4] Bradshaw, J. M. (editor), Software Agents (Menlo Park, CA; AAAI Press, 1997).

 

[5] Buchanan, B. G. and Wilkins, D. C. (editors), Readings in Knowledge Acquisition and Learning: Automating the Construction and Improvement of Expert Systems, (San Mateo, CA;  Morgan Kaufmann, 1995).

 

[6] Fontana, L. Debe, C., White, C. and Cates, W., Multimedia: Gateway to Higher-Order Thinking Skills a Work in Progress, Proceedings of the National Convention of the Association for Educational Communications and Technology. 1993.

 

[7] Gruber, T. R., Toward Principles for the Design of Ontologies used for Knowledge Sharing, in Guarino, N. and Poli, R. (editors), Formal Ontology in Conceptual Analysis and Knowledge Representation, Kluwer Academic, 1993.

 

[8] Hamburger H. and Tecuci G., Toward a Unification of Human-Computer Learning and Tutoring, Proceeding of the 4th International Conference, ITS ’98, San Antonio, TX; Springer-Verlag, 1998.

 

[9] Keeling, H.,  A Methodology for Building Verified and Validated Intelligent Educational Agents, Ph.D. Thesis, Learning Agent Lab, Department of Computer Science, School of Information Technology and Engineering, George Mason University, 1998.

 

[10] Kibler, D. And Langley, P.,  Machine Learning as an Experimental Science.  Readings in Machine Learning, San Mateo, CA; Morgan Kaufmann Publishers, Inc., 1990, 38-43.

 

[11] Michalski, R.S. and Tecuci, G. (Eds.), Machine Learning: A Multistrategy Approach Volume 4, San Mateo, CA; Morgan Kaufmann Publishers,  1994.

 

[12] Tecuci G. with contributions from Dybala T., Hieb M., Keeling H., Wright K., Loustaunau P., Hille D., Lee S. W.), Building Intelligent Agents: An Apprenticeship Multistrategy Learning Theory, Methodology, Tool and Case Studies, (Academic Press, 1998).

 

[13] Tecuci, G.and Keeling, H., Teaching an Agent to Teach Students, Proceedings of the 15th International Conference on Machine Learning, Madison, Wisconsin; Morgan Kaufmann, 1998.

 

[14] Tecuci, G.and Keeling, H., Developing Intelligent Educational Agents with the Disciple Learning Agents Shell, Proceedings of the 4th International Conference, ITS '98, San Antonio, TX; Springer-Verlag, 1998.