The goal of this project is to build Bayes net models of academic advising processes for the University of Kentucky and elsewhere, and to research, design, and implement algorithms for advising.

In the process of this work, we will develop various automated tools for building Bayes nets from data and from expert opinions; design a Bayes net to probabilistic database interface; contribute several benchmark examples to the Bayes net planning community, and possibly develop useful and useable tools to support academic advising.

Contents

Introduction
Mathematical Models
Why This Is Important
Research Projects

Introduction

Academic advising is both necessary (for the students) but often tedious (for the advisor). Although the ideal advisor remembers her advisees as individuals, even she is unlikely to know their progress without consulting their transcripts. From the transcript and the discussion and her knowledge of the academic system, she recommends a next set of courses for the student.

We intend to build mathematical models of this process. They can never replace the human interaction of a faculty advisor, but could supplement the advising process both by generating an initial set of advice and as a tool for comparing alternative plans.

Back to the top

The Underlying Mathematical Models

Markov decision processes model controllable stochastic processes: there is a set of states; a controller chooses among some number of actions, each of which has an associated probability matrix of state transitions; associated with each state and action pair is a reward. The basic goal, given such a model, is to find a strategy or policy for choosing actions that maximizes the total expected reward over some fixed (finite or infinite) time horizon.

Dynamic Bayes nets are a potentially compact representation of Markov decision processes. Each state of the system is described by a vector of values called fluents. (Note that if each of n fluents is two-valued, then the system has 2^n states.) Actions are described by the effect they have on each fluent by means of two data structures. They are a dependency graph and a set of functions encoded as conditional probability tables, decision trees, arithmetic decision diagrams, or in some other data structure.

The dependency graph is a directed acyclic graph with nodes {v_1,...,v_n} and {v_1',...,v_n'}. The first set of nodes represents the state at time t, the second at time t+1. The edges are from the first set of nodes to the second (asynchronous) or within the second set (synchronous). The value of the k^{th} fluent at time t_1 under action a depends probabilistically on the values of the predecessors of v_k' in this graph. The probabilities are spelled out, for each action, in the corresponding data structure for v_k' and a.

Back to the top

The state of the system will be an individual transcript, plus some additional summary information. An action will be the set of courses that the student is advised to take next. The associated costs will be -1 per semester, at least in the initial model, leading to an optimization of the time needed to graduate. The goal state set will encode the various graduation requirements.

In order to model a transcript, we will include fluents for each possible course, where the fluents can take values from the possible grade set, including Pass, Fail, Withdrawn, and NotTaken; values will also indicate how recently the course was taken. (This remains a finite set of possible values, as courses "expire" after a fixed time.) There will be additional fluents added as needed, such as AP exams, "mathematical maturity" and "writing ability," and GRE or ACT and TOEFL scores.

Note that the initial model will be for the Masters program for the Computer Science Department. This will be considerably easier to implement that a general undergraduate advisor, both because of the magnitudes of the respective course offerings and because it will be easier to gather the information needed to design the conditional probability structures.

Later implementations will take into account a variety of constraints, including timing constraints (taking into account the timetable for each semester) and student preferences, as well as secondary optimizations such as grade point averages.

Back to the top

Why This Work Is Important

In the course of building the advising models, we are designing knowledge elicitation and knowledge discovery tools that put us on the cutting edge of stochastic modeling, probabilistic databases, and other research areas. The processes of gathering expert opinions and data are important; the process of combining these inputs and producing a Bayes net is important; the Bayes nets we produce will be used to test a variety of planning algorithms, both by us and by other researchers elsewhere. There are very few benchmark examples of DBNs generally available. Those that are used in standard AI/planning papers are usually "toy" examples. Thus, coding up our advising processes provides a real service to the AI/planning/DBN community.

Because real advising takes into account a wide variety of constraints (on scheduling, topic and instructor preferences, etc.), this system can also be used to model constrained optimization problems and to test related algorithms.

There are also clear benefits to having a working system, both for testing advising and for testing human advisors' assumptions about course dependencies.

Back to the top

Actual Projects

• Research and choose a DBN construction kit
• Build web-based knowledge elicitation tools to elicit probabilities and stochastic dependency data from advisors In progress
• Design XML interfaces for integrating structure from elicited and discovered data In progress
• Design and implement a semi-structured probabilistic database In progress
• Build a Bayes net implementation that is compatible with the semi-structured probabilitic database AVAILABLE
• Design and implement knowledge-discovery algorithms for working with the anonymous transcript data, given the elicited BN structure AVAILABLE
• Investigate and choose conflict-resolution strategies for knowledge/data fusion (combining possibly conflicting expert opinions and discovered knowledge) AVAILABLE
• Domain-specific knowledge acquisition:
• Design the initial domain (CS MS program) In progress
• Elicit expert opinions In progress
• Apply knowledge and data fusion to build model
• Research, design, and implement a policy evaluator for very large DBNs AVAILABLE
• Research, develop, implement, and test algorithms and heuristics for policy finding with very large DBNs AVAILABLE
• Research, develop, implement, and test heuristics for approximate state aggregation in BNs In progress

Back to the top

Alex Dekhtyar,
Devan Desai,
Judy Goldsmith,
Meesoon Han,
Carol Hannahs,
Sean Hawkes,
Ryan Hunt,
Jan Pearce
John Pickens,
Jaleh Rezaie,
Brett Young

Back to the top