1. Executive Summary
2. Summary of Research
2.1 MIT Roles DB
2.2 Stanford Space Dog Project
2.3 XML Forum for Education
2.4 Role Based Access Control
2.5 The eXtensible Access Control Markup Language
2.6 The Shibboleth Project
3. Role Architecture
3.1 Genesis of Role Architecture
3.2 An Example of a Person with Multiple Roles
4. Modeling Methodology
4.1 Analysis of Roles in Use of Applications
4.2 Use Cases for Roles and Applications
5. Namespace Architecture and Filesystem Organization
5.1 Namespace Architecture
5.2 Organization of Files in Filesystem
6. Models
6.1 Person
6.2 Student
7. Implementation in W3C XML Schema
7.1 Choice of Schema Language
7.2 Implementation
7.2.1 Person XSD
7.2.2 Student XSD
7.2.3 Course in Academic Record Context
7.3 Use of Schemas
8. Conclusion
Acknowledgements
Bibliography
The School of Information Management & Systems is collaborating with the e-Berkeley Program Office, IS&T, BAS, and other campus units to apply XML, enterprise data architecture, web services, and other "Document Engineering" concepts to the e-Berkeley initiative. This effort began informally a year ago as class assignments in "Document Engineering for e-Business" (IS 290-4), a graduate course taught by SIMS Adjunct Professor Robert Glushko. In the Fall semester, 2002, with the help of some funding provided by Jon Conhaim, e-Berkeley's Program Director, the SIMS activity expanded into two semester-long enterprise data modeling efforts to provide a stronger architectural foundation to e-Berkeley. One of these efforts was the Roles Project.
The motivation for the Roles Project was the recognition that without a common model for roles, such as student or instructor, developers for each new application end up creating a new model from scratch, resulting in wasteful duplication of efforts when there could be reuse of existing models, and lack of easy interoperability between applications, since one application's student might be incompatible with the student of another application. In order to promote reuse of components, a common XML architecture, common models for use of roles in web services applications, and apply the "Document Engineering" methodology to the e-Berkeley domain, the Roles Project has worked on creating a model for roles that builds on and is compatible with the emerging e-architecture of the university, and illustrating how this model might be refined and applied to other domains.
Work on the Roles Project proceeded in several phases. First, in order to familiarize ourselves with industry best practices and ensure a sound foundation for our work, we looked at similar initiatives at other universities, and at various subjects of relevance to roles and access control. We looked at such topics as Role Based Access Control (RBAC), XML vocabularies for access control, and vendor solutions to roles management, and we summarized this research. Secondly, we looked at some existing applications to understand how roles were or were not implemented currently, and to develop requirements that our roles work should satisfy. Lastly, we proceeded to the modeling phase of the project, and designed a logical model for how to represent a person with one or more roles, and we provided a complete model for the student role in particular. These logical models were then translated into a schema language, which defines for an application the exact structure that a person or a student must have. Finally, we created sample documents that illustrate how the models could be implemented in an actual system.
Much work remains to be done on modeling the multitude of roles in the university context, and our work is preliminary to a more in-depth analysis of the roles problem at UC Berkeley, but we hope that our work has provided an architecture for representing a person with one or more roles, has illustrated how to proceed with other roles by providing a complete model for the student role, and has served as an example of a successful application of the "Document Engineering" methodology to the e-Berkeley domain.
This section describes the roles-related projects and research topics that informed our efforts.
The MIT Roles DB Project at the Massachusetts Institute of Technology [1] is a project to centrally maintain roles information for IT systems enterprise wide but allow for local management and updating of authorization information. The main principles of the Roles DB system are as follows:
While there was value in researching the Roles DB effort, and other roles-related work will certainly want to look more carefully at the Roles DB Project, we found that the Roles DB was concerned with much lower-level details than we were. The Stanford Space Dog project was more immediately useful to us.
The Space Dog Project at Stanford University [3] is an umbrella project to coordinate the planning, development and delivery of services concerning Sponsorship, Authority, Charging, Etc. for Departments, Organizations and Groups (SPACEDOG). The project is intended to guide efforts on a set of registry projects, namely a Person Registry, an Organization Registry, a Workgroup Registry, an Application Registry, an Account Registry, and a Course Registry [4] .
The most important registry for our purposes was the Person Registry, especially as we focused more and more on modeling a core person with roles, and the student role in particular. The Person Registry "brings together information from the University's source systems that deal with people," [5] including separate systems for student, faculty, and staff information. Some specific services that the Person Registry was intended to serve were to provide basic demographic information, support personal preferences, support the application of business rules on Person information, and to support an online directory of persons. Of special relevance to our modeling efforts were the data models for Person and the DTD and XML samples. We used these sources, along with further models and schemas as starting points for our own modeling efforts.
The XML Forum for Education is an industry group focused on XML standards in the higher education space [6] . Contributors to the effort are all members of the Postsecondary Electronic Standards Council, a consortium of universities, non-profit institutions, and corporate parties in the education space. Among their work products to date are three XML Schema files for a base data dictionary — containing items such as Student and AcademicSession — as well as a library for admissions and registrar components, and a schema for college transcript documents.
The logical models implicit in the XML Schema files were useful in our efforts, but due to style differences between the XML Forum schemas and the developing BABL library of schemas, we chose not to use any of the XML Forum components directly. Instead, we used their models as a starting point in developing our own models that we then implemented in W3C XML Schema in accordance with our style guidelines. Leveraging the excellent analysis that the XML Forum had done on their core elements and high level elements such as Student or Person saved us a great deal of time, and since the models were the result of analysis of requirements analysis in multiple universities already, they fit the UC Berkeley context quite well already.
The roots of Role Based Access Control (RBAC) include the use of groups in UNIX and other operating systems, privilege groupings in database management systems, and separation of duty concepts. The modern concept of RBAC embodies all these notions in a single access control model in terms of roles and role hierarchies, role activation, and constraints on user/role membership and role set activation. Some of the benefits of RBAC implementations are described in the next paragraph.
RBAC models have been shown to be "policy-neutral" in the sense that by using role hierarchies and constraints, a wide range of security policies can be expressed. Security administration is also greatly simplified by the use of roles to organize access privileges. For example, if a user moves to a new function within the organization, the user can simply be assigned to the new role and removed from the old one, whereas in the absence of an RBAC model, the user's old permissions would have to be individually revoked, and new permissions would have to be granted.
Because RBAC is a relatively new technology and because products and models come from different commercial and academic backgrounds, little consensus exists on what to call the different parts. RBAC is also a rich and open-ended technology, which ranges from very simple at one extreme to fairly complex and sophisticated at the other. Treating RBAC as a single model is therefore unrealistic. A single model would either include or exclude too much, and would only represent one point along a spectrum of technologies and choices. This gave rise to a proposed NIST RBAC standard [7] .
This RBAC standard is organized into two main parts: the RBAC Reference Model and the RBAC Functional Specification. The RBAC Reference Model provides a rigorous definition of RBAC sets and relations. The reference model has two primary objectives: to define a common vocabulary of terms for use in consistently specifying requirements and to set the scope of the RBAC features included in the standard. The RBAC Functional Specification defines requirements over administrative operations for the creation and maintenance of RBAC element sets and relations; administrative review functions for performing administrative queries; and system functions for creating and managing RBAC attributes on user sessions and making access control decisions.
What RBAC has provided us is a set of concepts on how to implement a robust, extensible framework on which to eventually build a complete set of e-berkeley roles. Following this standardization effort allows us to leverage the experience of vendors and researchers on RBAC technologies.
When initially focusing on access control mechanisms and the architectural implications of various kinds of access control mechanisms, we considered how to actually represent access control information in XML. The eXtensible Access Control Markup Language (XACML) [8] was the focus of our efforts. XACML is an effort to define an XML Schema for representing authorization and entitlement policies in XML.
The motivation for XACML was to standardize a specification of how authorizations should be designed based on rules, policies and policy sets and a common language for expressing security policy. XACML supports multiple paradigms of authorization, specifically authorizations based upon: access requestor characteristics, the protocols over which requests take place, classes of activities which the requestor is performing, and content introspect, consisting of user attribute values that are not known at the time of schema specification. One of the chief requirements that XACML was intended to satisfy was to support authorization decision on attributes of the subject and resource [9] . This is compatible with the role based access control (RBAC) paradigm. After conducting the preliminary research on these topics, our efforts increasingly focused on the large-scale architecture of roles, and the student role in particular, but we believe that XACML is a fruitful area for further research.
The Shibboleth Project is a joint project of Internet2/MACE (Middleware Architecture Committee for Education) and IBM [10] . It is charged with "investigating architectures, frameworks, and practical technologies to support inter-institutional sharing and controlled access to web available services. The project will produce an analysis of the architectural issues involved in providing such inter-institutional services, given current campus realities and the current state of relevant standards. It will also produce a pilot implementation to demonstrate the concepts" [11] . The motivation for the Shibboleth project is a recognition that within academia and business there is growing interest in resource sharing between institutions, and that existing solutions to resource sharing are unsatisfactory.
Shibboleth is a solution to the problem of user management, such that a user at one campus can access all and only those resources at another campus that (s)he should be able to access. This is achieved through the concept of federated administration, which means that administration of user identities and attributes is provided by the user's origin site, and not the resource provider [12] . Shibboleth, then, is "a system for securely transferring attributes about a user from the user's origin site to a resource provider site" [12] . The details of Shibboleth are the details of how this process occurs in a secure manner, with safeguards for user privacy.
While Shibboleth has no direct implications for our work on roles, access control based on attributes, which is how we envision our model being used, is one of the stated aims of the Shibboleth project, so our work is compatible with what we know of Shibboleth so far.
This section describes the large-scale architecture of a person with one or more roles.
After we decided in collaboration with Jon Conhaim to focus on an architecture for a person with one or more roles and at least the student role in particular, we turned our efforts to the large-scale architecture for a person — that is, how to model a person who can have one or more roles, and how to model the actual roles, such as student or instructor.
Our initial design for roles was that one or more roles would be attributes of a person. We considered the variety of roles that we had come across in our analysis of various systems and considered ways of classifying the various types of roles, and a classification scheme adopted by Queen's University [13] fit the Berkeley context well too.
In this classification scheme, there is a flexible and powerful five-fold classification of roles. This model of roles was adapted to fit the university context from another paper that analyzed the roles present in the context of a large organization [14] . Briefly, the classification scheme divides roles into five kinds. Institutional roles consist of 1) an Organizational Role, such as the department that a staff person works for (e.g., SIS or SIMS); 2) an Affiliation Role, which specifies the type of relation that a person has with the university (e.g., student, staff, faculty); and 3) an Authority Role, which specifies a role that is special in some way in terms of authorization (e.g., Dean or manager). The business-unit or application-specific roles consist of 4) a Functional Role, which specifies a position such as instructor or administrative assistant; and 5) a special role, which could be a temporary role, such as project manager, or a role that doesn't fit anywhere else, such as an application role.
We checked this classification against the variety of roles that we had encountered, and were able to map the roles easily into one of the five categories. With this basic classification of the broad categories of roles, we proceeded to model a core person entity that could have one or more sets of roles. Our first model grouped the roles according to type, so that, for example, all affiliation roles were grouped together, but we later decided that this grouping was unimportant, and we modeled a person as having one or more roles, with all roles being grouped together, and an attribute for the role type. This roles part of the model for person is illustrated in the following diagram:

The multiple roles that a person may have are grouped into a Roles container, and one or more sets of Roles are grouped into a ListOfRoles container.
As an illustration of the flexibility of this system, consider the following person: Joe is a student in SIMS who is also an employee (an information architect with manager authorization privileges) for SIS. Inside the ListOfRoles, there would be two Roles containers. Inside the first Roles container, there would be an role of affiliation type student and an organizational role of SIMS, and in the second Roles container, there would be an information architect functional role with an organizational role of SIS and an authority role of manager.
After determining this basic structure for how to represent a role, the next step was to actually model the details of what a person is, and what the student role in particular is.
This section describes the modeling methodology that informed our efforts.
The first step in our modeling process was to look at a representative subset of campus applications, and come up with a matrix of which of our core roles were relevant to each application, and the type of use that each role had with each application. Tables 1 and 2 illustrate these uses:
| Bearfacts/eGrades | SOC | |||
|---|---|---|---|---|
| viewing | administering | viewing | administering | |
| admitted student | financial aid, personal info | address | search for classes | NA |
| current student | fin. aid, grades, current classes, payments | address, DARS | search for classes | NA |
| Extension student | address | search for classes | NA | |
| GSI | view students if delegated to | give grades if delegated to | search for classes | NA |
| Teaching faculty | class lists, wait lists, grades | give grades | search for classes | NA |
| non-teaching faculty | addresses? | NA | search for classes | NA |
| department staff | students registration, grades, class schedules, fin. Aid for students in your department | NA | search for classess | scheduler for own department's classes |
| registrar staff | students registration, grades, class schedules, fin. Aid for all students | maybe input grades if professors send in paper forms? | search for classes | master scheduler, all courses |
| application | NA | Bearfacts uses info from eGrades | NA | schedule algorithm |
Table 1A
| CourseWeb | LDAP | |||
|---|---|---|---|---|
| viewing | administering | viewing | administering | |
| admitted student | yes | no | indirect | change address/email |
| current student | yes | no | indirect | change address/email |
| Extension student | yes | no | indirect | change address/email |
| GSI | yes | indirect | change address/email | |
| Teaching faculty | yes | yes | ||
| non-teaching faculty | yes | no | ||
| department staff | yes | |||
| registrar staff | yes | |||
| application | NA | NA | 35-40 different apps | update info from 4 main apps |
Table 1B
After having some idea of the type of use that the core roles had with each application, we focused on use cases for these systems. The intention with the use cases was that from the use cases and the types of use that each role had with each application, we would be able to determine what information was important for each role relative to each application, and could generalize from this information to model the role. Below is an example of the some of the use cases that we determined for Bear Facts:
| Usage Case | 1.1.1 Student Logs on to BearFacts |
|---|---|
| User Role | Admitted student, extension student, registered student |
| Pre-Conditions | Person attempting login is a registered student at Berkeley; student has navigated to login page |
| Post-Conditions | Student is at personal page in BearFacts |
| Purpose | To get access to personal information on BearFacts |
| Trigger | Student clicks "Login" button |
| Description | Student fills in CalNet ID and password and clicks the login button. Student is taken to their personal BearFacts page. |
| Frequency | Every time system is used by a student; often |
Table 2A
| Usage Case | 2.1 Student browses for information |
|---|---|
| User Role | Any kind of student |
| Pre-Conditions | Usage Case #1.1.1 was successful |
| Post-Conditions | Student has found information they are interested in |
| Purpose | For students to find personal information about themselves, including financial aid, grades, payments made, classes enrolled in, registration status, addresses and contact info, DARS, CARS. |
| Trigger | Student clicks one of the menu buttons |
| Description | Student decides what type of information they wish to see and click on that button in the menu. They are then shown a screen with the requested information |
| Frequency | Often |
Table 2B
| Usage Case | 2.1.1 Student browses for financial aid information |
|---|---|
| User Role | Any kind of student |
| Pre-Conditions | Usage Case #1.1.1 was successful, student has applied for financial aid and the system has financial aid information for the student |
| Post-Conditions | Financial aid information successfully found |
| Purpose | For student to browse their current financial aid status and information |
| Trigger | Student clicks one of the menu buttons |
| Description | Student clicks on a financial aid menu button. S(he) is then shown a screen with the requested information. |
| Frequency | Often |
Table 2C
Tables 2A, 2B, and 2C illustrate just a few of the use cases for students using Bear Facts. Usage case 1.1.1 is the student logging on to BearFacts using his/her CalNet ID. Many of the use cases in other systems too begin with authentication through CalNet using the Student ID. Therefore, a generic ID for this use case was incorporated as a part of the model for the base person, since identifiers are not confined to any one role. In this way, by observing the sorts of information that were required in different contexts, with different applications, we were able to determine what should be part of the model for, first, a person, which can have one or more roles, and later for the student role.
Because there was so much variability in what a person or a student was in different contexts and to different applications, we decided that the required core of each model should be minimal, the required fields being only what is required in all contexts. For example, the person model requires only an identifier, and other information is optional. This does not mean though that strict validation cannot occur, as we anticipate that stricter validation according to the business requirements of a particular context or application would be carried out using another validation method, such as Schematron [15] validation against a library of application-specific business rules.
Now we turn to implementation issues and first to the architecture of the mutliple roles-related namespaces and the filesystem organization.
This section outlines how we divided the Roles matter into multiple namespaces and mapped the namespaces to files on a filesystem.
The Roles XML components that we have developed are part of a larger document-engineering effort to design an XML vocabulary called the Berkeley Academic Business Language, BABL. Because BABL will eventually encompass many more projects than Course [18] and Roles, and in order to promote a modular architecture that provides for component reuse and ease of maintanability, we chose to divide BABL into mutliple namespaces. A namespace is a way of partitioning what would be one vocabulary into multiple vocabularies.
We organized the namespaces hierarchically, putting components that are frequently reused in multiple contexts at the bottom of the hierarchy, and layering additional components and documents at ever-higher levels. This namespace structure is illustrated with the following diagram:

This diagram illustrates the various Roles namespaces, and how they relate to the Course namespaces and lower level namespaces. At the base of the hierarchy are components of the Universal Business Language, UBL. We chose to build BABL on top of UBL in order to have a sounder semantic foundation and for compatibility with other XML vocabularies, which we believe will increasingly use UBL as a foundation.
On top of the UBL types, there are two core namespaces, BABL Codes and the BABL Library. The Codes namespace is a repository for BABL codelists. It contains codelists such the "Berkeley Degree Codelist," which lists all degrees available at Cal. In our design of codelists, we followed the same UBL-based approach as did the Course Project [19] .
The BABL Library namespace is a repository for components that are used in more than one higher-level namespace. It houses components such as Addresses or PersonalName, which will be of use in many namespaces. Immediately above the BABL Codes and BABL Library namespaces are the namespaces corresponding to the work of the Course Project team [18] . These namespaces contain many useful course-related components, some of which we used in our Roles work. For example, we extended the core Course component as part of the academic record of a student.
The entire box at the top of the above diagram represents the Roles space. At the bottom of the Roles space is the core Roles Library namespace. The Roles Library namespace is a library of reusable roles-related components. It contains, for example, the definitions of Person and the abstract type Role from which all future roles will be derived. The dark arrow in the diagram indicates that Role is defined within the Roles Library namespace.
On top of this core Roles namespace will be layered individual namespaces for other roles. The only higher-level namespace presently defined is the Student namespace, but if there is eventually a hierarchy, or taxonomy, of roles, they will follow the same namespace architecture, leveraging components from the core Roles Library and building higher-level roles on top of this foundation.
Not illustrated in the diagram is the AcademicRecord namespace, which we left out for clarity. This is a namespace for an AcademicRecord and related components. At present, the AcademicRecord is fairly simple, and the namespace fairly small, but we chose to make it a separate namespace to allow for future development of more related components and more extensive models for a student's Academic Record.
After deciding on the namespace architecture of BABL, the question arises of how to map namespaces to files in a filesystem. Taking advantage of the hierarchical organization of the namespaces, we decided to store each namespace in its own file, and not to split one namespace up into multiple files. Apart from the obvious heuristic value of having a simple one-to-one mapping of namespaces to files, it also makes sense to group the relatively heterogenous components within one namespace in one file. One reason to sometimes divide a namespace into multiple files is if the namespaces are very large, but we chose our modular namespace structure so that individual namespaces would be relatively small and manageable.
The filesystem organization of the files corresponding to the Roles space is as follows, from the BABL root context:
| babl_0p01/CoreComponents/CommonComponentTypes.xsd | Definitions for the BABL Common Types |
|---|---|
| babl_0p01/Codes/Codelists.xsd | Enumerated codelist types |
| babl_0p01/AggregateComponents/Roles/Roles.xsd | The library of Roles components |
| babl_0p01/AggregateComponents/Roles/Student/Student.xsd | The Student role component |
| babl_0p01/AggregateComponents/AcademicRecord/AcademicRecord.xsd | Academic Record components. |
Table 3
Now we turn to the actual models for person and student.
This section presents the models developed for the person entity and the student role.
The intention for the Person entity is that it serves as the base upon which roles can be layered. Person models the typical kinds of information that are relevant to any person, regardless of any role that they may have. Obviously, things like name and address are relevant to any person, but in the university context, there are other not so obvious attributes of a person that apply to more than one role, such as citizenship status. The UML model for Person is as follows:

As can be seen in the model, the only thing that is required is one or more identifiers. There are many scenarios in which just an identifier is sufficient, such as when accessing restricted resources through the Library Proxy. In contexts where business rules dictate that more information should be required, these application-specific business rules would be codified in a rule-based schema language, such as Schematron [15] , and the more restrictive validation would be performed.
Another advantage of having the open cardinality constraints of Person is the flexibility of the model. This model will support an role based access control on the basis of role alone; for example, a Person with a Student role of type 'affiliation.' In cases where more information is required, and the mere role of Student is not sufficient, then other parts of the Student role can be used. For example, a certain resource might be available to a student in a particular program.
The only role that we have modeled so far is Student. As stated above, in the context of roles based access control, an access control decision might be made on the sole basis of the role of a person. However, there are many cases where much more information is required, and a fuller model for what a student is than just a place in a role hierarchy is desirable. For these situations, the fuller optional model for the student role is important. The model for Student is as follows:

The UML represents that a Student is a type of Role, and that Student consists of optional information for the student's registration status (e.g., "Continuing"), the degree the student is studying for, her class standing (e.g., Freshman), the programs in which she is a student, an academic record, and an expected graduation date. The complexity of some of the components, such as AcademicRecord, obscure the quite simple and minimal Student role. It was our intention in creating Student that the role might need to be extended in the future, or for contexts that we didn't take into account. By making the cardinality constraints minimal, we anticipate that others could easily derive from our Student Type and create, for example, a GraduateStudent that was more restrictively defined.
This section outlines our experiences implementing the logical models for Person and Student in a schema language.
The choice of which schema language with which to implement our models was an easy decision. The Universal Business Language is implemented in only W3C XML Schema (WXS) at present, so in order to leverage UBL, we also chose WXS. W3C XML Schema is a powerful object-oriented schema language, with which one can express complex type relationships and hierarchies, so it was a very good match for the roles work, which as it develops will include an ever larger type or role hierarchy.
As we moved from an implementation-neutral logical model to an implementation-specific representation in W3C XML Schema, there were several decisions that needed to be made. XSD is a very powerful schema language, and there are frequently multiple ways of doing the same thing. Some of the issues that we dealt with were whether to use global types and/or global elements, whether to use type derivation or substitution groups for variable content, and how to handle codelists.
When deciding how to organize the schemas in terms of types and/or elements as global components, we followed the recommendation of Eve Maler and the current best practice of UBL [16] . This approach is called the "Garden of Eden" approach, and consists of both global elements and global types for almost all components. While this approach is less efficient in terms of the number of bytes required to represent a give schema, it maximizes the possibilities of reuse, since all types and all elements are global, and thus can be reused, and makes type derivation easier in many cases.
When considering which WXS mechanism to use for variable content, we again followed the example of UBL and relied on type derivation as opposed to substitution groups. The nature of the roles problem -- that is, the taxonomy of roles that will be developed — lends itself well to a hierarchy of types, and type derivation seemed like the natural solution. As an example of a place that we use type derivation, consider the Role element. This is a base element in Person which is declared abstract, meaning that it cannot be used in a Person. Instead, any XML document instance that contains a Person element must contain in the place of Role some kind of role that is derived from Role, such as Student. The base model for a Person just specifies that a Person has one or more roles, but since the Role element is declared abstract, any document instance that has a role is assured to have a derived role, such as Student or Faculty. For now, Student is the only role developed, but we anticipate that as Roles work develops, there will be a more complete taxonomy of all Cal Roles.
Our approach to codelists followed the recommendation of UBL's Eve Maler [17] and is described more fully in the Course Project's Final Report [19] . The following is an example of an externally maintained ISO Country Code codelist:
<xsd:complexType name="CountryCodeContentType">
<xsd:sequence>
<xsd:element name="iso3166Code" type="iso3166:CountryCodeType"/>
</xsd:sequence>
</xsd:complexType>
<xsd:complexType name="CountryCodeType">
<xsd:complexContent>
<xsd:extension base="bablc:CountryCodeContentType">
<xsd:attribute name="codeListIdentifier"
type="bablc:codeListIdentifierType" fixed="ISO3166-1"/>
<xsd:attribute name="codeListAgencyIdentifier"
type="bablc:codeListAgencyIdentifierType" fixed="ISO"/>
<xsd:attribute name="codeListVersionIdentifier" type="bablc:codeListVersionIdentifier" default="0.3"/>
<xsd:attribute name="languageCode" type="bablc:languageCodeType"/>
</xsd:extension>
</xsd:complexContent>
</xsd:complexType>
|
When there is not already a standard codelist for a given purpose, we had to create the codelist from scratch. The following is an example of a codelist for which there was not already an externally maintained codelist:
<xsd:simpleType name="BerkeleyDegreeContentType">
<xsd:restriction base="xsd:string">
<xsd:enumeration value="Bachelor of Science"/>
<xsd:enumeration value="Bachelor of Arts"/>
<xsd:enumeration value="Master of Science"/>
<xsd:enumeration value="Master of Arts"/>
<xsd:enumeration value="Master of Architecture"/>
<xsd:enumeration value="Master of Fine Arts"/>
<xsd:enumeration value="Master of Business Administration"/>
<xsd:enumeration value="Master of Financial Engineering"/>
<xsd:enumeration value="Master of City Planning"/>
<xsd:enumeration value="Master of Engineering"/>
<xsd:enumeration value="Master of Forestry"/>
<xsd:enumeration value="Master of Information Management and Systems"/>
<xsd:enumeration value="Master of Journalism"/>
<xsd:enumeration value="Master of Landscape Architecture"/>
<xsd:enumeration value="Master of Urban Design"/>
<xsd:enumeration value="Master of Laws"/>
<xsd:enumeration value="Master of Public Health"/>
<xsd:enumeration value="Master of Public Policy"/>
<xsd:enumeration value="Master of Social Welfare"/>
<xsd:enumeration value="Juris Doctor"/>
<xsd:enumeration value="Doctor of Medicine"/>
<xsd:enumeration value="Doctor of Public Health"/>
<xsd:enumeration value="Doctor of the Science of Law"/>
<xsd:enumeration value="Doctor of Education"/>
<xsd:enumeration value="Doctor of Engineering"/>
<xsd:enumeration value="Doctor of Philosophy"/>
</xsd:restriction>
</xsd:simpleType>
<xsd:complexType name="BerkeleyDegreeType">
<xsd:simpleContent>
<xsd:extension base="bablc:BerkeleyDegreeContentType">
<xsd:attribute name="codeListIdentifier" type="bablc:codeListIdentifierType"
fixed="BerkeleyDegreeTypes"/>
<xsd:attribute name="codeListAgencyIdentifier"
type="bablc:codeListAgencyIdentifierType" fixed="Berkeley"/>
<xsd:attribute name="codeListVersionIdentifier"
type="bablc:codeListVersionIdentifier" default="1.0"/>
<xsd:attribute name="languageCode" type="bablc:languageCodeType" fixed="en"/>
</xsd:extension>
</xsd:simpleContent>
</xsd:complexType>
|
The XSD for Person is as follows:
<xsd:element name="Person" type="roles:PersonType"/>
<xsd:complexType name="PersonType">
<xsd:sequence>
<xsd:element ref="babll:IDs"/>
<xsd:element ref="babll:PersonalName" minOccurs="0"/>
<xsd:element ref="babll:Addresses" minOccurs="0"/>
<xsd:element ref="babll:Contacts" minOccurs="0"/>
<xsd:element ref="babll:Gender" minOccurs="0"/>
<xsd:element ref="roles:MaritalStatus" minOccurs="0"/>
<xsd:element ref="roles:Ethnicity" minOccurs="0"/>
<xsd:element ref="roles:Citizenship" minOccurs="0"/>
<xsd:element ref="roles:Residency" minOccurs="0"/>
<xsd:element ref="roles:Immigration" minOccurs="0"/>
<xsd:element ref="roles:ListOfRoles" minOccurs="0"/>
</xsd:sequence>
</xsd:complexType>
|
The babll prefix is an abbreviation for the BABL Library namespace identifier, urn:berkeley:sims:doc-eng:names:babl:CoreComponents:CommonComponentTypes:0.01. It specifies that, for example, the IDs element belongs to the BABL Library namespace, and is declared in the BABL Library schema file. The roles prefix identifes the Roles namespace, urn:berkeley:sims:doc-eng:names:babl:AggregateComponents:Roles:0.01. The Roles part of Person is as follows:
<xsd:element name="ListOfRoles" type="roles:ListOfRolesType"/>
<xsd:element name="Roles" type="roles:RolesType"/>
<xsd:element name="Role" type="roles:RoleType" abstract="true"/>
<xsd:complexType name="ListOfRolesType">
<xsd:sequence>
<xsd:element ref="roles:Roles" maxOccurs="unbounded"/>
</xsd:sequence>
<xsd:attribute name="primaryRoleName" type="roles:RoleNameType"/>
</xsd:complexType>
<xsd:simpleType name="RoleNameType">
<xsd:restriction base="xsd:string"/>
</xsd:simpleType>
<xsd:complexType name="RolesType">
<xsd:sequence>
<xsd:element ref="roles:Role" maxOccurs="unbounded"/>
</xsd:sequence>
</xsd:complexType>
<xsd:complexType name="RoleType">
<xsd:attribute name="type" type="roles:RoleCategoryType"/>
</xsd:complexType>
<xsd:simpleType name="RoleCategoryType">
<xsd:restriction base="xsd:string">
<xsd:enumeration value="Affiliation"/>
<xsd:enumeration value="Function"/>
<xsd:enumeration value="Special"/>
<xsd:enumeration value="Authority"/>
<xsd:enumeration value="Organization"/>
</xsd:restriction>
</xsd:simpleType>
|
As discussed above, the Role element is declared abstract, so a particular role, which would be derived from Role, must be substituted instead.
The XSD for Student is as follows:
<xsd:element name="Student" type="student:StudentType"/>
<xsd:element name="RegistrationStatus" type="student:RegistrationStatusType"/>
<xsd:complexType name="StudentType">
<xsd:complexContent>
<xsd:extension base="roles:RoleType">
<xsd:sequence>
<xsd:element ref="student:RegistrationStatus" minOccurs="0"/>
<xsd:element ref="babll:ClassStanding" minOccurs="0"/>
<xsd:element ref="student:ExpectedGraduationSemester" minOccurs="0"/>
<xsd:element ref="babll:Programs" minOccurs="0"/>
<xsd:element ref="babll:BerkeleyDegree" minOccurs="0"/>
<xsd:element ref="academ:AcademicRecord" minOccurs="0"/>
</xsd:sequence>
</xsd:extension>
</xsd:complexContent>
</xsd:complexType>
<xsd:simpleType name="RegistrationStatusType">
<xsd:restriction base="xsd:string">
<xsd:enumeration value="New"/>
<xsd:enumeration value="Readmitted"/>
<xsd:enumeration value="Continuing"/>
</xsd:restriction>
</xsd:simpleType>
|
The elements that are declared and types that are defined in other namespaces are not shown here.
When creating the various components that are required to represent the academic record of a Student, we realized that the various Course components that the Course Project defined were insufficient for our purposes, since a course in the contexts that they modeled did not have a grade, whereas a course that a student has taken will always have a grade. In such a case, the solution that the Course team intended is to apply context and extend their component with the extra information required for the particular context. We chose to extend the soc:Course (Schedule of Classes Course component) with the extra information required for the academic record context. The WXS is as follows:
<xsd:complexType name="CourseType">
<xsd:complexContent>
<xsd:extension base="soc:CourseType">
<xsd:sequence>
<xsd:element ref="course:GradingOption"/>
<xsd:element ref="academ:CreditedUnits"/>
<xsd:element ref="babll:Grade" minOccurs="0"/>
</xsd:sequence>
</xsd:extension>
</xsd:complexContent>
</xsd:complexType>
|
This Course extends the Schedule of Classes Course with a grading option, credited units for the course, and an optional grade (optional because the course might be in progress).
We have tried to make the schemas as flexible as possible. In cases where Role Based Access Control on the basis of role alone is desirable, an XML document instance such as the following could be used:
<?xml version="1.0" encoding="UTF-8"?>
<roles:Person xmlns:roles="urn:berkeley:sims:doc-eng:names:babl:AggregateComponents:Roles:0.01"
xmlns:student="urn:berkeley:sims:doc-eng:names:babl:AggregateComponents:Roles:Student:0.01"
xmlns:babll="urn:berkeley:sims:doc-eng:names:babl:CoreComponents:CommonComponentTypes:0.01"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="urn:berkeley:sims:doc-eng:names:babl:AggregateComponents:Roles:0.01 Roles.xsd"
xsi:schemaLocation="urn:berkeley:sims:doc-eng:names:babl:AggregateComponents:Roles:Student:0.01 Student/Student.xsd">
<babll:IDs>
<babll:ID xsi:type="babll:StudentIDType" schemeName="StudentID"
schemeAgencyName="SIS">12345678</babll:ID>
</babll:IDs>
<roles:ListOfRoles>
<roles:Roles>
<roles:Role xsi:type="student:StudentType" type="Affiliation"/>
</roles:Roles>
</roles:ListOfRoles>
</roles:Person> |
In cases where a more complete model of a student is required — for example, where access control will be based on attributes of a Student, such as the student's home department — then the full model for Person and Student can be leveraged.
In summary, our work has consisted of several phases, and is itself a preliminary to a more sustained effort on Roles for e-Berkeley. We have surveyed many of the comparable projects that are relevant to Roles at Berkeley, and where appropriate, we have taken advantage of some of the insights of other projects. Furthermore, we have summarized several of the key technologies and methodologies that are relevant to Roles. In particular, we have applied the "Document Engineering" methodology to the Roles problem, and designed a logical model for the core Person entity. More generally, we have proposed a large-scale architecture for representing a Person with one or more Roles, and have illustrated how to follow this architecture by developing a model for one role in particular, Student.
Much work remains to be done on Roles at Berkeley, and we anticipate that work could continue in several areas. Some of the tasks that we envision being part of a continuing Roles effort are: first, the development of a complete hierarchy, or taxonomy, of roles, which would plug into the Person entity as we have illustrated; second, the actual modeling of additional roles, such as Professor and Staff; third, research into the low-level access control technologies that are the complements to the high-level models we have proposed. Though tentative and preliminary to a more complete Roles effort, we hope we have illustrated the utility of the "Document Engineering" methodology as applied to the e-Berkeley domain, and that the work we have done will contribute to the ongoing Roles effort in e-Berkeley.
The Roles team would like to thank the following people who contributed to this project:
First and foremost, many thanks to our project leader and advisor, Bob Glushko.
Jon Conhaim, e-Berkeley.
J. R. Schulden, Helen Lee, and Randy Ballew, Student Information Systems.
Chris Hoffman, Graduate Division.
Mara Hancock, Course Web.
Rozanne Largent, Karen Denton, e-Grades.
Robert Chevalier, Lucia Tsai, LDAP.
Patrick Garvey, SIMS, for his assistance in the later phases of the project.