USING DATA WAREHOUSES IN UNIVERSITY INFORMATION SYSTEMS

Marko Bajec, Rok Rupnik, Marjan Krisper
University of Ljubljana, Slovenia
Faculty of Computer Engineering and Information Systems

Abstract

In this paper we first briefly describe a data warehouse concept in general. We discuss the benefits of using it, trying to point out that data warehousing is important, not only in business enterprises, but in the university environment as well. Next we present a case study of using a data warehouse to enhance an existing information system, which handles admissions to university. Activities that are carried out within the admission process insert and use data from the operational database, which is then re-created every scholastic year. Instead of linking all databases together to perform a complex analysis, we could use a data warehouse, loaded with complete data set. Many advantages could be achieved with such an approach.

1. What is a Data Warehouse

A data warehouse is a subject-oriented, integrated, time-variant, and non-volatile collection of data in support of management's decision-making [1]. Simply, a data warehouse is repository of data that is specifically designed to make analysis of data simple and efficient. It is being developed to provide end-users with a collection of administrative and business information from the organisation's existing operational systems.

A data warehouse is different from an operational system because:

Today many organisations are trying to develop a data warehouse to meet their business objectives.

1.1. Benefits of Data Warehousing

There are many benefits of data warehousing. The most important are listed below.

In the highly aggressive market, use of these techniques could be of great importance [2].

However, not only business enterprises can benefit from data warehousing. This concept of collecting data into separate, multidimensional repositories to handle complex decision making-activities, can be used in the university environment as well. Note that traditional administrative systems, such as Student information system (student's records, entrance examination, admission, examination records etc.) are designed to optimise transactional processing. However, data synthesis and analysis is rather difficult in such systems, causing low performance of operational databases.

By using a data warehouse we could overcome this kind of problems. Not only that we set operational data free from complex retrieval, there are also many other benefits, concerning the university management.

For example, consider these few questions, which could be answered easily, using a data warehouse:

Note that those questions are not very easy to answer, using relational databases, due to a simple fact: relational databases are good at retrieving small number of records quickly, but they do not retrieve a large number of records and summarise them on the fly [3].

In the next section we present a study case of using a data warehouse to enhance an existing information system, called VPIS [4]. VPIS is an admission-system that handles admissions to university study programmes. Basically it supports an algorithm which accepts or rejects applicants, according to their selected programmes, qualification rules, secondary-school grades, final exams etc. The fact that applicants are allowed to apply to three different study programmes makes this selecting algorithm very complicated.

There are mainly two reasons, which lead us into development of a data warehouse:

  1. It is difficult to perform complex queries now, because a new database is created every scholastic year. We must link all those databases together to access and retrieve required data.
  2. We could use a data warehouse to help applicants decide what kind of study programmes to choose. Bad choice of study programmes is the most common mistake of many applicants, mostly because they do not really comprehend the selection algorithm. We feel that we could help those applicants by giving them special tools and access to the required information.

2. Using a Data Warehouse to enhance the information system "VPIS"

Candidates who intend to study at the university level in Slovenia apply to the admission service. In the application form they name up to three different study programmes, choosing from several faculties and departments. However, they have to be careful, because the priority in which they name study programmes could be very important. As we will see latter, this is due to different types of study programmes that influence the selection algorithm.

2.1. Selection Algorithm

The admission process begins with a data capture from the application forms into a database. Once this is finished, the analysis is undertaken to find out how many times a specific study programme was chosen as the preferred option. Depending on this number, it could become either a limited or an unlimited type. The limited programmes are those with more applications then study places available. All others are unlimited programmes. In the next step additional activities are performed, including programme-specific testing, calculation of marks etc.

The core of the admission process is a selection algorithm [5] that tries to place the candidate on his preferred study programme, for which he fulfils all the qualifications and other required conditions. At that point the priority, and type of selected study programmes become important. Depending on this, candidates could be accepted directly, without any competition, or they have to compete with other candidates. Rules defining this decision process are stated below.

img

You can see the selection algorithm concerning unlimited/limited programmes in figures 1 and 2. Although the difference is obvious, we give you an example that explains the difference even more clearly.


Example:

Think of a candidate that would like to study most at the:

Suppose now that he has no preferred order between these three study programmes. Because he is unable to figure out which study programmes may become limited, he names them in random order. After the application analysis, it turns out that the physics study programme is the only one that is unlimited. Now observe what could happen.

In the selection algorithm the candidate is rejected, because of strong competition, both for the law study programme and for the computer science and informatics study programme. He is left with only the physics study programme. Although the programme is unlimited, he must still compete for the available places. It is important to note that in this case the available places are only those that were left by the applicants, which chose the programme of physics as the preferred programme. Unfortunately, he is rejected for that programme, too. In fact it turns out that he is not accepted anywhere.

A completely different scenario would happen if he selected the physics study programme as his preferred study programme. In that case, he would be accepted without any selection or competition.


2.2. Evaluation of an acceptance

The selection algorithm compares the totals of points that candidates get according to their success in the last two years of secondary school, final examinations and study programme specific testing. Calculations of totals are performed according to the formulas that are defined for every single study programme. Below you can see the example of a formula.


Example:

Department: Faculty of computer and information sciences.
Study programme: Computer science and informatics.

Qualification conditions:
a) Matura exam or
b) Internal final exam

In case of a limitation, the total of points must be calculated using this formula:
a) for candidates with the Matura exam: Total = 0,60(Sb) + 0,40(S34)
b) for candidates with the Internal final exam: Total = 0,20(SIf) + 0,40(S34) + 0,40(Sadd)

Legend: S_b number of points achieved by Matura exam
S_34 number of points achieved by success in last two years of secondary school
S_If number of points achieved by Internal final exam
S_add number of points achieved by mathematics and physics tests.

Note that candidates do not have equal possibilities to be accepted by any faculty. Study programmes like medicine and law, always have very strong competition. Hence, only candidates with very high totals can compete for those study places. Others should avoid naming a study programme of medicine or law in the application form. For that reason, it is important to know how to evaluate the possibility of acceptance for the specific study programme. Our former experience shows that not many candidates have the knowledge how to do that. The fact is that candidates do not know, at the time when they are applying, neither how successful they are going to be at the final examinations or at specific testing, nor which study programmes will become limited. They can only guess.

However, there is something that we could help them with. These are results of past admissions. If the candidate, from the example above, knew that the physics study programme was almost always unlimited, then he would certainly choose it, as the most preferred study programme. He would also avoid choosing the law study programme, if he knew how many points he needed for that programme (supposing that he was not one of the best students, either at the secondary school, or at the finals).

2.3. Publishing important information on the Internet using a Data Warehouse as a Source

As we stated earlier, candidates that apply for university study programmes would make fewer mistakes if they were provided with enough information. We think that publishing this information on the Internet would certainly make sense. Due to deep analysis that is necessary to get this kind of information, usage of a data warehouse would be highly recommended.

Here is an example of the historical information about the law study programme. We are sure applicants would find this information useful.

img

However, an even more exciting idea is to develop a tool that would be able to answer the questions like:

Of course, answers would be based on historical information. Although they were only estimations, we are sure they would be very useful.

3. Conclusion

We have shown that there is no particular reason why a data warehouse could not be used in a university environment. In fact, there are several features of data warehousing that could be beneficial, both for the university management and for applicants and students. However, the most important principle in data warehousing is that the project should be driven by a need to produce results for the end user and not by a desire to create a sophisticated IT system [6]. We believe that we have shown that too.

4. References

  1. W. H. Inmon, 1992, Building the Data Warehouse, John Wiley & Sons.
  2. Hans G. Dallenbach, 1995, Systems and Decision Making: A Management Science Approach, John Wiley & Sons.
  3. Friend David, An Introduction to Multidimensional Database Server Technology, Pilot Software Inc.
  4. R. Rupnik, M. Bajec, M. Krisper, 1997, Information System for Application Procedure for Registration in Higher Education in Slovenia, Proceedings of European Cooperation in Higher Education Information Systems, pg. 221-226, Grenoble, France, September 1997.
  5. Tomaž Mohoric, Izbirni postopek za vpis na fakultete in samostojne visokošolske zavode v Sloveniji, In Uporabna informatika, vol.1 - jan/feb/mar 1998, pg.32-40.
  6. Doreen Stevenson, 1997, Data Warehouses and Executive Information Systems - Ignoring the Hype, Proceedings of European Cooperation in Higher Education Information Systems, pg. 202-207, Grenoble, France, September 1997.

Address:

Marko Bajec, Rok Rupnik, Marjan Krisper
Trzaska 25, 1000 Ljubljana, Slovenia
e-mail:
marko.bajec@fri.uni-lj.si,
rok.rupnik@fri.uni-lj.si,
marjan.krisper@fri.uni-lj.si