EUNIS97, Grenoble (France) 9-11 September 1997

Ref: 030104>

An analysis of the e-mail traffic by a server called TIGRIS.KLTE.HU

György Terdik and Péter Erdõsi

Introduction

One of the first use of Internet is the electronic mailing. Messages are sent between users of computer systems and the computer systems are used to hold and transport essages. There are several advantages of electronic mailing as it is fast, cheap, comfortable and so on. The number of Internet users increasing exponentially therefore more and more people is able to get and send e-mails. There is no doubt about that the academic staff and university students are used to it very much. It is a part of not only research work but of the everyday life as well and stooping services of e-mail would cause unimaginable difficulties at keeping in touch. Electronic mail is an important component of an office automation system. There is a problem one should pay attention to writing an e-mail it is the problem of usage of special characters of languages different from English. The Hungarian Umlaut Û, say, supposed to be encoded and decoded as well before reading it. Most of the client software's are providing automatic coding services. The most popular one at our University is The Pegasus mail with Uuen/decoding possibilities among others. Newer systems support the composition and delivery of multimedia mail, which can combine text, graphics, voice, facsimile, and other forms of information in a single message

The mail server called Tigris

In 1991 a VAX6000/510 was installed at our University it was one of the strongest server that time in the country with one VAX processor, 128 MB RAM, 6 Gigabyte DSSI Winchester later an extra 6 Gigabyte SCSI Winchester was added. The main job of the server is providing Internet services for more then 4000 users and it is playing the role of mailgateway for some other servers. It can be reached by Ethernet (10 Mbit/sec) from the campus and by FDDI Ring (100 Mbit/sec) from all others institutions of higher education of the our city Debrecen end by 512 Kbit/sec leased line from Budapest. Besides the protocols TCP/IP and DECnet the protocol POP3 has also been implemented because of the extensive use of the clients Exchange and Pegasus Mail.

Logging of e-mails

The logging of e-mails gives information about the e-mail traffic. On the base of logging one may monitor the most popular mailing lists the activity of users and so on. The number of outgoing e-mails is not the same as the number of e-mails sent by the users of the server because of the mailgateway function. Note here that the e-mail traffic of the server is sum of its users mailing and the e-mail traffic some other servers. The e-mail traffic of the University is larger then the traffic of this server because there are servers not using the Tigris as mailgateway. The public domain message transport system called Message Exchange 4.2 is taking the responsibility for the mailing function of Tigris. It is familiar with several protocols as SMTP, NJE, UUCP and so on. As the gateway it is used to carry out necessary protocol conversion if it necessary. It is reliable and is running for several months without any problems. In 1996 two protocols were used by Tigris for transferring mails During logging several data are recorded according to the protocols. The choice of data to be recorded is depending on the postmaster. In our case the data are collected by the following table.
Protocol SOURCE HOST USERSENT SIZE DATETIME
LOCAL+ -+ -+ ++
SMTP+ +- ++ ++

The agents of MX are registering the data of an e-mail when it leaves the MX-queue. The only outgoing mails are logged so every mail is logged once. The logged data are transferred to an Oracle data base V7 and tables are made by SQL questioning. The MX system was installed in 1994 therefore the time series of daily traffic of the Tigris available for three years 1994, 1996 and 1996. The data are analyzed by the help of MS Excel and SPSS using standard methods of time series.

The following figure shows the daily traffic in 1995 the number of letters are plotted against days.

One may realize the increasing number of letters at the beginning and decreasing by the end of the semesters. It is a pity that there is no period by semesters because of the difference in the length of the winter and summer holidays.

We summarizing the e-mail traffic by years and by protocols. In 1994 Bitnet SMTP and DECnet SMTP was also running. It is seen that there is not too much difference between the average size of the letters. The first step of the statistical analysis was preliminary transformation to get rid of outliers. The cause of outlier data is the problems with network, server and mailing system. The outliers was changed by a regression method using the data neighboring it.

The number of letters increasing of cause and maximal value is decreasing it is because the stability of the leased line and the mailing system became better and better. The minimal value in 1996 was 37 showing that practically there was no fault in the delivery.

For the detection of periods the estimated and smoothed spectrum is considered.

The figure above contains the plot of the spectrum for the time series of the daily traffic in 1995. The values of the spectrum, i. e. the spectral density is plotted versus periods in days. It has a high peak at 7 which means not surprisingly that there is a weakly period.

The table of the descriptives statistics concerning to the seven day period shows that the working days in each year are significantly different form the holidays, the minimum p-value is 0.4. The means of the working days traffic are different nevertheless theses difference are not significant. This is the case for the holidays as well. It was checked by t-test.

The time series of the weakly averages does not contain the period any more and allows us to make further analysis. The correlation between the time series of 1994, 1995 and 1995 was calculated. The data of 1994 proved to be independent from both series of 1995 and of 1996. Therefore the base of the decision about trend in the series of weakly averages was the years 1995 and 1996. The question is whether the series contains trend i.e. can be put into the form

,

Testing the hypothesis b=0 we differentiated the series once
,

and used the one sample t-test for testing . The estimated value for b is 3,8 -with 2-tailed p-value 0,94 therefore there is no reason to assume that there is any trend around.

Now we are in the position to predict the weakly average series in 1997.

Denote the traffic in 96 and the prediction for 97. It is calculated by the formula

, t=1,2,….,52

where a = 279,5. The measured and the predicted values for the first 19 weeks is plotted above.

References.

  1. Brillinger, D.R.1975), Time Series: Data Analysis and Theory, Hold Reinhart Winston, New York.
  2. Zoltán Gál, Zsolt Korcsolay, György Terdik: ``UDNET: An Informatics network at Universitas of Debrecen``, Trends in Academic Information Systems in Europe -Conference, Dusseldorf, November 1995.
  3. Zoltán Gál, Ida Rápolti, Katalin Rutkovszky, György Terdik, Role of the computer center in migration to Information Society-A case study at Lajos Kossuth University of Debrecen, EUNIS97.


Center for Informatics and Computing,
Lajos Kossuth University of Debrecen,
4010 Debrecen, Pf. 58, Hungary
E-mail: terdik@cic.klte.hu, perdosi@cic.klte.hu

Copyright EUNIS 1997 Y.E.