EUNIS 99

Dynamic WWW Style Processing with SeSAMe Thomas Fischer Universität Gesamthochschule Kassel, Germany

Abstract

In the process of developing a new design concept for our university’s WWW service we catalogued a list of specific formal and aesthetical requirements and finally worked out a problem oriented software concept. This might be useful for the university online worker as well as of hypertext-theoretical interest.

Introduction

In the moment of writing this paper, exactly one decade has passed since Tim Berners Lee with his article "Information Management: A Proposal" started the development of the World Wide Web. Half a decade later in late 1993 the University of Kassel launched its first official WWW server (www.uni-kassel.de). What we gained was more than a new medium – we also found an object for study as well as technical and scientific work. It was not hard to recognize that the new questions occurring in our new collaborative hypertext writing act could not be answered just with our knowledge of traditional media. So we developed and learned new ways of cooperation, authoring and thought.

The initial strategy of WWW document publishing was to mark up hypertext logically. Layout and design were of minor importance and it was the client side where the output format should be configured for a maximum of comprehensibility, independent of the actual output format. Nevertheless visual design manipulation was always possible, it was always practiced and in the end it is one of the reasons for the popularity that the World Wide Web has today. Accordingly universities, political parties, companies and clubs began to reproduce the corporate identities they used in traditional media on their websites.

Taking the point of view of these organizations several reasons can be found to set up a corporate identity: For example they ensure identification from within and recognition from outside the organisation. From the perspective of the user, i.e. the reader, recognition is a very valuable aspect of a unique online document appearance: Because potentially every inter-document distance appears as just one mouse click, a corporate design can answer the question whether two documents are located in the same directory on the same server or in different continents. Thus, a corporate identity can be regarded as a useful hypertext navigation aid for complex online hypertexts such as university WWW services.

Our redesign development

When we began to work on a redesign for the WWW presence of our University we first analyzed the advantages and the problems of our then current design and we found that most of the problems resulted from technical problems with our corporate identity. For organizational and of course for democratic reasons universities want their staff and students not only to receive online information but also to produce it. However, many of our WWW authors obviously had problems to reproduce the HTML code required for our official website appearance - we found that source code was often copied and pasted into new projects, thus becoming more and more defective and some common errors spreading just like viruses. Thus from the author’s point of view a technical gap aggravates the production and the maintainance of stylish online material. We felt, that bridging that gap could solve problems like missing or outdated documents and dead links. We also felt that our hypertext structure should become more transparent through a greater emphasis on its logical structures. In the ideal case university WWW (sub-)hypertexts are not chaotic – they are structured. A simple model is a tree - starting with the universities’ homepage branching to various subsites at various levels. These subsites may either consist of a tree structure or of a crystal-like matrix structure e.g. in time tables or staff overviews. In any case our documents can be found in contexts where a superior hypertext level is available which can provide an overview of the current content and links that refer to documents which can be classified as ’horizontally relative’ or branching to subordinate documents. In this situation a ’navigation bar’ with links to related documents and a link up to the subordinate hypertext level is a useful tool. It can be found on many online documents in various appearances. We decided to integrate a legible navigation bar in every document as well as links to online help documents and a search engine.

When we presented our prototypes to the public our (future) authors recognized the more functional document source code as even more complex and harder to reproduce than the old design so they did not like it. Finally we found that the demands for a stylish and comfortable corporate design on the one hand and for a simple easy-to-reproduce HTML code on th other hand were contradicting each other and we started to think about proprietary strategies to fulfill all needs. What we have developed is a WWW server PlugIn and an authoring system. We call our concept SeSAMe (SErverbased Style and Authoring ManagemEnt).

The SeSAMe strategy

From other user software products such as text processors we know the concept of document templates. Integrating some content into a document template is a common technique so it would be a good idea to offer a content-free central style template. However, our authors had already used our homepage as a style template and as we have seen adapting it caused them problems concerning code correctness and maintainability. Nevertheless the merging of content and a style template is a trivial process so we decided to let the computer do this. Now the authors can concentrate on their content. As a side effect with a server technology based on a single design ’masterfile’ maintained by the public relations office the central responsibility for our print media design is also given for our online documents. Furthermore, when we need to redesign our WWW service again in a few years, we will only have to edit a single template and every document will be served in the new style automatically.

Todays standard technique to describe HTML design elements centrally is the usage of so-called style-sheets. These are detailed specifications of sizes or colours of fonts, lists, screen coordinates and tables of layers stored in one file to which any number of documents may refer to. We found this solution not quite optimal for the following reasons: Style sheets can only be interpreted by the latest browser generation. But as a university we want our information to be read internationally. Therefore in a very early state of our redesign process we decided not to require modern high-standard client technology which is common to the well equipped readers in industrial nations. Older clients on slow machines should understand as much of our code as possible. Moreover a design generated with style sheets not only consists of the information in the style specification: It is also a result of the order of elements within the HTML documents. For example, when only using style sheets each file would require re-editing if we wanted our navigation bar to move from the document bottom to the document top in another redesign.

As mentioned above, the initial idea of HTML was to allow free output formats by marking up document elements logically. So why not invent a logical markup specification which allows a corporate design of any complexity being added automatically? Analyzing our WWW server’s documents we found that all documents shared the same set of meta-information fields: For every document there was an author, his email address, his institution, the date of last modification, the title, the actual content and so on: We found a set of Meta Data (this is what the Dublin Core Set consists of). Every document also contains a standard set of visible elements in a defined appearance. We started to work on a technique that allows us to store documents without design information and that at each request interprets them on the server side where the design elements are being added.

SeSAMe technology

Our redesign definitively required each of our WWW documents to be revised and edited. So we decided to take the opportunity to port them into another format: We converted *.html files to our own format: *.ghk files. These are plain text documents containing pairs of variables and contents of the following format:

=TITLE <SQD>Homepage</SQD>;
=AUTHOR <SQD>Webmaster</SQD>;
=MAILTO <SQD>www@hrz.uni-kassel.de</SQD>;

The <SQD> sequences stand for ’SeSAMe quoted data’. These are used as string limiters and might be understood as a substitution for quotation marks. Variables like =CONTENT may contain contents of free length as well as HTML elements. The syntax as shown above is SeL (SeSAMe Language). This was developed for our needs by Andreas Matthias who is a member of our computer center. Our design masterfile is also written in SeL. It is the central design template for our corporate design. The SeL compiler is able to convert several input streams of characters dynamically to one output stream at a very high speed, while all features of high-level programming languages can be used, for example any type of variables, loops, subroutines, branching, mathematical functions, the definition of custom-functions, environment-calls etc. We use it to process *.ghk file contents and our masterfile to produce an HTML output every time a document is being requested.

Fig. 1: SeSAMe WWW document service

The SeL compiler is an external program, called by our web server (Apache 1.3.2) when a request for a *.ghk file occurs. It can be understood as a PlugIn, extending the server’s functionality: Besides SeL processing nothing was changed on the server so the service of traditional static HTML files continues unchanged and nobody is forced either to use our technology or to use the university’s official design. On the contrary we have documented how every author is able to set up his own design masterfile to make use of our system’s advantages without participating in the official design.

SeSAMe online document authoring

*.ghk files consist of a set of text elements which might be understood as ’fields’ or ’objects’. We developed a method to allow our writers to create their documents using their WWW clients. As is well known the WWW protocol http allows collecting information from the client side by making use of forms. Form elements are fields for text input, checkboxes etc. We use this method to gather our *.ghk field’s elements with a problem-oriented authoring system. The tool generates a user dialogue in dynamic HTML containing intelligent form elements by executing cgi scripts. It allows the creation, editing, renaming and deleting of WWW documents and directories. Though it is accompanied by online help material and an online handbook it acts as a wizard: It generates dialog-based input forms with context-relevant help texts and hints. We use JavaScript to evaluate user inputs immediately to avoid logical input errors. For example after the tool has asked for the language a document is written in the document may contain links to translations in any language except its own.

The authoring frontend finally collects the author’s UNIX account password, encipheres it and transmitts it to a CGI based FTP client on the server who tries to use the identification data given by the author to connect to its FTP server and to communicate new and updated information. Developing this we found a way to make use of freeware viewer software as composers and finally to use the actually anonymous http protocol to permit access rights as configured on the UNIX file system.

Fig. 2: SeSAMe authoring

The first conception of a hypertext system - Memex by Vannevar Bush from 1945 already showed the possibility and the importance not only of reading but also of writing hypertext by integrating a microfiche camera system enabling the user to add information to a knowledge base. Ever since then authoring systems attempted more or less successfully to provide authoring modules. Today for example Netscape comes along with a so-called composer for visual HTML editing and Microsoft sells Frontpage with an integrated site management system. These composers are designed for general purposes. Therefore their user interfaces suffer from a very high load of options and it appears arguable whether it is harder to learn how to write HTML by hand and being able to control the syntax or to learn to use a modern WYSIWYG editor. The SeSAMe authoring frontend is designed to handle a small amount of typical tasks excluding the solving of design problems. Therefore it can easily be understood and used as well as adapted to future requirements. It appaers in the user’s WWW client and can be used to manage complex sites and group writing without requiring any knowledge on HTML programming or UNIX.

New possibilities

The strategy to mark up any information in WWW documents logically allows powerful services. For example, SeSAMe can play an active role in document maintainance. When we recorded the problems of our WWW service for our re-design, we found many online information obsolete and outdated because their authors had forgotten to maintain them. To solve this problem our authoring frontend can ask for a date and a short message and stores them in the *.ghk files. Every morning all documents are scanned for existing dates which are of the present day. Whenever that condition is given, the short message will be sent to the document’s author reminding him to update it – this is possible because in another *.ghk file object the authors’ email addresses are stored to generate the document bottom notes. Another useful utility we have implemented is the possibility of specifying a path to an author’s public PGP key. This can be used to sign the documents electronically if an author wishes to do so and with a small icon at each document’s bottom we can confirm whether a document is authentic or if its content was manipulated by an unauthorized person. Another service that our system provides is an automatic banner integration. Our new design reserves space for two small banners on each document. During the authoring dialogue the authors can choose whether a document shall have two, one or no banner and whether these banners are custom made ones maybe with author-defined linkage or if the banners should be integrated dynamically by the server. This can be used for commercial use of our WWW service and as the server is able to log how often which banner was shown on which authors’s documents, every one of our authors has the chance to make commercial use of his online documents.

Although SeSAMe’s *.ghk files are still stored in the file system, they already have much in common with database management systems. This leads us to the problems databases have to deal with. One of their problems is so-called file locking. It occurs when more than one author is entitled to edit a file. If more than one author has loaded a file into his editor to manipulate the content one writing process will delete the result of the other. A simple solution might be to mark a file to indicate that it is being edited. Yet on our multitasking timesharing computer architectures small loopholes are hard to avoid and the data remains insecure. We solved this problem by using the UNIX revision control system RCS which includes a file locking system and as a side effect our authors can use a menue in our authoring frontend to restore any earlier state of their documents.

Ideas and future developments

The possibilities of our concept seem to be endless. At a certain point we had to stop to implement all our ideas that came up during our implementation process because of our limited ressources in order to finish version number one. I want to mention two of the ideas we could not yet implement on our own to show what has become possible and to inspire the reader of this article to improve and to contribute to SeSAMe.

We dream of online documents which contain a small icon for a print version. When this icon is clicked, the SeL compiler and another masterfile (executing some shell calls) could use the information contained in the *.ghk file to generate an output stream, which can be compiled by TeX, then converted to a *.pdf file on-the-fly and finally be displayed by the Acrobat Reader on the client side. Much information contained in the online appearance of a document is of no use when it is printed like the navigation bar or icons for online help and a search engine. The print version may even need additional information stored in additional objects but more importantly it should appear on paper as the designer of the document wishes. With the strategy proposed a professional and very stable postscript appearance can be transported. In addition with our PGP signature this is a method to publish documents of truly secure content and design.

SeSAMe already allows inclusion of local files into other local files like apache’s server side includes do. But we see a much greater potential: Amongst others *.ghk files have objects containing the author’s name, his email address, labels and URLs for a navigation aid and the actual content. Imagine if in the *.ghk files head we could determine which objects are allowed to be exported to which external servers. Now a WWW server might provide a link to a document located on a distant server displaying it in the local design. This techique could be used to export, to reuse and to share hypertext branches even inter-continentally and we finally gain a style-independent WWW correspondent of what Ted Nelson called ’transclusion’ and originally wanted to implement with his Xanadu project.

Database compatibility

The internal data representation of SeSAMe in our *.ghk files is designed not only to be understood by our SeL compiler but also to be plausible to the human reader. The files can be written ’by hand’ with a text editor on any text console. Nevertheless the field-orientation provides has database characteristics much more complex and more useful than usual flat-file WWW servers’ directories. We perceive a strong impulse of large and professional WWW services to run websites from databases. These are powerful tools the possibilities of which allow very complex and dynamic outputs, that are not only of interest in the commercial sector but for all publishers of large online ressources. Regarding content and formal network structures the free growth of HTML material in universities can result in rather chaotic networks. This is not a good starting point to port WWW material to a database because it aggravates the porting while the aspect of formal relation is the key to powerful information handling through databases. In this situation SeSAMe might be the missing link between flat file systems and real databases because it can be used to re-organize structures and to mark-up contents logically in a transition period already providing features of huge database management systems. In a later database-based service SeSAMe features like dynamic on-the-fly style processing or the authoring frontend will still be of good use.

Conclusion

SeSAMe is more an idea than a particular piece of software. Nevertheless we have developed some programs which might be useful for others who feel that their traditional WWW service might run into style or maintainance problems in future. As our software is either based on cross-platform compatible internet standards like HTML and JavaScript on the client side and C and Perl on the server side the system is also portable to other server systems. As pointed out above, there is a potential for further developments we could not finish by now. We hope that both the potential for the further development of our system and the common need for server software capable of university WWW difficulties might move the reader to visit the SeSAMe homepage, to read about the latest developments and maybe to download, use or even to contribute to the software. After software development has shifted more and more to the commercial market we want our project to be understood as an experiment which might turn out to be only of educational interest but which could also be useful to solve the above mentioned common problems and also to improve and support intra- and inter-university network authoring.

References

Berners-Lee, Tim (1989): Information Management:A Proposal. http://www13.w3.org/History/1989/proposal.html

Bush, Vannevar (1945):AsWemay think. In: Atlantic monthly, http://www.theatlantic.com/unbound/flashbks/computer/bushf.htm

Nelson, Theodor H. (1995): The Heart of Connection: Hypermedia Unified by Transclusion. Communications of the ACM, 38(8): 31-33 79

Matthias, Andreas and Fischer, Thomas: The SeSAMe Handbook. Please visit the SeSAMe homepage to find the handbook’s location and latest version

The SeSAMe homepage: http://www.uni-kassel.de/~sesame

Address

Thomas Fischer
Gesamthochschule Universität Kassel, Abt. VII
Moenchebergstrasse 19
34109 Kassel
Germany
tfischer@exp.psychologie.uni-kassel.de