heitml 2.0 - Object-oriented HTML
1. Introductionheitml (extended interactive HTML) is a language extension of HTML [W3C98a]. HTML describes static hypertext documents. heitml extends this to server side dynamic documents and complete Web applications as needed for Web-database integration or e-commerce applications. HTML pages are usually downloaded verbatim from a Web server and displayed by a browser. heitml documents on the other hand are processed by the server and transformed into HTML before being sent to the browser.
heitml introduces server sided interactive objects. An interactive object within a Web page displays dynamically generated information, waits for user input, and processes it. Interactive objects are placed on Web pages using the well known tag syntax of HTML, XML [W3C98b], and SGML. This allows even non-programmers to use server side objects on their pages and it makes component editors like RADpage possible.
The biggest advantage of heitml is that interactive objects can be combined, nested, parameterized, and programmed very flexibly. These features are crucial to create applications by reusing cooperating interactive objects instead of rewriting everything from scratch each time.
There are class libraries of ready to use heitml objects. All are implemented using the programming language features of heitml. Library objects can be freely combined with user written objects. Also new objects can be created by inheriting from the already existing classes.
This document gives a quick and informal, yet technical introduction to the object-oriented features of heitml and the idea of interactive objects. For an introduction into the general design of heitml see http://www.heitml.com/heitml2.1/lref/lrtext01.hei and http://www.heitml.com/heitml2.1/lref/lrtext00.hei for the language reference. We first have a look at the classical approach of creating Web applications, then weŽll introduce the most interesting concepts of heitml and finally give you some technical details on heitml as a programming language.
2. Comparison to the Classical Approach
Classically Web applications are programmed in a page oriented way. This means programs or page templates with embedded programs are created by the programmer. Each program or template is started when the browser requests a page. Then this single HTML page is generated and sent to the browser.
Adding some server side functionality, for example a database form, requires additions in at least two places: The form needs to be added to the page program that shows the form and a processing routine needs to be added to the following page program.
This hampers the creation and especially the use of reusable components: A user, who just wants to add some server side functionality, must insert the right calls in several places of the program and so requires detailed knowledge of the program and the component.
3. Using Server Side Functionality
3.1. Using heitml Tags and Elements
Just as an HTML Web site consists of HTML pages, a heitml Web site consists of heitml pages. Since heitml is an extension of HTML, an HTML page is already a very simple heitml page. In addition to HTML tags however heitml page can contain server side functionality in the form of so called objects.
The goal of heitml is to specify the use of server side functionality the same way as browser functionality: by using HTML/XML like tags. This simplifies programming so that even HTML-designers without programming knowledge can use server functionality.
For example a counter can be used on a heitml page by writing
or for a counter with a reset buttonThe couter value is <counter name="test">
The counter with the reset-button is interactive, i.e.it reacts on a users response (pressing the reset button). This requires a totally different implementation, since the server must be prepared to handle the user clicking on the reset button. Note, however that there is no difference in the description, in both cases ordinary HTML syntax is used. Note that in contrast to a clock or an animation (typical client side objects), the counter is server based since the counter value must be stored on the server. So pressing the reset-button requires an interaction with the server, in order to reset the counter value.This is a <counter name="test" resetbutton>
Another example is a database display element and a database scroller element:
This is the database content: <dbdisplay relation="customers"> <dbformat> <dbfield "Name"> <dbfield "Address> <br> </dbformat> </dbdisplay> <dbemptyquery> No data found. </dbemptyquery>
This displays a list of customer records on the Web page. The <dbformat> element holds the format of a database record, and the <dbemptyquery> contains the text to be displayed in case the database is empty.
If the tag <dbscroller> is used instead, not all records are shown at once, but the representation automatically provides scroll buttons which make it possible to scroll through pages of data. Clearly the implementation of a scroller is much more complex, since it must react to the scroll buttons accordingly. It is however completely hidden to the Web author, who only has to know the tag name and parameters, as happens with standard HTML tags.
Thanks to heitml's libraries, there are many possible new elements like <counter> or <dbdisplay> . Examples are a <dbform> to modify database records or, more application-specific, a shopping basket or a discussion group.
It is very important, however, that tags can be freely combined. Only combination makes it possible to build complex applications. Have a look at the following example
The counter now only counts if the database is empty, because it only shows up on the page if the database is empty. This means features of a page change dynamically based on information stored on the server/database and execution context.<dbdisplay relation="customers"> ... </dbdisplay> <dbemptyquery> No data found <counter name="nodata" resetbutton> </dbemptyquery>
It is even more interesting to enter the counter into the <dbformat> . In this case every database record shown is counted and so there are several coutners on a single page, each of whom could be individually reset. Again, this would be achieved by simply using tags, and no special programming is required. These things may sound simple and straight forward, but they require certain special implementation techniques.
3.2. Object Model
From a simple point of view a server side object looks just like another HTML tag. Indeed HTML users can identify server side objects with their HTML representation. From a programmers side things are more complex however, since objects also have a sever side representation.
When a heitml page is requested from the server the (server side representation of the) heitml objects on this page are created in the server. The objects then perform their specific functions, possibly interacting with other objects, and finally generate HTML code (the objects HTML representation) to display themselves in the browser. When the user fires an event, e.g. by clicking on a button or a link of an object, the (server side representation of the) object waiting on the server receives a message and reacts accordingly.
A heitml object that specifically waits for user responses is called interactive. For example the counter with a reset button or the<dbscroller>. (The server side repesentation of) interactive objects must be kept on the server as long as the corresponding page is shown in the browser, while (the server side representation of) non-interactive objects can be freed unless there are other reasons to keep it.
Technically (server side representations of) heitml objects are similar to objects of other object-oriented programming languages plus a state handling mechanism and a message passing and dispatch mechanism. State handling happens totally transparent for the programmer and simulates a persistent memory per session. The message passing mechanism is build atop standard CGI invocation, so than an event happening on the client is translated into a GET or POST request and delivered to the server. heitml, which is activated according to the CGI or server API standards, parses the request, retrieves the object and activates its methods accordingly. However these technical details are hidden to the user.
4. Basic heitml ConceptsSo far we have explained how server side objects can be used on heitml pages. We will now explain, how server side objects can be programmed in heitml. In fact heitml extends HTML by a full featured programming language. Syntactically this goes well beyond XML/SGML still being an integrated document description and programming language. There is a lot to say about heitml, however, we will just highlight the object-oriented features. For an introduction into the general design of heitml see http://www.heitml.com/heitml2.1/lref/lrtext01.hei and http://www.heitml.com/heitml2.1/lref/lrtext00.hei for the language reference.
4.1. Defining heitml Classes
As usual in object-oriented systems you can define classes in heitml. A class defines the functionality (including representation) of a group of objects shareing certain properties. Once a class such as counter is defined, objects of that class can be placed on pages by using the HTML/XML tag syntax.
heitml offers the possibility to define classes using the <def> tag.
This defines subsequent uses of <mytag> to be replaced by "This is to be printed". More precisely it defines a class named mytag so that objects of the class show up as "This is to be printed". So when the tag < mytag > occurs on a page an object of the class mytag is created and displayed as "This is to be printed". Note that this is a minimal object, of course, which is not persistent and can't react to messages.
Below we give the syntax of a tag definition in EBNF. Terminal tokens are printed in bold letters. Nonterminals are written in italics. The Nonterminal heitml stands for any heitml text, Ident stands for an identifier enclosed in quotation marks.
Using this mechanism, text, parts of pages and page layouts, can be packed into objects and be reused several times. To reuse a class on several pages, heitml provides an inclusion mechanism. Definitions can be written into include files which can be reused by other pages. This assures consistency in Web development, a goal which pure HTML simply cannot attain.
Classes can have parameters and parameters might have default values. Default values are taken, if a parameter is not given, when a tag is used. Parameter values can be inserted into the text using the <?> tag:
Parameterization of objects enforces their reuse, and makes Web design a lot easier, since one can define its own high level, generic tags which replace tens of HTML tags in a seamless way.
Definitions can be nested, i.e. there can be a tag definition inside another one. Nested definitions are called methods of the enclosing class and may not contain an <inherit> Tag. An inner definition is visible only within the enclosing define. (There are also nested class definitions in heitml but they are not of concern within this paper).
In fact the inherit tag can also be left out in non-nested definitions. In this case they define methods of an implicitly defined class named page. Practically they are often used like procedures of other languages.
Inheritance is the major object-oriented concept used for achieving reusability. Through inheritance it is possible to create several similar objects without specifying each one from scratch.
Classes can inherit the methods of another class. The <inherit> tag is used for this purpose.
It must be placed directly behind the <def> tag and the class must give the name of a preceding class definition. It is then called a superclass. What inheritance means is that all the methods of the superclass become visible, unless they are overwritten by a new method definition.
The body of a def tag is also seen as a method named constructor. If a class is defined with the <defclass> tag (instead of <def> or <defenv> ) then it inherits the constructor.
The following example uses a fictional home page of an online computer store. Actually the computer store wants to have two home pages, one for new customers (left figure) and one for existing customers (right figure). Both pages differ slightly. The following code is used to specify this using inheritance. <homepage> shows the home page for new customers and <nexttime> shows the home page for existing customers.
It should be immediately clear how the <homepage> object works. Since <nexttime> inherits the constructormethod from <homepage> , it displays the same basic layout of the home page. However <welcome> and <news> are overwritten by the definitions inside <nexttime> , this is why <nexttime> shows up differently.<def homepage><inherit object> <def welcome>Welcome to our home page.</def> <def news></def> <\h1> Computer Shop Inc. <\/h1> <\p> <welcome> Computer Shop offers a variety of computer products, well suited for any application. <\p> Please <a href="intro.html">learn more about us<news> or go to our <a href="shop.html" >shopping area</a>. </def> <defclass nexttime><inherit homepage> <def welcome>Thank you for visiting us again.</def> <def news>, <a href="new.html" >view new products available</a>, </def> </defclass>
In this case, traditionally two homepages have to be maintained. Using inheritance the homepage must be worked out just once. This is a small example of how object orientation and inheritance reduce the code size by reuse. In larger systems the savings are much higher.
4.4. EnvironmentsWe call an element that consists of a start-tag, the content, and an end-tag an environment. Environments are defined using the <defenv> tag. They can be classes when defined with <inherit> and methods otherwise.
The heitml text between <defenv> and </defenv> is called body and may contain the <defbody> tag. When the environment is used on a Web page, the whole environment is replaced by the body. Inside the body the <defbody> tag is replaced by the content.
Inheritance works for environment definitions the same way as for normal classes.
4.5. Environments using MethodsAdditionally, inside the content of an environment methods of the environment-class can be used.
The <gl> -tag defines a list, like <ul> but uses an image as bullet. As usual, the <li> tag can be used to create list entries inside the <gl> -tag.<defenv gl> <def li></tr><tr><td><img src="mybullet.jpg"></td><td></def> <table ><tr ><td ></td><td ><defbody></td></tr> </table> </defenv> <gl> <li> Topic 1 <li> Topic 2 </gl>
Executing the <gl> creates a table. The first column contains the bullet and the second column the text. The <li> -tag is redefined to achieve this functionality. A <li> tag can occur several times in a document, sometimes within an <ul> or <ol> and sometimes within a <gl> . The redefinition of <li> as a method of <gl> ensures that <li> is redefined within a <gl> only.
This heitml language feature is very powerful, because different tag definitions can be used in each environment.
4.6. Interactive ObjectsAn interactive object stays on the server, waiting to process user input. For example database forms or database scrollers are interactive objects. They are initially displayed on the page and they contain links or buttons the user can click to request a certain function. This needs to be programmed in form of a program procedure that we call processing routine.
<regsiterme> assigns the interactive object a unique identification. This identification must be passed on as URL parameter in the <a> tag.<defenv InteractiveLink; inherit Interactive> <def process x> <message>Hello World</message> <return null> </def> \<registerme> <a href="...">Click here</a> </defenv>
Normally users will not define interactive links themselves but use a library class <iaLink> .
and then use<defclass mylink><inherit iaLink> <def process x> <message>Hello World</message> <return null> </def> </defclass>
5. The heitml Programming LanguageThe heitml concepts presented so far extend HTML by the use and the definition of classes, methods, and inheritance. Interestingly all these things can be used benefitially without actually programming (in the sense of control flow and state).
In addition heitml is a complete object-oriented programming language. This means that not only adaptive Web pages, but complete applications can be done. Therefore, it is a hybrid language combining document description and programming in one language. The following is a very brief introduction and it is far from being complete. The purpose is however to give you an inpression on heitml and to make the examples later in the paper understandible. The complete language reference is at http://www.heitml.com/heitml2.1/lref/lrtext00.hei.
5.1. heitml Syntax Extensionsheitml features three important syntax changes that apply to heitml tags:
5.2. heitml Semantic OverviewA heitml page is seen as a program and is processed in textual order. Ordinary text is just displayed and HTML tags work as usual (when the output or heitml is sent to a browser).
heitml introduces a variety of new tags directly corresponding to common language constructs. There are control structures, e.g.
and expression handling tags to evaluate a variable, to insert the content of a variable into the document, and to assign some generated text to a variable.
In addition, expressions can be used as parameters for user-defined tags.
Finally, there are tags for accessing the database using SQL.
5.3. Expressionsheitml variables are dynamically typed. Possible types are Boolean, integer, real (double precision), string, and object. There are the usual operators and a lot of built in functions.
Variables do not need to be declared. They are always local to the current page or the current definition. In addition there are global variables that always have to be written as gl.variablename, and session variables written as se.variablename. Session variables keep their values between page accesses. Finally, a special kind of predefined variables, Server variables, allow the programmer to retrieve information about the server, the underlying O.S., the CGI request which caused the execution of a page.
5.4. The Object DatatypeThe object data type is most interesting in heitml since it provides the full functionality of associative arrays. It covers the record/struct/object datatype as well as the array datatype of other languages.
An object consists of several fields. Each field can contain a value of any type. The fields are ordered and numbered starting with 0. Each field can have a field name, which must be unique. If o is an object, then o.name accesses the field with the given name. o[i], where i is an expression leading to an integer, accesses the i-th field. o[s], where s is an expression leading to a string, accesses the field with the name given by s. Associative arrays are a powerful concept since they allow addressing of subobjects in a very flexible way.
Objects are fully dynamic, so fields can be added and deleted etc. Multidimensional arrays or arrays of objects can be realized by using objects as field values. Objects use reference semantics, as is common in object-oriented languages.
Each object belongs to a class that is determined to create the object. An object is created either by directly calling a constructor function (named as the class name), or by using the class name as a tag. Object methods can be called using the usual syntax x.methodname(parameters), where x is an expression evaluating to an object.
5.5. Class Definitions revisited
The body of a class definition implicitly defines a display/create method. When the class is called (i.e. a tag with the classe`s name is processed) an object of the class is created and the display / create method is executed.Inside each method, the object can be referred to by the keyword this, so object fields can be accessed writing this.fieldname. The following example clarifies this. <al> works as <gl> from 3.5 but marks the list items with letters. <al> uses an object variable to keep track of the letter to be taken next.
The object vanishes unless this is assigned to a global data structure. It is possible to assign the object to a session variable and to later call a method of the object using the programming language method call. This is the way interactive objects work.<defenv al> <def li></tr><tr><td><? chr(this.char); this.char=this.char+1>)</td><td></def> <let this.char=asc("a")> <table ><tr ><td ></td><td ><defbody></td></tr> </table> </defenv>
5.6. Interactive ObjectsThe prototype of an interactive object is a database form that displays a database record and allows the user to update it. Such a form could look like
The form consists of the <dbform> environment with some parameters identifying the record to be shown. The content describes the layout of the form; <fieldtext> displays a field of the database record.<dbform ... > ... <fieldtext "Name"> <fieldtext "Email"> ... </dbform>
Simplified implementations of <dbform> and <fieldtext> are given below. No error checks are performed and the database record must always exist. <dbform> inherits the method registerme from the class Interactive. Registerme saves the object in a session variable and makes sure that the process method of the object is called when the user clicks the submit button of the form. sesform provides for object persistency across HTTP requests of the page.
The class <dbform> is defined with two methods, fieldtext and process. The create / display routine performs a database select statement to read the record. It assigns the complete database record to a field of the object so that all following calls of fieldtext can access the values.<defenv dbform rel key; inherit Interactive; def process inp; dbupdate> update <? this.rel> set <? inp ",%Qn"> where <? this.key "AND%Qn"> < /dbupdate; /def def fieldtext name size; ><input name="<? this.bid>.<?name>" size=<?size> value=this.record[name]>< /def sesform; this.rel=rel; this.key=key; dbquery q> select * from <? rel> where <? key "AND%Qn"> < dbrow; this.record = q; /dbquery; registerme; defbody > <input name="<? this.bid>Submit" type="Submit" value="Submit"> < /sesform; /defenv>
<fieldtext> creates an HTML input field. Its initial value is the field value contained in the database. The field name itself is prefixed by this.bid which is generated by registerme so that input fields of different objects receive different names. The process method gets one parameter. It contains all input fields belonging to the object. It performs the appropriate database update.
The heitml system includes a browser based WYSIWYG Component Editor called RADpage. A component is nothing else than a heitml object, that can be edited by the Component Editor.
RADpage displays an ordinary heitml page in the browser and attaches handles to every component on the page. Then the user can click on the handle to modify the attributes of a component. It is also possible to select new components from a catalog and add them to an existing page.
To program a component first an ordinary heitml class needs to be programmed to perform the desired function. Then this class needs to inherit the class Component or SimpleComponent.
For SimpleComponents, this is all that needs to be done. For Components an additional component description file needs to be created ususally named 'com_Classname'. The component description file contains documentation and help texts to be displayed by RADpage It also describes the possible values and the desired formatting of the component attributes.Component editing is a very powerful example of what can be achieved using heitml. Web site maintenance can be done using just a browser and component editing: RADpage, in fact, changes the heitml pages as a user uses it to add, edit or delete components. And programming becomes a real WYSIWYG experience, since a component inserted on a page immediately starts to function, even if the page is in editing mode.
6. Implementation Issuesheitml can work as a CGI program or as a Web server extension using the ISAPI or Apache API interfaces. heitml is first translated using state of the art compiler techniques [EmmGro90] into an intermediate representation and is then interpreted.
When a user requests several pages from a server then (some) variables in heitml keep their values, including all objects they refer to. Keeping state is a crucial feature for many applications. Also interactive objects depend on it, because such objects must be preserved, so they can later process the users response.
In CGI applications and Web applications it is not possible to keep data from one page access to another. CGI scripts are started anew for each request and so loose all memory. When using the API of a multi-process server like Apache, page requests for the same user might be handled by different tasks. In multi-threaded servers like the MIIS, variables are kept, but all programs must be reentrant, since they must be able to process multiple requests in parallel.
Here comes the specific advantage of a language implementation for the Web. heitml abstracts from the underlying mechanisms and simulates a simple persistent memory for the author, although heitml is available for all the interfaces described above.
After each page access heitml performs a kind of garbage collection to find out which objects need to be kept. These are written to file and read in again when needed. This way user sessions can have very long timeouts and are persistent even beyond server crashes.
7. Software Engineering AspectsMore and more Web sites turn out to be large software systems consisting of HTML pages and HTML generation programs. There are well known principles to support the creation of large software systems, to enhance maintainability, and to reuse software parts. Encapsulation, information hiding, and abstraction are familiar mechanisms [Parnas72]. Currently, HTML -and most of its derivatives - lack these mechanisms. There is no way to structure a Web site in HTML in a top-down or bottom-up manner, because HTML supports neither encapsulation nor information hiding, nor abstraction. There is no way to hide unnecessary technical details and no way to reuse parts several times.
Many applications require HTML-pages to be dynamically changing. There are many approaches to generate pages or their parts from programs (two-level language approach). This approach sacrifices the simplicity of HTML that makes it intuitive for thousands of non-programmers, graphic designers, and text authors. However this approach creates unnecessary complexity for programmers, too, since two different languages have to be used. This often enforces a bad program structure, scattering logically connected parts throughout the system, since the abstraction mechanisms of the programming language can not be used for the HTML part.
On the other hand, object-oriented languages have a tremendous success because of their ability to create large, maintainable event driven systems [Meyer88]. In object-oriented languages, classes encapsulate state and provide clean interfaces. Together with inheritance and genericity this provides an enormous possibility for reuse.
heitml has been designed to enrich HTML with these advantages. heitml is an object-oriented programming language which smoothly combines the markup features of HTML with modern concepts such as inheritance, operational methods, and user-defined elements. Classes describe tags that can be used to place user-defined objects on Web pages. A heitml class generates HTML code, expanding its program to a final text. A heitml class can be reused and extended which allows designing Web page systems in a modular, class-based way.
8. ConclusionWe introduced heitml, an object-oriented HTML language extension. Web authors can use it to create Web sites in an object-oriented structured manner or to create complete Web-based application programs.
User-defined elements are seen as classes and can be used to place objects on Web pages. Classes and objects are a powerful mechanism to structure a Web site, to enable reuse by inheritance and enhance maintainability. Objects can react to user input and automatically perform actions on the Web-/database server, thereby replacing CGI functionality.
The paper demonstrated how HTML is extended on the server-side, without modification of the browser or the Internet protocols. heitml makes it possible to create well structured, interactive and maintainable Web sites right now without complex and slow CGI programming.
Copyright (C) 1996-2004 by H.E.I. Informationssysteme GmbH and suppliers, all rights reverved.
© 1996-2013 H.E.I. All Rights Reserved.