DPT Web Server Admin and Programming Guide

Database Programmer's Toolkit

Web Server Admin and Programming Guide



The terms "Model 204" and "204" are trademarks of Rocket Software Inc., and that fact is acknowledged wherever those terms are used in this document. Likewise the term "Janus" which is a trademark of Sirius Software.

Contents


Overview

Features

Web application development on DPT?

Unlike most of DPT, the web server is not a direct emulation of a Model 204 feature. Having said that, there is great similarity with parts of Sirius's Janus Web suite, which is no real surprise as the purpose is the same. (And if you're going to copy someone, who better than the Sirius guys?) The DPT web server is somewhat more straightforward than Janus Web, which makes it easier to configure and write applications for. This is of course in addition to the fact that working with HTML, pictures, documents etc. is easier on the PC anyway.

In reality, while not being 100% portable to Janus or any other platform, you'll probably find that much of the code you write in a web application is in standard formats such as HTML. While the exact mechanics of serving up that HTML differ between web servers, the general principles of HTTP messaging are the same, and web application code has a similar structure in any language on any platform. As well as User Language programming notes, this guide contains assorted background information which applies to web programming in general, for example a summary of HTTP message processing, which is fundamental to the whole exercise. So writing a web application on DPT is as good an introduction to this area as using any other platform, and probably easier than many because you can do everything together through the same user interface.

Static and dynamic web page content

The thing which means we can talk about "web-based applications" rather than just "web sites" is the ability for users to interact with the pages in their browser and see different information based on actions they make. Ignoring the various client-side "DHTML" concepts for the moment, this usually means that instead of simply serving up HTML files held at its end, the web server instead runs some kind of user-written "scripts", and serves the output from these scripts instead. Whatever language they're written in the scripts will usually be creating HTML on the fly, but they might generate output in any other form that the browser end of the application requested, such as plain text or applets. On DPT these scripts are written in User Language.

Server daemons

This is another issue which is shared in common by all web servers, which must be prepared to handle huge numbers of potentially very small requests, where socket connection and server thread initialization might become a significant overhead. DPT addresses this concern by spawning special daemon users which are permanently initialized and ready to service web requests. These daemons can run with database files permanently open, globals permanently set, and so forth, which cuts down on per-request overheads. When a browser requests a dynamic web page, your script (User Language handler procedure) for the page is run by one of the server daemons.

In most respects these daemons are just regular user threads. The one big difference is that they have access to an extra output "destination" in addition to the normal terminal output and USE destinations. This is the network connection back to the browser, with data sent there becoming the content of the requested web page. The programming section below describes various ways to write information to the browser, and also various ways you can easily code, test and debug your scripts, with or without involving the server daemons.


System Configuration

Default and demo configurations

The chances are that as you read this your DPT installation is the one that comes with the demo download, in which case the web server is automatically enabled and should work straight away - just start the DPT host and point your browser at "localhost" or "127.0.0.1".

However, if you make a fresh installation by starting dpthost.exe in a clean directory, the default is for the web server not to be enabled. This section describes various system level actions you can take to enable and configure it.

Enabling the web server

For the web server to work, the following two parameters need to be changed from their default settings.

Daemon processing and parameters

All daemons on DPT run with a single input command line. In the case of the web server daemons this command is =PSTWEB, a custom command similar to the PSTxxx commands used for checkpointing etc. The functioning of the command can be summarised as follows:
  1. The daemon opens a standard procedure directory named by WEBPROC.
    This is the default directory for procedures generating dynamic content.
    It is allocated automatically (i.e. you do not have to code an ALLOCATE command for it anywhere) but with a file name which you can define.
  2. The daemon (first one only) includes an initialization procedure.
    This can contain any one-off commands required to configure your web application.
  3. The daemon includes a login procedure.
    In this proc you might typically place OPEN and RESET commands and perhaps set up some globals. This reduces the overhead of doing these things for every incoming request, since the daemon retains a continuous session between requests.
  4. The daemon then makes subtle alterations to its settings for routing line output etc. in preparation for creating dynamic content.
    This is discussed later in the programming section.
  5. Finally the daemon starts polling the queue of waiting browser connections, picking them up as they arrive and sending appropriate responses.
  6. Server daemons waiting for work show in the MONITOR command as custom wait state 94.
Fine control can be exerted over daemon operation using various parameters. Follow the hyperlinks for details about:

Managing "content type" reference data

In the HTTP protocol, it is usually considered the responsibility of a server to tell a client (e.g. browser) how to interpret the data inside each file it serves. For example, "here is the file you requested 'beach.gif', and it contains a GIF format picture". Clearly the browser could have guessed that itself from the file extension, and most browsers these days do that, and also have an array of other techniques they apply by looking inside the data (after all, being flexible on this issue is part of their competitive advantage over other browsers). However, the convention that servers describe served content is still valid, even if many browsers just ignore the information, so it is something worth mentioning.

When serving static file content, DPT performs a lookup of the file extension in a cross-reference table, and returns a standard content type description if it knows one. Otherwise it is left to the browser to decide. The =MIME command is used for maintenance of this reference information.

When serving dynamic content, the browser will have requested a resource ending in e.g. ".dptw", which does not have a standard meaning. The content will depend on the request-handler procedure that you write, and while it will usually be HTML (the default), it might be of any type, or even "multipart" data containing more than one type. Whatever it is, your code is responsible for sending an appropriate content type description unless you have confidence that all your clients will be using browsers that can make a correct guess. The method by which the handler procedure specifies the content type may or may not involve validation against the lookup table (see later).

Other web server config issues

Security
In the initial release (V2.0) the web server has no specific support for security-related features like HTTPS/SSL, or hooks into Windows login services, which is in line with the generally relaxed DPT security philosophy. However since a web application is by nature public, you may wish to consider the following simple measures that can be taken.


Programming for Dynamic Page Content

Following on from the earlier notes, this section describes how to use a small set of specialized new User Language features to create dynamic web page content and control how it's sent to the browser. Various other issues to do with building web pages are also covered, and it's therefore quite a long section, as follows. (Note: There is a much shorter tutorial chapter in Appendix 3 if you're in a hurry ...)

Request start and completion points

In response to an HTTP request for a resource with the "dynamic" URI extension (default ".dptw"), a server daemon includes a procedure or executes a command. The command case is really a simple form of the procedure case, so the following discussion talks about procedures, and returns to commands briefly towards the end.

The name of the procedure that gets included is the URI minus the extension, (which in the end will probably be a .txt file since that is the default extension for procedures on DPT). The procedure is by default assumed to reside in WEBPROC, but you can request procedures in other directories too.

www.elsewhere.com/script1.dptw                   //assumed to be in WEBPROC
localhost/myprocs/script1.dptw                   //a different directory - MYPROCS
127.0.0.1/perm group g1/script1.dptw             //a directory group
The server daemon will need the specified file or group open, so that would be a time to put an OPEN command in the daemon login proc. WEBPROC will always be open if it exists. Note that since the inclusion of a procedure requires that it resides in a M204-style allocated proc directory, you can't run arbitrarily-located scripts like say "C:/proc.dptw"unless you allocate those directories as "DDs". Only one level of indirection is allowed, i.e. to name an alternate allocated directory from WEBPROC.

Once the procedure has been included, the web page request processing is deemed to be complete when the daemon returns to command level out of the procedure. You can also trigger early completion from partway through the procedure using $WEB_FINISH. In either case the results are returned to the browser and no further processing happens for that HTTP request. Everything that happens inside the procedure is up to you, and you can use any combination of commands, User Language, subsystem invocations etc.

The server daemons have a continuous session which they keep going while servicing many requests - the idea being to remove per-request overheads in the form of repeated file openings/closings, parameter and global settings, and so forth. However, note that some things *are* reinitialized for each HTTP request, such as the dynamic content buffer (obviously), and also the daemon's message history ($ERRMSG, $FSTERR, etc.) and other terminal output history (PRINT etc.)

Request parameterization methods

Dynamic content requests will often be parameterized in some way, to take account of user input or preferences. Parameters can be passed from HTML input forms in three main ways, with DPT providing similar $functions for retrieving the parameters in each case:

Parameterization methodTypical useRetrieval $function
1. URI parametersMany simple retrieval request forms$WEB_GET_URI_PARM
2. Data sent with POSTMore complex forms$WEB_GET_POST_PARM
3. HTTP header fieldsControl information$WEB_GET_HEADER

1. URI parameters
This is the simplest and easiest-to-understand way of passing parameters from an input form to a web server. When the user fills in the form, the browser appends the parameter names and vaules to the requested URI after a question mark, and then navigates to e.g.:

www.nerdstuff.com/search.dptw?product=compilers&language=M204

As well as being simple, this method is user-friendly from a browsing point of view, since an entire parameterized URI such as the one above can be bookmarked, and presents no problems to the browser's "forward" and "back" functions.

The host-side procedure retrieves the parameter values set like this using the $WEB_GET_URI_PARM function.

2. Data sent with a POST request
This technique comes in to its own when the requirements in a particular situation do not suit the above. For example:

In such cases the input form can simply switch from using "action=GET" to "action=POST" on its submit button. The main difference in HTTP message terms is that a GET request consists of a URI and some headers, whereas a POST request additionally has an attached data section, containing the values entered by the user on a form, and/or file contents.

The parameters are retrieved in a similar way but with a different $function, $WEB_GET_POST_PARM. Despite the apparently complexity of this parameter-passing scheme compared to the previous one, your server-side handler code can look exactly the same apart from the name of the $function. The differences are handled automatically by the browser at one end and DPT at the other.

The data sent can be of an arbitrary size, which presents a problem to processing in User Language where the largest data object size is usually 255 bytes. If you expect data longer than 255 bytes the answer is to increase the STRINGMAX parameter value and declare the receiving variables with sufficient LEN to take the incoming values.

3. HTTP header fields
Most of the header fields in HTTP requests and responses are for low-level control of the protocol by the browser and server. While applications can use custom header fields to parameterize requests in any way they see fit, the most common times a host-side script would access header information are when interacting with browser features such as content caching and cookies.

All the headers on an incoming request are accessible within User Language using the $WEB_GET_HEADER function. Cookies and caching are discussed in more detail later.

Preparing a response

Now we've talked all about how our code is invoked and how we can see exactly what the browser asked for, lets move on to what we send back and how. In HTTP terms there are three main components to DPT's response.

The appendix contains some lower-level information about how DPT prepares its HTTP response message.

Writing to the HTTP data buffer

Output to be sent to the browser is assembled in a buffer before finally sending it in a single HTTP response message when the script finishes. There are several ways of writing to the buffer, as follows.

1. Line by line
A simple way to generate textual data such as HTML is to print it out line by line. To this end DPT provides a special variation of the PRINT statement, WEBPRINT, which works exactly like PRINT except that the output goes to the HTTP data buffer instead of the normal output destination. Here is an entire web page generated using this method:

B
WEBPRINT '<HTML>'
WEBPRINT '<TITLE>Employee Information</TITLE>'

%N = $WEB_GET_URI_PARM('NAME')
IN EMPLOYEE FR WHERE NAME = %N
    WEBPRINT '<P>' WITH FIRSTNAME AND SURNAME
    WEBPRINT '<BR><IMG SRC="photos/' SURNAME '_' FIRSTNAME '.jpg">'
END FOR

WEBPRINT '<! Standard page small print>'
WEBPRINT '<P><FONT SIZE = 1>The information on this page is copyright ...</FONT>'
WEBPRINT '<P><FONT SIZE = 1>blah yawn blah</FONT>'
WEBPRINT '</HTML>'
END
WEBPRINT statements append a newline separator sequence (CRLF) to the end of the string placed in the HTTP buffer, so the above code generates readable HTML. Note also that like AUDIT, SET TRAILER etc., if you end the expression with an ellipsis (...) the output line must be continued with actual PRINT statements rather than WEBPRINT.

2. Rerouting normal output to the HTTP data buffer
The WEBPRINT statement is provided to improve code readability by distinguishing it from regular PRINT, but you can use PRINT if you prefer. This is achieved using the $WEB_USE function, which redirects everything that would normally appear as "terminal" line output (UL PRINT, messages, command output, etc.) to the HTTP data buffer.

This method can be handy to generate error diagnostics with existing code which uses PRINT statements. For example:

...
ON ERROR
  $WEB_BUFFER_SIZE(0)                    /? discard anything built so far   ?/
  $WEB_USE('ON')                         /? reroute PRINT                   ?/
  INCLUDE STANDARD_ERROR_CODE            /? diagnostics using PRINT         ?/
  $WEB_PUT_DATATYPE('TXT')               /? assuming messages are not HTML  ?/
  $THREADCODE(0)                         /? stand down auto error handling  ?/
  $WEB_FINISH                            /? all done                        ?/
END ON
END
It is also one way of creating a "remote command line" handler, since the $WEB_USE setting remains in effect if the script drops out of User Language to issue commands. (See also the shorthand method for obtaining command output later).
...
  $WEB_USE('ON')
END
TABLEB RECLEN     /? output will go to HTTP buffer ?/

Notice the neater-than-usual $function calling syntax in these examples, which is new in V2.0, and applies to all $functions, not just these $WEBxxx ones.

Sometimes it may be inappropriate or impossible to build the output data in a line-wise fashion, so there are some more $functions that can be used...

3. Single items of text or data
The $WEB_PUT function takes a string expression and places it, with no translation, and no terminating CRLF, into the HTTP buffer. You could use this simply to build a page of HTML in a more piecemeal way than above, or if you didn't want the CRLF for some reason, or to insert non-textual or non-printable data into a page. In a slightly contrived example we could rewrite the start of the above code as:

B
%CRLF = $X2C('0D0A')
$WEB_PUT('<HTML>' WITH %CRLF)
$WEB_PUT('<TITLE>Employee Information</TITLE>' WITH %CRLF)

4. Incorporating data from a file
The $WEB_PUT_FILE function allows files of already-formatted HTML or other content to be incorporated into a piece of dynamic content, for example a multipart data block. Recoding the above script we could place the raw HTML for the fixed parts in separate files:

page_header.html:

<HTML>
<TITLE>Employee Information</TITLE>

page_trailer.html:

<! Standard page small print>
<P><FONT SIZE = 1>The information on this page is copyright ...</FONT>
<P><FONT SIZE = 1>blah yawn blah</FONT>
</HTML>
Then the original code would look like this:
B
$WEB_PUT_FILE('page_header.html')

%N = $WEB_GET_URI_PARM('NAME')
IN EMPLOYEE FR WHERE NAME = %N
    WEBPRINT '<P>' WITH FIRSTNAME AND SURNAME
    WEBPRINT '<BR><IMG SRC="photos/' SURNAME '_' FIRSTNAME '.jpg">'
END FOR

$WEB_PUT_FILE('page_trailer.html')
END

This technique is useful when there are smallish pieces of static content to be included in a relatively large amount of dynamic content. In many cases however the opposite is true, and it can make more sense to use a strategy based on the standard server side include method, where dynamic content is inserted into static, rather than the other way round. There are pros and cons of both methods, and in many cases, as illustrated in the sample web application, they can be used in combination.

Error handling

At the point where your dynamic content would normally be sent to the client, (either return to command level or $WEB_FINISH), DPT makes a check against the highest MSGCTL error code encountered during the procedure. If this exceeds a certain value ($WEB_AUTO_ERROR), your content is discarded and an error report page returned instead. Depending on the WEBFLAGS parameter value the page may or may not contain the current $FSTERR, $ERRMSG, $STATUS and $STATUSD, and other detailed diagnostics.

In situations where your code catches an application-level error it would be most reasonable to prepare a formatted error page yourself, in which case you could then tell the automatic error handler not to run by calling e.g. $THREADCODE. Perhaps in other cases if the error was unlikely and/or you were feeling lazy, you might trick the automatic error processing into running when it would not otherwise have run, again with a call to $THREADCODE.

Incidentally the script auto-error page is an example of the DPT web server standard error page with HTTP response code 500 (general server error), even if the reason for the error was something that might reasonably correspond to another standard HTTP code, such as record-locking (409?) or nonexistent-record-referenced (410?). Note also that if server daemons are bumped, an in-flight request is sent a 503 response (Server Unavailable).

Specialized coding topics and ideas

Using cookies
Cookies are one way to add continuity to conversations between browsers and servers, since technically each HTTP request/response exchange starts afresh, with no state information preserved from one to the next. When a server sends a response message it can add information to the message and tell the browser to "remind me of this next time you request something". Such items of information are called cookies, and the browser stores them on its local disk. As the cookies go back and forth between the server and browser they effectively constitite a preserved session state.

Cookies are exchanged as specially-formatted HTTP header lines. The server tells the browser what cookies to store (and various other information about how to store them and for how long) using the "Set-Cookie" header. The values are returned when appropriate from the browser in the "Cookie" header.

For example on the home page of your website you could ask the user to log on, and send them a cookie with a pass key which gives them access to other private pages on the site. The browser would then automatically send this pass key when the user visits other pages, saving them having to log on individually to each page. When the user returns on some future occasion, the same key might or might not have expired, but the site could at least welcome the user by name. Another example might be on a shopping site where the server sets a cookie when the user purchases something. Then whenever the user visits the site in the future the server is reminded by the returning cookie of the user's preferences, and can make relevant product suggestions.

Using cookies on DPT is very simple. Your handler code sends cookies to the browser with $WEB_PUT_COOKIE and retrieves them on future occasions using $WEB_GET_COOKIE.

Control of browser caching
When serving static content, the DPT web server supports browser caching of non-changing data by co-operating with browser's use of certain header fields. In the dynamic situation your handler procedure can avoid repeatedly sending the same application data by programmatically doing the same thing.

Firstly, to tell the browser to cache some content, set the "Modified-Date" header into a response - usually to the current date and time. Then the next time the browser requests the same URI (assuming the browser has caching enabled and the user didn't explicitly request a fresh copy) it will automatically put an "If-Modified-Since" header into the request, indicating that it is prepared to use its local copy. So your handler procedure can use the $WEB_GET_HEADER function to retrieve this header, compare it with application database time-stamps, and if appropriate send a code 304 response. The demo application contains an example of this.

In certain applications you could also take a more proactive approach to save the browser even having to ask if the data is updated. For example if you know there will be a database update in 3 days, additionally send the "Expires: {datetime}" header with the original response. The browser would then know that its cached copy is invalid after that period, and request a fresh copy straight away next time.

Note that browsers cache content based on URI. This means that a request for, say, some records from a database using POST-style parameters (see earlier) might or might not contain sufficient information for you to decide whether to send fresh content, since the URI would be the same whatever records were requested. It would depend on the level within your application database at which timestamps were held. You can supplement the automatic "If-Modified-Since" processing with your own cache control information if you like. For example there is an HTTP standard header called "Cache-Control" which gives finer control, and you can also set up schemes based on cookies or "hidden" input fields in HTML forms you generate. All these ideas are commonly used, but they are outside the scope of this document.

M204 command output
Incorporating the output of a command into a web page is a common requirement, and it can be handled in a number of ways using the previously-described facilities. For example the input form might generate a request for "handler.dptw" with some POST parameters identifying a sequence of commands to issue. The script could pick out the parameters and use $COMMBG, or set them into globals, and use ?& dummy strings to issue them as commands.

In simple cases however can be neater to use a special built-in script provided for this purpose called "=COMMAND". This causes the daemon to issue the command(s) named in the URI parameter, and add the output to the HTTP buffer. Some examples:

dptoolkit.com/=command.dptw?MONITOR DISKBUFF
localhost/=command.dptw?openc refproc;in refproc display list
www.elsewhere.com/=COMMAND.DPTW?b;print 'The local time here is ' $time;end
When this method is used the "script" is considered to be complete when the supplied input runs out (e.g. in the second example above two commands will actually get issued). By default, the automatic error handling trigger level is set so as not to be invoked using this facility. So for example the output of "=command?FOOBAR" will not be a formatted error report but the "Invalid Model 204 command" message, which would perhaps be more useful in typical situations.

The WEBFLAGS parameter's X'00000004' and X'00020000' bits control whether this feature is enabled at all and how errors are handled.

Note that the command output is plain text, so if it's getting inserted into a page of HTML, (for example commonly in a SSI scenario), wrap it with appropriate tags (e.g. <PRE> ... </PRE>) to get the correct tabulation and line breaks. Note also that the text of the input command is uppercased or not according to the WEBFLAGS X'100' bit, and not the value of the CASE parameter (*UPPER etc.)

Issues with lowercase/uppercase code and displays
Legacy mainframe applications are often all in uppercase as regards user I/O, and it has been common practice to write M204 code in an editor where everything is uppercased automatically. On the other hand, client-side parts of a web application such as HTML input forms are fundamentally written in mixed case, both code-wise and in the handling of user I/O. As a result, areas where the client and server meet could easily become a source of irritating misunderstandings. Things are also complicated by the fact that some parts of the overall web infrastructure are case-sensitive (e.g. cookies) and some aren't (URIs in many situations, HTTP header names).

Nobody wants to write an application where everything is in uppercase, especially when values are entered by, or echoed back to, an end user, so DPT addresses the issue in several ways:

Building and debugging web code

The methods you adopt are obviously up to you, but the following comments may be helpful.

Testing at the command line (">")
Firstly note that a web page handler procedure can simply be included at the chevron or F-fived in the editor like any other procedure. This is convenient, especially in the early stages of writing a script, because it saves you having to constantly switch over to your browser to initiate page requests. The traditional [GO, edit, modify]... cycle can be used, with or without the debugger active, and this is a speedy way to build code.

There are a couple of drawbacks to working purely at the command line, the most obvious being that when the procedure comes to an end (or you call $WEB_FINISH), any HTML built by the procedure has no browser to get sent back to and just gets shown on the terminal. This is often enough to see whether the script is working correctly though.

Some other issues are also worth mentioning.

Testing "as-live"
A second approach is to make changes to handler procedures and view the results immediately through a browser. That is, let the DPT web server daemons run your code and generate output as it's supposed to look. This is the most streamlined way to work when you have nearly-complete web pages that just need changes to presentational frippery like layout, colours and so on.

When making code changes, simply save (F6) in the editor and then refresh (F5 or Ctrl+F5) in the browser to invoke the changed script. (Unless the script is an input form handler with POSTed parameters, in which case you'd have to fill out the form again - see earlier notes).

When working like this it is more difficult to find errors in scripts, since they are being run by a daemon and you will not have immediate access to so much diagnostic information. The audit trail will contain some messages, which can also be enhanced with the procedure name and line number (set the MSGCTL parameter 32 bit in the daemon login proc). Normal "terminal" messages issued on the daemon are not discarded either, although they are somewhat less accessible than the audit trail (in the #DAEMON directory). With more subtle errors you may want to debug the script, which is the subject of the next paragraph.

Debugging as a daemon
A halfway house between the above two approaches is to pretend to become a server daemon and handle a page request on your own terminal session. This combines the advantages of proper browser interaction, and having the debugger available as the handler procedure runs, but it is a little more awkward to set up:

Coding and editing issues
When building a web application you'll probably find that in addition to what M204 programmers normally consider "procedures", containing M204 commands and User Language, you'll have a number of other files containing HTML, javascript, style sheets etc. which you are also actively making changes to. These might be static pages of your website, or fragments that you are including in dynamic pages using $WEB_PUT_FILE and/or Server Side Includes.

The DPT client is not a sophisticated multi-language editor, but even so it can be very convenient to have both User Language and other types of code accessible within the same IDE rather than using separate editor tools, and to support this kind of usage DPT has a couple of new features. Firstly the editor has some simple functions, for handling files which don't contain User Language, and specifically HTML. Secondly you can allocate directories to the host with non-default extensions, which gives you the option of using commands and DPT GUI facilities to access them as if they were procedures. This is a neat way to work, and is how the demo installation is configured.

Miscellaneous daemon processing issues

Session corruption
In the normal course of events each daemon includes the login procedure once only. If your application "messes up" the daemon session so much that it seems simpler to reinitialize it somehow than try to tidy up, you might include the login proc again explicitly, or, probably better, code a LOGOFF, in which case a fresh daemon will start automatically.

One common issue might be when different scripts require different parameter settings, since parameter settings are part of the persistent state of a server daemon between web page requests. It is the programmer's responsibility to deal with this. For example install a set of parameters in the daemon login proc that caters for all requests, or alternatively have each script install all the parameter settings it needs.

Line IO
As with any other daemon (IODev 99) or robot thread (IODev 1 or 3) operations like $READ and ?? dummy strings, which require user input, can cause the daemon to terminate, since it has no user to prompt. So if you write web application code which uses $READ and it works when you test it at the terminal, the chances are it won't work in the real web situation where the code gets run by a daemon.

For the purposes of the WEBPRINT statement, OUTMRL is always 32K (i.e. just like the AUDIT statement).

It's also worth mentioning that server daemons do not enqueue procedures during script processing in the same way that terminal users do. This feature was specifically put in to allow you to have handler code open in the editor on your terminal session at the same time as testing it via a daemon (see earlier notes).

End-of-Procedure
An automatic error response page is sent if your handler procedure comes to an end and there are mismatched quotes, block comment delimiters or BEGIN/END pairs. On any other kind of thread this is OK as the command line interpreter just takes more input and lets the mismatches resolve naturally - most UL programmers are familiar with the informational message "Waiting for close quote" for example. The web server daemon on the other hand has no way of prompting for closure, so it treats the whole script as invalid.

There is always an implied database commit at the end of request handler procedures.



Appendixes


Appendix 1: Terminology


Appendix 2: HTTP Message Processing on DPT

Message structure summary

A HTTP message is a chunk of data sent across a TCP/IP connection. It is in human-readable form, broken into lines by line separator characters (CRLF), as follows.

Title lines
The request message title line identifies the resource required from the server. e.g.

GET homepage.html HTTP/1.1

The response title line gives a numerical success/failure code and simple descriptive text. e.g.

HTTP/1.1 200 OK

In this case (OK) the response would be followed by the page data. In other cases such as e.g. commonly 404 (Not Found) or 500 (Server Error) there might be no page data following. Or the server might instead send more detailed information in a small formatted error report, which is what the DPT web server does. (Although see general comments on error response messages).

The full set of HTTP response codes are not listed here (surf them up) but are grouped into standard ranges:
1xx: Information
2xx: Success
3xx: Redirections
4xx: Client errors
5xx: Server errors

Header lines
These are name=value pairs, and the message receiver knows it's the last one when there is a blank line. There may be no headers, in which case there will still be just the blank line. Many standard header names are used for HTTP protocol control reasons, but any free-format headers can also be inserted for application use. First consider a few standard headers - the above response title might have continued as follows:

HTTP/1.1 200 OK
Date: Fri, 08 Sep 2006 13:29:02 GMT
Content-Type: text/html
Content-Length: 107

...contents of file (107 bytes)...

The time is an important field for client/server synchronization, since the two ends of the conversation might be in different time zones. More on this later.

The content type tells the receiver how to interpret the upcoming data (is it an image, a video, a spreadsheet) etc? In many cases the file extension also carries this meaning, but other times the request may have been for a script to be run which could have generated various types of content, in which case this header is more important. In any case it is usually present. The two-part slash-separated value of this header is a "MIME" code (more later).

Finally in this example the content length header is essential so that the receiver will know how much data to wait for.

A few more standard headers are mentioned in the following sections, and if you are interested there is of course a wealth of information out on the internet. Generally speaking however, user script code does not have to do too much interaction with the HTTP headers. Cookies and cache-control headers are the most common exceptions.

The data block
The data, or "message body" is present most commonly on response messages, in which case it contains either the data of the requested resource, or some kind of error report.

In the case of a "GET" style request such as the one shown above, there is no message body on the request message. However, with "POST" style requests (typically used with HTML input forms) the body contains the values of the input fields as entered by the user, which are effectively parameters to the request. Retrieving these is discussed elswhere in this document. A third type of request, "HEAD" asks that the response message contains everything as per "GET" except the data, and is sometimes used by browsers to find out how large a particular file is before actually requesting the data.

What DPT does automatically (all request types)

Request validation
Obviously to service a web request DPT must understand what's being requested. There can be various misunderstandings between client and server, but to summarise, DPT requires that the request

If any of the above conditions are not met, a message with the appropriate code (usually 400) is sent back, containing a small formatted error report as its message body.

Note that under no circumstances (in the current release) does DPT take any notice of information concerning the character set or encoding of incoming data, and if this is a concern it must be handled at the application level. (Strictly speaking this may be HTTP 1.1 non-compliance and rather naughty, but it keeps things simple and can easily be added later).

Server availablilty check
Under certain circumstances the DPT web server will be unable to even process requests. For information, the following responses may sometimes be seen:

URI standardization
Here are some examples of potential incoming request URIs. Note that the browser will have removed the web site name, so that at the server end we only see the requested resource name.

homepage.html                           * simple
/                                       * user wants default page
home+page.html                          * space replaced by + (common convention)
home%20page.html                        * special character given %hexhex encoding 
homepage.html?name=Luke                 * URI-encoded form parameter
homepage.html?name=Luke&ship=X%20wing   * two parms
/homepage.html                          * preceding slash ignored here

DPT will reformat these and try to locate the following resources:

homepage.html
index.html                              * or as per WEBHOME parameter
home page.html
home page.html
homepage.html                           * form parm ignored if static (see later)
homepage.html
homepage.html

Response message headers
These headers are always included in all types of response, so user code never needs to set them up.

What DPT does automatically during static file request processing

This means requests for resources having a file extension which does *not* match WEBEXTN. In other words all the requests for .gif, .html, .doc etc.

Request validation
Static file requests must be *retrieval* requests for the resource, and may therefore not have the POST message type. Put another way, DPT does not support "blind" file transfers from client to server. To perform file transfers you must go via a host-side script you write, so you can then apply appropriate intelligence to the details of exactly what to store and where to store it.

URI parameters on requests for static content may sometimes be used by browsers for their own purposes, so DPT ignores these and simply issues an audit trail message.

Browser cache support
Most browsers keep a local copy of all files they get from web sites, so that if the user requests them again, the cached copy can be used, and network bandwidth and time saved. This is usually enabled by the browser asking the server "has this file been changed or can I use my cached copy?", and the server replying either "yes it's been changed, here's the new version", or "no, go ahead and use your copy". Put more technically, the browser might send the following header along with its GET request:

If-Modified-Since: Wed, 22 Nov 2006 21:32:07 GMT
DPT compares the last-modified date/time of the requested file and either sends it as normal, or sends a 304 response (Not Modified), which is obviously quicker.

Other request headers with special meanings

Special request types

Response message: Header values sent with static content

What DPT does automatically during dynamic (user script) processing

The mechanics of including the script procedure and routing its output are discussed in the programming section, as are some general points about request parameters, which DPT automatically parses from the various parts of the message and makes available via $functions.

Parameter names and textual values and header names and values can be automatically uppercased depending on the WEBFLAGS setting. Note that the case of the handler procedure name in the URI does not matter.

HTML input form element types
Most input elements send a simple text value as entered by the user. The the following notes may also be of interest.

POST message data unpacking details
DPT attempts to extract information and make it available to the user script via $WEB_GET_POST_PARM. Different processing is applied depending on the Content-Type header of the message, as follows.

Posted file data
File data can be posted either from an HTML input form or as an arbitrary POST request from any HTTP client. The data is held in memory for use by a script but not stored as a host-local file unless application code explicitly stores it.

To cater for the situation where a client posts very large files, perhaps maliciously, you can set a limit on the size of a single file, or the aggregate size over a period using the WEBPMAX and WEBPMAXT parameters.

HTTP date handling

This section is here mainly to provide some background to the required format of dates if you ever want to put them into standard HTTP headers such as "Modified-Date". For all other purposes you can use dates and times in any formats appropriate for the application, although remember that when client and host might be in different time zones any date/time values passed between them should either specify the time zone they relate to, or better still use GMT.

The HTTP 1.1 protocol defined the following 3 formats as acceptable. However, the second and third were included for historical reasons, and the protocol specified that all new applications should use the first format. (Note that HTTP 1.1 has been surprisingly resilient, and the term "new applications" meant new back in the late 90s).

1. Fri, 31 Dec 1999 23:59:59 GMT      * recommended
2. Friday, 31-Dec-99 23:59:59 GMT
3. Fri Dec 31 23:59:59 1999
To facilitate working with the HTTP format if required for use with HTTP headers, or if desired for other general purposes, DPT comes with some new options for use with all the User Language $DATExxx functions.

Server side includes (SSI)

This is a simple and popular technique, supported by most web servers, allowing you to create web pages containing content from several different sources, including both static and dynamic sections. For example many good-looking web sites contain large amounts of standard graphics and containers which appear on every page, plus a smaller amount of content specific to each page. The main idea is very similar to the M204 INCLUDE directive, and is best illustrated by an example:

mypage.html:

<HTML>  
<!--#include virtual="graphics/banners.html?My Results Page" -->

<TABLE BORDER><TR><TD>
  <!--#include virtual="results.dptw" -->
</TABLE>

<!--#include file="smallprint.html" -->
</HTML>
In this case the host sends a single page in response to a request for mypage.html, which contains the contents of "banners.html", followed by the dynamic content generated by the script "results.dptw" in a box, followed by the contents of "smallprint.html". The calling-in of these elements is handled at the host end and requires no network traffic or interactions with the browser.

Note that SSI directives are always wrapped in HTML comments, since you don't want the browser to show the directives themselves. What's more their format is quite tightly defined. There must be two hyphens before the #xxxxx, and a space and two hyphens after the resource name, which may have optional quotes. Any mistakes in this format are liable to mess up the whole page, for example commonly forgetting the space. The server replaces the entire comment with the included resource, or a short error message if e.g. the resource could not be found.

DPT supports a small subset of the wide range of standard SSI directives, as follows.

Notes Notes specifically on SSI-included scripts

DPT web server standard error page

DPT generates a very simple formatted HTML error page containing some diagnostic information, in two groups of situations:
  1. Resource location or conflict situations such as "not found" etc. as mentioned in various places above
  2. Problems with application code which trigger the automatic end-of script error handling
By default these pages are sent back to the browser as HTTP messages with an appropriate HTTP response code. However, depending on how the browser and DPT are configured, you may or may not see the page. This is because some browsers have an option where they will display their own error page when certain "non-success" HTTP response codes are received, in preference to any page content the server might have sent with its response message. This can be nice as it puts a familiar face on various common server errors, but as a webmaster you might want to exert more control.

One way to affect this browser behaviour is to turn off the option in all your users' browsers, although this would obviously not always be possible. Another option is to set the DPT control flag which says send error page responses with an HTTP code of 200 (success), thus "tricking" the browser into displaying them. This is the setting in effect on the DPT demo installation, since we want to see as much information as possible there. It's up to you how you handle this in your application - perhaps leaving it up to the browser for general use and turning on the more informative server option when investigating error situations.


Appendix 3: Quick-Start Programming Tutorial

If you want to build dynamic web pages as soon as possible but don't feel like reading this entire document right now, you've come to the right place! Building a web application is a little bit more complicated than a terminal application, but not that much, and the good thing is that once you're an expert doing it in User Language on DPT you'll know a lot of what you need to work with PHP, ASP and all those other sexy things.

Step 1: A basic User Language script

Let's start with a simple User Language terminal program. At the DPT command line, enter 'OPEN WEBPROC' and then create a new procedure with Ctrl+Shift+N. Type in the following and hit F5.

B
PRINT 'DPT is awesome!'
END
If you've ever used DPT or M204 before you won't be surprised to see a single line of output come to the terminal. You may also not be surprised if the text got uppercased too (we'll come back to that).
DPT IS AWESOME!
This isn't quite a web page yet - for one thing the output has to be sent to a browser instead of your terminal. Re-edit the program (Ctrl+Tab) as follows:
B
$WEB_USE('ON')
PRINT 'DPT is awesome!'
END
There are several ways of sending data to a browser, and with this one we're just asking to redirect the result of the PRINT statement.

While you're in there, if your editor uppercased the text in quotes, why not select "is awesome" and lowercase it again (Ctrl+L). User I/O that's all in upper case is really not the done thing in a web application, even if you can usually get away with it in a legacy 3270 application! The issue is covered in more detail in the programming section elsewhere in this document.

Run it again and you should see something like this:

HTTP/1.1 200 OK
Server: Database Programmer's Toolkit/1.3a
Date: Tue, 28 Nov 2006 19:01:15 GMT
Content-Length: 17
Content-Type: text/html
--------------------------------------------------------------------------------
DPT is awesome!
--------------------------------------------------------------------------------
It's different but it's still not a web page, is it? Well, it's closer than you might think. By using one of the special web-related $functions we told DPT that this is a web page script, and it responded by creating a variety of extra "HTTP" control information that would be required by a browser. Unfortunately we still don't have a browser connected, so DPT mocked up a version of what it would have sent, and displayed it at the terminal instead.

Working purely at the terminal like this is a good way of doing the early construction of a web page, as it's quick, easy and familiar, and we can debug the program if we want to using the DPT debugger (Ctrl-F1).

Step 2: Running the script from a browser

Now we've got a rudimentary script built, let's run it from a web browser and see what we get. If you're working on the DPT demo installation, you should be able to open up your browser and type in the address:

localhost/yourproc.dptw

Depending on how your browser is configured, you may have to "persuade" it that you really want to look at a web site on your local machine, but this is simply a matter of clicking the appropriate buttons when offered. For example In MS Explorer (Win XP and before) click "work offline" then "connect", or in Vista it may prompt you to enable "intranet" settings in the tools menu. It's also possible you'll get objections from a local firewall, which you can suppress by allowing localhost (loopback) connections on port 80 (the default WEBPORT parameter).

To make the page look a little more attractive, how about editing the script to look like this:

B
WEBPRINT '<HTML>'
WEBPRINT '<H2>This page was created in User Language</H2>'
WEBPRINT '<HR>'
WEBPRINT '<P>The local time is <TT><B>' $TIME '</TT></B>'
END

Here we've inserted some HTML tags into the page to lay it out more nicely. Also note that the PRINT statement has been replaced with WEBPRINT, which does the same thing, but saves us having to issue $WEB_USE beforehand since WEBPRINT automatically generates browser output lines.

Now either save it (F6), or better still save-and-run it (F5) to check it compiles, and then hit refresh in the browser. Every time you refresh you should see the time updating to show the script is being rerun each time.

Step 3: Techniques for more realistic pages

OK, that's pretty much all there is to it! Everything else is just elaborating on this theme, and now it's up to you. Elsewhere in this and the other DPT manuals lots of ideas are covered for building complex web pages, as follows. (Plus there is a fair amount of sample script code distributed with the download).