Designing and implementing an on-line database interface

Alan Vince

Contents

  1. How do on-line databases work?
  2. When is a database more appropriate than HTML?
  3. Custom building versus off-the-peg solutions
  4. Checklist
  5. Forms
  6. Database interfaces
  7. Useful links

How do on-line databases work?

Most online databases work by capturing requests sent by the user to the webserver. These requests are then processed by a program which uses the Common Gateway Interface (CGI). This program then retrieves records from the database, wraps HTML code around them, and supplies the results in HTML format to the user.

For very small datasets an alternative method would be to parcel up the entire database, together with a program which can access and display the data and send all of this to the user. Such programs can be written in Java or, for very simple cases, JavaScript or they may rely on the user having a plug-in which is called into play when files of the appropriate format are requested from the webserver.

A third option is to simply supply the raw data to the user in a form where it can be read and interrogated on the user's computer using a third party program.

When is a database more appropriate than HTML?

All but the simplest HTML documents impose a structure on their contents. In some cases this structure is too complex to be easily served up as a database whilst in others the structure is too simple to warrant the use of a database. In between these two extremes, however, there are many situations in which one could either use HTML files or a database to publish the document.

Examples are: catalogues, multi-part reports, repetitive descriptions, tables, appendices and lists. In short, anything that is not prose which must be read in a particular order to make sense.

One might want to publish data as an online database if you want to make repeated but varied calls to that data (as with an exhibition or museum catalogue) or if you want to provide the user with the ability to retrieve records based on complex criteria, such as a combination of keywords.

Custom building versus off-the-peg solutions

The most successful online databases are those which allow the user to retrieve the required records with the least effort. Effort can consist of having to navigate a series of menus or it can consist of having to learn how to use the system. For example, the search engines of Yahoo and Altavista have adopted different approaches to the retrieval of data. At Yahoo, the user's area of interest is gradually defined by selecting more and more detailed categories. One might start by selecting computers, then software, then Windows 95 operating system and end up with a series of links to relevant sites. At Altavista, by contrast, the user is simply faced with a single text box into which a string of keywords can be typed [recently, Altavista has added what appears to be a similar hierarchy to that found at Yahoo but the principles behind its classification are less intuitive than Yahoo's]. In both of these search engines it is possible to select either short or long records. Short records will consist of the HTML page title and a hyperlink whilst long records will give the first few lines of the file's text, or perhaps a summary or evaluation of the site's contents provided by a third party. You would choose the first option if you knew exactly what you were looking for but had forgotten the URL and the second option if you were looking to see if a website covering a particular topic existed.

Providing your webserver (and its webmaster) supports CGI scripts, then you may be able to find a software package that will allow records to be retrieved through a simple interface. A well-known package which does this is htgrep. Htgrep is a perl utility which can therefore probably be run on both Unix/Linux and Windows webservers. It retrieves records from simple ASCII files in which the records (by default) are separated by blank lines. The records can themselves contain HTML tags, and could include inline images, tables and HTML headings and styles.

At the other extreme are packages which interface with complex databases. These almost all use Standard Query Language (SQL). In these cases, the CGI script will be used to compose an SQL request, which is then sent to a separate piece of software which undertakes the retrieval. The same CGI script will probably format the retrieved records for display, and provide means to modify the request or to produce a separate request based on the results of the first, or a completely new request. It may also modify the display depending on the contents of the retrieved records.

The more options are provided for the user, the more complicated the coding to run the system will be. In particular, it is necessary to cater for the user supplying unexpected requests. This can be done by sanity checking, for example to make sure that a requested keyword is present in the database before sending the request. Such checking is often carried out on the user's computer using JavaScript.

Checklist

The following points should be borne in mind when designing an on-line database interface:

Forms

Requests for data are sent to the web server through the use of forms. The tags used with forms are <input>, <select> and <textarea>

<input> has the following attributes:

type="the type of input". The options are radio, text, password, submit and reset.

name="a unique name for the input value"

value="the default value for the input"

Types include checkbox (which has the additional attribute of "checked")

radio (which is used in groups of two or more, where an input has a limited and exclusive choice of values)

text(where you can set the width of the text box and the maximum length of entry that is allowed)

password (which has similar attributes to text but where the text is not displayed)

Two types of <input> are special: submit and reset.

submit sends all the values for the inputs in your form to the webserver, along with the name of the input. They are sent as a long string with each pair joined by ampersands thus:

program_name?radio1=radio_value&text1=text_value&password1=password_value&submit1=Submit&select+item=item+one&name=

reset clears all values from the form.

Select allows one to set up a drop-down menu from which, by default, a single item can be selected. This behaviour can be changed with the attribute "MULTIPLE" thus:

.

textarea is similar to the <input> text type but allows carriage returns to be sent to the server rather than triggering the submit input, which is the effect of typing a carriage return in a normal text or password input.

Database interfaces

Active Server Pages can be used to interface with Windows NT databases, such as Access and SQL server. They are documented online on the Microsoft website, plus various others. LearnASP, for example, gives worked examples.

WDBI is a perl-based interface to a variety of databases running on Unix/Linux computers. As of December 1999 these were:

Useful links

Perl: http://www.perl.com

The Perl Journal: http://www.tpj.com

mod_perl: http://perl.apache.org

MySQL database: http://www.mysql.com

mSQL database: http://www.Hughes.com.au

PostgreSQL database: http://www.postgresql.org