< Previous Session | Next Session >

Session Three
CGI Environment Variables

The defining characteristic of a CGI program is its ability to read and understand data submitted to it from a web form. When a remote client submits a form, the browser bundles it up in a special format and sends it back to the web server. The server then passes it on to your program. Your program must know how to acquire that bundle of data and then unbundle it. The CGI protocol is the language that specifies how the data is bundled and supplied to the CGI program.

In this Session we will discuss the environment variables that your web server will create when a CGI session is generated. Environment variables are a series of hidden values that the web server sends to every CGI program that it runs. Your CGI program (or the dBASE WebClass library) can parse them, and use the data they send.

In order to get a better understanding of the myriad of environment variables, we will begin by building a CGI program that displays many of these variables and their values. Run dBASE Plus and change to the "Source" folder in your server's document root. Then switch to the command window and enter the following two commands:

   
   compile showEnv
   build showEnv to ..\app\showEnv.exe WEB

You now have a CGI program ready to run. This program will read a set of CGI environment variables and returns their values in the response page. To see the environment variables click the following link. (Be sure your web server is running and use the browser's back button to return to this page):

http://localhost/app/showEnv.exe

Your response page should look very similar to the following printout, except that your values will be slightly different. This is not a complete list of all the environment variables. But it includes most of the more commonly used variables.

   
   SERVER_SOFTWARE = Apache/2.0.28 (Win32) 
   SERVER_NAME = localhost
   SERVER_PROTOCOL = HTTP/1.1 
   SERVER_PORT = 80 
   SERVER_ADMIN = mike@home 
   GATEWAY_INTERFACE = CGI/1.1 
   DOCUMENT_ROOT = C:/Apache2/htdocs 
   
   REQUEST_METHOD = GET 
   SCRIPT_FILENAME = C:/Apache2/htdocs/App/showEnv.exe 
   SCRIPT_NAME = /app/showEnv.exe 
   CONTENT_TYPE = 
   CONTENT_LENGTH = 
   QUERY_STRING = 
   REQUEST_URI = /app/showEnv.exe 
   PATH_INFO = 
   PATH_TRANSLATED = 
    
   AUTH_TYPE = 
   REMOTE_HOST = 
   REMOTE_ADDR = 127.0.0.1 
   REMOTE_USER = 
   
   HTTP_HOST = localhost 
   HTTP_ACCEPT_CHARSET = 
   HTTP_ACCEPT_LANGUAGE = en-us 
   HTTP_CONNECTION = Keep-Alive 
   HTTP_REFERER = 
   HTTP_USER_AGENT = Mozilla/4.0 (compatible; MSIE 5.5; Windows NT 5.0; DigExt) 
  
  

Notice that some environment variables give information about your server, and will never change from CGI to CGI (such as SERVER_NAME and SERVER_ADMIN), while others give information about the visitor, and will be different every time someone accesses the script. Different web servers set their own environment variables as well, so you should check your server documentation for more information. In addition, some server, like Apache, let you define your own custom environment variables.

Not all environment variables get set for every CGI program. For example, REMOTE_USER is only set for pages in a directory or subdirectory that's password-protected. And HTTP_REFERER is only set when a server request is made from another web page, rather then from a bookmark or typed directly into the location field.

In your dBASE Plus CGI Program, you can use the getenv() function to access any of the CGI environment variables.

   
   cRemoteHost = getenv("REMOTE_ADDR")

The following list describes many of the common environment variables. If the client sends any HTTP headers along with its request, then these headers are also placed into the environment. The names of these environment variables are the names of the HTTP headers, and are prefixed with HTTP_. (All letters in the name are changed to upper case and all hyphens are changed to underscore characters.)

SERVER_SOFTWARE
This environment variable contains the name and version of the software that your program is running under.

SERVER_NAME
This environment variable contains the domain name or IP address of the server machine.

SERVER_PROTOCOL
This environment variable contains the name and revision of the protocol being used by the client and server.

SERVER_PORT
This environment variable contains the number of the port to which this request was sent.

DOCUMENT_ROOT
This environment variable contains the directory under which the current program is executing, as defined in the server's configuration file.

SERVER_ADMIN
The value given to the ServerAdmin (for Apache) directive in the web server configuration file. If the script is running on a virtual host, this will be the value defined for that virtual host.

GATEWAY_INTERFACE
This environment variable contains the revision of the CGI specification supported by the server software.

REQUEST_METHOD
This environment variable contains the name of the method (defined in the HTTP protocol) to be used when accessing URLs on the server. When a hyperlink is clicked, the GET method is used. When a form is submitted, the method used is determined by the METHOD attribute to the FORM tag. CGI programs do not have to deal with the HEAD method directly and can treat it just like the GET method.

SCRIPT_FILENAME
The absolute pathname of the currently executing script.

SCRIPT_NAME
This environment variable contains the name of the virtual path to your program. If your program needs to refer the remote client back to itself, or needs to construct anchors in HTML referring to itself, you can use this variable.

CONTENT_TYPE
If a form is submitted with the POST method, then this environment variable contains the type of data being sent by the client. While clients normally only send "application/x-www-form-urlencoded," this variable can contain any MIME type. To transfer binary data to the your CGI program you must use "multipart/form-data".

CONTENT_LENGTH
This environment variable contains the number of bytes being sent by the client. You use this variable to determine the number of bytes you need to read.

QUERY_STRING
This environment variable contains information from an HTML page to your program in these three instances:

REQUEST_URI
The Uniform Resource Identifier (URI) which was given in order to access the program. The URI points the server to the file that contains the CGI program you want to run (or the static document or image to be served).

PATH_INFO
This environment variable contains the extra path information that the server derives from the URL that was used to access the CGI program.

PATH_TRANSLATED
This environment variable contains the actual fully-qualified file name that was translated from the URL. web servers distinguish between path names used in URLs, and file system path names. It is often useful to make your PATH_INFO a virtual path so that the server provides a physical path name in this variable. This way, you can avoid giving file system path names to remote client software.

AUTH_TYPE
If the CGI script is protected by any type of authorization, this environment variable contains the authorization type. Apache web servers support HTTP basic and digest access authorization.

REMOTE_HOST
This environment variable contains the host name of the remote client software. This is a fully-qualified domain name such as www.dbase.com (instead of just www, which you might type within your intranet). If no host name information is available, your program should use the REMOTE_ADDR variable instead.

REMOTE_ADDR
This environment variable contains the IP address of the remote host. This information is guaranteed to be present.

REMOTE_USER
This environment variable is set to the name of the local HTTP user of the person using the browser software only if access authorization has been activated for this URL. Note that this is not a way to determine the user name of any person accessing your program.

HTTP_USER_AGENT
This environment variable identifies the browser software being used to access your program.

HTTP_HOST
Contents of the Host: header from the current request, if there is one.

HTTP_ACCEPT
This environment variable enumerates the types of data the client can accept. For most client software, this protocol feature has become a bit convoluted and the information isn't always useful.

HTTP_ACCEPT_CHARSET
Contents of the Accept-Charset: header from the current request, if there is one.

HTTP_ACCEPT_LANGUAGE
Contents of the Accept-Language: header from the current request, if there is one. This value can be changed on the client browser options, when choosing preferred language.

HTTP_CONNECTION
Contents of the Connection: header from the current request, if there is one.

HTTP_REFERER
The address of the page (if any) which referred the browser to the current page. This is set by the user's browser; not all browsers will set this.

HTTP_USER_AGENT
Contents of the User_Agent: header from the current request, if there is one. This is a string denoting the browser software being used to view the current page.

Submitting Data

There are two primary ways in which a remote client can submit data to a CGI program. The first way is to use an Anchor tag. This tag lets you define a hypertext link that the user can click to display a document. You define a hypertext link by using the <A> tag with an HREF attribute to indicate the start of the hypertext link, and use the </A> tag to indicate the end of the link. When the user clicks any content between the <A HREF> and </A> tags, the link is activated. The value of the HREF attribute must be a URL or a virtual path. If you want the link to open a new document, the value of HREF should be the URL or virtual path for the destination document.

   
   http://localhost/app/myApp.exe  
   /app/myApp.exe
   

Using a hyperlink to execute a CGI program utilizes the GET method and populates the QUERY_STRING environment variable. Using the query string is the simplest way of passing data to a CGI program. If you append a question mark (?) to the URL for your applet, then any characters after the question mark will be passed to your applet in the QUERY_STRING environment variable.

You can use the following hyperlink to run the ShowEnv applet and see the values of the environment variables. You will see that "GET" is the value for the REQUEST_METHOD environment variable and "name=value&name1=value1" is the value for the QUERY_STRING environment variable.

http://localhost/app/showEnv.exe?name=value&name1=value1


   REQUEST_METHOD = GET 
   SCRIPT_FILENAME = c:/Apache2/htdocs/app/showenv.exe 
   SCRIPT_NAME = /app/showEnv.exe 
   CONTENT_TYPE = 
   CONTENT_LENGTH = 
   QUERY_STRING = name=value&name1=value1 
   REQUEST_URI = /app/showEnv.exe?name=value&name1=value1
 

In a dBASE CGI program, you can obtain the data in the QUERY_STRING with code like the following.

   
   cMethod = upper(getEnv("REQUEST_METHOD"))
   if cMethod == "GET"
      cEnv = getEnv("QUERY_STRING")
   endif

Although passing data through the query string is among the easiest ways to submit CGI data, there are a few draw backs to this approach. First, there is a limit on the length of the query string. The web server treats the local portion of the url, including the query string, as a kind of file path. And thus the length is restricted by the operating systems limit on the length of the path to a file (256 characters). So you can't pass very much data this way.

Second, when you place data in the url yourself, you are responsible for url-encoding. As we noted in Session One, spaces must be converted to + signs, and punctuation characters must be escaped with the % sign and hexadecimal digits.

Third, the url, including the query string, is collected in the access logs maintained at the server. If your access logs are public, you may not object to having your hits recorded, but your data might contain information you'd prefer not to expose.

The CGI POST

Because of these limitations, another method of transmitting data to a CGI program was developed, and is now the most common and recommended method. The POST method sends data to a program's Standard Input. It's less public (it's not reported in the server logs), the web browser automatically encodes the data, and in principle there are no length limitations. On the other hand, you cannot supply POST data to a program directly in the url, as can be done with the query string or path info. You must use a web form to POST data.

You can submit the following form to the showEnv applet to see the values associated with the relevant environment variables. You will see that "POST" is the value for the REQUEST_METHOD environment variable and that the QUERY_STRING environment variable is empty. The form tag for this example is the following:

<FORM METHOD=POST ACTION="http://localhost/app/showEnv.exe">


   
   DOCUMENT_ROOT = C:/Apache2/htdocs
   REQUEST_METHOD = POST 
   SCRIPT_FILENAME = C:/Apache2/app/showenv.exe 
   SCRIPT_NAME = /app/showEnv.exe 
   CONTENT_TYPE = application/x-www-form-urlencoded 
   CONTENT_LENGTH = 27 
   QUERY_STRING = 
   REQUEST_URI = /app/showEnv.exe 

HTML forms that use the POST method send their encoded information using the standard input. To access this data in a dBASE program you must use a file object and read the data as if it is stored on the hard drive. In order to determine the number of bytes to read in you can use the CONTENT_LENGTH environment variable. In the above example this value is 27. The dBL code to read this information might look like the following:

   
   cMethod = upper(getEnv("REQUEST_METHOD"))
   if cMethod = 'POST'    
      nLen = val(getEnv("CONTENT_LENGTH"))
      fIn  = new file()     
      fIn.Open("StdIn", "RA") 
      cEnv = fIn.Read(nLen) 
   endif
		 

The WebClass Library

The dBASE WebClass library makes working with the data stuffed in environment variables quite easy. Although we will explore the details of this library in the Sessions that follow, here we should point out that this library does much of work we have been reviewing. Normally, a your CGI application will call a method called "Connect()" from the WebClass library. This methods does a number of important tasks.

First the method establishes a communication channel with Standard In. Then it determines whether the CGI request is a GET or POST method and. based on that determination, reads the incoming data. Because the data is URL encoded, the WebClass library then transforms CGI escape chars embedded in the submitted data

Next all the name/value pairs passed in this data is added to an associative array, which you can easily access in your CGI program. Finally, the connect method opens a communication channel with Standard Out so that you have a route to send your response page.

There is no doubt that the WebClass library saves the developer a good deal of time by reading and formatting the data sent to your CGI program.

Path Info

Another method of sending data to a CGI program is through the PATH_INFO environment variable. Similar to the query string, path info is whatever comes after the program name in the url. You need to start the path info with a slash (/) to let the web server know where the program name ends.

Try the following hyperlink. It contains both path info and a query string.

http://localhost/app/showEnv.exe/uploads/dir1/?name=value&name1=value1

    
   REQUEST_URI = /app/showEnv.exe/uploads/dir1/?name=value&name1=value1
   PATH_INFO = /uploads/dir1/ 
   QUERY_STRING = name=value&name1=value1
   SCRIPT_NAME = /app/showEnv.exe 
   DOCUMENT_ROOT = C:/Apache2/htdocs 
   PATH_TRANSLATED = C:\Apache2\htdocs\uploads\dir1 

You can see from the output that "/uploads/dir1/" is listed as the value of the PATH_INFO environment variable. And that "name=value&name1=value1" is the query string. You must supply the path info first and the query string second in the URL.

Path info need not be a path to any particular file or directory. You can use it to convey constant information to your CGI programs independent of the information the client sends. Nonetheless, this was its original intent and is still the most common use. For example, many hit counter scripts are installed for system-wide use. If one program serves many users, the program must be told which page is to be counted. Path info will often be the (file system or url) path to the particular file that contains the current count. Path info has the same disadvantages as the query string. It is not automatically url encoded, it is subject to the same path length limitations, and it is reported in the server logs.

You can also use the extra path info to access the web server's virtual-to-physical path translation. You can send a virtual path as extra path information, so your CGI program can use the path information to access a file on your server machine. This means the server provides the physical path name corresponding to that virtual path in the environment variables by using path translation. Notice in the above printout that the PATH_TRANSLATED environment variable is the concatenation of the DOCUMENT_ROOT and PATH_INFO variables (with the slashes converted to backslashes on a Windows server).

It turns out that you can add extra path info (and a query string) to a URL used to submit a form. The next example form uses the Method=POST (so the form data is passed to the CGI program through standard in), but it also includes extra path info (in the ACTION attribute of the FORM tag.

<FORM METHOD=POST ACTION="http://localhost/app/showEnv.exe/uploads/dir1/">


  
   DOCUMENT_ROOT = c:/Apache2/htdocs 

   REQUEST_METHOD = POST 
   SCRIPT_FILENAME = c:/Apache2/htdocs/app/showenv.exe 
   SCRIPT_NAME = /app/showEnv.exe 
   CONTENT_TYPE = application/x-www-form-urlencoded 
   CONTENT_LENGTH = 27 
   QUERY_STRING = 
   REQUEST_URI = /app/showEnv.exe/uploads/dir1/ 

   PATH_INFO = /uploads/dir1/ 
   PATH_TRANSLATED = c:\Apache2\htdocs\uploads\dir1


Using Server Authentication

The most common way to screen web clients with Apache is to use basic HTTP authentication. Server authentication can act a wall to keep unauthorized people out of your sites. It can also be used to customize a site by adding user specific information to the web pages. To control access to a folder on your web site you must include the Apache User Authentication Directives within a directory block. The core configuration directives are as follows:

   
   <Directory "c:/apache/htdocs/app">
      AuthType Basic 
      AuthName "My Important Site" 
      AuthUserFile "c:/apache/users/users.txt" 
      require valid-user 
   </Directory>
  

The AuthType directive tells Apache to use basic HTTP authentication. This system is pretty simple. Usernames and passwords are stored in a text file that resided on the same computer that is running the web server. Moreover basic authentication uses no encryption, and, therefore, passwords are sent in plain-text form.

The AuthName directive assigns a name to the area being protected. You can use any name you want for this area. This text will be sent to the web browser and displayed in the logon dialog box.

The AuthUserFile directive sets the path to the password file. This file must be a simple text file with a username and password on each line. The format of this file is "username:password". You should add a blank line after the list of name. Also note that it is a good idea to locate this file outside of your server's web space. A sample file might look like this:

Alan:CTO
Romain:Master
Geoff:grasshopper

The require directive (note the case) lets you specify which users are allowed access to a protected area. The directive has three valid arguments: user, valid-user, and group. In the sample above, we use the valid-user argument which admits all users whose names are listed in the password file.

The Apache Basic Authentication system also supports organizing users into groups. To implement this feature, create a text file with the following format:

   
   groupname: username1 username2 username3
 

Use the AuthGroupFile directive to sets the path to this file:

   
   AuthGroupFile "c:/apache/users/groups.txt"
 

Finally use the group argument in the require directive:

   
   require group groupname
    

The following are two examples of the directory blocks that assign groups. In the first example all users must be assigned to a group, and only those within "myGroup" can access this folder. In the second example, any user who belongs to the group "myGroup" can access the folder and the user "nuwer" who may not be a group member can access the folder.

Example One

   
   AuthType Basic 
   AuthName "My Important Site"
   AuthUserFile "c:/apache/users/users.txt"    
   AuthGroupFile "c:/apache/users/groups.txt" 
   require group myGroup
   

Example Two

   
   AuthType Basic 
   AuthName "My Important Site" 
   AuthUserFile "c:/apache/users/users.txt"    
   require user nuwer 
   AuthGroupFile "c:/apache/users/groups.txt" 
   require group  myGroup
   


Authorization Error Message. You can customize any of the error messages that Apache handles with the ErrorDocument directive. By including this directive within the directory block, you can create different customized error page for each secure folder on your server. The following example returns /ErrorFiles/Error401.htm to the browser when the user fails to authenticate.

   
   ErrorDocument 401 /ErrorFiles/Error401.htm

Notice that this page is located in a non-protected folder on the server. Since the user has failed the authentication process, the web server will not give access to files or documents in the secure folder. This can, however, be confusing. Even though the user does not have access to the files in the protected folder, that folder is still treated as the default. So if your error document has images or other objects and their not references with a full path, the html page will not be able to use them.

CGI Environment Variables. One of the standard CGI environment variables is REMOTE_USER. Once a user has been authenticated by the web server, this variable contains the user's username. Moreover, the web server passes this variable to your program each time it is called. Thus it is easy to grab this variable and use it to look up information about this user.

   
   cLookFor = getenv("REMOTE_USER") 
   q = new query() 
   q.sql = 'Select * from "c:\apache\data\users.dbf"'    
   q.active = true 
   q.rowset.indexName := "username" 
   q.rowset.findkey(upper(cLookFor))
    


You can now customize your pages, like for example, print the user's full name at the top of the page.


< Previous Session | Next Session >

The Legal Stuff: This document is part of the dBASE onLine Training Program created by Michael J. Nuwer. This material is copyright © 2001, 2003 by Michael J. Nuwer. dBASE is copyrighted, trademarked, etc., by dBASE, Inc. This document may not be posted elsewhere without the explicit permission of the author, who retains all rights to the document.