- The Original Question
At 12:30 PM 8/8/96, Seldolivaw wrote:
Okay, now although I know when I ask these questions I'm going to get a lot
of opinion, I also want some hard facts. My questions:
We now have a large range of languages available to us with which to
process forms. I am no good in any of them and have started learning all of
them along with a couple friends of mine. They have reached the consensus
that now that Java and Javascript are here we no longer need CGI. This can't
be true can it? Isn't Java just another language in which to write CGI? Is
it still necessary to learn CGI techniques if you know J or JS?
- Apples vs. Oranges - Or Tuning Forks
I find it helpful when looking at this type of question to first consider what
each of the alternatives can do, so you know when you're comparing
"apples and apples." (Or in this case, an apple, an orange, a banana,
and a tuning fork...<smile>)
- The Two Sides of a Web Interaction: Client and Server
The first thing to remember is that every Web interaction has two sides: the
Client (the reader with a browser who is coming to your page) and the Server
(the place where your homepage lives, which includes both the machine and the
software that knows how to answer calls from the Internet, find your page, and
"serve it up". )
In a plain vanilla Web page, the client "links to" your URL. When that request
comes to the Server, it responds by sending over the HTML file, and the client
displays it.
Where we get into all this other exciting stuff is when we want to make that
process more interactive. We want the Client to be able to talk
back to the Server: perhaps to send in information on a form and
have it stored in a file on the Server. Or we may want to get
information from the Server's database files, and display it back to the
Client.
So we have two machines, and two separate complex software environments. The
reader's machine is running a browser (this is the Client). The host's machine
is running a Web server (this is the Server).
Just as your machine can also run other programs - like wordprocessing, graphics,
etc. - and you may even have these up and running while you're doing your
browsing, the host machine can also be running other programs.
When we get into interaction, we usually mean we want the client to request that
something happen on the server. That "something" happens by running other
programs besides the Webserver itself.
OK, now that we see the two sides of the Web interaction, let's look
at where the langauges you mentioned might fit in.
- Javascript: Client Side
Javascript is a scirpting language which allows you to execute code on the
Client side of a Web interaction. That is, it allows you to do
some things on the reader's machine. You can't do too much with files, due to
concerns about security and resource access, but you can do lots with the
display of images. It's very useful, then, for controlling the
display of the screen, so it's good for animation and all that sort of thing.
You can also do some nice things with client-side validation in forms. It's
specific to Web applications. VBScript is very similar, but is based on Visual
Basic and is specific to Microsoft's Explorer. (Note that if the reader doesn't
have a browser that can run your script, it doesn't matter what you put on your
homepage: it's not going to work for them. This is because the
Javascript/VBScript runs on their machine, not on your host
machine.)
Server-Side Javascript
Netscape has recently added the ability to run some Javascript on the
Server-side for some of their Webservers. However, that's not what most Web
authors are talking about when they discuss Javascript - and the Javascript that
you write into a Webpage is most likely going to run on the Client-side.
- Java: Your Choice: Client-Side --Or-- Server-Side
Java is a platform-independent language which can run on either the client or
the server side of a Web interaction. It's a full programming language, but
again there are restrictions as to what you can do to the reader's files, for
security reasons. So on the client side, you can do all the same types of
display things you can do with Javascript, only with much more power. (You also
have a much steeper learning curve.) On the server side, if the Webserver offers
Java support, or on a client that runs its own Java Machine, you can do pretty
much anything you can do with C or C++--it's a full scale language. This is why
there are _browsers_ written in Java, as well as browsers that _run_ Java.
There are also a number of other platform-independent Java programs that have
nothing to do with the WorldWideWeb.
(Note that Java programs come in two flavors: applets, which run in Web
browsers and have their file access capabilities crippled for security reasons,
and Java appliations, which are regular programs and would probably run on the
server.)
- Cgi: Server-Side ("Mother May I?")
After the original Server HTTP specification was written, the authors found that
people wanted servers to do more than just serve up Web pages for display. In
particular, they wanted the Web Server to be able to access data files on the
server machine, and to save information there as well. However, almost all of
these activities are extremely platform-dependent. Moreover, the code and
algorithms for doing these things already existed in programming languages. So
CGI was born as a way for an http Webserver to send information in a standard
format to a program that would run on the server, and then give information
back. (CGI stands for Common Gateway Interface).
CGI is a specification that says that the server will hand input to a program in
a specific format, and expect to get output returned in a specific format. What
the program does in between input and output is up to the capabilities and
limitations of its programming language.
Four Requirements For CGI
In order to use CGI so that the client can request that something
happen on the server, you need four things:
- A trigger, which is either a URL that points to a program that
understands the CGI format, or a Server Side Include which executes
a program which understands the CGI format;
- A Server that knows what to do with a CGI request, and will format
the information accoding to the CGI standard and then pass it to the
specified program
- A program which has been written in such a way that it can
communicate with the Webserver (that is, it needs to recognize the
standard CGI way of getting input and output)
- Permission to run that program
CGI allows your client to send a request to the server that a program on the
server be run. This program is the CGI script. The CGI Script is an executable
program (it can be written in any of several languages) that is capable of
understanding the CGI request passed to it from the Server.
Two Steps For Running A CGI Script
Note that even once you meet these four requirements, executing each CGI script
is still a two step process: first your browser asks the server to execute the
script, either through a URL reference or through an HTML Form's ACTION
attribute.
Then the server then takes your request and passes it along to the CGI script.
Your server will also set the "environment variables" that your CGI script can
check as it is running.
In this sense, CGI includes a "Mother May I?" step which is not included in a
typical standalone program. This is the the reason that causes so many HTML
authors to pull their hair out when first running CGI scripts; the script is
there, it runs by itself in test, but for some reason the Webserver won't
execute it.
This is also important in terms of our discussion because
when using CGI in your page it means that:
- Your reader's browser must know what to do with the CGI request (critical for forms support);
- Your Webserver must know what to do with the CGI request;
- Your executable program must know how to handle CGI input and output formats
- You must have permission to actually run the executable.
A program that can handle CGI formats is called a "CGI script," whether it is
written in a scripting language like Perl or a full programming language like
C.
Common Uses Of CGI: Form Actions
CGI's most common use would be for a client to ask a program that runs on the
Server to either retrieve or store data on the server, then take that
information and pass it back over for display. So a CGI script could, for
example, run a counter and _save_ the results on the server. It could then
format these for display.
(Since clientside Javascript runs on each client's machine, it can't
do a counter showing how many different machines have come to
the server: you need to save something on the server side for that.)
A CGI script could ask a database on the Server machine (or another machine
accessible to it) for numbers from a company's information file, then it could
reformat that into a pretty display to pass back to the Webserver to give to
the Web client.
One of the most common ues of CGI is to take the information from an HTML
<FORM...> tag and pass it to a SERVER side program for processing, using
the ACTION attribute. For example:
<FORM METHOD=POST ACTION="http://www.foo.com/cgi-bin/post-query">
is an HTML tag which will cause the CGI script post-query to be run when the
user submits the form from the HTML page. Note that this requires action on the
server's part. The browser sends in the information in the form; the server
starts the CGI script named in the ACTION. The CGI script processes the
information.
Since CGI scripts can create files, they can also create HTML files which can
the be passed back to the original reader. This is a common method of getting
information off the server and passing it back to the client. It is also used to
create HTML pages "on the fly."
- Server-Side Programming Languages
PERL
PERL (which stands for Practical Extraction and Report Language) is a scripting
language which runs in UNIX. It's especially good at handling text strings: and
the CGI method of handling input tends to make it look like one long text
string, so it's a good match for CGI. (Remember that CGI scripts run on the
server side of a Web interaction, so your readers don't have to have PERL on
their machines: it will run on the machine where your Web page resides.)
Unlike Java, Perl is not specifically designed to be platform-independent.
It's a nice text-handling language, quick to learn, very suitable for the types
of things that most Web applications want to have done on the Server side,
although there are some issues with speed. It runs better under UNIX than
WinNT, and doesn't run at all on some other platforms, but it's fairly easy to
learn, quite powerful, and useful. Note that when you run a PERL script for a
Web application, it's probably as a CGI script: your HTML file uses CGI to
issue a Mother May I? to the Webserver, which in turn starts the Perl program,
using CGI format to pass it the information from the client. The Perl program
runs as a CGI script, so it has to be able to handle its input and output in CGI
format. If it issues its output as an HTML file, it can be displayed back to
the user, or the HTML file may be written to use a Server Side Include to handle
the output.
In any case, Perl is a scripting language that is used for many things besides
the Web. When you include CGI format commands in it, it can talk to a Webserver
that supports CGI, and you can then write a CGI script in Perl.
Other Server-Side CGI Script Options: TCL, C, Visual Basic, Applescript
Perl isn't the only choice. Mac users have long favored Applescript for CGI
scripts. O'Reilly's server allows WinNT users to run Visual Basic as CGI
scripts. And C and Java are also sometimes implemented with CGI when full
programming power is needed. The only thing the programming language has to be
able to do is to handle input and output using the CGI format. Once that's
done, put it in the right directory with the right name and the right
permissions, and you will usually be able to run it as a CGI script. Compiled
programming langauges like C++ tend to be harder to learn but usually run faster
and often smaller than interpreted langauges like Perl and Visual Basic.
- Server Side Includes
Server Side Includes (SSIs) are closely related to CGI in that they are
instructions to the Webserver for performing special functions. That means not
all Servers support them, not all directories will have permission to use them.
Server Side Includes basically tell the Webserver to take your HTML file and
STOP when it gets to an SSI line, do something, and then continue.
In order to do this, the Webserver has to read your file line by line (really,
character by character) before it passes it on to the client. This is called
"parsing," and if it sounds slow, it is. For that reason, many systems are NOT
set up to parse every single web page. Instead, your webmaster will ask you to
use a special naming convention (most commonly .shtml instead of .html, although
it can be anything the webmaster wants to use), and then only files with that
particular kind of name will be parsed.
All SSIs are included inside an HTML comment, using a pound sign to clue the
Server that the SSI is coming up. Typical would be
<!-- #exec cgi= "/cgi-bin/countme.cgi" -->
or
<!--#include file="newfile.txt" -->
#INCLUDE
One of the most useful SSIs is the #INCLUDE, which lets you include a file
inside your HTML file. Remember, this will happen on the server side, before the
page is passed over to the client. This is an excellent way to improve the
management of large sites, or to handle things like "today's weather
report."
#EXEC
This allows you to execute a CGI script at this point in your file. The Server
will wait for the results of the CGI script to run before continuing to parse
your page, which can make it possible to construct pages as you go. Note that
the use of #EXEC requires that you meet all the requirements for using SSI _and_
all the requirements for running a CGI script.
(Technical side note: security is implemented somewhat differently for SSIs and
the cgi-bin directory itself, so it is not uncommon to find a site that will
allow some CGI scripts, but NOT allow #EXEC. Check with your the Webmaster at
your host to see what you're allowed to use, and how.)
ONE CAUTION WHILE LEARNING SSIs and CGI
For the Web author, it makes perfect sense that SSIs and CGI work together.
However, many texts that cover CGI will plunge you almost immediately into a
programming language. You need to remember that Server Side Includes are
special commands for use by a Webserver, and reside inside your HTML file, not
your program file. Many programming languages have their own commands like
"include," "echo," "exec"--it's important not to confuse these commands, which
are used within your executable CGI script file for the program's use, with the
SSIs like "#include," "#echo," and "#exec", which are placed within the HTML
file for the Webserver's use.
- How Client-Side And Server-Side Programs Can Work Together
Now, these things can all work together. Client-side Javascript can be used to
improve processing time and validation, as well as do animations, over on the
CLIENT side (in the reader's browser). It could pass along this prevalidated
form result to a CGI script which runs on the server, and hands off the results
to a PERL program to store in a file on the server. Then the server could get
the results back and pass them over to the Client, where another client-side
Javascript could put up some pretty processing messages. (Remember that
client-side Javascript doesn't touch the server, so you can't use it to save or
get things from the server: just to handle the collection of the information
from the client.)
Or, you could use Java on the Client side AND Java on the Server side, which may
be what's confusing your friends; but the two would be independent
implementations. If a client didn't have Java enabled on the browser, they
could still make use of the Java programs on the server. And it's NOT the Java
programs on the client that touch the server database: you just happened to use
the same language to handle both sides--NOT the same Java program.
You can use CGI as a format to do a Mother May I? on just about any
program that can run on your server. Common choices are Perl, C,
Visual Basic, Lisp, Ada, AppleScript and so on. Remember, these
programs must be installed and available on the server--not on the
client's machine. If you have a UNIX host, you can't run VB from a
CGI script, even though your browser runs on a PC--the programs using
CGI run on the server.
- The Tuning Fork: The ActiveX Environment
That brings us to the tuning fork that you didn't mention: ActiveX. ActiveX is
a wrapper environment that, like Java, has both a client side and a server side
implementation. ActiveX can do display type things; it can do CGI type things.
It can also use OLE objects to run actual programs on the client side. It can
use .asp files to cause events to happen on the server side.
This is both good and bad: it's wonderful functionality, and horrible security.
Great for intranets, and scary for the open Web. But it's a very different way
of doing things, because it's going to allow the server to request that actions
happen on the client. (Remember, up until now the main thing the Server did on
its side of the interaction was to pass things over for the Browser to display.
This started to change with plug-ins, which allowed actual programs to run on
the client side. ActiveX takes that and pushes it to the limit.)
The biggest issue with ActiveX aside from security is that in its initial form
its going to be largely limited to high powered 32 bit WinTel environments; you
won't see it running on Macs or UNIX for awhile, and it may never have full
functionality for Win 16. For developers, it lets you move a desktop to the
Web, but the cost of that is the loss of platform independence: you're going to
have to go back to developing a Win95 version, and a Mac version, and a UNIX (in
all its many flavors) version, and so on. This is a big difference from Java.
With Java, it's "write once, right everywhere." With ActiveX, it's write once
for Win95, write once for Mac, write once for each UNIX flavor , and wrong
everywhere else. The upside of the tradeoff is that ActiveX will be able to use
the full power of that one environment. So you can do cooler things, but for
fewer people (with much higher development and maintenance costs if you want to go
cross-platform).
- So Which Is Best For Forms?
Client-Side Limitations
As for which is best, who knows? Fast, efficient forms processing would ideally
include both action on both the Client-side (for prevalidation, animation,
navigation controls and display messages) and Server-side (for data storage and
retrieval from the Server site).
A lot depends on what you already know, what you already have, what your Server
supports, and what you expect your readers to run.
CGI scripts, because they run on your server, will produce results that almost
all readers can see: you do the work, you control the security, you hand them
the answer.
Javascript, VBScript, and Java applets require your readers to have the right
software in place before they come to your page, or they won't see your nifty
new features.
ActiveX is even more restrictive. You're asking the Reader's machine to do the
work you tell it to: they may not want to, or they may not be able to, and you
run the risk of losing readers or limiting content.
CGI will continue to be around for awhile because of this Server vs Client
issue: when you run the programs on YOUR machine, you know what the results are
going to be, and your page can serve them up to everyone. (You're also not
asking your readers to trust you with their hard drives.)
CGI has in one sense more power than client-based languages like clientside
Javascript simply because it's _not_ client-based. That's hard to get away
from. So I don't see Javascript or VBScript or Java applets replacing CGI--they
just don't do the same things.
The Future On The Server Side
Now, whether a Server-side Java application will replace CGI....hmmm, that's
tricky. Most Webservers don't run native Java--they use CGI or ActiveX or some
other wrapper/middleware to implement it. It's unlikely that you'll find a host
that provides Java as a substitute for CGI/ActiveX, so I wouldn't count on it
for now. The learning curve for Java is just too steep: CGI combined with a
simple language like Perl is a quick and easy way of implementing simple
programs on the Server.
On the other hand, I have no doubt that CGI will disppear in ActiveX
servers--it's just not needed there, while Java will continue to be available in
the ActiveX environment. But how many people will run ActiveX, and when? That
change won't come for at least 18 months, if ever. (Editor's note - for time
reference, this was originally posted in August of 1996.)
- Making Your Own Choice
As far as what you yourself will run for any one page: again, it depends on
what you have available on your server and what you think your readers have both
available AND running (a lot of people turn Java and Javascript off, for
security reasons).
If CGI is not available on your server, Javascript can do some nice things
with prevalidation and display controls--but only if your readers have the
software to run it on their machines
If both sides have ActiveX, you have a lot of choices, but you open up some
signficant security issues as well. If you're starting out to learn something
for "coolness," Javascript and VBscript are a lot easier to learn than Java. So,
it's your choice, but make it based on both parts of the Web transaction: the
client and the server.