Specifies the way in which a program receives its parameters from a client via a Web server.
How does the program get input from the client?
How does the output of the program get returned to the client?
CGI passes client data in Environment Variables or via standard input.
Data is returned to the client via standard output.
CGI programs can be written in any programming language.
Interpreted languages (Perl, TCL, Python, Ruby) are often preferred to compiled languages (C, C++)
Not CGI: Java is a common server side language, usually as part of a larger Java application server (Tomcat, Websphere)
Not CGI: Languages like PHP and ASP use a different interface technology between server and script (server module)
Web server needs to differentiate between content and programs.
Common convention is to use the cgi-bin directory: files in here are programs.
Since CGI scripts are executable, anyone can run a CGI script installed on your system.
Security Risk
Servers often restrict who or what can have CGI scripts.
When a server recieves a request for a CGI resource:
<html>
<head>
<title>Demo static HTML page</title>
</head>
<body>
<h1>Hello COMP249!</h1>
</body>
</html>
#!/usr/local/bin/python
print "Content-type: text/html\n\n"
print "<html>"
print "<head>"
print "<title>Demo static HTML page</title>"
print "</head>"
print "<body>"
print "<h1>Hello COMP249!</h1>"
print "</body>"
print "</html>"
The first print statement is the Content-type line that specifies the media type of the output that is generated. This line is actually part of the HTTP response header sent back to the client.
This line is immediately followed by a blank line which must not contain any spaces or tabs (the '\n\n' bit). Remember the HTTP protocol.
CGI programs often fail since the programmer forgot the blank line.
After the blank line comes the HTML encoded text which is displayed on the user's browser.
#!/usr/local/bin/python
print "Content-type: text/html\n\n"
print """
<html>
<head><title>Table Demo</title></head>
<body>
<h3>COMP249 Staff</h3>
<table>
<tr><th>Lecturer</th><th>Unit</th></tr>
"""
dict = {'Rolf':'COMP249','Steve':'COMP249'}
for key, value in dict.items():
print " <tr><td>", key, "</td><td>", value, "</td></tr>"
print """
</table>
</body>
</html>
"""
<html>
<head><title>A simple HTML Form Page</title><head>
<body>
<h3> A simple HTML Form Page</h3>
<hr>
<form action="cgi-bin/process.py">
<p><strong>Enter your name:</strong></p>
<p><input type=text name=user></p>
<p><input type=submit></p>
</form>
<body>
<html>
The CGI script receives the form data via the query string.
In the previous example:
user=Rona+Bates
Spaces are replaced with + or %20
Other special characters are scaped with % plus the ASCII code
Information from different form widgets is separated with &
user=Rona+Bates&gender=female
If a GET method:
The query string is encoded in the URL:
http://.../cgi-bin/process.py?user=Rona+Bates
The server places the information in the environment variable QUERY_STRING.
If a POST method:
GET encodes the query string in the URL
Easy to see what's being passed to the script
Can bookmark/link to a form submission, eg. a Google search
POST encodes query as a message in the HTTP request.
No information in the URL about the form data.
Data is not hidden, but not stored in server logs.
Allows larger payloads to be sent (eg. file upload).
FieldStorage which provides an
interface to submitted form dataFieldStorage object reads all form input
and allows you to query it via a simple interface
import cgi
form = cgi.FieldStorage()
if 'name' in form:
print form.getvalue('name')
#!/share/bin/python
import cgi # imports cgi module
import cgitb; cgitb.enable() # traceback manager, displays
# errors in the Web browser
form = cgi.FieldStorage() # retrieves form input
# define start and end of page
pagehead = """
<html>
<head><title>Greetings</title></head>
<body>
<h3>Greetings</h3>"""
pagefoot = """ </body>
</html>
"""
print "Content_type: text/html\n\n"
print pagehead
if 'user' not in form:
print "<p>Who are you?</p>"
else:
print "<p>Hello ", form.getvalue('user'), "</p>"
print pagefoot
Much of the additional information needed by CGI programs is available via HTTP environment variables.
How is this information transmitted?
For example:
In Python, HTTP environment variables are available (when set) via the os.environ dictionary.
#!/share/bin/python
import os
print 'Content-type: text/html\n\n'
if os.environ.has_key('SERVER_PORT'):
server_port = os.environ['SERVER_PORT']
print '<p> SERVER_PORT:', server_port, '</p>'
else:
print '<p> SERVER_PORT: unknown </p>'
The QUERY_STRING variable always appears in the os.environ dictionary, even though its value is '' (the empty string).
#!/share/bin/python
import os
print 'Content-type: text/html\n\n'
if os.environ.has_key('QUERY_STRING') and \
os.environ['QUERY_STRING'] != '':
query_string = os.environ['QUERY_STRING']
print '<h3> QUERY_STRING:', query_string, '</h3>'
else:
print '<h3> QUERY_STRING: unknown </h3>'
Examples: name=Steve, name=Steve&age=21