This section aims to deal with basic questions, addressing the role and nature of CGI, and its place in Web programming. Questions/answers which just don't appear to 'fit' under any other section may also be included here.
[ from the CGI reference http://hoohoo.ncsa.uiuc.edu/cgi/overview.html ] The Common Gateway Interface, or CGI, is a standard for external gateway programs to interface with information servers such as HTTP servers. A plain HTML document that the Web daemon retrieves is static, which means it exists in a constant state: a text file that doesn't change. A CGI program, on the other hand, is executed in real-time, so that it can output dynamic information.[Table of Contents] [Index]
The distinction is semantic. Traditionally, compiled executables (binaries) are called programs, and interpreted programs are usually called scripts. In the context of CGI, the distinction has become even more blurred than before. The words are often used interchangably (including in this document). Current usage favours the word "scripts" for CGI programs.[Table of Contents] [Index]
There are innumerable caveats to this answer, but basically any Webpage containing a form will require a CGI script or program to process the form inputs.[Table of Contents] [Index]
[answer to this non-question hopes to try and reduce the noise level of the recurrent "CGI vs JAVA" threads]. CGI and JAVA are fundamentally different, and for most applications are NOT interchangable. CGI is a protocol for running programs on a WWW server. Whilst JAVA can also be used for that, and even has a standardised API (the servlet, which is indeed an alternative to CGI), the major role of JAVA on the Web is for clientside programming (the applet). In certain instances the two may be combined in a single application: for example a JAVA applet to define a region of interest from a geographical map, together with a CGI script to process a query for the area defined.[Table of Contents] [Index]
CGI and SSI (Server-Side Includes) are often interchangable, and it may be no more than a matter of personal preference. Here are a few guidelines: 1) CGI is a common standard agreed and supported by all major HTTPDs. SSI is NOT a common standard, but an innovation of NCSA's HTTPD which has been widely adopted in later servers. CGI has the greatest portability, if this is an issue. 2) If your requirement is sufficiently simple that it can be done by SSI without invoking an exec, then SSI will probably be more efficient. A typical application would be to include sitewide 'house styles', such as toolbars, netscapeised <body> tags or embedded CSS stylesheets. 3) For more complex applications - like processing a form - where you need to exec (run) a program in any case, CGI is usually the best choice. 4) If your transaction returns a response that is not an HTML page, SSI is not an option at all. Many more recent variants on the theme of SSI are now available. Probably the best-known are PHP which embeds server-side scripting in a pre-html page, and ASP which is Microsoft's version of a similar interface.[Table of Contents] [Index]
APIs are proprietary programming interfaces supported by particular platforms. By using an API, you lose all portability. If you know your application will only ever run on one platform (OS and HTTPD), and it has a suitable API, go ahead and use it. Otherwise stick to CGI.[Table of Contents] [Index]
Too many to enumerate - but I'll try and summarise. Briefly, there are several decisions you have to make, including: * Power. Is it up to a complex task? * Complexity. How much programming manpower is it worth? * Portability. Might you want to run your program on another system? So here's an overview of the main options. It's inevitably subjective, but may be helpful to someone: Basic SSI: Simple interface for basic dynamic content. Non-standard - read your server docs. Enhanced SSI[1]: Suitable for more complex tasks within an HTML page. CGI: The standardised, portable general-purpose API, not limited to working with HTML pages. Enhanced CGI-like[2]: Typically gain efficiency but lose portability compared to standard CGI. Servlets: An alternative API for JAVA, that overcomes the limitation of JAVA not supporting environment variables. Server API: Generally the most powerful and most complex option. [1] For example, PHP, ASP. [2] For example, CGI adapted to mod_perl or fastcgi.[Table of Contents] [Index]
If you're already a programmer, CGI is extremely straightforward, and just three resources should get you up to speed in the time it takes to read them: 1) Installation notes for your HTTPD. Is it configured to run CGI scripts, and if so how does it identify that a URL should be executed? (Check your manuals, READMEs, ISP webpages/FAQS, and if you still can't find it ask your server administrator). 2) The CGI specification at NCSA tells you all you need to know to get your programs running as CGI applications. http://hoohoo.ncsa.uiuc.edu/cgi/interface.html 3) WWW Security FAQ. This is not required to 'get it working', but is essential reading if you want to KEEP it working! http://www.w3.org/Security/Faq/www-security-faq.html If you're NOT already a programmer, you'll have to learn. If you would find it hard to write, say, a 'grep' or 'cat' utility to run from the commandline, then you will probably have a hard time with CGI. Make sure your programs work from the commandline BEFORE trying them with CGI, so that at least one possible source of errors has been dealt with.[Table of Contents] [Index]
Yes. Period. There is a lot you can do to minimise these. The most important thing to do is read and understand Lincoln Stein's excellent WWW security FAQ, at http://www.w3.org/Security/Faq/www-security-faq.html[Table of Contents] [Index]
No, but it helps. The Web, along with the Internet itself, C, Perl, and almost every other Good Thing in the last 20 years of computing, originated in Unix. At the time of writing, this is still the most mature and best-supported platform for Web applications.[Table of Contents] [Index]
No - you can use any programming language you please. Perl is simply today's most popular choice for CGI applications. Some other widely- used languages are C, C++, TCL, BASIC and - for simple tasks - even shell scripts. Reasons for choosing Perl include its powerful text manipulation capabilities (in particular the 'regular' expression) and the fantastic WWW support modules available.[Table of Contents] [Index]
It isn't really that important. Use what you're comfortable with, or what you're constrained (eg by your manager) to use. If you're just dabbling with programming, Perl is a good choice, simply because of the wealth of ready-to-run Perl/CGI resources available. If you're serious about programming, you should be at home in a range of languages. C, the industry standard, is a must (at least to the level of comfortably reading other people's code). You'll certainly want at least one scripting language such as Perl, Python or Tcl. C++ is also a good idea. In response to a Usenet newbie question: > I am seriously wanting to learn some CGI programming languages J.M. Ivler wrote some eloquent words of wisdom: > If you want to learn a programming language, learn a programming language. > If you want to learn how to do CGI programming, learn a programming > language first. > > My book is one of the few that tackles two languages at the same time. > Why? because it's not about languages (which are just syntax for logic). > CGI programming is about programming, and how to leverage the experience > for the person coming to the site, or maintaining the site, or in some way > meeting some requirements. Language is just a tool to do so.[Table of Contents] [Index]
see next question[Table of Contents] [Index]
Maybe. It depends on your server installation. These types of filenames are commonly used conventions - no more. It is up to the server administrator whether or not CGI scripts are enabled, and (if so) what conventions tell the server to run or to print them. If you are running your own server, read the manual. If you're on ISP or other rented webspace, check their webpages for information or FAQs. As a last resort, ask the server administrator.[Table of Contents] [Index]
The CGI Overhead is a consequence of HTTP being a stateless protocol. This means that a CGI process must be initialised for every "hit" from a browser. In the first instance, this usually means the server forking a new process. This in itself is a modest overhead, but it can become important on a heavily-used server if the number of processes grows to problem levels. In the second place, the CGI program must initialise. In the case of a compiled language such as C or C++ this is negligible, but there is a small penalty to pay for scripting languages such as Perl. Thirdly, CGI is often used as 'glue' to a backend program, such as a database, which may take some considerable time to initialise. This represents a major overhead, which must be avoided in any serious application. The most usual solution is for the backend program to run as a separate server doing most of the work, while the actual CGI simply carries messages. Fourthly, some CGI scripts are just plain inefficient, and may take hundreds of times the resources they need. Programs using system() or `backtick` notation often fall into this category. Note that there are ways to reduce or eliminate all these overheads, but these tend to be system- or server-specific. The best-supported server is probably Apache, as commercial server-vendors may prefer to push their proprietary solutions in preference to CGI.[Table of Contents] [Index]
Unix systems are designed for multiple users, and include provision for protecting your work from unauthorised access by other users of the system. The file permissions determine who is permitted to do what with your programs, data, and directories. The command that sets file permissions is chmod. Web servers typically run as user "nobody". That means that, setting aside serious bugs (such as those in certain versions of the Frontpage extensions), your files are absolutely secure from damage through the webserver. It also means that you may have to make explicit changes to enable the server to access them in a CGI context. There are two ways to run CGI: - by default they run as the webserver user (nobody) For most purposes this is safest, as your programs and data are protected by the operating system from unauthorised access through possible bugs in your CGI. However, when the CGI has to write to a file, that file must be writable to every web user on the system, and is therefore completely unprotected. - setuid, they run under your own userid. This means that files written by your CGI can be secure. On the other hand, any bugs in your CGI could now compromise *all* your programs and data on the server. As an elementary security precaution, scripts (e.g. Perl) are prevented from running setuid by most OSs. The "cgiwrap" program offers a workaround for this. A third way you should *never* permit CGI to be run is: - as root or setuid root, they can run as any user. This is extremely dangerous, as any bugs could compromise the entire server, including every user's files. Fortunately only the system administrator can install setuid root programs. If you are *at all* concerned about security, make sure that no such programs (in particular Frontpage extensions) are installed, regardless of whether you use them yourself. For a proper overview, "man chmod". Some modes that may be useful in a typical CGI context are: * CGI programs, 0755 * data files to be readable by CGI, 0644 * directories for data used by CGI, 0755 * data files to be writable by CGI, 0666 (data has absolutely no security) * directories for data used by CGI with write access, 0777 (no security) * CGI programs to run setuid, 4755 * data files for setuid CGI programs, 0600 or 0644 * directories for data used by setuid CGI programs, 0700 or 0755 * For a typical backend server process, 4750 Finally, if this answer tells you anything you didn't already know, don't even think about trying to set up a secure server![Table of Contents] [Index]
[ quoted from http://www.umr.edu/~cgiwrap/intro.html ] > CGIWrap is a gateway program that allows general users to use CGI scripts > and HTML forms without compromising the security of the http server. > Scripts are run with the permissions of the user who owns the script. In > addition, several security checks are performed on the script, which will not > be executed if any checks fail. > > CGIWrap is used via a URL in an HTML document. As distributed, cgiwrap > is configured to run user scripts which are located in the > ~/public_html/cgi-bin/ directory. See http://www.umr.edu/~cgiwrap/[Table of Contents] [Index]
The normal format for data in HTTP requests is URLencoded. All Form data is encoded in a string, of the form param1=value1¶m2=value2&...paramn=valuen Many non-alphanumeric characters are "escaped" in the encoding: the character whose hexadecimal number is "XY" will be represented by the character string "%XY". Decoding this string is a fundamental function of every CGI library. Another format is "multipart/form-data", also known as "file upload". You will get this from the HTML markup <form method="POST" enctype="multipart/form-data"> (but note you must accept URLencoded input in any case, since not all browsers support multipart forms). Most(?) CGI libraries will handle this transparently.[Table of Contents] [Index]
Home, Forums, Reference, Tools, FAQs, Articles, Design, Links
Copyright © 1996 - 2006. All rights reserved.