Information on the World Wide Web is stored in pages. A page can contain any of the following:
- Style sheets
Web pages are constructed using a series of client-side technologies that are processed and displayed by Web browsers.
Web browsers are client programs used to access Web sites and pages. The Web browser has the job of processing received Web pages and displaying them to the user. The browser attempts to display graphics, tables, forms, formatted text, or whatever the page contains.
The most popular Web browsers now in use are Netscape Navigator and Microsoft Internet Explorer. Other lesser-used browsers exist too, for example Mozilla and Opera.
Web page designers have to pay close attention to the differences between browsers because different Web browsers support different HTML tags. Unfortunately, no one single browser supports every tag currently in use. Furthermore, the same Web page often looks different on two different browsers; every browser renders and displays Web page objects differently. Even the same browser running on different operating systems will often behave differently.
For this reason, most Web page designers use multiple Web browsers and test their pages in every one to ensure that the final output appears as intended. Without this testing, some Web site visitors will not see the pages you published correctly.
Dreamweaver MX, used to create Web pages, has its own built in browser that is neither Microsoft Internet Explorer, nor Netscape Navigator, nor any other browser. To help you test your Web pages in as many browsers as possible, Dreamweaver MX allows you to define external browsers that may be launched to view your creations.
Web pages are plain text files constructed via Hypertext Markup Language (HTML). HTML is implemented as a series of easy-to-learn tags. Web page authors use these tags to mark up a page of text. Browsers then use these tags to render and display the information for viewing.
HTML is constantly being enhanced with new features and tags. To ensure backward compatibility, browsers must ignore tags they do not understand. For example, if you use the <MARQUEE> tag in an effort to create a scrolling text marquee, browsers that do not support this tag display the marquee text but do not scroll the text.
Web pages also can contain hypertext jumps, which are links to other pages or Web sites. Users can click links to jump to either other pages on the same site or any page on any site.
Pages on a Web server are stored in various directories. When requesting a Web page, a user might provide a full path (directory and filename) to specify a particular document.
You can specify a default Web page, a page that is sent back to the user when only a directory is specified, with a Web server. These default pages are often called index.html or default.htm (or index.cfm for ColdFusion pages). If no default Web page exists in a particular directory, you see either an error message or a list of all the available files, depending on how the server is set up.
HTML is a page markup language. It enables the creation and layout of pages and forms but not much else. Building intuitive and sophisticated user interfaces requires more than straight HTMLclient-side scripting is necessary, too. Scripting enables you to write code (small programs) that runs within Web browsers.
Perform form field validation
Pop open windows
Animate text and images
Create drop-down menus or navigation controls
Perform rudimentary text and numeric processing
...and much more
Scripting enables developers to trap and process eventsthings that occur within the browser. For example, a page being loaded, a form being submitted, and the mouse pointer moving over an image are all events, and scripts can be automatically executed by the Web browser when these occur. It is this that facilitates the types of features I just listed.
Script code is either embedded in the HTML file or stored in an external file and linked within the HTML code. Either way, the script is retrieved and processed by the Web browser.
Writing client-side scripts is more difficult than writing simple HTML. Not only are scripting languages harder to learn than HTML, there is an additional complexity in that various browsers support various levels of scripting. Writing portable scripts is possible, but it is not trivial.
Other Client Technologies
Most new browsers also enable the use of add-on technologies that are supported either directly or via plug-in modules. Some of the most significant ones are
CSS (Cascading Style Sheets). Provide a means of separating presentation from content so that both can be more readily reused and managed.
DHTML (Dynamic HTML). A combination of HTML, scripting, and CSS that, when used together, provide extremely rich and powerful user-interface options.
Java applets. Small programs that run within the Web browser (actually, they run within a Java Virtual Machine, but we'll not worry about that just yet). Applets were popular in the late '90s but are seldom used now because they are difficult to write; slow to download; and tend to be terribly incompatible with all the computers, operating systems, and browsers in use.
Macromedia Flash. A technology that is now embedded in over 98% of all browsers in use. Flash provides a mechanism for creating rich and portable interactive user interfaces (complete with audio, video, and animation, if needed), and Flash is being ported to all sorts of new platforms and devices.
So, now you know what Web servers, Web browsers, and Web pages are. The piece that links them all together is the URL.
Every Web page on the World Wide Web has an address. This is what you type into your browser to instruct it to load a particular Web page.
These addresses are called Uniform Resource Locators (URLs). URLs are not just used to identify World Wide Web pages or objects. Files on an FTP server, for example, also have URL identifiers. World Wide Web URLs consist of up to six parts (see Figure 1.5) as explained in Table 1.3.
Figure 1.5 URLs consist of up to six parts.
Table 1.3 Anatomy of a URL
The protocol to retrieve the object. This is usually http for objects on the World Wide Web. If the protocol is specified then it must be followed by :// (which separates the protocol from the host name).
The Web server from which to retrieve the object. This is specified as a DNS name or an IP address.
The host machine port on which the Web server is running. If omitted, the specified protocol's default port is used; for Web servers, this is port 80. If specified, the port must be preceded by a colon (:).
Path to file to retrieve or script to execute.
The file to retrieve or the script to execute.
Optional script parameters. If a query string is specified, it must be preceded by a question mark (?).
Look at some sample URLs:
http://www.forta.com. This URL points to a Web page on the host www.forta.com. Because no document or path was specified, the default document in the root directory is served.
http://www.forta.com/. This URL is the same as the previous example and is actually the correct way to specify the default document in the root directory (although most Web browsers accept the previous example and insert the trailing slash automatically).
http://www.forta.com/books/. This URL also points to a Web page on the host www.forta.com, but this time the directory /books/ is specified. Because no page name was provided, the default page in the /books/ directory is served. http://220.127.116.11/books/. This URL points to the same file as the previous example, but this time the IP address is used instead of the DNS name.
http://www.forta.com/books/topten.html. Once again, this URL points to a Web page on the www.forta.com host. Both a directory and a filename are specified this time. This retrieves the file topten.html from the /books/ directory, instead of the default file.
http://www.forta.com:81/administration/index.html. This is an example of a URL that points to a page on a Web server assigned to a nonstandard port. Because port 81 is not the standard port for Web servers, the port number must be provided.
http://www.forta.com/cf/tips/syndhowto.cfm. This URL points to a specific page on a Web server, but not an HTML page. CFM files are ColdFusion templates, which are discussed later in this chapter.
http://www.forta.com/cf/tips/browse.cfm?search=mx. This URL points to another ColdFusion file, but this time a parameter is passed to it. A ? is always used to separate the URL itself (including the script to execute) from any parameter.
http://www.forta.com/cf/tips/browse.cfm?search=mx&s=1. This URL is the same as the previous example, with one additional parameter. Multiple parameters are separated by ampersands (the & character).
ftp://ftp.forta.com/pub/catalog.zip. This is an example of a URL that points to an object other than a Web page or script. The protocol ftp indicates that the object referred to is a file to be retrieved from an FTP server using the File Transfer Protocol.
This file is catalog.zip in the /pub/ directory.
Links in Web pages are references to other URLs. When a user clicks a link, the browser processes whatever URL it references.
Hosts and Virtual Hosts
As already explained, the term host refers to a computer connected to the Internet. The host name is the DNS name by which that machine may be referred to.
A Web site is hosted on a host (which, if you think about it, makes perfect sense). So host www.forta.com (which has an IP address of 18.104.22.168) hosts my Web site. But that host also hosts many other Web sites (some mine and some belonging to other people). If a request arrives at a host that hosts multiple Web sites, how does the host know which Web site to route it to?
There are actually several ways that this can be accomplished:
Earlier I explained that IP address must be unique, that is, no two hosts may share the same IP address. But what I did not explain is that a single host may have more than one IP address (assuming the operating system allows this, and most in fact do). If a host has multiple IP addresses, each may be mapped in the Web server software to virtual hosts (which are exactly that, virtual hosts). Each virtual host is has an associated Web root (the base directory for any and all content), and depending on the IP address that the request came in on, the Web server can route the request to the appropriate virtual host and directory structure.
Some Web servers allow multiple virtual hosts using the same IP address. How do they do this? By looking at the DNS name that was specified. You will recall that I earlier explained that multiple DNS names can resolve to the same IP address, and so Web servers may allow the mapping of virtual hosts by DNS name (rather than IP address). This is the instance I was referring to earlier when I said that there is a scenario in which DNS names must be used.
In both of these configurations, all of the hosts (including virtual hosts) are processed by the same Web server. There is another way to support multiple hosts without using different DNS names or IP addresses:
Depending on the Web server software being used, it may be possible to run multiple Web servers on the same computer. In this configuration each and every instance of the Web server must be running on a different port (you will recall that no two applications may share a port at the same time). When requests are made the port must be specified in the URL (or else, as previously explained, the request will default to port 80). Each Web server has its own Web root which is then the root for a specific virtual host.
In this configuration multiple Web servers are used, one per host.
The difference may seem subtle, but it is very important, as you will soon see.