Advanced Sitemap Features
If you are already somewhat familiar with Cocoon, you will have noticed that we left out some features when we first introduced it. The main reason for this was to make it easier for first-time users to get started with Cocoon. Now that we have expanded on the first block of information with examples and the first version of the news portal application, we can complete the description of the sitemap features from a user perspective.
One of the most important functions in Cocoon is its ability to obtain data from various sources. This is done through different protocols. This section introduces some Cocoon-specific protocols. We will also explain some new sitemap component types and the views and resources sections of the sitemap. However, before we dive into the details, let's begin our look at the sitemap with a slightly different type of componentthe action-set.
Action-Sets
Chapter 4 introduced the component type action, which can be used in any pipeline to fulfill a defined task. Cocoon also offers a more flexible approach to using actions: action-sets.
In contrast to other sitemap component types, an action-set is a combination of formerly defined actions that can be used in a pipeline as though it were a single component. Defining an action-set is like defining a pipeline, which is a combination of sitemap components. An action-set is also defined inside its own sitemap section, the map:action-sets section.
Each action-set is introduced with the map:action-set element, which receives a unique name via the attribute name. Inside this element you can enter as many actions as you like, as shown in Listing 6.8. You arrange a set of actions to form a group.
Listing 6.8 An Example of an Action-Set
<map:action-sets> <map:action-set name="myactionset"> <map:act type="log-start-action"/> <map:act type="add-action" action="add"/> <map:act type="del-action" action="delete"/> <map:act type="log-end-action"/> </map:action-set> </map:action-sets>
A defined action-set can be used in the pipeline just like a normal action via the tag <map:act set="myactionset"/>. The difference is that the attribute set is used instead of type.
If you use an action-set, all actions of this set are called in the order they are defined. In addition, it is possible to selectively call an action inside an action-set. To do this, you can define each action in the action-set to have an attribute action. If the current request being processed by the pipeline contains a request parameter called cocoon-action, the action with the corresponding action attribute in the action-set is called.
In Listing 6.8, if the action-set myactionset is used, log-start-action is invoked. If the request currently being processed contains a cocoon-action parameter with the value add, the action add-action is invoked. If instead the cocoon-action parameter has the value delete, del-action is invoked. Finally, log-end-action is always invoked. The cocoon-action parameter can contain only one value, so either add-action or del-action or neither is invoked, but never both at the same time.
Do you remember value substitution, discussed in Chapter 4? An action can provide key-value pairs for other sitemap components. All components nested inside the action have access if they know the key's name.
Value substitution for action-sets is very similar, as shown in Figure 6.5. Whereas all values of an action are accessible using the key for nested components, all values of all called actions of the action-set are available inside the action-set element. Therefore, the value substitution algorithm collects all values from all actions. However, if two actions use the same key inside an action-set, only the value of the last action is available. It overrides the previous one.
Figure 6.5 Value substitution for action-sets.
Using action-sets allows you to build modular components that can be used flexibly in pipelines. Often, actions are used to control the flow inside a pipeline and to determine such things as which data source needs to be accessed for the current request. Using the various protocols available in Cocoon allows a variety of different possibilities when it comes to retrieving data or calling internal functions as part of processing.
Protocols
A concept widely used inside sitemaps is the definition of URIs. On the one hand, you define the sitemap to spawn a virtual URI space, which is served by Cocoon, but more obviously, you use URIs to specify which resources are to be read by the various sitemap components. For example, the file generator needs an XML document as input; the xslt transformer processes a stylesheet, and so on.
As we discussed in Chapter 4, you can use any protocol supported by Cocoon to define your URIs and to access resources. For example, you can use an HTTP connection to retrieve an XML document from a remote server, an FTP connection to read a stylesheet, or the file protocol to read a file from the local hard drive.
In addition to these standard protocols, Cocoon offers additional protocols that can also be used inside the source definition of a generator, a transformer, or any other component. All these protocols follow the general pattern for building URIs: protocolname://path to the resource. Cocoon supports a resource protocol, a context protocol, a cocoon protocol, and a protocol that is used implicitly.
The Implicit Protocol
The most important protocol is the implicit protocol, which you have already used without noticing. As the name suggests, this protocol is used implicitly whenever a protocol definition is missing. For example, if you write something like <map:generate src="mydocument.xml"/>, Cocoon can handle it even though the protocol is missing.
How Cocoon handles this depends on how you deployed the web application. There are two ways of doing this. You can bundle everything into a web archive (WAR) file, or you can deploy everything as individual files. If your web application is not a WAR file, Cocoon implicitly adds the file protocol. All the references are then resolved relative to the location of the current sitemap using the file protocol. If you have a WAR file, Cocoon implicitly adds the protocol provided by the servlet engine to access these files, again relative to the location of the current sitemap.
This means that you don't need to worry about explicitly using a protocol when you define your pipelines and the resources they are to access. However, it is always better to add the protocol explicitly, because this makes your sitemap entries more readable to someone who is not as familiar with the inner workings of Cocoon.
The Context Protocol
The context protocol is used to access any resource belonging to the Cocoon web application. If you deployed the Web application from a directory on your hard drive, the context protocol is directly mapped to the filesystem. So the resource definition context://mydocument.xml is translated to a file URI pointing to the Cocoon web application directorymore precisely, to a file called mydocument.xml inside this directory.
If you have deployed your Cocoon web application as a WAR file, you access the resources inside the WAR file using the context protocol. The argument following the protocol is a path relative to the root of the WAR file. So again, context:// mydocument.xml references a file named mydocument.xml stored at the root of the WAR file.
So, if you use the context protocol, you can abstract from how you deployed your Cocoon web application. Cocoon can determine whether to use the filesystem or the WAR file to resolve the resource you might want to load.
Whereas the context protocol can be used to access resources inside a WAR file or in a filesystem, the resource protocol can locate resources inside Java archives (JAR files).
The Resource Protocol
Because Cocoon is implemented using Java, it consists of several JAR files that contain the various parts. A JAR file can contain more than Java code. It can hold any resource, such as images, XML documents, or stylesheets. All these JAR files are located in the WEB-INF/lib directory of your Cocoon context and are loaded automatically at startup by your servlet engine.
If you want to read such a resource, you can simply use the resource protocol followed by a path specifying the resource precisely. Cocoon then searches all loaded JAR files for this resource. For example, resource://org/apache/cocoon/components/language/markup/xsp/java/xsp.xsl specifies a file named xsp.xsl. This file is in one of the JAR files in the directory structure org/apache/cocoon/components/language/ markup/xsp/java. So one JAR file has a root directory called org, which has a subdirectory named apache, and so on.
So far, we have looked at protocols that allow you to access static resources. But what if you want to access resources that are not available as a unit but must be built by a process?
The Cocoon Protocol
Because Cocoon is a processing framework that can build documents using processing pipelines, sooner or later you might want to use a Cocoon resource as the input for a generator in another resource. Doing this lets you use the result of a resource as the starting point for a pipeline or as the input for any other component. So what you need is a way to access the result of one pipeline from another pipeline.
The cocoon protocol allows you to do exactly this. It accesses pipelines inside the sitemap. For example, <map:generate src="cocoon:/helloworld"/> uses the file generator that reads an XML document created by a request for the document helloworld against the sitemap.
Whenever you use the cocoon protocol, Cocoon internally processes a new request for the specified document and uses this result for the ongoing processing of the original request.
The main use of this protocol is content aggregation, in which you can build a document from more than one source, as you will see in the next section. But you can, of course, use this protocol everywhere in the sitemapfor example, as an input to the xslt transformer.
All in all, the different protocols allow a very flexible mechanism for accessing data sources. You can also add your own new protocol if you like. We will show you how to do this in Chapter 9, "Developing Components for Cocoon." As soon as you have set up pipelines to access the various data sources, content aggregation allows you to combine them inside the sitemap.
Content Aggregation
When designing web applications, such as a portal, you often need to build complex documents consisting of several parts. Consider a typical information web site. The document consists of a header displaying, for example, the name of the company, a navigation bar, a block of content that was chosen from the navigation bar, and perhaps a footer displaying some static information.
Although this is a single document, it consists of four parts: header, navigation bar, content, and footer. Many documents follow this scheme. For each piece of content you display on your web site, you have exactly one document consisting of three static partsheader, navigation bar, and footerand the content. How can documents like this be created easily?
One solution is to define a separate pipeline for each document. Each pipeline then reads an XML document containing not only the content but also XML information for the header, footer, and navigation bar. The XML information is then formatted by a stylesheet to present the complete page.
The problem with this solution is that you cannot access just the content. You would need to do this if you wanted to format the data into a PDF document, where you do not need the additional information on a header or footer.
Even worse, defining separate pipelines mixes concerns. The content should not need to know about the other parts, and vice versa. So the ideal solution would be to create the parts as separate documents and then be able to combine them.
That's where content aggregation comes in handy. You can define a document that is a combination or aggregation of other documents. To do this, you need to define a pipeline in the sitemap and use some tags specific to content aggregation, as shown in Listing 6.9.
Listing 6.9 An Example of Content Aggregation
<map:pipeline internal-only="true"> <map:match pattern="header"> <map:generate src="header.xml"/> <map:serialize type="xml"/> </map:match> <map:match pattern="footer"> <map:generate src="footer.xml"/> <map:serialize type="xml"/> </map:match> <map:match pattern="navigation"> <map:generate src="footer.xml"/> <map:serialize type="xml"/> </map:match> <map:match pattern="*"> <map:generate src="docs/{1}.xml"/> <map:serialize type="xml"/> </map:match> </map:pipeline> <map:pipeline> <map:match pattern="docs/*"> <map:aggregate element="document"> <map:part src="cocoon:/header" element="header"/> <map:part src="cocoon:/navigation" element="navigation"/> <map:part src="cocoon:/{1}" element="content"/> <map:part src="cocoon:/footer" element="footer"/> </map:aggregate> <map:transform src="all2html.xsl"/> <map:serialize type="html"/> </map:match> </map:pipeline>
Listing 6.9 has some new elements we need to define before proceeding with our discussion. The most obvious one is the map:aggregate command. It is used inside an XML processing pipeline as a replacement for the map:generate instruction you would have in a normal pipeline. It defines a content aggregation of the parts, which are defined as nested map:part elements. In our example, we are building a complete document containing a header, a footer, navigation, and content. The attribute element of map:aggregate defines the root element of the generated XML document. Each part can have an element, under which you can find this part in the aggregated content. See Listing 6.10.
Listing 6.10 Aggregated Content
<?xml version="1.0"?> <document> <header> <!-- here is the content of the header document --> </header> <navigation> <!-- here is the content of the navigation document --> </navigation> <content> <!-- here is the content of the content document --> </content> <footer> <!-- here is the content of the footer document --> </footer> </document>
As you can see from Listing 6.10, the content is aggregated by the various parts. The following components in the pipeline, such as the xslt transformer, can transform this aggregated document into HTML or whatever format is required.
You do not need to define an element attribute for a part. If it is omitted, the part's content is directly included under the document's root node.
The cocoon protocol is used for each part. Therefore, each part is defined by another pipeline somewhere in the sitemap. In this example, these pipelines are all inside their own map:pipeline section in the sitemap.
Normally, because the separate parts are pipelines in the sitemap, you would be able to access them individually using a browser. This is not what you want, however, because it would result in your receiving only part of a document.
Because you do not want to be able to receive only the document header or navigation or footer or the content itself without the surrounding parts, this map:pipeline section is protected with the attribute internal-only set to true. With this attribute set, all marked map:pipeline sections are skipped when Cocoon processes a request. These pipelines can only be invoked "internally" by using the cocoon protocol from within another pipeline.
You can control content aggregation using three more attributes for an aggregated part: prefix, ns, and strip-root. So, a full-featured part might look like this:
<map:part src="cocoon:/header" strip-root="true" prefix="header" ns="header://version/1.0" element="header"/>
The top-level element for the header part is called header. It gets the namespace defined by the attribute ns. The attribute prefix is used to define the prefix. So the top-level element looks like this:
<header:header xmlns:header="header://version/1.0"/>
You can leave out the attribute prefix.
In addition, you can use the attribute strip-root with a Boolean value. If it is set to true, the root element of the aggregated part is stripped off. So if the pipeline for the document header has the root element myheader, it is not included. All children of the myheader element are included under the root element of the part.
Although you might get the impression that you must use the cocoon protocol to aggregate parts, this is not true. You can use any protocol available. The simplest case is aggregating XML files.
Later you will see practical examples and tips and a real-world example of content aggregation. This examplethe Cocoon online documentationalso uses some other features not explained yet. One of them is the concept of subsitemaps.
Subsitemaps
When you develop large web applications, or when more than one person is editing the sitemap, it can be very difficult to maintain, because it is a single big XML document.
To simplify sitemap editing and maintenance, Cocoon offers the concept of subsitemaps (see Figure 6.6). A subsitemap looks like a normal sitemap, but it is mounted into the main sitemap. By mounting, we mean that you usually define a URI prefix for a subsitemap. All incoming requests starting with this prefix are then handled by the subsitemap.
Figure 6.6 Subsitemaps.
The mount points allow you to cascade your sitemaps. This ensures more readability and supports sitemap editors managing the web application. Each subsitemap can then be maintained by a different person. After mounting, you can imagine the whole construction as a tree, with the main sitemap being the root.
When a request for a document enters Cocoon, it is always processed by the main sitemap first. If a mount point for a subsitemap is reached, the processing is passed to the subsitemap (see Listing 6.11).
Listing 6.11 A Basic Example of Mounting a Subsitemap
<map:match pattern="faq/*"> <map:mount check-reload="yes" src="faq/sitemap.xmap" reload-method="synchron"/> </map:match>
The src attribute defines the location of the subsitemap. If it ends in a slash, sitemap.xmap is automatically appended to find the sitemap. Otherwise, Cocoon assumes that the src attribute directly defines a file containing the subsitemap.
Like the root sitemap, subsitemaps can be configured with respect to reloading. The configuration is similar to that of the root sitemap in cocoon.xconf. The check-reload attribute, which defaults to true, defines whether changes to the subsitemap should be reflected.
If this reload checking is activated, reload-method specifies whether the subsitemap regeneration should be synchronous or asynchronous. Here the same rules apply as those explained for sitemap reloading at the beginning of this chapter.
The fourth attribute for map:mount is the uri-prefix attribute. As explained, when a request enters Cocoon, the root sitemap is processed with the incoming URI. Now, if a mount point for a subsitemap is reached and Cocoon processes this subsitemap, the same URI is passed in.
For example, if you requested for a document called faq/installation, and the mount defined in Listing 6.11 is reached, this URI is passed on to the subsitemap unchanged. Even though you mounted the sitemap under the path faq, you still have to match this prefix inside the subsitemap. If you want to mount your subsitemap under a different path, such as old-example, you have to update the root sitemap to add a prefix and also all matches inside your subsitemap to reflect this new location (see Listing 6.12).
Listing 6.12 Mounting a Subsitemap with Prefix
<map:match pattern="faq/*"> <map:mount uri-prefix="faq/" check-reload="yes" src="faq/sitemap.xmap" reload-method="synchron"/> </map:match>
To avoid these problems and to make the subsitemap more independent from the root sitemap, you can use the uri-prefix attribute to pass only the important part into the subsitemap. In the example, you want to pass only installation into the subsitemap.
Because the subsitemap is mounted using the path faq/, you have to remove it from the URI that is passed to the subsitemap. And that's exactly what you do with the uri-prefix attribute. You define a string starting on the left side of the URI. It is removed from the original when processing is passed to the subsitemap. In the example, you want to remove faq/ and therefore give this value to the uri-prefix attribute. Cocoon automatically checks for a trailing slash, so writing either faq or faq/ is equivalent. However, we suggest that you add the slash to make it easier to read your entry.
A subsitemap can look the same as the main sitemap. It can have the same sections, starting with a components section and ending with a pipelines section.
In fact, these two sections are the ones required to make a subsitemap work, as you can see from Listing 6.13. But you can, of course, have all the other sections as well.
Listing 6.13 An Example Subsitemap
<map:sitemap xmlns:map="http://apache.org/cocoon/sitemap/1.0"> <map:components> <map:generators default="file"/> <map:transformers default="xslt"/> <map:readers default="resource"/> <map:serializers default="html"/> <map:selectors default="browser"/> <map:matchers default="wildcard"/> </map:components> <map:pipelines> <map:pipeline> <map:match pattern="*"> <map:generate src="{1}.xml"/> <map:transform src="faq2html.xsl"/> <map:serialize/> </map:match> </map:pipeline> </map:pipelines> </map:sitemap>
All requests entering the main sitemap that start with the prefix faq/ are passed to the subsitemap. The prefix is removed from the URI, and the subsitemap receives only the part of the URI that comes after this prefix.
So a request for faq/installing is passed as a request for installing to the subsitemap. As defined in the subsitemap in Listing 6.13, the request reads an XML document named installing.xml, transforms it, and serializes it as HTML.
As you can see from this example, you can use all the sitemap components from the main sitemap without declaring them again, but in order to make the subsitemap work, you have to declare the default component for each component type.
However, in order to separate concerns, you can define specific sitemap components in the components section of your subsitemap. These components are then accessible only in this subsitemap, not in the parent sitemap. You can also redefine a component inherited from the parent sitemap but with another configuration. Again, this configuration is used only in the subsitemap.
Using subsitemaps helps you manage your web site. Each sitemap editor has his own separate sitemap that cannot interfere with the other sitemaps. Even if a subsitemap stops working due to a mistake made in the subsitemap, the main sitemap and all other subsitemaps still work.
The hierarchical structure of sitemaps is not limited to two levels (one main sitemap with several subsitemaps). Because a subsitemap is a full-featured sitemap that inherits from the parent (or main) sitemap, it can have its own subsitemaps. So you can build a big tree of sitemaps using this concept.
Each subsitemap can have its own directory to store resources such as XML documents and stylesheets. All URIs that do not have an explicit protocol are resolved according to the sitemap's directory. In the example, the subsitemap is stored in the directory faq. The pipeline for a document reads an XML document that is resolved relative to this directory faq.
Apart from using the concept of subsitemaps to maintain your web site, you can also use views to organize what you send to the client application.
Views
Chapter 4 glossed over the explanation of the map:views and map:resources sections in the sitemap. Let's now fill in this gap, starting with views.
A request you send to Cocoon is mapped to a pipeline in the sitemap. That pipeline uses a combination of components to generate an end result, a document that is returned to you as a result of your request. You can think of the end result as being the default view of the document generated by that particular pipeline. However, Cocoon also lets you configure and request other views of a particular document.
Cocoon offers a wide variety of configurable views for its documents. You can request a document's content view, and you will get the content in that document's XML format. Or you can ask for a document's link view and get all the links to other documents contained in this document.
The views concept is complex. So we'll start our discussion of views by looking at some simple examples and examining some use-cases. The first thing you need to know is how to specify which view of the document you want when sending the request to Cocoon. You do so using the request parameter cocoon-view with the value of the view name you ask for. So if you ask for http://localhost:8080/cocoon/helloworld?cocoon-view=content, you receive that document's content view.
The more complex question is how Cocoon knows what to do when a view is requested. Generally speaking, a view is an alternative pipeline for a document. It starts like the original pipeline for the document, but it has a different ending.
Assume that you have a standard pipeline consisting of a file generator, an xslt transformer, and an html serializer. You can then define a different view using the same file generator but a different transformer and serializer.
A view definition consists of two parts, as shown in Listing 6.14. The first part specifies which parts or beginning of the original pipeline should be used for the view. The second part defines the alternative ending. The ending is defined in the map:views section of the sitemap.
Listing 6.14 Views
<map:views> <map:view name="content" from-label="content"> <map:serialize type="xml"/> </map:view> <map:view name="links" from-position="last"> <map:serialize type="links"/> </map:view> </map:views>
For each possible view, you create a map:view element with the attribute name specifying the view's unique name. Inside this element, you define the pipeline's ending. Because this is only the ending, you must not define a generator. However, you can use transformers, and you must provide a serializer.
Listing 6.14 shows two defined views: the content view and the links view. Each new view contains only a serializer. Looking at the links view, you can see that the attribute from-position has the value last. This tells Cocoon where the new pipeline should take over from the original when the links view is requested. In this case, the alternative ending for this view starts at the last position of the original pipeline.
In other words, the serializer of the original pipeline is ignored, and instead, all sitemap components enclosed in this view are appended. So the links view differs from the original document in that it uses the links serializer (see Listing 6.15).
Listing 6.15 The Link Serializer
<map:serializers> <map:serializer name="links" src="org.apache.cocoon.serialization.LinkSerializer"/> </map:serializers>
The link serializer is a special serializer that outputs plain text. It extracts all links and references from a document and puts each link in a separate line of the output text. These links and references are searched for in the original document by searching for the attributes src and href.
Another possibility is to define the value first for the view's from-position attribute. Then the alternative pipeline starts immediately after the original generator.
But Cocoon wouldn't be Cocoon if these were the only possibilities for defining views! You can define more fine-grained views by using the attribute from-label on the view. The value of this attribute marks a label that can be used in the original pipeline for the sitemap components.
With this label attached to sitemap components such as generators and transformers, you define which components of the original pipeline should be used for the view. Listing 6.16 shows an example.
Listing 6.16 An Example of Labeled Views
<map:generators default="file"> <map:generator name="file" label="content" src="org.apache.cocoon.generation.FileGenerator"/> <map:generator name="html" src="org.apache.cocoon.generation.HTMLGenerator"/> ... </map:generators> ... <map:pipeline> <map:match pattern="document_one"> <map:generate src="document.xml"/> <map:transform src="document2html.xsl"/> <map:serialize/> </map:match> <map:match pattern="document_two"> <map:generate src="page.html" type="html"/> <map:transform label="content" src="restructure.xsl"/> <map:transform src="document2html.xsl"/> <map:serialize/> </map:match> </map:pipeline>
The component definition of the file generator is labeled with a label called content. This indicates that whenever a view is requested and this view uses the label content, the generator is included in the pipeline for this view. Similarly, you can mark other generators and transformers in the components section as well.
The pipeline for the first document, called document_one (see Figure 6.7), is assembled by the file generator, an xslt transformer, and the html serializer. When the content view is requested, Cocoon looks at the map:views section and finds the definition for this view. This view indicates that the label content is used. During the pipeline assembly, the components for this pipeline are checked for the label.
Figure 6.7 A simple example of using views.
The file generator is labeled, so it is used. If a component is labeled, it is added to the pipeline for the view, and the usual pipeline processing is passed to the views section. All other sitemap components of the original pipeline are ignored, and the components of the views section are appended.
The pipeline for the second example (document_two), shown in Figure 6.8, is assembled by the html generator, two xslt transformers, and the html serializer. Note that neither the html generator nor the xslt transformer is labeled in the components section. When the content view of this document is requested, the original pipeline is searched for the label.
Figure 6.8 An advanced example of using views.
In general, the xslt transformer is not labeled, so it usually isn't added to the pipeline for the view. But for this special pipeline, you can indicate that the transformer should be added by giving it an attribute label with the value content. The first xslt transformer is labeled using the attribute label with the given value.
The process here is the same as in the first example. All sitemap components are added to the pipeline until one component is labeled. This component is added as well, but the following ones are skipped. Then the view's sitemap components are appended. For this example, the view is assembled from the html generator, the first xslt transformer, and the xml serializer from the content view.
Regardless of whether the label is defined in the components or pipelines section of the sitemap, the original sitemap is left immediately after the first component containing the label. Even if you have more than one component in the pipeline marked with the required label, only the first component containing it is used.
As you will see at the end of this chapter, the links view is important for the offline generation of documents using Cocoon's command-line interface.
Now that you know about Cocoon's views, you know about nearly all of a sitemap's sections. So, let's discuss the last one.
Sitemap Resources
The last section we have yet to explain is the map:resources section (see Listing 6.17). This section is very similar to the map:pipeline section. You can define XML processing pipelines containing a generator, transformers, and a serializer and give this pipeline a name for further use in the map:pipelines section of the sitemap.
Listing 6.17 An Example of a Sitemap Resource
<map:resources> <map:resource name="Not authorized"> <map:generate src="notauthorized.xml"/> <map:transform src="tohtml.xsl"/> <map:serialize/> </map:resource> </map:resources>
You can refer to this resource from the map:pipelines section using the unique name for these sitemap resources. So a sitemap resource can be compared to a macro or a placeholder.
Currently, the only place in Cocoon where you can use sitemap resources is for redirects.
Redirects
Basically, a redirect allows you to jump from one pipeline to another. You can redirect to a totally different URI or to a previously defined sitemap resource. Listing 6.18 shows two examples.
Listing 6.18 Examples of Redirects
<map:redirect-to uri="helloworld"/> <map:redirect-to resource="Not authorized"/>
Unfortunately, the semantics of the map:redirect statement differ a bit from the semantics of the other sitemap components. Usually if you specify a source, such as for a generator, and you do not specify a protocol for the URI, Cocoon automatically adds the context protocol.
However, for a redirect to a relative URI, this is not the case. Cocoon implicitly adds the same protocol used to request the original document. For example, if you request a document with http://localhost:8080/cocoon/original_document, and this results in the execution of the previous redirect to helloworld as shown in Listing 6.18, Cocoon generates a new URI using the old one as a base. The redirect then references http://localhost:8080/cocoon/helloworld. So a relative URI is translated into an absolute URI.
Cocoon does not directly process redirects. Instead, an HTTP response to the client is generated. This response contains the information to process a redirect in addition to the redirect URI as content. The client itself recognizes this redirect and starts a new request with the new URI. Whenever you use a redirect, this results in at least two requests to your server. The first one identifies the redirect, and the second requests the redirected document.
If you redirect to a sitemap resource, the processing flow is continued in the new sitemap resource. Thus, the sitemap components defined in this resource are executed.
Now that you know about all the additional sitemap features and some Cocoon configuration points, it is time to bring in two new components and show you some examples that use them and the concepts described in this chapter.
Connecting to a Database
You can use the sql transformer in a pipeline to integrate a database as one of the data sources in a Cocoon application. Using this transformer, you can send any SQL command to a database. The transformer is controlled by commands contained in the XML stream processed by the transformer. If the SQL command fetches data from the database, the data is converted into XML.
You might wonder why this is a transformer and not a generator. The key point is usability. In general, SQL statements can have many options and parameters. This starts with specifying the database to use, the tables, and the rows, and it ends with complex information such as search phrases. If you want to use a generator, you have to specify all this in the sitemap as parameters for the generator. Changing a simple value would then require changing the sitemap.
Using a transformer allows you to build more-complex pipelines in which the information on what to fetch from the database is determined at runtime using the file generator, for example. When the request is processed, the file generator reads an XML file that contains the actual parameters for the transformer. Because the file generator can request the XML file via a protocol such as HTTP, this allows the dynamic generation of those commands.
Listing 6.19 shows the configuration of the sql transformer in the sitemap and how to use it in a pipeline.
Listing 6.19 SQL Transformer
<map:transformers> <map:transformer name="sql" src="org.apache.cocoon.transformation.SQLTransformer"/> </map:transformers/> ... <map:pipeline> <map:match pattern="test"> <map:generate src="document.xml"/> <map:transform type="sql"/> <map:transform src="tohtml.xsl"/> <map:serialize/> </map:match> </map:transform>
You can send any valid SQL command to the database. This is triggered by your XML document. Listing 6.20 shows an XML document that is read by the file generator and then is transformed by the sql transformer.
Listing 6.20 A Simple SQL Example
<document> <sql:execute-query xmlns:sql="http://apache.org/cocoon/SQL/2.0"> <sql:use-connection>personnel</sql:use-connection> <sql:query> select id,name from department_table </sql:query> </sql:execute-query> </document>
The sql transformer is triggered by XML elements that have the transformer's namespace, http://apache.org/cocoon/SQL/2.0. Each command is started by the element execute-query. Nested inside this element is all the information for the sql transformer, a combination of elements and text information.
The element use-connection defines which connection (or database) should be used for the SQL command. The following example will show you how you can configure database connections. For now, just assume you have defined a database connection named personnel.
Inside the query element, you can see the actual SQL command to be sent to the database. When the sql transformer receives such an XML block, it removes it from the XML document. If the SQL command fetches some data, this data is converted to XML and is inserted instead of the XML block controlling the sql transformer.
How is this data converted? An element rowset is created. Inside this element for each fetched row, an element named row is created. Inside this element, for each fetched column of this row, an element is created and is named the same as the column name. Inside this element is a text node with the value of that column from the database. All these elements get the namespace of the sql transformer.
You could then simply add a stylesheet to the XML processing pipeline, converting the rowset to an HTML table or whatever you like. The output displayed in Listing 6.21 is an intermediate XML document that is created during the pipeline processing. Because you will receive HTML in your browser, you will never notice this document; you will see only the starting XML document and the final output.
Listing 6.21 The Document after a SQL Transformer Run
<document xmlns:sql="http://apache.org/cocoon/SQL/2.0"> <sql:rowset> <sql:row> <sql:name>Matthew</sql:name> <sql:id>1</sql:id> </sql:row> <sql:row> <sql:name>Carsten</sql:name> <sql:id>2</sql:id> </sql:row> </sql:rowset> </document>
But what if your resulting document does not display the data you wanted? You need to know what the sql transformer has output in order to see if your SQL statement is working. You can, of course, change your document's pipeline definition. Instead of using a stylesheet to produce HTML and the html serializer, you can simplify the pipeline by removing the stylesheet and using the xml serializer. This shows you the data delivered by the sql transformer directly in your browser.
Another answer to this problem is to use the log transformer to see what is happening in the pipeline.
Logging
Usually pipelines consist of three or more sitemap components, starting with a generator, going to some transformers, and ending with a serializer. In the case of the file generator, you can see the starting XML document that is read by this component and the end result of the pipeline processing.
But what can you do if your output document doesn't look as you expected? One simple solution is to change your pipeline. Just remove all transformers after the component you want to test, and add the xml serializer. You will get the output of the transformer you want to test directly in XML.
If this stage of your pipeline looks right, you can then remove the next transformer in the chain and look at that output, and so on until you know where the fault is.
Another possibility is the log transformer (see Listing 6.22), which can be chained between two sitemap components. As the name suggests, this transformer logs the output of the sitemap component before the log transformer.
Listing 6.22 The Log Transformer
<map:transformers> <map:transformer name="log" src="org.apache.cocoon.transformation.LogTransformer"/> </map:transformers/> ... <map:pipeline> <map:match pattern="test"> <map:generate src="document.xml"/> <map:transform type="sql"/> <map:transform type="log"> <map:parameter name="logfile" value="logfile.log"/> <map:parameter name="append" value="no"/> </map:transform> <map:transform src="tohtml.xsl"/> <map:serialize/> </map:match> </map:transform>
In Listing 6.22, the output of the sql transformer is logged. When no parameter is set to the log transformer, it outputs everything to the servlet log of your servlet engine. But you can, of course, redirect the output to a file on your local hard drive. The sitemap parameter logfile defines the location of that file. With the parameter append, you can specify whether a new log file should always be written, or if the output should be appended to an existing file.
But be careful with using the log transformer in a servlet environment. It is not safe for concurrent requests. So if more than one client requests a document containing the log transformer, the output is mixed by these two pipelines. So for debugging, you should be sure that only one client invokes the request at a time.
This section covered the advanced features of the sitemap. You saw that a Cocoon application is not limited to just one sitemap, but that sitemaps can be cascaded. This feature is particularly useful when the application consists of separate parts. Using the available protocols and components such as the sql transformer, you can integrate existing data sources into your application. Content aggregation allows configured information sources to be flexibly combined into a single document. The document you receive as a pipeline's output is only one of the views Cocoon can provide. Using the views concept, you can define alternative pipelines that can return, for example, only the content or the links of a particular document. You can use the logging mechanism to check on what is happening in your pipeline, which is important if things do not work as expected.
Although the most common form of running Cocoon is as a servlet, this is only one way of using the framework. In fact, it is only a very small part of Cocoon that is servlet specific. This part is only one of the interfaces Cocoon provides to the outside world. Another important interface that allows Cocoon to be used in different environments is the Command-Line Interface.