Publishers of technology books, eBooks, and videos for creative people

Home > Articles

This chapter is from the book

This chapter is from the book

Advanced Sitemap Features

If you are already somewhat familiar with Cocoon, you will have noticed that we left out some features when we first introduced it. The main reason for this was to make it easier for first-time users to get started with Cocoon. Now that we have expanded on the first block of information with examples and the first version of the news portal application, we can complete the description of the sitemap features from a user perspective.

One of the most important functions in Cocoon is its ability to obtain data from various sources. This is done through different protocols. This section introduces some Cocoon-specific protocols. We will also explain some new sitemap component types and the views and resources sections of the sitemap. However, before we dive into the details, let's begin our look at the sitemap with a slightly different type of component—the action-set.

Action-Sets

Chapter 4 introduced the component type action, which can be used in any pipeline to fulfill a defined task. Cocoon also offers a more flexible approach to using actions: action-sets.

In contrast to other sitemap component types, an action-set is a combination of formerly defined actions that can be used in a pipeline as though it were a single component. Defining an action-set is like defining a pipeline, which is a combination of sitemap components. An action-set is also defined inside its own sitemap section, the map:action-sets section.

Each action-set is introduced with the map:action-set element, which receives a unique name via the attribute name. Inside this element you can enter as many actions as you like, as shown in Listing 6.8. You arrange a set of actions to form a group.

Listing 6.8 An Example of an Action-Set

<map:action-sets>
 <map:action-set name="myactionset">
  <map:act type="log-start-action"/>
  <map:act type="add-action" action="add"/>
  <map:act type="del-action" action="delete"/>
  <map:act type="log-end-action"/>
 </map:action-set>
</map:action-sets>

A defined action-set can be used in the pipeline just like a normal action via the tag <map:act set="myactionset"/>. The difference is that the attribute set is used instead of type.

If you use an action-set, all actions of this set are called in the order they are defined. In addition, it is possible to selectively call an action inside an action-set. To do this, you can define each action in the action-set to have an attribute action. If the current request being processed by the pipeline contains a request parameter called cocoon-action, the action with the corresponding action attribute in the action-set is called.

In Listing 6.8, if the action-set myactionset is used, log-start-action is invoked. If the request currently being processed contains a cocoon-action parameter with the value add, the action add-action is invoked. If instead the cocoon-action parameter has the value delete, del-action is invoked. Finally, log-end-action is always invoked. The cocoon-action parameter can contain only one value, so either add-action or del-action or neither is invoked, but never both at the same time.

Do you remember value substitution, discussed in Chapter 4? An action can provide key-value pairs for other sitemap components. All components nested inside the action have access if they know the key's name.

Value substitution for action-sets is very similar, as shown in Figure 6.5. Whereas all values of an action are accessible using the key for nested components, all values of all called actions of the action-set are available inside the action-set element. Therefore, the value substitution algorithm collects all values from all actions. However, if two actions use the same key inside an action-set, only the value of the last action is available. It overrides the previous one.

Figure 6.5Figure 6.5 Value substitution for action-sets.

Using action-sets allows you to build modular components that can be used flexibly in pipelines. Often, actions are used to control the flow inside a pipeline and to determine such things as which data source needs to be accessed for the current request. Using the various protocols available in Cocoon allows a variety of different possibilities when it comes to retrieving data or calling internal functions as part of processing.

Protocols

A concept widely used inside sitemaps is the definition of URIs. On the one hand, you define the sitemap to spawn a virtual URI space, which is served by Cocoon, but more obviously, you use URIs to specify which resources are to be read by the various sitemap components. For example, the file generator needs an XML document as input; the xslt transformer processes a stylesheet, and so on.

As we discussed in Chapter 4, you can use any protocol supported by Cocoon to define your URIs and to access resources. For example, you can use an HTTP connection to retrieve an XML document from a remote server, an FTP connection to read a stylesheet, or the file protocol to read a file from the local hard drive.

In addition to these standard protocols, Cocoon offers additional protocols that can also be used inside the source definition of a generator, a transformer, or any other component. All these protocols follow the general pattern for building URIs: protocolname://path to the resource. Cocoon supports a resource protocol, a context protocol, a cocoon protocol, and a protocol that is used implicitly.

The Implicit Protocol

The most important protocol is the implicit protocol, which you have already used without noticing. As the name suggests, this protocol is used implicitly whenever a protocol definition is missing. For example, if you write something like <map:generate src="mydocument.xml"/>, Cocoon can handle it even though the protocol is missing.

How Cocoon handles this depends on how you deployed the web application. There are two ways of doing this. You can bundle everything into a web archive (WAR) file, or you can deploy everything as individual files. If your web application is not a WAR file, Cocoon implicitly adds the file protocol. All the references are then resolved relative to the location of the current sitemap using the file protocol. If you have a WAR file, Cocoon implicitly adds the protocol provided by the servlet engine to access these files, again relative to the location of the current sitemap.

This means that you don't need to worry about explicitly using a protocol when you define your pipelines and the resources they are to access. However, it is always better to add the protocol explicitly, because this makes your sitemap entries more readable to someone who is not as familiar with the inner workings of Cocoon.

The Context Protocol

The context protocol is used to access any resource belonging to the Cocoon web application. If you deployed the Web application from a directory on your hard drive, the context protocol is directly mapped to the filesystem. So the resource definition context://mydocument.xml is translated to a file URI pointing to the Cocoon web application directory—more precisely, to a file called mydocument.xml inside this directory.

If you have deployed your Cocoon web application as a WAR file, you access the resources inside the WAR file using the context protocol. The argument following the protocol is a path relative to the root of the WAR file. So again, context:// mydocument.xml references a file named mydocument.xml stored at the root of the WAR file.

So, if you use the context protocol, you can abstract from how you deployed your Cocoon web application. Cocoon can determine whether to use the filesystem or the WAR file to resolve the resource you might want to load.

Whereas the context protocol can be used to access resources inside a WAR file or in a filesystem, the resource protocol can locate resources inside Java archives (JAR files).

The Resource Protocol

Because Cocoon is implemented using Java, it consists of several JAR files that contain the various parts. A JAR file can contain more than Java code. It can hold any resource, such as images, XML documents, or stylesheets. All these JAR files are located in the WEB-INF/lib directory of your Cocoon context and are loaded automatically at startup by your servlet engine.

If you want to read such a resource, you can simply use the resource protocol followed by a path specifying the resource precisely. Cocoon then searches all loaded JAR files for this resource. For example, resource://org/apache/cocoon/components/language/markup/xsp/java/xsp.xsl specifies a file named xsp.xsl. This file is in one of the JAR files in the directory structure org/apache/cocoon/components/language/ markup/xsp/java. So one JAR file has a root directory called org, which has a subdirectory named apache, and so on.

So far, we have looked at protocols that allow you to access static resources. But what if you want to access resources that are not available as a unit but must be built by a process?

The Cocoon Protocol

Because Cocoon is a processing framework that can build documents using processing pipelines, sooner or later you might want to use a Cocoon resource as the input for a generator in another resource. Doing this lets you use the result of a resource as the starting point for a pipeline or as the input for any other component. So what you need is a way to access the result of one pipeline from another pipeline.

The cocoon protocol allows you to do exactly this. It accesses pipelines inside the sitemap. For example, <map:generate src="cocoon:/helloworld"/> uses the file generator that reads an XML document created by a request for the document helloworld against the sitemap.

Whenever you use the cocoon protocol, Cocoon internally processes a new request for the specified document and uses this result for the ongoing processing of the original request.

The main use of this protocol is content aggregation, in which you can build a document from more than one source, as you will see in the next section. But you can, of course, use this protocol everywhere in the sitemap—for example, as an input to the xslt transformer.

All in all, the different protocols allow a very flexible mechanism for accessing data sources. You can also add your own new protocol if you like. We will show you how to do this in Chapter 9, "Developing Components for Cocoon." As soon as you have set up pipelines to access the various data sources, content aggregation allows you to combine them inside the sitemap.

Content Aggregation

When designing web applications, such as a portal, you often need to build complex documents consisting of several parts. Consider a typical information web site. The document consists of a header displaying, for example, the name of the company, a navigation bar, a block of content that was chosen from the navigation bar, and perhaps a footer displaying some static information.

Although this is a single document, it consists of four parts: header, navigation bar, content, and footer. Many documents follow this scheme. For each piece of content you display on your web site, you have exactly one document consisting of three static parts—header, navigation bar, and footer—and the content. How can documents like this be created easily?

One solution is to define a separate pipeline for each document. Each pipeline then reads an XML document containing not only the content but also XML information for the header, footer, and navigation bar. The XML information is then formatted by a stylesheet to present the complete page.

The problem with this solution is that you cannot access just the content. You would need to do this if you wanted to format the data into a PDF document, where you do not need the additional information on a header or footer.

Even worse, defining separate pipelines mixes concerns. The content should not need to know about the other parts, and vice versa. So the ideal solution would be to create the parts as separate documents and then be able to combine them.

That's where content aggregation comes in handy. You can define a document that is a combination or aggregation of other documents. To do this, you need to define a pipeline in the sitemap and use some tags specific to content aggregation, as shown in Listing 6.9.

Listing 6.9 An Example of Content Aggregation

<map:pipeline internal-only="true">
  <map:match pattern="header">
    <map:generate src="header.xml"/>
    <map:serialize type="xml"/>
  </map:match>
  <map:match pattern="footer">
    <map:generate src="footer.xml"/>
    <map:serialize type="xml"/>
  </map:match>
  <map:match pattern="navigation">
    <map:generate src="footer.xml"/>
    <map:serialize type="xml"/>
  </map:match>
  <map:match pattern="*">
    <map:generate src="docs/{1}.xml"/>
    <map:serialize type="xml"/>
  </map:match>
</map:pipeline>
<map:pipeline>
  <map:match pattern="docs/*">
    <map:aggregate element="document">
      <map:part src="cocoon:/header"   element="header"/>
      <map:part src="cocoon:/navigation" element="navigation"/>
      <map:part src="cocoon:/{1}"    element="content"/>
      <map:part src="cocoon:/footer"   element="footer"/>
     </map:aggregate>
     <map:transform src="all2html.xsl"/>
     <map:serialize type="html"/>
  </map:match>
 </map:pipeline>

Listing 6.9 has some new elements we need to define before proceeding with our discussion. The most obvious one is the map:aggregate command. It is used inside an XML processing pipeline as a replacement for the map:generate instruction you would have in a normal pipeline. It defines a content aggregation of the parts, which are defined as nested map:part elements. In our example, we are building a complete document containing a header, a footer, navigation, and content. The attribute element of map:aggregate defines the root element of the generated XML document. Each part can have an element, under which you can find this part in the aggregated content. See Listing 6.10.

Listing 6.10 Aggregated Content

<?xml version="1.0"?>
<document>
  <header>
    <!-- here is the content of the header document -->
  </header>
  <navigation>
    <!-- here is the content of the navigation document -->
  </navigation>
  <content>
    <!-- here is the content of the content document -->
  </content>
  <footer>
    <!-- here is the content of the footer document -->
  </footer>
</document>

As you can see from Listing 6.10, the content is aggregated by the various parts. The following components in the pipeline, such as the xslt transformer, can transform this aggregated document into HTML or whatever format is required.

You do not need to define an element attribute for a part. If it is omitted, the part's content is directly included under the document's root node.

The cocoon protocol is used for each part. Therefore, each part is defined by another pipeline somewhere in the sitemap. In this example, these pipelines are all inside their own map:pipeline section in the sitemap.

Normally, because the separate parts are pipelines in the sitemap, you would be able to access them individually using a browser. This is not what you want, however, because it would result in your receiving only part of a document.

Because you do not want to be able to receive only the document header or navigation or footer or the content itself without the surrounding parts, this map:pipeline section is protected with the attribute internal-only set to true. With this attribute set, all marked map:pipeline sections are skipped when Cocoon processes a request. These pipelines can only be invoked "internally" by using the cocoon protocol from within another pipeline.

You can control content aggregation using three more attributes for an aggregated part: prefix, ns, and strip-root. So, a full-featured part might look like this:

<map:part src="cocoon:/header" strip-root="true" 
 prefix="header" ns="header://version/1.0" element="header"/>

The top-level element for the header part is called header. It gets the namespace defined by the attribute ns. The attribute prefix is used to define the prefix. So the top-level element looks like this:

<header:header xmlns:header="header://version/1.0"/>

You can leave out the attribute prefix.

In addition, you can use the attribute strip-root with a Boolean value. If it is set to true, the root element of the aggregated part is stripped off. So if the pipeline for the document header has the root element myheader, it is not included. All children of the myheader element are included under the root element of the part.

Although you might get the impression that you must use the cocoon protocol to aggregate parts, this is not true. You can use any protocol available. The simplest case is aggregating XML files.

Later you will see practical examples and tips and a real-world example of content aggregation. This example—the Cocoon online documentation—also uses some other features not explained yet. One of them is the concept of subsitemaps.

Subsitemaps

When you develop large web applications, or when more than one person is editing the sitemap, it can be very difficult to maintain, because it is a single big XML document.

To simplify sitemap editing and maintenance, Cocoon offers the concept of subsitemaps (see Figure 6.6). A subsitemap looks like a normal sitemap, but it is mounted into the main sitemap. By mounting, we mean that you usually define a URI prefix for a subsitemap. All incoming requests starting with this prefix are then handled by the subsitemap.

Figure 6.6Figure 6.6 Subsitemaps.

The mount points allow you to cascade your sitemaps. This ensures more readability and supports sitemap editors managing the web application. Each subsitemap can then be maintained by a different person. After mounting, you can imagine the whole construction as a tree, with the main sitemap being the root.

When a request for a document enters Cocoon, it is always processed by the main sitemap first. If a mount point for a subsitemap is reached, the processing is passed to the subsitemap (see Listing 6.11).

Listing 6.11 A Basic Example of Mounting a Subsitemap

<map:match pattern="faq/*">
 <map:mount check-reload="yes" src="faq/sitemap.xmap" reload-method="synchron"/>
</map:match>

The src attribute defines the location of the subsitemap. If it ends in a slash, sitemap.xmap is automatically appended to find the sitemap. Otherwise, Cocoon assumes that the src attribute directly defines a file containing the subsitemap.

Like the root sitemap, subsitemaps can be configured with respect to reloading. The configuration is similar to that of the root sitemap in cocoon.xconf. The check-reload attribute, which defaults to true, defines whether changes to the subsitemap should be reflected.

If this reload checking is activated, reload-method specifies whether the subsitemap regeneration should be synchronous or asynchronous. Here the same rules apply as those explained for sitemap reloading at the beginning of this chapter.

The fourth attribute for map:mount is the uri-prefix attribute. As explained, when a request enters Cocoon, the root sitemap is processed with the incoming URI. Now, if a mount point for a subsitemap is reached and Cocoon processes this subsitemap, the same URI is passed in.

For example, if you requested for a document called faq/installation, and the mount defined in Listing 6.11 is reached, this URI is passed on to the subsitemap unchanged. Even though you mounted the sitemap under the path faq, you still have to match this prefix inside the subsitemap. If you want to mount your subsitemap under a different path, such as old-example, you have to update the root sitemap to add a prefix and also all matches inside your subsitemap to reflect this new location (see Listing 6.12).

Listing 6.12 Mounting a Subsitemap with Prefix

<map:match pattern="faq/*">
 <map:mount uri-prefix="faq/" check-reload="yes" src="faq/sitemap.xmap"
  reload-method="synchron"/>
</map:match>

To avoid these problems and to make the subsitemap more independent from the root sitemap, you can use the uri-prefix attribute to pass only the important part into the subsitemap. In the example, you want to pass only installation into the subsitemap.

Because the subsitemap is mounted using the path faq/, you have to remove it from the URI that is passed to the subsitemap. And that's exactly what you do with the uri-prefix attribute. You define a string starting on the left side of the URI. It is removed from the original when processing is passed to the subsitemap. In the example, you want to remove faq/ and therefore give this value to the uri-prefix attribute. Cocoon automatically checks for a trailing slash, so writing either faq or faq/ is equivalent. However, we suggest that you add the slash to make it easier to read your entry.

A subsitemap can look the same as the main sitemap. It can have the same sections, starting with a components section and ending with a pipelines section.

In fact, these two sections are the ones required to make a subsitemap work, as you can see from Listing 6.13. But you can, of course, have all the other sections as well.

Listing 6.13 An Example Subsitemap

<map:sitemap xmlns:map="http://apache.org/cocoon/sitemap/1.0">
  <map:components>
    <map:generators default="file"/>
    <map:transformers default="xslt"/>
    <map:readers default="resource"/>
    <map:serializers default="html"/>
    <map:selectors default="browser"/>
    <map:matchers default="wildcard"/>
  </map:components>
  <map:pipelines>
    <map:pipeline>
      <map:match pattern="*">
        <map:generate src="{1}.xml"/>
        <map:transform src="faq2html.xsl"/>
        <map:serialize/>
      </map:match>
    </map:pipeline>
  </map:pipelines>

</map:sitemap>

All requests entering the main sitemap that start with the prefix faq/ are passed to the subsitemap. The prefix is removed from the URI, and the subsitemap receives only the part of the URI that comes after this prefix.

So a request for faq/installing is passed as a request for installing to the subsitemap. As defined in the subsitemap in Listing 6.13, the request reads an XML document named installing.xml, transforms it, and serializes it as HTML.

As you can see from this example, you can use all the sitemap components from the main sitemap without declaring them again, but in order to make the subsitemap work, you have to declare the default component for each component type.

However, in order to separate concerns, you can define specific sitemap components in the components section of your subsitemap. These components are then accessible only in this subsitemap, not in the parent sitemap. You can also redefine a component inherited from the parent sitemap but with another configuration. Again, this configuration is used only in the subsitemap.

Using subsitemaps helps you manage your web site. Each sitemap editor has his own separate sitemap that cannot interfere with the other sitemaps. Even if a subsitemap stops working due to a mistake made in the subsitemap, the main sitemap and all other subsitemaps still work.

The hierarchical structure of sitemaps is not limited to two levels (one main sitemap with several subsitemaps). Because a subsitemap is a full-featured sitemap that inherits from the parent (or main) sitemap, it can have its own subsitemaps. So you can build a big tree of sitemaps using this concept.

Each subsitemap can have its own directory to store resources such as XML documents and stylesheets. All URIs that do not have an explicit protocol are resolved according to the sitemap's directory. In the example, the subsitemap is stored in the directory faq. The pipeline for a document reads an XML document that is resolved relative to this directory faq.

Apart from using the concept of subsitemaps to maintain your web site, you can also use views to organize what you send to the client application.

Views

Chapter 4 glossed over the explanation of the map:views and map:resources sections in the sitemap. Let's now fill in this gap, starting with views.

A request you send to Cocoon is mapped to a pipeline in the sitemap. That pipeline uses a combination of components to generate an end result, a document that is returned to you as a result of your request. You can think of the end result as being the default view of the document generated by that particular pipeline. However, Cocoon also lets you configure and request other views of a particular document.

Cocoon offers a wide variety of configurable views for its documents. You can request a document's content view, and you will get the content in that document's XML format. Or you can ask for a document's link view and get all the links to other documents contained in this document.

The views concept is complex. So we'll start our discussion of views by looking at some simple examples and examining some use-cases. The first thing you need to know is how to specify which view of the document you want when sending the request to Cocoon. You do so using the request parameter cocoon-view with the value of the view name you ask for. So if you ask for http://localhost:8080/cocoon/helloworld?cocoon-view=content, you receive that document's content view.

The more complex question is how Cocoon knows what to do when a view is requested. Generally speaking, a view is an alternative pipeline for a document. It starts like the original pipeline for the document, but it has a different ending.

Assume that you have a standard pipeline consisting of a file generator, an xslt transformer, and an html serializer. You can then define a different view using the same file generator but a different transformer and serializer.

A view definition consists of two parts, as shown in Listing 6.14. The first part specifies which parts or beginning of the original pipeline should be used for the view. The second part defines the alternative ending. The ending is defined in the map:views section of the sitemap.

Listing 6.14 Views

<map:views>
  <map:view name="content" from-label="content">
    <map:serialize type="xml"/>
  </map:view>
  <map:view name="links" from-position="last">
    <map:serialize type="links"/>
  </map:view>
</map:views>

For each possible view, you create a map:view element with the attribute name specifying the view's unique name. Inside this element, you define the pipeline's ending. Because this is only the ending, you must not define a generator. However, you can use transformers, and you must provide a serializer.

Listing 6.14 shows two defined views: the content view and the links view. Each new view contains only a serializer. Looking at the links view, you can see that the attribute from-position has the value last. This tells Cocoon where the new pipeline should take over from the original when the links view is requested. In this case, the alternative ending for this view starts at the last position of the original pipeline.

In other words, the serializer of the original pipeline is ignored, and instead, all sitemap components enclosed in this view are appended. So the links view differs from the original document in that it uses the links serializer (see Listing 6.15).

Listing 6.15 The Link Serializer

<map:serializers>
  <map:serializer name="links"
   src="org.apache.cocoon.serialization.LinkSerializer"/>
</map:serializers>

The link serializer is a special serializer that outputs plain text. It extracts all links and references from a document and puts each link in a separate line of the output text. These links and references are searched for in the original document by searching for the attributes src and href.

Another possibility is to define the value first for the view's from-position attribute. Then the alternative pipeline starts immediately after the original generator.

But Cocoon wouldn't be Cocoon if these were the only possibilities for defining views! You can define more fine-grained views by using the attribute from-label on the view. The value of this attribute marks a label that can be used in the original pipeline for the sitemap components.

With this label attached to sitemap components such as generators and transformers, you define which components of the original pipeline should be used for the view. Listing 6.16 shows an example.

Listing 6.16 An Example of Labeled Views

<map:generators default="file">
  <map:generator name="file" label="content"
  src="org.apache.cocoon.generation.FileGenerator"/>
  <map:generator name="html" src="org.apache.cocoon.generation.HTMLGenerator"/>
  ...
</map:generators>
...
<map:pipeline>
  <map:match pattern="document_one">
    <map:generate src="document.xml"/>
    <map:transform src="document2html.xsl"/>
    <map:serialize/>
  </map:match>
  <map:match pattern="document_two">
    <map:generate src="page.html" type="html"/>
    <map:transform label="content" src="restructure.xsl"/>
    <map:transform src="document2html.xsl"/>
    <map:serialize/>
  </map:match>
</map:pipeline>

The component definition of the file generator is labeled with a label called content. This indicates that whenever a view is requested and this view uses the label content, the generator is included in the pipeline for this view. Similarly, you can mark other generators and transformers in the components section as well.

The pipeline for the first document, called document_one (see Figure 6.7), is assembled by the file generator, an xslt transformer, and the html serializer. When the content view is requested, Cocoon looks at the map:views section and finds the definition for this view. This view indicates that the label content is used. During the pipeline assembly, the components for this pipeline are checked for the label.

Figure 6.7Figure 6.7 A simple example of using views.

The file generator is labeled, so it is used. If a component is labeled, it is added to the pipeline for the view, and the usual pipeline processing is passed to the views section. All other sitemap components of the original pipeline are ignored, and the components of the views section are appended.

The pipeline for the second example (document_two), shown in Figure 6.8, is assembled by the html generator, two xslt transformers, and the html serializer. Note that neither the html generator nor the xslt transformer is labeled in the components section. When the content view of this document is requested, the original pipeline is searched for the label.

Figure 6.8Figure 6.8 An advanced example of using views.

In general, the xslt transformer is not labeled, so it usually isn't added to the pipeline for the view. But for this special pipeline, you can indicate that the transformer should be added by giving it an attribute label with the value content. The first xslt transformer is labeled using the attribute label with the given value.

The process here is the same as in the first example. All sitemap components are added to the pipeline until one component is labeled. This component is added as well, but the following ones are skipped. Then the view's sitemap components are appended. For this example, the view is assembled from the html generator, the first xslt transformer, and the xml serializer from the content view.

Regardless of whether the label is defined in the components or pipelines section of the sitemap, the original sitemap is left immediately after the first component containing the label. Even if you have more than one component in the pipeline marked with the required label, only the first component containing it is used.

As you will see at the end of this chapter, the links view is important for the offline generation of documents using Cocoon's command-line interface.

Now that you know about Cocoon's views, you know about nearly all of a sitemap's sections. So, let's discuss the last one.

Sitemap Resources

The last section we have yet to explain is the map:resources section (see Listing 6.17). This section is very similar to the map:pipeline section. You can define XML processing pipelines containing a generator, transformers, and a serializer and give this pipeline a name for further use in the map:pipelines section of the sitemap.

Listing 6.17 An Example of a Sitemap Resource

<map:resources>
  <map:resource name="Not authorized">
    <map:generate src="notauthorized.xml"/>
    <map:transform src="tohtml.xsl"/>
    <map:serialize/>
  </map:resource>
</map:resources>

You can refer to this resource from the map:pipelines section using the unique name for these sitemap resources. So a sitemap resource can be compared to a macro or a placeholder.

Currently, the only place in Cocoon where you can use sitemap resources is for redirects.

Redirects

Basically, a redirect allows you to jump from one pipeline to another. You can redirect to a totally different URI or to a previously defined sitemap resource. Listing 6.18 shows two examples.

Listing 6.18 Examples of Redirects

<map:redirect-to uri="helloworld"/>
<map:redirect-to resource="Not authorized"/>

Unfortunately, the semantics of the map:redirect statement differ a bit from the semantics of the other sitemap components. Usually if you specify a source, such as for a generator, and you do not specify a protocol for the URI, Cocoon automatically adds the context protocol.

However, for a redirect to a relative URI, this is not the case. Cocoon implicitly adds the same protocol used to request the original document. For example, if you request a document with http://localhost:8080/cocoon/original_document, and this results in the execution of the previous redirect to helloworld as shown in Listing 6.18, Cocoon generates a new URI using the old one as a base. The redirect then references http://localhost:8080/cocoon/helloworld. So a relative URI is translated into an absolute URI.

Cocoon does not directly process redirects. Instead, an HTTP response to the client is generated. This response contains the information to process a redirect in addition to the redirect URI as content. The client itself recognizes this redirect and starts a new request with the new URI. Whenever you use a redirect, this results in at least two requests to your server. The first one identifies the redirect, and the second requests the redirected document.

If you redirect to a sitemap resource, the processing flow is continued in the new sitemap resource. Thus, the sitemap components defined in this resource are executed.

Now that you know about all the additional sitemap features and some Cocoon configuration points, it is time to bring in two new components and show you some examples that use them and the concepts described in this chapter.

Connecting to a Database

You can use the sql transformer in a pipeline to integrate a database as one of the data sources in a Cocoon application. Using this transformer, you can send any SQL command to a database. The transformer is controlled by commands contained in the XML stream processed by the transformer. If the SQL command fetches data from the database, the data is converted into XML.

You might wonder why this is a transformer and not a generator. The key point is usability. In general, SQL statements can have many options and parameters. This starts with specifying the database to use, the tables, and the rows, and it ends with complex information such as search phrases. If you want to use a generator, you have to specify all this in the sitemap as parameters for the generator. Changing a simple value would then require changing the sitemap.

Using a transformer allows you to build more-complex pipelines in which the information on what to fetch from the database is determined at runtime using the file generator, for example. When the request is processed, the file generator reads an XML file that contains the actual parameters for the transformer. Because the file generator can request the XML file via a protocol such as HTTP, this allows the dynamic generation of those commands.

Listing 6.19 shows the configuration of the sql transformer in the sitemap and how to use it in a pipeline.

Listing 6.19 SQL Transformer

<map:transformers>
  <map:transformer name="sql"
         src="org.apache.cocoon.transformation.SQLTransformer"/>
</map:transformers/>
...
<map:pipeline>
  <map:match pattern="test">
    <map:generate src="document.xml"/>
    <map:transform type="sql"/>
    <map:transform src="tohtml.xsl"/>
    <map:serialize/>
  </map:match>
 </map:transform>

You can send any valid SQL command to the database. This is triggered by your XML document. Listing 6.20 shows an XML document that is read by the file generator and then is transformed by the sql transformer.

Listing 6.20 A Simple SQL Example

<document>
  <sql:execute-query xmlns:sql="http://apache.org/cocoon/SQL/2.0">
    <sql:use-connection>personnel</sql:use-connection>
    <sql:query>
      select id,name from department_table
    </sql:query>
  </sql:execute-query>
</document>

The sql transformer is triggered by XML elements that have the transformer's namespace, http://apache.org/cocoon/SQL/2.0. Each command is started by the element execute-query. Nested inside this element is all the information for the sql transformer, a combination of elements and text information.

The element use-connection defines which connection (or database) should be used for the SQL command. The following example will show you how you can configure database connections. For now, just assume you have defined a database connection named personnel.

Inside the query element, you can see the actual SQL command to be sent to the database. When the sql transformer receives such an XML block, it removes it from the XML document. If the SQL command fetches some data, this data is converted to XML and is inserted instead of the XML block controlling the sql transformer.

How is this data converted? An element rowset is created. Inside this element for each fetched row, an element named row is created. Inside this element, for each fetched column of this row, an element is created and is named the same as the column name. Inside this element is a text node with the value of that column from the database. All these elements get the namespace of the sql transformer.

You could then simply add a stylesheet to the XML processing pipeline, converting the rowset to an HTML table or whatever you like. The output displayed in Listing 6.21 is an intermediate XML document that is created during the pipeline processing. Because you will receive HTML in your browser, you will never notice this document; you will see only the starting XML document and the final output.

Listing 6.21 The Document after a SQL Transformer Run

<document xmlns:sql="http://apache.org/cocoon/SQL/2.0">
  <sql:rowset>
    <sql:row>
      <sql:name>Matthew</sql:name>
      <sql:id>1</sql:id>
    </sql:row>
    <sql:row>
      <sql:name>Carsten</sql:name>
      <sql:id>2</sql:id>
    </sql:row>
  </sql:rowset>
</document>

But what if your resulting document does not display the data you wanted? You need to know what the sql transformer has output in order to see if your SQL statement is working. You can, of course, change your document's pipeline definition. Instead of using a stylesheet to produce HTML and the html serializer, you can simplify the pipeline by removing the stylesheet and using the xml serializer. This shows you the data delivered by the sql transformer directly in your browser.

Another answer to this problem is to use the log transformer to see what is happening in the pipeline.

Logging

Usually pipelines consist of three or more sitemap components, starting with a generator, going to some transformers, and ending with a serializer. In the case of the file generator, you can see the starting XML document that is read by this component and the end result of the pipeline processing.

But what can you do if your output document doesn't look as you expected? One simple solution is to change your pipeline. Just remove all transformers after the component you want to test, and add the xml serializer. You will get the output of the transformer you want to test directly in XML.

If this stage of your pipeline looks right, you can then remove the next transformer in the chain and look at that output, and so on until you know where the fault is.

Another possibility is the log transformer (see Listing 6.22), which can be chained between two sitemap components. As the name suggests, this transformer logs the output of the sitemap component before the log transformer.

Listing 6.22 The Log Transformer

<map:transformers>
  <map:transformer name="log"
         src="org.apache.cocoon.transformation.LogTransformer"/>
</map:transformers/>
...
<map:pipeline>
  <map:match pattern="test">
    <map:generate src="document.xml"/>
    <map:transform type="sql"/>
    <map:transform type="log">
      <map:parameter name="logfile" value="logfile.log"/>
      <map:parameter name="append" value="no"/>
    </map:transform>
    <map:transform src="tohtml.xsl"/>
    <map:serialize/>
  </map:match>
 </map:transform>

In Listing 6.22, the output of the sql transformer is logged. When no parameter is set to the log transformer, it outputs everything to the servlet log of your servlet engine. But you can, of course, redirect the output to a file on your local hard drive. The sitemap parameter logfile defines the location of that file. With the parameter append, you can specify whether a new log file should always be written, or if the output should be appended to an existing file.

But be careful with using the log transformer in a servlet environment. It is not safe for concurrent requests. So if more than one client requests a document containing the log transformer, the output is mixed by these two pipelines. So for debugging, you should be sure that only one client invokes the request at a time.

This section covered the advanced features of the sitemap. You saw that a Cocoon application is not limited to just one sitemap, but that sitemaps can be cascaded. This feature is particularly useful when the application consists of separate parts. Using the available protocols and components such as the sql transformer, you can integrate existing data sources into your application. Content aggregation allows configured information sources to be flexibly combined into a single document. The document you receive as a pipeline's output is only one of the views Cocoon can provide. Using the views concept, you can define alternative pipelines that can return, for example, only the content or the links of a particular document. You can use the logging mechanism to check on what is happening in your pipeline, which is important if things do not work as expected.

Although the most common form of running Cocoon is as a servlet, this is only one way of using the framework. In fact, it is only a very small part of Cocoon that is servlet specific. This part is only one of the interfaces Cocoon provides to the outside world. Another important interface that allows Cocoon to be used in different environments is the Command-Line Interface.

Peachpit Promotional Mailings & Special Offers

I would like to receive exclusive offers and hear about products from Peachpit and its family of brands. I can unsubscribe at any time.

Overview


Pearson Education, Inc., 221 River Street, Hoboken, New Jersey 07030, (Pearson) presents this site to provide information about Peachpit products and services that can be purchased through this site.

This privacy notice provides an overview of our commitment to privacy and describes how we collect, protect, use and share personal information collected through this site. Please note that other Pearson websites and online products and services have their own separate privacy policies.

Collection and Use of Information


To conduct business and deliver products and services, Pearson collects and uses personal information in several ways in connection with this site, including:

Questions and Inquiries

For inquiries and questions, we collect the inquiry or question, together with name, contact details (email address, phone number and mailing address) and any other additional information voluntarily submitted to us through a Contact Us form or an email. We use this information to address the inquiry and respond to the question.

Online Store

For orders and purchases placed through our online store on this site, we collect order details, name, institution name and address (if applicable), email address, phone number, shipping and billing addresses, credit/debit card information, shipping options and any instructions. We use this information to complete transactions, fulfill orders, communicate with individuals placing orders or visiting the online store, and for related purposes.

Surveys

Pearson may offer opportunities to provide feedback or participate in surveys, including surveys evaluating Pearson products, services or sites. Participation is voluntary. Pearson collects information requested in the survey questions and uses the information to evaluate, support, maintain and improve products, services or sites; develop new products and services; conduct educational research; and for other purposes specified in the survey.

Contests and Drawings

Occasionally, we may sponsor a contest or drawing. Participation is optional. Pearson collects name, contact information and other information specified on the entry form for the contest or drawing to conduct the contest or drawing. Pearson may collect additional personal information from the winners of a contest or drawing in order to award the prize and for tax reporting purposes, as required by law.

Newsletters

If you have elected to receive email newsletters or promotional mailings and special offers but want to unsubscribe, simply email ask@peachpit.com.

Service Announcements

On rare occasions it is necessary to send out a strictly service related announcement. For instance, if our service is temporarily suspended for maintenance we might send users an email. Generally, users may not opt-out of these communications, though they can deactivate their account information. However, these communications are not promotional in nature.

Customer Service

We communicate with users on a regular basis to provide requested services and in regard to issues relating to their account we reply via email or phone in accordance with the users' wishes when a user submits their information through our Contact Us form.

Other Collection and Use of Information


Application and System Logs

Pearson automatically collects log data to help ensure the delivery, availability and security of this site. Log data may include technical information about how a user or visitor connected to this site, such as browser type, type of computer/device, operating system, internet service provider and IP address. We use this information for support purposes and to monitor the health of the site, identify problems, improve service, detect unauthorized access and fraudulent activity, prevent and respond to security incidents and appropriately scale computing resources.

Web Analytics

Pearson may use third party web trend analytical services, including Google Analytics, to collect visitor information, such as IP addresses, browser types, referring pages, pages visited and time spent on a particular site. While these analytical services collect and report information on an anonymous basis, they may use cookies to gather web trend information. The information gathered may enable Pearson (but not the third party web trend services) to link information with application and system log data. Pearson uses this information for system administration and to identify problems, improve service, detect unauthorized access and fraudulent activity, prevent and respond to security incidents, appropriately scale computing resources and otherwise support and deliver this site and its services.

Cookies and Related Technologies

This site uses cookies and similar technologies to personalize content, measure traffic patterns, control security, track use and access of information on this site, and provide interest-based messages and advertising. Users can manage and block the use of cookies through their browser. Disabling or blocking certain cookies may limit the functionality of this site.

Do Not Track

This site currently does not respond to Do Not Track signals.

Security


Pearson uses appropriate physical, administrative and technical security measures to protect personal information from unauthorized access, use and disclosure.

Children


This site is not directed to children under the age of 13.

Marketing


Pearson may send or direct marketing communications to users, provided that

  • Pearson will not use personal information collected or processed as a K-12 school service provider for the purpose of directed or targeted advertising.
  • Such marketing is consistent with applicable law and Pearson's legal obligations.
  • Pearson will not knowingly direct or send marketing communications to an individual who has expressed a preference not to receive marketing.
  • Where required by applicable law, express or implied consent to marketing exists and has not been withdrawn.

Pearson may provide personal information to a third party service provider on a restricted basis to provide marketing solely on behalf of Pearson or an affiliate or customer for whom Pearson is a service provider. Marketing preferences may be changed at any time.

Correcting/Updating Personal Information


If a user's personally identifiable information changes (such as your postal address or email address), we provide a way to correct or update that user's personal data provided to us. This can be done on the Account page. If a user no longer desires our service and desires to delete his or her account, please contact us at customer-service@informit.com and we will process the deletion of a user's account.

Choice/Opt-out


Users can always make an informed choice as to whether they should proceed with certain services offered by Adobe Press. If you choose to remove yourself from our mailing list(s) simply visit the following page and uncheck any communication you no longer want to receive: www.peachpit.com/u.aspx.

Sale of Personal Information


Pearson does not rent or sell personal information in exchange for any payment of money.

While Pearson does not sell personal information, as defined in Nevada law, Nevada residents may email a request for no sale of their personal information to NevadaDesignatedRequest@pearson.com.

Supplemental Privacy Statement for California Residents


California residents should read our Supplemental privacy statement for California residents in conjunction with this Privacy Notice. The Supplemental privacy statement for California residents explains Pearson's commitment to comply with California law and applies to personal information of California residents collected in connection with this site and the Services.

Sharing and Disclosure


Pearson may disclose personal information, as follows:

  • As required by law.
  • With the consent of the individual (or their parent, if the individual is a minor)
  • In response to a subpoena, court order or legal process, to the extent permitted or required by law
  • To protect the security and safety of individuals, data, assets and systems, consistent with applicable law
  • In connection the sale, joint venture or other transfer of some or all of its company or assets, subject to the provisions of this Privacy Notice
  • To investigate or address actual or suspected fraud or other illegal activities
  • To exercise its legal rights, including enforcement of the Terms of Use for this site or another contract
  • To affiliated Pearson companies and other companies and organizations who perform work for Pearson and are obligated to protect the privacy of personal information consistent with this Privacy Notice
  • To a school, organization, company or government agency, where Pearson collects or processes the personal information in a school setting or on behalf of such organization, company or government agency.

Links


This web site contains links to other sites. Please be aware that we are not responsible for the privacy practices of such other sites. We encourage our users to be aware when they leave our site and to read the privacy statements of each and every web site that collects Personal Information. This privacy statement applies solely to information collected by this web site.

Requests and Contact


Please contact us about this Privacy Notice or if you have any requests or questions relating to the privacy of your personal information.

Changes to this Privacy Notice


We may revise this Privacy Notice through an updated posting. We will identify the effective date of the revision in the posting. Often, updates are made to provide greater clarity or to comply with changes in regulatory requirements. If the updates involve material changes to the collection, protection, use or disclosure of Personal Information, Pearson will provide notice of the change through a conspicuous notice on this site or other appropriate way. Continued use of the site after the effective date of a posted revision evidences acceptance. Please contact us if you have questions or concerns about the Privacy Notice or any objection to any revisions.

Last Update: November 17, 2020