Publishers of technology books, eBooks, and videos for creative people

Home > Articles

This chapter is from the book

This chapter is from the book

Practical Examples and Tips

This chapter has covered a lot of topics so far. Hopefully you have been able to use some of these new features to extend an application you already have built. We will now look at some examples and give you a few tips on getting the most out of Cocoon when you use it to build applications that other people might also use.

The two following examples help you understand the components and concepts presented so far. The first one is a small example showing you how to use the sql transformer to fetch data from a database. You might need to use the log transformer in this example if you have any problems connecting to the database. The second example is a bigger real-world example: the Cocoon Documentation System. This system uses nearly all the concepts explained so far.

We will then look at how you can make sure that your Cocoon application is set up to handle all the requests it might receive when you release it into a production environment.

A SQL Example

The following example requires a database that can be used from Java, so you need a JDBC driver. Instead of using your own database, you can use the included HSQLDB shipped with the Cocoon distribution. This database is completely written in Java and can be started automatically when Cocoon is run.

However, if you want to use your own database, you have to include a suitable driver. Put this driver class either in a JAR file in Cocoon's WEB-INF/lib directory or as a class file in the WEB-INF/classes directory. In order to make the driver available, you have to add it to the list of loaded classes in the web application deployment descriptor (web.xml), as shown in Listing 6.24. The parameter load-class gets a list of classes that are automatically loaded at startup.

Listing 6.24 Adding Drivers

<!--
   This parameter is used to list classes that should be loaded
   at initialization time of the servlet.
   Usually these classes are JDBC Drivers used
-->
  <init-param>
    <param-name>load-class</param-name>
    <param-value>
      <!-- For HSQLDB: -->
      org.hsqldb.jdbcDriver
      <!-- ODBC -->
      sun.jdbc.odbc.JdbcOdbcDriver
    </param-value>
  </init-param>

Next you have to add a connection to your database in cocoon.xconf. Listing 6.25 is an excerpt from cocoon.xconf that shows a custom connection called personnel.

Listing 6.25 Configuring Data Sources

<datasources>
  <jdbc name="personnel">
    <dburl>jdbc:hsqldb:hsql://localhost:9002</dburl>
    <user>sa</user>
    <password></password>
  </jdbc>
</datasources>

For this connection, you can define the URL to the database, the username, and the password. These three settings depend on which database you use. The user and password might be optional. If you want to use the HSQLDB, the values shown here should work right out of the box.

After you have defined your database connection, you can use it in the sql transformer by specifying the use-connection element for the transformer. Save the XML document shown in Listing 6.26 to the Cocoon context directory, and name it sqlexample.xml.

Listing 6.26 A Simple SQL Example

<document>
  <sql:execute-query xmlns:sql="http://apache.org/cocoon/SQL/2.0">
    <sql:use-connection>personnel</sql:use-connection>
    <sql:query>
      select id,name from department
    </sql:query>
  </sql:execute-query>
</document>

If you are using your own database, you might need to adjust the select statement. A stylesheet for the SQL data, transforming it to a simple HTML table, could look like Listing 6.27. Save this stylesheet, and name it sqlexample.xsl.

Listing 6.27 A Simple SQL Stylesheet

<?xml version="1.0"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
        xmlns:sql="http://apache.org/cocoon/SQL/2.0">

<xsl:template match="document">
  <html><body><table>
    <xsl:apply-templates select="sql:rowset/sql:row"/>
  </table></body></html>
</xsl:template>

<xsl:template match="sql:row">
  <tr>
    <xsl:apply-templates/>
  </tr>
</xsl:template>

<xsl:template match="sql:id|sql:name">
  <td>
    <xsl:value-of select="."/>
  </td>
</xsl:template>

</xsl:stylesheet>

Again, if you use a custom database and table, you might have to adjust the stylesheet to reflect different column names. The pipeline for this example is very simple. It is shown in Listing 6.28.

Listing 6.28 A Sample SQL Pipeline

<map:pipeline>
  <map:match pattern="sqldocument">
    <map:generate src="sqlexample.xml"/>
    <map:transform type="sql"/>
    <map:transform src="sqlexample.xsl"/>
    <map:serialize/>
  </map:match>
</map:pipeline>

Start your browser, and request http://localhost:8080/cocoon/sqldocument. You will get the XML data from the database displayed as an HTML table. If you are using your custom database and you face any problems, add the log transformer after the sql transformer to see what data is coming from your database.

Using databases with Cocoon is very easy, as you can see from this example. To demonstrate more of the features introduced in this chapter, we will now look at a larger working example.

The Cocoon Documentation System

One of the best sample applications using many of the features we have described in this and the previous chapters is the Cocoon Documentation System. Because Cocoon itself is an XML publishing framework, the documentation is, of course, generated by Cocoon. Some of the features the documentation system uses include content aggregation, subsitemaps, the cocoon protocol, and image generation using SVG. All these features allow the documentation to be written in a fashion that separates the content from the layout.

Because this example is rather complex and uses many resources, we will examine only the basic idea behind this system. In addition, we will look at some excerpts from each of the files. If you're interested in seeing more than what's presented here, the whole system can be found inside the Cocoon distribution.

The Cocoon documentation (see Figure 6.9) is served by a subsitemap that is independent of the main sitemap. (You will find the subsitemap and all other resources in the documentation directory of your Cocoon context directory.)

Figure 6.9Figure 6.9 The Cocoon documentation.

The documentation is currently available in HTML. Each HTML page consists of a static header, a navigation bar on the left side, and the content for the current document on the right.

The navigation bar is created by an index, which is called a book in Cocoon. The documentation is arranged in several hierarchically nested books. There is one main book, and it contains documents and subbooks. You can compare this to a directory structure such as your filesystem. A book is similar to a directory: It has a name, and it contains documents (files) or other books (directories).

As you might have already guessed, each HTML document is created using content aggregation and the cocoon protocol. Let's have a look at the sitemap entries, shown in Listing 6.29.

Listing 6.29 An Excerpt from the Cocoon Documentation Sitemap

<map:pipeline>

  <map:match pattern="*.html">
    <map:aggregate element="site">
      <map:part src="cocoon:/book-{1}.xml"/>
      <map:part src="cocoon:/body-{1}.xml"/>
    </map:aggregate>
    <map:transform src="stylesheets/site2xhtml.xsl">
      <map:parameter name="use-request-parameters" value="true"/>
      <map:parameter name="header" value="graphics/{1}-header.jpg"/>
    </map:transform>
    <map:serialize/>
  </map:match>

  <map:match pattern="**book-**.xml">
    <map:generate src="xdocs/{1}book.xml"/>
      <map:transform src="stylesheets/book2menu.xsl">
        <map:parameter name="use-request-parameters" value="true"/>
        <map:parameter name="resource" value="{2}.html"/>
      </map:transform>
      <map:serialize type="xml"/>
    </map:match>

    <map:match pattern="body-**.xml">
      <map:generate src="xdocs/{1}.xml"/>
      <map:transform src="stylesheets/document2html.xsl"/>
      <map:serialize/>
    </map:match>

</map:pipeline>

The document names available from these pipelines do not follow our recommendation: They use explicit endings such as .xml and .html. The HTML document is aggregated by two parts—a part called book, and a part called body. The book part reads the current book and creates the navigation bar from it. This navigation bar is transformed by a stylesheet to partial XHTML.

The body part reads the real content from the XML document and transforms it into partial XHTML as well. The main pipeline for the document aggregates these two parts and combines the XHTML fragments using a stylesheet. It also adds the constant header.

The navigation bar and title displayed in the document's header are actually images. These images are rendered using SVG. We left out the pipelines for the images, but they are specified in the installed Cocoon application. Inside the Cocoon context directory is a directory called documentation. This directory contains a subsitemap named sitemap.xmap that contains all pipelines for the whole documentation system.

This example of a real application shows how a web site can be built very easily with Cocoon. By using content aggregation, you separate the different parts of one document and can maintain them more easily. Just take your time and have a look at this application and how it works. It will help you understand the concepts you have learned so far. You will also get a look behind the scenes of Cocoon's documentation system. Of course, one of the most important features of any Internet application, such as the documentation system or a portal built with Cocoon, is how fast the required information is returned. After all, no one wants to wait around for minutes until the browser displays the requested document. Cocoon provides two methods of speeding up the application: pipeline caching and component pooling.

The Cocoon Caching Mechanism

As you have seen, Cocoon generates documents using pipelines that contain a variety of components. You have seen that each time a request reaches a pipeline, the required document is generated and returned to the calling application. Using Cocoon's caching mechanism, you can control whether the document is actually generated or whether it can be returned from a cache. This speeds up the time it takes to return the document, because the pipeline does not have to be processed completely. Cocoon's caching algorithm is very flexible, but fortunately it is also very easy to handle. Let's start with a description of the caching algorithm.

Cocoon generates a stream pipeline for each request. This stream pipeline either is a reader or consists of an event pipeline and a serializer. The event pipeline in turn is assembled by a generator and the used transformers (if any).

Cocoon's caching algorithm can cache the result of a stream pipeline and/or an event pipeline. The caching for such a pipeline is turned on or off in cocoon.xconf (see Listing 6.30). Because everything in Cocoon is implemented using Avalon components, you simply specify which implementation for an event or stream pipeline should be used: the caching or the noncaching one. You will learn more about these components when we explain Cocoon from the developer perspective in Chapter 8.

Listing 6.30 Turning on Caching in cocoon.xconf

<event-pipeline class=
"org.apache.cocoon.components.pipeline.CachingEventPipeline"/>
<stream-pipeline class=
"org.apache.cocoon.components.pipeline.CachingStreamPipeline"/>

These lines turn on caching for both pipelines. The code shown in Listing 6.31 turns it off. Of course, you can mix it and turn on caching for event pipelines but not for stream pipelines. If you want to change your setting, locate the lines for event-pipeline and stream-pipeline in your cocoon.xconf and change the class attribute.

Listing 6.31 Turning off Caching

<event-pipeline class=
"org.apache.cocoon.components.pipeline.NonCachingEventPipeline"/>
<stream-pipeline class=
"org.apache.cocoon.components.pipeline.NonCachingStreamPipeline"/>

But what does it mean if caching is turned on? The following explanation is simplified for the user perspective. We will look at the full power of the caching algorithm in Chapter 8.

But for now, let's start with the stream pipelines. The result of a stream pipeline, for example, can be cached if it is a reader, which can cache. So we can redefine the question: When can a reader cache?

A reader (and this is also true for the other sitemap components, as you will soon see) can cache if it can detect that the content has changed since it was last read. For example, the resource reader reads a file. It can detect whether the file has changed by looking at what time the file was last changed.

So the first time the resource reader reads a document, the caching algorithm stores this document, along with the current time. The next time this document is requested, the caching algorithm provides this time to the reader, which simply checks whether the cached content is still valid. If it is, the cache serves the document. If it is not valid, the cached content is discarded, the reader reads the file again, and the cache stores this along with the current time.

But there are cases in which the reader cannot detect content changes, such as if it gets the read file via HTTP or any other connection. In this case, the reader can't support caching, so nothing is cached. This means that even though Cocoon provides a means of caching pipelines, it is still dependent on the data source to provide a means of determining whether the content has changed since it was last accessed.

If the stream pipeline consists of an event pipeline and a serializer, both parts must support caching. Most serializers in Cocoon support caching, because they are only dependent on the XML they receive from the event pipeline.

The question of whether an event pipeline can be cached is more complex, because the pipeline consists of several components. It is cacheable only if all the components are themselves cacheable. In the event pipeline, the caching algorithm asks each component if it supports caching, starting with the generator. For each component that supports it, a unique key is generated. Then the next pipeline component is queried. This process continues until either all components are queried or one component is not cacheable.

The keys of all cacheable components are chained, and together they build the cache key. The request is processed, and the document is built. The cache stores the result of the last component, indicating cacheability. The next time this document is requested, the key is built, and the cached content is fetched from the cache.

Next, the cache asks all components of the event pipeline if their input has changed since the time the content was cached. For example, the generator checks this by looking at the last modification date of the XML document, the xslt transformer checks the date of the stylesheet, and so on. Only if all state that the content is still valid is it used from the cache. Otherwise, the document is generated from scratch. So the event pipeline tries to cache as much of the XML processing pipeline as possible.

Caching the pipeline results and being able to return them as fast as possible is perhaps the key factor to whether an Internet application built with Cocoon will be successful and whether people will like using it. Cocoon's built-in caching already provides a powerful mechanism for doing this and should be used whenever possible. Another important factor in any component-based system is the performance at which new components are created when they are needed.

Pooling Your Components

Nearly everything inside Cocoon is an Avalon component. Without going into too much detail about the Avalon component model and the life cycle of components, we'll explain how you can fine-tune your application in this area.

For each request received by Cocoon, a lot of Avalon components are generated—one event pipeline, one stream pipeline, one generator, one or more transformers, and a serializer. (In fact, there are more, but these will do for the moment.)

If several documents are requested at the same time, this set of components is created for each request. For example, if 50 documents are requested simultaneously, you end up with 50 event pipelines, 50 stream pipelines, 50 generators, and so on.

One of the most time-consuming operations in Java is the creation and destruction of new objects. Therefore, the Avalon component model supports the pooling of objects. This means that a component is created once, locked when used inside a request processing, and released for further use after the request is processed. It is not destroyed and can be reused for the next request.

If only one request at a time is processed, such a pooled component is created once, locked for this request, used for this request, and released afterwards. When the next request arrives, the same process starts again.

If more than one request is processed at the same time, a pooled component must be created for each request. If 50 requests arrive simultaneously, 50 components must be created. If they all can be pooled, the pool grows to 50 components. At first glance, this seems desirable, but imagine that one day 1000 requests are processed simultaneously. You end up having 1000 components in your pool, although the average of simultaneous requests is less.

In order to adjust your application to the load you might have, you can control the pooling of the Avalon components. You can define how many components are to be stored inside the pool by specifying a minimum and maximum number, as well as how the pool should grow if no free component is available from the pool. If your pool reaches the maximum, but there are more requests to serve, Avalon creates new components to process the request, but these components are discarded afterwards and are not added to the pool.

The configuration of this pooling is on a per-component basis. So you set the values separately for each component—for the stream pipeline, for the event pipeline, for the file generator, and so on. Listing 6.32 shows a sample pooling configuration.

Listing 6.32 An Example of a Pooling Configuration

<stream-pipeline class=
"org.apache.cocoon.components.pipeline.CachingStreamPipeline"
          pool-max="32" pool-min="16" pool-grow="4"/>

<generator name="file" src="org.apache.cocoon.generation.FileGenerator"
          pool-max="64" pool-min="16" pool-grow="4"/>

In Listing 6.32, you see the configuration for the stream pipeline, which is done in cocoon.xconf, and for the file generator, taken from the sitemap. Remember that both the sitemap and cocoon.xconf contain components that are based on Avalon and therefore can be pooled.

Both configurations are similar in that they use three special attributes. pool-min defines the minimum number of components in the pool. When the pool is instantiated, this number of components is created at startup. pool-max defines the maximum number of components to hold in the pool. pool-grow gives the number by which the pool increases each time no free component is available.

If you set the log level to DEBUG, you can see if your pools are too small by searching for a message containing the phrase "decommissioning instance of." This message is output each time a poolable instance is created when the pool has reached maximum capacity. The component's class name follows the phrase, so it is possible to adjust the setting for exactly this component.

With the tips on caching and component pooling, we have covered the two most important ways to make a Cocoon application as fast as possible. These features are provided by Cocoon and can be used in different application scenarios. Depending on the type of application being built, other factors can influence the application's performance. We will cover some further aspects when we talk about different types of applications in Chapter 11, "Designing Cocoon Applications."

Peachpit Promotional Mailings & Special Offers

I would like to receive exclusive offers and hear about products from Peachpit and its family of brands. I can unsubscribe at any time.

Overview


Pearson Education, Inc., 221 River Street, Hoboken, New Jersey 07030, (Pearson) presents this site to provide information about Peachpit products and services that can be purchased through this site.

This privacy notice provides an overview of our commitment to privacy and describes how we collect, protect, use and share personal information collected through this site. Please note that other Pearson websites and online products and services have their own separate privacy policies.

Collection and Use of Information


To conduct business and deliver products and services, Pearson collects and uses personal information in several ways in connection with this site, including:

Questions and Inquiries

For inquiries and questions, we collect the inquiry or question, together with name, contact details (email address, phone number and mailing address) and any other additional information voluntarily submitted to us through a Contact Us form or an email. We use this information to address the inquiry and respond to the question.

Online Store

For orders and purchases placed through our online store on this site, we collect order details, name, institution name and address (if applicable), email address, phone number, shipping and billing addresses, credit/debit card information, shipping options and any instructions. We use this information to complete transactions, fulfill orders, communicate with individuals placing orders or visiting the online store, and for related purposes.

Surveys

Pearson may offer opportunities to provide feedback or participate in surveys, including surveys evaluating Pearson products, services or sites. Participation is voluntary. Pearson collects information requested in the survey questions and uses the information to evaluate, support, maintain and improve products, services or sites; develop new products and services; conduct educational research; and for other purposes specified in the survey.

Contests and Drawings

Occasionally, we may sponsor a contest or drawing. Participation is optional. Pearson collects name, contact information and other information specified on the entry form for the contest or drawing to conduct the contest or drawing. Pearson may collect additional personal information from the winners of a contest or drawing in order to award the prize and for tax reporting purposes, as required by law.

Newsletters

If you have elected to receive email newsletters or promotional mailings and special offers but want to unsubscribe, simply email ask@peachpit.com.

Service Announcements

On rare occasions it is necessary to send out a strictly service related announcement. For instance, if our service is temporarily suspended for maintenance we might send users an email. Generally, users may not opt-out of these communications, though they can deactivate their account information. However, these communications are not promotional in nature.

Customer Service

We communicate with users on a regular basis to provide requested services and in regard to issues relating to their account we reply via email or phone in accordance with the users' wishes when a user submits their information through our Contact Us form.

Other Collection and Use of Information


Application and System Logs

Pearson automatically collects log data to help ensure the delivery, availability and security of this site. Log data may include technical information about how a user or visitor connected to this site, such as browser type, type of computer/device, operating system, internet service provider and IP address. We use this information for support purposes and to monitor the health of the site, identify problems, improve service, detect unauthorized access and fraudulent activity, prevent and respond to security incidents and appropriately scale computing resources.

Web Analytics

Pearson may use third party web trend analytical services, including Google Analytics, to collect visitor information, such as IP addresses, browser types, referring pages, pages visited and time spent on a particular site. While these analytical services collect and report information on an anonymous basis, they may use cookies to gather web trend information. The information gathered may enable Pearson (but not the third party web trend services) to link information with application and system log data. Pearson uses this information for system administration and to identify problems, improve service, detect unauthorized access and fraudulent activity, prevent and respond to security incidents, appropriately scale computing resources and otherwise support and deliver this site and its services.

Cookies and Related Technologies

This site uses cookies and similar technologies to personalize content, measure traffic patterns, control security, track use and access of information on this site, and provide interest-based messages and advertising. Users can manage and block the use of cookies through their browser. Disabling or blocking certain cookies may limit the functionality of this site.

Do Not Track

This site currently does not respond to Do Not Track signals.

Security


Pearson uses appropriate physical, administrative and technical security measures to protect personal information from unauthorized access, use and disclosure.

Children


This site is not directed to children under the age of 13.

Marketing


Pearson may send or direct marketing communications to users, provided that

  • Pearson will not use personal information collected or processed as a K-12 school service provider for the purpose of directed or targeted advertising.
  • Such marketing is consistent with applicable law and Pearson's legal obligations.
  • Pearson will not knowingly direct or send marketing communications to an individual who has expressed a preference not to receive marketing.
  • Where required by applicable law, express or implied consent to marketing exists and has not been withdrawn.

Pearson may provide personal information to a third party service provider on a restricted basis to provide marketing solely on behalf of Pearson or an affiliate or customer for whom Pearson is a service provider. Marketing preferences may be changed at any time.

Correcting/Updating Personal Information


If a user's personally identifiable information changes (such as your postal address or email address), we provide a way to correct or update that user's personal data provided to us. This can be done on the Account page. If a user no longer desires our service and desires to delete his or her account, please contact us at customer-service@informit.com and we will process the deletion of a user's account.

Choice/Opt-out


Users can always make an informed choice as to whether they should proceed with certain services offered by Adobe Press. If you choose to remove yourself from our mailing list(s) simply visit the following page and uncheck any communication you no longer want to receive: www.peachpit.com/u.aspx.

Sale of Personal Information


Pearson does not rent or sell personal information in exchange for any payment of money.

While Pearson does not sell personal information, as defined in Nevada law, Nevada residents may email a request for no sale of their personal information to NevadaDesignatedRequest@pearson.com.

Supplemental Privacy Statement for California Residents


California residents should read our Supplemental privacy statement for California residents in conjunction with this Privacy Notice. The Supplemental privacy statement for California residents explains Pearson's commitment to comply with California law and applies to personal information of California residents collected in connection with this site and the Services.

Sharing and Disclosure


Pearson may disclose personal information, as follows:

  • As required by law.
  • With the consent of the individual (or their parent, if the individual is a minor)
  • In response to a subpoena, court order or legal process, to the extent permitted or required by law
  • To protect the security and safety of individuals, data, assets and systems, consistent with applicable law
  • In connection the sale, joint venture or other transfer of some or all of its company or assets, subject to the provisions of this Privacy Notice
  • To investigate or address actual or suspected fraud or other illegal activities
  • To exercise its legal rights, including enforcement of the Terms of Use for this site or another contract
  • To affiliated Pearson companies and other companies and organizations who perform work for Pearson and are obligated to protect the privacy of personal information consistent with this Privacy Notice
  • To a school, organization, company or government agency, where Pearson collects or processes the personal information in a school setting or on behalf of such organization, company or government agency.

Links


This web site contains links to other sites. Please be aware that we are not responsible for the privacy practices of such other sites. We encourage our users to be aware when they leave our site and to read the privacy statements of each and every web site that collects Personal Information. This privacy statement applies solely to information collected by this web site.

Requests and Contact


Please contact us about this Privacy Notice or if you have any requests or questions relating to the privacy of your personal information.

Changes to this Privacy Notice


We may revise this Privacy Notice through an updated posting. We will identify the effective date of the revision in the posting. Often, updates are made to provide greater clarity or to comply with changes in regulatory requirements. If the updates involve material changes to the collection, protection, use or disclosure of Personal Information, Pearson will provide notice of the change through a conspicuous notice on this site or other appropriate way. Continued use of the site after the effective date of a posted revision evidences acceptance. Please contact us if you have questions or concerns about the Privacy Notice or any objection to any revisions.

Last Update: November 17, 2020