Publishers of technology books, eBooks, and videos for creative people

Home > Articles > Design > Adobe Creative Suite

Real World Adobe InDesign CS4: XML

  • Print
  • + Share This
XML is as simple or complicated as you care to make it. Olav Martin Kvern and David Blatner show you how to keep it simple, at least at first.
From the book

XML stands for “Extensible Markup Language.” What the heck does that mean? It’s easy to be scared off—XML is usually mentioned in the same breath as SOAP, DTDs, metadata, structured content, and schema. With all of the buzzwords and jargon surrounding the topic, it’s easy to lose track of something very basic: XML is simple.

XML is a way to mark up (or tag) information in a text file. Any application that can write text files can be used to write XML. Like HTML, XML uses tags, such as “<h1>” to mark a piece of text. Unlike HTML, XML doesn’t have a limited set of predefined tags. That’s what the “extensible” part of the acronym means. You’re not limited to <h1>, <h2>, <p>, and so on, as you are in HTML.

It would be easy to fall into the trap of thinking that XML is something like an expanded version of HTML. This isn’t the case—the two markup schemes are different in kind. HTML is about what things look like; XML is about what elements are. The formatting of XML data—if that data ever appears in a document—is up to you.

While we said that XML is simple (as opposed to complex), we never said that it couldn’t be complicated. Indeed, the biggest problem of implementing an XML publishing workflow is figuring out the design of the XML data structures you want to work with. You can easily end up paralyzed by the sheer number of possibilities—there are a limitless number of ways to accomplish a given task.

In short, XML is as simple or complicated as you care to make it. Keep it simple, at least at first.

When XML Isn’t XML

When you import XML into an InDesign document, it isn’t XML anymore. It gets turned into an InDesign representation of the XML structure. If you’re familiar with XML, this can be a major conceptual stumbling block.

Working with XML outside InDesign, one becomes accustomed to certain things: finding a given element by its content using XPath, for example, or being able to use the content of a given XML element in multiple places in a web page generated from the XML You can’t do those things in InDesign—at least not directly.

XML Vocabulary

When we talk about XML, we’ll be using a standard set of terms, for which we offer the following non-standard set of definitions.

Tag.

A tag is a label for a piece of XML data a tag is not an XML element. Tags are marked with angle brackets (also known as greater than and less than symbols), like “<title>”. Tags cannot contain space characters. XML must be “well formed,” which means that any “start” tag must be matched by an “end” tag. Something like this:

<title>Revolutionary Girl Utena</title>

XML Element.

XML elements are the fundamental building blocks of an XML file. The title just shown is an example of an XML element (containing the data “Revolutionary Girl Utena”). XML elements may contain other elements, as shown in the following example.

<author>
	<name>Olav Martin Kvern</name>
	<address>5207 Phinney Avenue North</address>
	<city>Seattle</city>
	<state>Washington</state>
	<zip>98103</zip>
</author>

Here, the “author” XML element contains the other elements. Each element, in turn, can contain other elements. We could easily change the structure to change the way that the name information is stored, for example.

<author>
	<name>
		<first>Olav</first>
		<middle>Martin</middle>
		<last>Kvern</last>
	</name>
	<address>5207 Phinney Avenue North</address>
	<city>Seattle</city>
	<state>Washington</state>
	<zip>98103</zip>
</author>

XML elements are sometimes referred to as “nodes.” Every InDesign document includes at least one XML element—by default, that’s the “Root” element.

Using “White Space” Characters.

Both of the examples above use tabs to show the nesting of elements in the file and returns to separate each XML element. This is not necessarily the best way to write XML for import into InDesign—it depends on the XML import options you choose. If you need to include white space characters, every character in the XML file will be imported as text (which only makes sense if you really do want tab characters in those locations).

XML Attribute.

One way to attach data to an XML element is to add an element inside the element—an XML attribute is another way of doing the same thing. In general, you use attributes to add information about the element (or “metadata”).

In our example, we might want to store the last time that the XML element was updated. We can do that by adding an XML attribute to the “author” tag. XML attributes do not appear in the layout when you place the containing XML element on a page.

<author last_update="12/29/08">

XML Structure.

The structure of an XML document is nothing more than the way that the elements fit together. Don’t let anyone tell you otherwise.

DTD.

This is simply a description of what elements can appear, and in which order, in a defined XML structure. DTDs are not required by InDesign, but you might want to use them.

XML Comments.

These just give you a way to add a comment to your XML. Like XML attributes, comments will not appear in a layout. You can think of them as notes for XML.

XML Processing Instructions.

These tell an application to do something with the XML data, but you don’t really need to worry about them—InDesign does not use XML processing instructions in any way. Actually, very few applications use processing instructions. If, however, you have one of these rare beasts, you can add, edit, and delete any processing instructions in the XML structure.

Where Should You Work with XML?

Given that InDesign gives you a way to add, delete, and rearrange XML elements in the XML structure of a document, you might think that you could take care of all XML creation and editing tasks without ever leaving the friendly confines of the program.

We urge you, however, not to do this. While InDesign’s XML editing tools can be used in this fashion, they’re really intended more for quick-and-dirty touch-up work than for serious XML editing.

Instead, if you must create XML files from scratch, find yourself a good XML editing program. Ole likes Oxygen for cross-platform XML editing, and XSLT debugging.

If you can’t find an XML editor you like, you can always use a text editor, such as BBEdit, Notepad, or even InDesign to enter and then save the text of an XML file (as we said, it’s just a simple text file).

At the moment, the system platforms (Windows and the Mac OS) are adding to their XML support. There are far more system tools for working with XML (such as the MSXML parser for Windows from Microsoft) outside InDesign than there are inside InDesign.

But we expect that, most of the time, you probably won’t be writing the XML yourself. Instead, you’ll be getting your XML documents from some automated process, such as an Excel VBA macro, an export from FileMaker or Access, or from an InDesign document you’ve exported as XML.

About XML Workflow

Before we dive into the details of working with the XML structure, we’d better explain how we think XML fits into a page layout process. As usual, we risk getting ahead of ourselves by presenting a conceptual overview before we talk about the details of the feature, but there’s just no other way to do it. The following is an outline of one approach we see for working with XML in an InDesign document. This approach does not include validation of the imported XML using a DTD (see the section on DTDs, later in this chapter).

  1. Create an InDesign document. You can use empty placeholder frames, dummy text, fixed text (text you don’t expect to have in the XML data file), or you can mark up an existing document.
  2. Load XML tags from an XML file. This doesn’t have to be the file containing your data, and it doesn’t even have to be an XML file with the same structure as you’ll be using. All it needs to include are the names of the elements you expect to have in the XML data you plan to import.
  3. Alternatively, you can create XML tags from scratch. You’ll have to remember to make sure that the XML tag names match the element names for the XML files you’ll be importing.

  4. Apply XML tags to frames and text in your template document.
  5. Map styles to XML tags using the Map Tags to Styles dialog box.
  6. Import XML into the document. When you do this, the data in your XML file (including any graphics specified in the XML structure) will appear in your layout.

When you import a new XML file and choose to replace the existing structure, InDesign will apply the formatting you’ve already applied. This makes this workflow particularly useful for setting up a document with a repeating publication schedule (newsletters, product data sheets, and so on). To make certain that new XML files match the layout, you might want to export the XML from the document to use as a template (see “Exporting XML Tags”) for the next iteration of the publication, or for use in a Web site or database.

Inside the Structure View

It’s all about structure. No matter how crazy and free-form your layout, your brain—and the brains of your audience—impose a structure on the content of your document. This is true, whether you’re conscious of it or not. So don’t be afraid of the word “structure” or try to deny that your documents have it. It’s there. You can choose to work with it or not. Once you’re out of your “denial” phase, you’ll find InDesign’s Structure view a powerful ally.

Using the options in the Structure view, you can create XML elements and attributes, associate elements with InDesign page items or text, rearrange XML elements, and delete XML elements (see Figure 14-1). Even if you don’t work with XML you’ve probably already found the Structure view—it’s all too easy to expand it by accident when you’re trying to add a ruler guide or reposition the zero point.

Figure 14.1 Structure View

The Structure view uses icons to give you clues about the elements in the document’s XML structure, as shown in Table 14-1.

Table 14-1: Structure View Icons

Icon

Representing

What it means


The Root XML element

The Root element is the base, or top-level, XML element in your XML structure. All XML elements are contained by the Root element.


Story element

An InDesign story.


Text element

A range of text.


Graphic element

A graphic.


Unplaced text element

A text element that has not yet been associated with a page item.


Unplaced graphic element

A graphic element that has not yet been associated with a frame.


Empty element

An element associated with an empty frame.


Attribute

An attribute of an element. Attributes are always optional, and are only visible in the Structure window.


Collapse/Expand

Click this icon to collapse or expand an element. If you hold down Command/ Ctrl as you click this icon, InDesign will expand all elements contained within the element.

Showing/Hiding Text Snippets.

To see a short passage of the text associated with the XML elements, choose Show Text Snippets from the Structure view menu (see Figure 14-2). To hide text snippets, choose Hide Text Snippets.

Figure 14.2 Showing Text Snippets

Tagged Frames and Tag Markers.

Want to see which frames are associated with XML elements? Turn on the Show Tagged Frames option on the View menu (see Figure 14-3). To see text that’s been associated with an XML element, choose Show Tag Markers from the View menu (see Figure 14-4).

Figure 14.3 Viewing Frame Tags

Figure 14.4 Viewing Tag Markers

XML Tags in the Story Editor

When you’re working with text you’ve tagged with XML tags, editing the text in the layout can become something of a chore. It can be very hard to see where one range of tagged text ends, where another tag begins, and the ways that the tags relate to each other. It can also be far too easy to delete tags inadvertently. The solution? View and edit the text in the Story Editor, which provides a better view of the text elements (see Figure 14-5).

Figure 14.5 Viewing XML Tag Markers in the Story Editor

Adding XML Elements

There are many ways to add a new XML element to a document. Our favorite method is to use the Context menu in the Structure view. Select the element you want as the container for the new element, then choose New Element from the Context menu (or click the New Element button). Select a tag in the associated submenu, and InDesign adds an empty element inside the element you selected (see Figure 14-6). You can also click the New Element button.

Figure 14.6 Adding an XML Element

Changing XML Element Data

When you add an XML element using the Structure view, it’s natural to assume that you can somehow enter the data for the element in that window. You can’t. Remember: the data for an XML element in InDesign is stored in the page item or text object that the element is associated with. The only case in which you can have element data that’s not associated with a frame on your page is when you’ve imported XML and have not yet assigned an element to a page item, or when you’ve deleted the object the element was originally associated with.

To change the data in an XML structure, simply edit the text or the frame that the element is associated with.

Duplicating XML Elements

To duplicate an XML element (and any elements it contains), select the element in the Structure view and copy the element. Select another XML element (such as the Root element) and paste. InDesign pastes the copied element into the selected element.

Moving XML Elements

To move an element in the XML structure (including all of the elements it contains), simply drag the element up or down in the Structure view (see Figure 14-7). To move the element inside another element (and thereby change the hierarchy of elements), drag the element inside another element.

Figure 14.7 Moving an XML Element

Deleting XML Elements

To delete an element in the XML structure, select the element and choose Delete Element from the Context menu (or click the Delete Element button).

Applying a Parent XML Element

To move an existing XML element into a new parent XML element, select an element or series of elements and choose New Parent Element from the Context menu. Select a tag from the associated submenu, and InDesign moves the selected element (or elements) into a new element (see Figure 14-8).

Figure 14.8 Creating a Parent XML Element

Working with XML Attributes

XML attributes are what’s called “metadata”—they’re information about the information in the XML element they’re associated with. You can’t really do much with attributes in InDesign, but you might want to add an attribute if the XML is destined to appear in a situation in which attributes are necessary.

To add an attribute to an element, select the element and choose New Attribute from the Structure view menu. InDesign displays the New Attribute dialog box. Enter a name (like XML element names, attribute names cannot contain spaces) and value for the attributes, then click OK to save the attribute (see Figure 14-9).

Figure 14.9 Adding an Attribute to an XML Element

To change an attribute, double-click the attribute. InDesign displays the Edit Attribute dialog box. Change the name or text of the attribute and click the OK button.

To delete an attribute, select the attribute press Delete, or choose Delete Attribute from the Structure view menu.

Adding an XML Comment

To add an XML comment to an element, select the element and choose New Comment from the Context menu (or from the Structure menu). InDesign displays the New Comment dialog box. Enter the text of your comment in the dialog box and click the OK button. InDesign adds the XML comment to the structure (see Figure 14-10). To edit a comment, double-click the comment in the Structure view. To delete a comment, select the comment and press Delete.

Figure 14.10 Adding an XML Comment

Adding an XML Processing Instruction

To add an XML processing instruction to an element, select the element and choose New Processing Instruction from the Context menu (or from the Structure menu). InDesign displays the New Processing Instruction dialog box. Enter the name of the target application in the Target field. Enter the text of the instruction in the Data field. Click the OK button to close the dialog box. InDesign adds the XML processing instruction to the structure (see Figure 14-11). Double-click a processing instruction to edit it, or select the processing instruction and press Delete to remove it from the structure.

Figure 14.11 Adding an XML Processing Instruction

  • + Share This
  • 🔖 Save To Your Account

Peachpit Promotional Mailings & Special Offers

I would like to receive exclusive offers and hear about products from Peachpit and its family of brands. I can unsubscribe at any time.

Overview


Pearson Education, Inc., 221 River Street, Hoboken, New Jersey 07030, (Pearson) presents this site to provide information about Peachpit products and services that can be purchased through this site.

This privacy notice provides an overview of our commitment to privacy and describes how we collect, protect, use and share personal information collected through this site. Please note that other Pearson websites and online products and services have their own separate privacy policies.

Collection and Use of Information


To conduct business and deliver products and services, Pearson collects and uses personal information in several ways in connection with this site, including:

Questions and Inquiries

For inquiries and questions, we collect the inquiry or question, together with name, contact details (email address, phone number and mailing address) and any other additional information voluntarily submitted to us through a Contact Us form or an email. We use this information to address the inquiry and respond to the question.

Online Store

For orders and purchases placed through our online store on this site, we collect order details, name, institution name and address (if applicable), email address, phone number, shipping and billing addresses, credit/debit card information, shipping options and any instructions. We use this information to complete transactions, fulfill orders, communicate with individuals placing orders or visiting the online store, and for related purposes.

Surveys

Pearson may offer opportunities to provide feedback or participate in surveys, including surveys evaluating Pearson products, services or sites. Participation is voluntary. Pearson collects information requested in the survey questions and uses the information to evaluate, support, maintain and improve products, services or sites; develop new products and services; conduct educational research; and for other purposes specified in the survey.

Contests and Drawings

Occasionally, we may sponsor a contest or drawing. Participation is optional. Pearson collects name, contact information and other information specified on the entry form for the contest or drawing to conduct the contest or drawing. Pearson may collect additional personal information from the winners of a contest or drawing in order to award the prize and for tax reporting purposes, as required by law.

Newsletters

If you have elected to receive email newsletters or promotional mailings and special offers but want to unsubscribe, simply email ask@peachpit.com.

Service Announcements

On rare occasions it is necessary to send out a strictly service related announcement. For instance, if our service is temporarily suspended for maintenance we might send users an email. Generally, users may not opt-out of these communications, though they can deactivate their account information. However, these communications are not promotional in nature.

Customer Service

We communicate with users on a regular basis to provide requested services and in regard to issues relating to their account we reply via email or phone in accordance with the users' wishes when a user submits their information through our Contact Us form.

Other Collection and Use of Information


Application and System Logs

Pearson automatically collects log data to help ensure the delivery, availability and security of this site. Log data may include technical information about how a user or visitor connected to this site, such as browser type, type of computer/device, operating system, internet service provider and IP address. We use this information for support purposes and to monitor the health of the site, identify problems, improve service, detect unauthorized access and fraudulent activity, prevent and respond to security incidents and appropriately scale computing resources.

Web Analytics

Pearson may use third party web trend analytical services, including Google Analytics, to collect visitor information, such as IP addresses, browser types, referring pages, pages visited and time spent on a particular site. While these analytical services collect and report information on an anonymous basis, they may use cookies to gather web trend information. The information gathered may enable Pearson (but not the third party web trend services) to link information with application and system log data. Pearson uses this information for system administration and to identify problems, improve service, detect unauthorized access and fraudulent activity, prevent and respond to security incidents, appropriately scale computing resources and otherwise support and deliver this site and its services.

Cookies and Related Technologies

This site uses cookies and similar technologies to personalize content, measure traffic patterns, control security, track use and access of information on this site, and provide interest-based messages and advertising. Users can manage and block the use of cookies through their browser. Disabling or blocking certain cookies may limit the functionality of this site.

Do Not Track

This site currently does not respond to Do Not Track signals.

Security


Pearson uses appropriate physical, administrative and technical security measures to protect personal information from unauthorized access, use and disclosure.

Children


This site is not directed to children under the age of 13.

Marketing


Pearson may send or direct marketing communications to users, provided that

  • Pearson will not use personal information collected or processed as a K-12 school service provider for the purpose of directed or targeted advertising.
  • Such marketing is consistent with applicable law and Pearson's legal obligations.
  • Pearson will not knowingly direct or send marketing communications to an individual who has expressed a preference not to receive marketing.
  • Where required by applicable law, express or implied consent to marketing exists and has not been withdrawn.

Pearson may provide personal information to a third party service provider on a restricted basis to provide marketing solely on behalf of Pearson or an affiliate or customer for whom Pearson is a service provider. Marketing preferences may be changed at any time.

Correcting/Updating Personal Information


If a user's personally identifiable information changes (such as your postal address or email address), we provide a way to correct or update that user's personal data provided to us. This can be done on the Account page. If a user no longer desires our service and desires to delete his or her account, please contact us at customer-service@informit.com and we will process the deletion of a user's account.

Choice/Opt-out


Users can always make an informed choice as to whether they should proceed with certain services offered by Adobe Press. If you choose to remove yourself from our mailing list(s) simply visit the following page and uncheck any communication you no longer want to receive: www.peachpit.com/u.aspx.

Sale of Personal Information


Pearson does not rent or sell personal information in exchange for any payment of money.

While Pearson does not sell personal information, as defined in Nevada law, Nevada residents may email a request for no sale of their personal information to NevadaDesignatedRequest@pearson.com.

Supplemental Privacy Statement for California Residents


California residents should read our Supplemental privacy statement for California residents in conjunction with this Privacy Notice. The Supplemental privacy statement for California residents explains Pearson's commitment to comply with California law and applies to personal information of California residents collected in connection with this site and the Services.

Sharing and Disclosure


Pearson may disclose personal information, as follows:

  • As required by law.
  • With the consent of the individual (or their parent, if the individual is a minor)
  • In response to a subpoena, court order or legal process, to the extent permitted or required by law
  • To protect the security and safety of individuals, data, assets and systems, consistent with applicable law
  • In connection the sale, joint venture or other transfer of some or all of its company or assets, subject to the provisions of this Privacy Notice
  • To investigate or address actual or suspected fraud or other illegal activities
  • To exercise its legal rights, including enforcement of the Terms of Use for this site or another contract
  • To affiliated Pearson companies and other companies and organizations who perform work for Pearson and are obligated to protect the privacy of personal information consistent with this Privacy Notice
  • To a school, organization, company or government agency, where Pearson collects or processes the personal information in a school setting or on behalf of such organization, company or government agency.

Links


This web site contains links to other sites. Please be aware that we are not responsible for the privacy practices of such other sites. We encourage our users to be aware when they leave our site and to read the privacy statements of each and every web site that collects Personal Information. This privacy statement applies solely to information collected by this web site.

Requests and Contact


Please contact us about this Privacy Notice or if you have any requests or questions relating to the privacy of your personal information.

Changes to this Privacy Notice


We may revise this Privacy Notice through an updated posting. We will identify the effective date of the revision in the posting. Often, updates are made to provide greater clarity or to comply with changes in regulatory requirements. If the updates involve material changes to the collection, protection, use or disclosure of Personal Information, Pearson will provide notice of the change through a conspicuous notice on this site or other appropriate way. Continued use of the site after the effective date of a posted revision evidences acceptance. Please contact us if you have questions or concerns about the Privacy Notice or any objection to any revisions.

Last Update: November 17, 2020