Getting Started with Filtering

In my last post about filtering, I tried to explain the theory of filtering, so you’d understand which kinds of problems filtering might solve. In this post, I’ll attempt to show how to get started on an actual project. This post is intended for folks who are just getting started with filtering.

Overall Process for a Filtering Project

Generally speaking, I find my process to be something like this:

  1. Create a matrix of the product’s variations and the associated publications so that I have a reference to consult for my sanity. Basically, I’m trying to sketch out which topics I’ll include and how to filter them in or out (in other words, show them or not show them in the final publication). This might be organized by features, audience, revision number, platform, or a combo of any of these.
  2. Use the matrix to define the values I’m going to need in my filter files. A “value” is the definition behind the features, audience, platform, etc. For example, the values might be “beginner,” “intermediate,” or “advanced,” or “Mac,” “Windows,” or “Linux”. Defining the values is basically figuring out who the audience is and the content they’ll need in the publication.
  3. Create the actual filter files (based on the values in the matrix) so that the computer knows how to show or not show tagged topics and inline text appropriately at build/publish time.
  4. Tag the content in the source files for the different features.
  5. Create the final publications (.doc, html, pdf, etc.) using each filter.

Create a Matrix

In my last post, I used the robot product as an example. The robot came in different models, so the matrix I created for that was based on features. But I find that when I change the example, that sometimes produces a different “aha” moment for readers. So this time around, I’m going to use a cookbook example that’s based on audience values (beginner and advanced).

I need to create a cookbook for two culinary school courses: Beginning Baking and Advanced Baking. I’ve determined the chapters I’m going to need and which ones I’ll be able to share between the two different cookbooks. Here’s how it looks:

Define the Values

So in this case, we’re going to need two audience values:

  • Beginner
  • Advanced

Create the Filters

A filter is the actual file in which the values are defined. This may vary depending on which tool you’re using (Frame, Word, DITA, whatever), but the approach is generally the same between tools. As I mentioned in my last post, filtering is largely a mind-shift and less about which tool you’re using to accomplish the goal.

In my shop, we use DITA, so I would create a ditaval file for each of the cookbooks. (The filter files are called “ditavals” because they have a .ditaval filename extension).

  • Beginner
  • Advanced


Note that each of the ditavals includes/excludes the different audience values, like so:

  • Beginner — includes beginner, but excludes advanced.
  • Advanced — includes advanced, but excludes beginner.

The include/exclude action tells the computer which content to include and which to exclude when you publish your guide. So, for example, when you create the Beginner Guide, you want to see the following chapters:

  • Introduction
  • Cookies
  • Cakes and Pies
  • Basic Breads
  • Conclusion

You do not want to see:

  • Advanced Breads
  • French Pastry
  • Croissants

And when you create the Advanced Guide, you want to see:

  • Introduction
  • Cakes and Pies
  • Basic Breads
  • Advanced Breads
  • French Pastry
  • Croissants
  • Conclusion

You do not want to see:

  • Cookies

Tag at the Topic or Chapter Level

So, to accomplish this, you would “tag” the chapters so the computer knows what to show and what not to show at build/publish time. Again, in my shop, we use DITA, so my parent Cookbook file (called a “ditamap”) would look like this:


Breaking that Down

Introduction, Cakes and Pies, Basic Breads, and Conclusion are not tagged. That is because you want these to show (include) in both guides.

Here’s how the guides will look based on the include/exclude information in their associated filter file (the .ditaval):

  • Beginning Guide
    Will include Introduction, Cookies, Cakes and Pies, Basic Breads, Conclusion
    Will exclude Advanced Breads, French Pastry, Croissants
  • Advanced Guide
    Will include Introduction, Cakes and Pies, Basic Breads, Advanced Breads, French Pastry, Croissants, Conclusion
    Will exclude Cookies

If you don’t fully grasp this right now, try not to freak out. You’ll test it! You WILL get this to work. It just takes some time for your mind to adapt to this way of thinking.

Tag at the Inline Level

In addition to the topic-level tagging that you’ve done, you may have some text that you need to tag inline (e.g., specifying “audience=(value)” to specific elements). For example, in your Introduction chapter, you could do something like the following (apologies for crappy image; click it for better readability):


For the xml-phobic, here’s a different view of that same content with some nice red arrows to explain what’s going on (again, click the image for better readability):


Refer back to the “Breaking it Down” section above to understand how this tagged content will show or not show in your different cookbook versions.

QA the Tagging

Here are a few ideas for QAing your tagging:

  • Your tool may have a handy way to validate your tagging. In other words, if you have tagged something incorrectly, your tool may have a way to let you know that and show you exactly where you messed up. In my shop, we use Oxygen, which has a great validator.
  • You can also design in your own quick QA checks. For example, one thing I’ve done in the past is to create a topic that lists the PDF guides. Here’s an example where the table rows are tagged for either “sys_admin” or “end_user” audience value. In the final publication, I can quickly glance at this topic and see if the correct row is showing (in this case, my ditaval filters are such that only the end user row or only the system administrator row should be showing). Here’s how the XML looks:
    Documentation PDFs file in XML

And again for the xml-phobic, here is the same file in the WYSIWYG-ish view. The green indicates content that is tagged:
Documentation PDFs file in WYSIWYG

  • Compare the TOC of the final output against the filter. Is anything showing that shouldn’t be, or vice versa?
  • Look at your build output for any error messages. We’ll explore filtering challenges in another post, but for now, here’s a quick example of a filtering-related error message:
    [FATAL] Failed to parse the input file ‘AdministeringIdeation.ditamap’ because all of its content has been filtered out. Please check the input file ‘AdministeringIdeation.ditamap’ and the ditaval file, and ensure that the input is valid.

Create the Cookbooks

Now that you have tagged everything and you’re ready to create the guides, you’ll push the magic button in whatever tool you’re using (for example, running the build on the command line; running the build in Oxygen; running the build via your CMS, or whatever). In Oxygen, I would choose my filter (e.g., beginning_baking.ditaval), my desired outputs (HTML, PDF, .doc, whatever), and then run the build. I would expect the result to be the Beginning Guide.

Start Small! But Start!

I tried to pick an example that was complicated enough to describe the power of filtering, but still accessible enough that it wasn’t completely overwhelming. My advice would be to pick a small project and practice on it until you get the hang of filtering.

If you have any questions, feel free to ask in the comments. Thanks for reading!

3 thoughts on “Getting Started with Filtering

  1. Howdy, Melanie! I was pleased to discover your DITA user group in Portland. I am part of a 7-person team that has been using DITA for five years or so. We work for Fiserv in Hillsboro. I see that you are planning a meeting for early October; four of us are interested in attending. I am writing to ask whether we need to pre-register or can just show up at the scheduled time.

    Thank you kindly,
    Jim Waddill

Leave a Reply

Your email address will not be published. Required fields are marked *