Share:
ClickHelp Documentation

Finding Content with Regular Expressions and Other Techniques

To perform more sophisticated operations with Global Find and Replaceyou can use regular expressions. Here, we will provide you with some use cases.

Regular Expressions Syntax

Information

Before starting, keep in mind a few things about the syntax of regular expressions:

  • The Global Find and Replace engine supports the .NET regular expression syntax. Here is an article by Microsoft to learn more: Regular Expression Language - Quick Reference.
  • When using the Search and Replace dialog in the Topic Source editor, HTML editor, JavaScript editor, CSS editor, and Code Samples editor, ClickHelp uses the JavaScript regular expression syntax. To learn more, refer to this article: JavaScript RegExp Reference. Keep in mind, that you can use regular expressions in ClickHelp without quotes or slashes.
  • In the Global Find and Replace page and the Search and Replace dialogs in editors, you can use the $ symbol with a number (like $1, $2 and so on) to refer to a specific match.
  • Some symbols like ?*+.\$^{}()| may have special meaning in regular expressions. If you need to use those symbols when searching, make sure that you escape them in your search string by adding a backslash before the symbol. For example, \? or \*.
  • If you're unsure about the correct use of regular expressions, we recommend using REGEX TESTER to test and fix your regular expressions.

Scenario #1 - Replace Text with a Variable

Let's suppose, you added your company name to all your topics.

And now you'd like to use a Variable for all those places, so you can easier update the company name if required. You can perform this change as described below:

  • Fill in the search string with the company name.

  • Check the relevant filter boxes. 

  • Click FIND ALL.
  • Fill in the 'Replace with' field.

  • Select topics where you want to replace your company name with a variable.
  • Click REPLACE SELECTED (or REPLACE ALL if you need to replace it in all topics).
  • Here is the result:

Scenario #2 - Find Some HTML and Replace it with a Snippet

Let's suppose, you have an instruction:

Its code looks like this in the Source view:

To find it in all topics, use \s*? before and after opening and closing attributes. So, your search query will look like this:

HTML
<p>\s*?Here is how to open a file:\s*?</p>\s*?<ul>\s*?<li>\s*?Click "File > Open" in the main menu\.\s*?</li>\s*?<li>\s*?Select a file from the Open File dialog\.\s*?</li>\s*?</ul>

It may look complicated but it takes few seconds - just remember that you should add \s*?  before  every < tag and after every > tag. 

If you need to replace this code with a snippet, do the following:

  • Fill in the search string with code according to the above sample.
  • Check the relevant filter boxes.
  • Click FIND ALL.
  • Fill in the 'Replace with' field with a snippet reference (you need to create a snippet topic beforehand).

  • In the search results, select the topic where you want to replace code with a snippet reference.
  • Click REPLACE SELECTED, or click REPLACE ALL if you need to replace it in all topics.
  • Here is the result - the static content has been replaced with a snippet.

To learn more about snippets, refer to this topic: Content Snippets.

Scenario #3 - Clean Up Formatting After Content Importing

Let's suppose, you want to delete this header that got imported with external content: 


Let's switch to the Source mode, we'll see that the HTML code of the header looks like this:

  • In order to find it, you can use its ID:
    HTML
    <div[^>]*?id="pnlHeader"[^>]*?>\s*?Created by a third-party tool\s*?</div>

    or its class:
    HTML
    <div[^>]*?class="header"[^>]*?>\s*?Created by a third-party tool\s*?</div>
    As you can see, we added [^>]*? to ignore all other attributes of the current tag, you can use this regular expression in other similar cases. 

Scenario #4 - Looking for a Tag with Varying Attributes

You can use this scenario for different situations but, in our case, we want to find and delete the following header: 

  
Its code looks like this in the Source mode:

In this situation, you can use its ID to find it as it's shown in the previous scenario, so the approach is the same.

However, if the sought for markup has some varying attributes and tags, then you may need to ignore the varying part and use the static part for search. For example, you have the following formatting in one topic: 


and this one in another topic (the navigation link text is different):

You can match both versions using something that they have in common, like this:

HTML
<div id="pnlHeader">
<div[^>]*?class="headerText"[^>]*?>[^<]*?</div>
<a[^>]*?>[^<]*?</a>
</div>

So, your search query will be the following:

HTML
<div id="pnlHeader">\s*?<div[^>]*?class="headerText"[^>]*?>[^<]*?</div>\s*?<a[^>]*?>[^<]*?</a>\s*?</div>

This is the result of the search, as you can see, both topics are shown:

Scenario #5 - Clean Up Headers and Footers with Unknown Tag Structure

Sometimes there are situations when you don't know the tags structure of all headers/footers. In this case, you can use the previous method and delete groups of headers/footers one by one, or you can use the opposite method - find out what you need to save among headers/footers. Below we will show the last method:

  • For example, you have a header that looks like this in the Design view.

  • And it looks like as the following in the Source view:

  • We can specify the beginning and the end of the content, so the search query will look like this:
    HTML
    ^.*?<div[^>]*?id="pnlContent"[^>]*?>(.+)</div>\s*?<div[^>]*?class="footer".*?$

  • Fill in the 'Replace with' field with the following regular expression: $1
  • Then proceed with the steps.
  • Here is the result:

This method and the Global Find and Replace engine, in general, is powerful, and you can use it not only for headers and footers but also for other cases when you need to find (and replace) parts of formatting.

Scenario #6 - Find a Particular CSS Class

It can be useful to clean up your content after importing from MS Word because CSS classes can be created with automatic names for repeated classes. To learn more, refer to this topic: Import from Microsoft Word. Here are some use cases:

  • Automatic CSS Classes Names

    In this case, use plain text, as they are unique, it's even unnecessary to check the 'Match whole word' filter box. 

  • Other Classes Names

    Class names can be different and they can even be the same as tags name. In this case, the two-step replacement is required:

For example, we need to find the "myClass" class name in all CSS style. We will use this Regular Expression:

CSS
\.myClass\b

As you can see, we use \  to match the period and \b to match a word boundary.

For HTML, looking for all places where myClass is used will look like this:

HTML
(<[^>]+?class=['"][^'"]*?)\bmyClass\b

  • Then you should match the beginning of a tag and its class attribute with other possible classes that are before the necessary class. It will be in the group that we should not delete/edit. Then we match the necessary class that will be surrounded by word boundaries. In the square brackets, we match a single quote or double-quotes. The expression like [^'"]  means to find all symbols excepting a single quote or double-quotes in order not to go beyond class attribute. 
  • Fill in the 'Replace with' field with $1myNewClass. Then proceed with the steps.
  • So, for example, for the part:
    CSS
    <div class="firstClass myClass otherClass">

  • The result will be the following:
    CSS
    <div class="firstClass myNewClass otherClass">

Scenario #7 - Find an Output Tag

Let's assume, you need to find the PrintedDoc tag.

Using regular expressions, the search query will look like this:

Information Note: it's available only for content. If you need to find it among publication settings, use TOC filtres. To learn more, refer to this topic: TOC Filters.

HTML
(<ch:[^>]*?tags=['"][^'"]*?)\bPrintedDoc\b

In order to replace it, fill in the 'Replace with' field with $1NewOutputTag. Then proceed with the steps.

So, if we had the following Exclude Block markup:

HTML
<ch:exclude tags="AdminGuide,PrintedDoc">

We'll get the following result after the replace operation:

HTML
<ch:exclude tags="AdminGuide,NewOutputTag">

Scenario #8 - Find a Topic or Image Link

In this case, you don't need regular expressions, you can use filters.

For example, we need to find where a topic is used. Just fill in the search string with the topic title, ID or URL.

Scenario #9 - Find and Rename Variables and Output Tags from Design View

For example, you need to rename a variable with the companyName tag:

  • Here is how it looks in the Design view:
  • Here is how it looks in the Source view:

So, here are the steps to follow in order to find and replace it:

  • In this case, you need to use regular expressions, so your search query will look like this: 

HTML
<ch:var[^>]+?name=['"]companyName['"][^>]*?>

As you can notice, you need to use ['"] for double or single quotes, [^>]+? is for space, and [^>]*? is for the space and slash.

  • In order to rename it, fill in the 'Replace with' field with $1NewName$2

  • Select a topic where you want to replace the variable, then click REPLACE SELECTED (or, if it's needed, click REPLACE ALL).
     
  • As a result, you'll get the new variable with the NewName tag.

Content migration may require using regular expressions to perform smart updates of your content. If you need help from our migration experts, please contact sales@clickhelp.com