Grep Console How-to #3: Capture Groups

Note: Grep Console 3.0 and 3.0.1 had a bug that caused some expressions to be created not quite as described here. This was fixed in version 3.0.2. If you still have an older version, please update before continuing with this how-to.

After the last two how-tos described the basics of Grep Console, it’s time for some advanced stuff. This time, I’ll be showing how to use capture groups to highlight only certain keywords in a line.

Once more, we’ll reuse the code from the previous examples for our demo program:

public static void main(String[] args)
{
  System.out.println("This is the first line.");
  System.out.println("Some more text...");
  System.out.println("This is the second line.");
  System.out.println("Some more text...");
  System.out.println("This is the third line.");
}

Also, here‘s the settings file containing the grep expressions created during the first two how-tos, in case you don’t have them on your system. Simply load the file in the Grep Console dialog.

We now want to add an expression to highlight only the single word “line”. The easiest way to do this is to run our program once, select the word “line” in the output and right-click to open the context menu, then select “Add expression” and the “How-to 1” folder.

Take a look at the grep expression this shows in the expression dialog:

As you can see, the text “line” is surrounded by parentheses. In regular expressions, they are used to denote a capture group. Capture groups can be used to select certain parts of lines matched by the regular expression as a whole. Remember that Grep Console automatically wraps every expressions with “.*” strings at the beginning and end. The full expression in this case therefore looks like this:

.*(\Qline\E).*

The “\Q…\E” part just means that any special characters between the “\Q” and the “\E” should not be treated as regular expression characters – we could write “.*” inside this part and it would only match a dot character followed by an asterisk, not any number of arbitrary characters as “.*” usually does in a regular expression. In this case, “line” is a simple enough string without any special implications, so we could just as well drop the “\Q” and “\E” without problems.

The whole expression therefore means: Match lines that contain any number of arbitrary characters followed by the string “line” and any number of additional arbitrary characters, and make “line” a capture group.

Grep Console recognises capture groups and allows you to assign a distinct style to each capture group. So far, we’ve only used the “whole line” group, which is present for every regular expression and applies, as the name says, to the entire matched line. Now that we have a capture group, the table below the expression field shows a second line, labelled “Group 1”. This corresponds to our capture group, and a style applied to this group will only be used on the part of the line described by the group.

Create a new style for the capture group by double clicking on the “Group 1” line and enable bold font in the style dialog. The preview should now look like this:

As you can see, every line containing the word “line” has that word highlighted in a bold font.

But what about the expressions we already created earlier to highlight those lines? Click “Ok” to close the dialog and take a look at what the console output looks like now:

Grep Console combines all styles matching a line. In this case, we’ve specified that three of our console lines should be styled with a background colour, and that the word “line” should be bold. The result is just that: Three lines with background colour and a bold font for the word “line”.

But you can also assign different styles to different parts of a line with a single expression. Open the Grep Console dialog and create a new expression using the following regular expression:

(Some)(.*)(text)

If you type this text into the expression field, you may notice the text sometimes turning red as you type. If the expression is shown in a red font, it means that the current expression is invalid. Hovering the mouse cursor over the text field will open a tool tip showing a description of the error, as provided by the Java API. For example, if you type an opening parenthesis without a matching closing one, the expression will be invalid.

The above expression contains three capture groups, and the table below the expression field will therefore have four entries: “Whole line”, “Group 1”, “Group 2” and “Group 3”. Create a new background colour style for the “whole line” group, assign the “bold” style we created above to groups 1 and 3, and assign the background colour style used for the second line (named “Line 2” in my example) to group 2.

The result should look like this:

As you can see, the second capture group spans all the text between the first (“Some”) and third (“text”) group, including white space. Capture group styles take precedence over styles applied to the whole line, so the middle group is shown with a blue background, while the rest of the line uses a pink one.

Capture groups are an excellent tool to highlight relevant parts of log output lines. For example, if your program regularly outputs a certain numeric value of interest, you can use a regular expression to identify all lines containing this value, and then use a capture group to apply a text style to only the numeric value. This allows you to easily find the relevant information in your log output.

In closing, here’s a look at the fully styled console output:

The next how-to will introduce the Grep View.

This entry was posted in Grep Console and tagged , . Bookmark the permalink.

Leave a Reply

Your email address will not be published. Required fields are marked *