In a previous post I presented some general-purpose wrappers for adding regular expressions functionality to VBA and SQL in Microsoft Access.

These wrappers work well, but we can make them much more efficient for situations where the same pattern is used many times, such as in a SQL query where the regular expression will be executed on each row of the input table or query. This article describes how to use a simple factory pattern to cache and reuse RegExp objects, dramatically increasing performance when the same pattern is used more than once during the lifetime of a program.

Caching Compiled Regular Expressions

What we’ll use is the fact that, behind the scenes, every regular expression pattern gets compiled into a tiny little program that executes in order to actually apply the regular expression. The first time you call any of the methods on a RegExp object there will be a short pause while the expression is compiled. Once the regular expression has been compiled, though, it can execute right away.

The pause for compilation is usually on the order of a few tens of milliseconds. That’s small enough that you might never notice it even when stepping through code. But it can really add up when the expression is going to be applied to a few hundred thousand rows.

If you’re using a RegExp object in a procedure while looping through a recordset programmatically you can just keep using the same RegExp object. But what about when the same pattern is used in independent calls to the same function — or even to different functions entirely? The solution is to cache all of your regular expressions objects, and serve them up using a simple factory pattern. In this article I’ll show how to do just that.

Creating the Factory

We will store all the RegExp objects in a dictionary, where the keys will be a concatenation of the pattern and some symbols to represent the three options properties (Global, IngoreCase, and Multiline).

You will need the following references for this to code work:

  • Microsoft VBScript Regular Expressions 5.5
  • Microsoft Scripting Runtime

Now anywhere in your code that you need a RegExp object, you call GetRegex instead of declaring a New RegExp. This also makes it really fast to create new RegExp objects in a single line.

Updating the Regular Expressions Wrappers

See Regular Expressions in VBA for an in-depth discussion of these methods. Here we’ll just update them to take advantage of the new RegExp factory we just built:

RxTest

RxMatch

RxMatches

RxReplace

And that’s that! In a future post I’ll show examples of when and how you might actually use such functions, as well as performance tests delving deeper into the benefits of caching your RegExp objects.

Conclusion

Behind the scenes, RegExp objects generate tiny compiled programs that are executed in order to apply regular expressions. Using a simple caching mechanism you can dramatically increase the performance of using regular expressions in Microsoft Access by saving each unique compiled RegExp object, as identified by the pattern and the three options properties of the object.