This is the next installment of a series of deep-dives into the structure and implementation of variables in Visual Basic for Applications. For the previous posts, see the following:

In this post, I will cover the details of string variables and pointers. See Scalar Variables and Pointers in Depth for additional background and for the code for the utility functions HexPtr and Mem_ReadHex.

Pointers and memory for string variables

Even though string variables are treated semantically as value types, they are reference types by implementation. The contents of a string variable is actually a pointer to another memory location where the actual string characters are stored. With VBA we can either get the address to the variable itself using VarPtr, or we can go straight to the start of the character buffer by using StrPtr. For a variable declared as a String, then, directly reading the memory at the address returned by VarPtr should give you the same pointer value as calling StrPtr.

Strings are BSTR structures

As noted in VBA Internals: What’s in a variable, strings in VBA are implemented using the COM BSTR structure. The BSTR structure actually starts with an unsigned 32-bit integer which indicates the length of the character buffer. Note this length is in bytes, not characters, and it does not include the two bytes of the terminating null character. However, the BSTR specification requires that implementers pass around the pointer to the start of the character buffer itself (rather than the preceding length field), so that a BSTR* can be passed directly to functions expecting pointers to C-style null-terminated strings. In order to directly read this length field, then, we need to take the pointer returned by StrPtr and back up 4 bytes.

In the example below, I show the full BSTR structure by getting the length in bytes of the string buffer itself using LenB, back up 4 bytes to include the length field, and read a total of 6 extra bytes to include both the length field at the start and the null character at the end.




The variable table in this case is pretty simple:

Name Type Address
strVar String 0x0039F4F0

The functions used and memory layout revealed take a little more explaining. First, when we directly read the memory at the address returned by VarPtr, we get the bytes of the pointer to the character buffer. Since my machine is little-endian the raw bytes appear backwards. The printout shows that calling StrPtr returns the exact same pointer value as in ptrBSTR.

Finally, we actually display the bytes of the BSTR. It starts with the 4-byte length field. Again, this is little-endian so we have to reverse the bytes to correctly interpret it. When we do we indeed see a value of 10, for the 10 bytes of the 5-character Unicode string “Hello”. Next is the character buffer. The characters are in the order expected, but the two bytes within each 16-bit code point are once again little-endian. Finally, there is a two-byte null character at the end.

Address 0 1 2 3
0x0039F4Fx E4 3A 35 08
= 0x08353AE4
Address 0 1 2 3 4 5 6 7 8 9 A B C D E F
0x08353AEx 0A 00 00 00 48 00 65 00 6C 00 6C 00 6F 00 00 00
Length Prefix
0A 00 00 00
= 0x0000000A = 1010
48 00
= 0x0048 = H
65 00
= 0x0065 = e
6C 00
= 0x006C = l
6C 00
= 0x006C = l
6F 00
= 0x006F = o
Null term
00 00