VBA Internals: Variant Variables and Pointers in Depth

July 5, 2013 in VBA

Pointers and memory for Variant variables

In the Component Object Model (COM) Automation framework, the VARIANT structure provides a wrapper for passing around any type of data, and a suite of manipulation functions facilitate using the VARIANT as a platform-level dynamically-typed variable. I say platform-level because the structures, enumerations, and functions that implement VARIANTs exist at the Windows API level. Any language — including those that are not dynamically typed — can use the API to accomplish something like dynamic types.

VBA does provide dynamically typed variables, and calls them Variants, just like the supporting structures in the COM API. When writing VBA code you never have to call the API functions like VarAdd or VarXor. The compiler and runtime do it for you behind the scenes. But when you pop the hood and start directly working with the bits and bytes of Variant variables and pointers it’s important to know what you’re really dealing with — namely, a COM VARIANT structure.

The details of the layout of the 16 bytes in the VARIANT structure are covered in detail in What’s in a variable. The full code of the memory utility functions used in the examples in this post are included in Scalar Variables and Pointers in Depth.

Variants that contain numeric scalars, strings, or arrays

Everything described in the my previous posts about pointers to numeric scalar variables, strings, and arrays applies to the same data types stored in Variants. The only difference is that the content stored at the pointer to a statically typed variable of these types is instead found 8 bytes after the pointer to the start of the Variant. The first 8 bytes of the Variant are the flag that indicate what type of data the Variant is currently storing, and then some unused space. I’ll try to clarify this with a few examples:

Example 1: Static Long vs Variant Long
Code
Sub LongVarExample()
    Dim myLong       As Long
    Dim myVar        As Variant
    Dim ptrToLong    As LongPtr
    Dim ptrToVar     As LongPtr
    Dim ptrToVarLong As LongPtr
    
    myLong = 12345678
    myVar = 12345678
    ptrToLong = VarPtr(myLong)
    ptrToVar = VarPtr(myVar)
    ptrToVarLong = ptrToVar + 8
    
    Debug.Print " ptrToLong    = 0x" & HexPtr(ptrToLong)
    Debug.Print "*ptrToLong    = 0x" & Mem_ReadHex(ptrToLong, 4)
    Debug.Print " ptrToVar     = 0x" & HexPtr(ptrToVar)
    Debug.Print "*ptrToVar     = 0x" & Mem_ReadHex(ptrToVar, 16)
    Debug.Print " ptrToVarLong = 0x" & HexPtr(ptrToVarLong)
    Debug.Print "*ptrToVarLong = 0x" & Mem_ReadHex(ptrToVarLong, 4)
End Sub
Output
 ptrToLong    = 0x0031F070
*ptrToLong    = 0x4E61BC00
 ptrToVar     = 0x0031F060
*ptrToVar     = 0x03000000000000004E61BC0000000000
 ptrToVarLong = 0x0031F068
*ptrToVarLong = 0x4E61BC00

Explanation

Here’s what the variable table looks like:

Variables
Name Type Address
myLong Long 0x0031F070
myVar Variant 0x0031F060

The layout for the statically-typed Long variable is simple:

Address 0 1 2 3
0x0031F07x 4E 61 BC 00
 
= 0x00BC614E = 12,345,67810

The layout of the variant is almost as simple. The first two bytes are the flag, the next 6 bytes are empty, and then bytes 8-11 have the actual content — the four bytes containing the 32-bit integer.

Address 0 1 2 3 4 5 6 7 8 9 A B C D E F
0x0031F06x 03 00 00 00 00 00 00 00 4E 61 BC 00 00 00 00 00
vt
03 00
= 0x0003 ⇒ VT_I4 ⇒ 4-byte signed integer (see VARENUM)
wReserved1
00 00
= [unused]
wReserved2
00 00
= [unused]
wReserved3
00 00
= [unused]
lVal
4E 61 BC 00
= 0x004E61BC ⇒ 12,345,67810
[empty]
00 00 00 00
= [unused]

We can see that the Long value is stored starting byte offset 8. The “real” content of a variant always starts at byte offset 8 (except for a DECIMAL…covered later). Because of this, we can do some pointer arithmetic and get a pointer directly to the Long inside the Variant. This is what we do in the following line of code:

    ptrToVarLong = ptrToVar + 8

Now ptrToVarLong is in fact a pointer to a Long. That Long just happens to be inside a Variant. The value we get by directly reading from this address (the number 12,345,678) is the exact same as the value we get from directly reading from the address of the variable actually declared as a Long, which also has the value 12,345,678.

Example 2: Static String vs Variant String
Code
Sub StringVarExample()
    Dim myString        As String
    Dim myVar           As Variant
    Dim ptrToStringVar  As LongPtr
    Dim ptrToStringBSTR As LongPtr
    Dim ptrToVar        As LongPtr
    Dim ptrToVarContent As LongPtr
    Dim ptrToVarBSTR    As LongPtr
    Dim lngBSTRLen      As Long
    
    myString = "Hello world!"
    myVar = "Hello world!"
    lngBSTRLen = LenB(myString)
    ptrToStringVar = VarPtr(myString)
    ptrToStringBSTR = StrPtr(myString)
    ptrToVar = VarPtr(myVar)
    ptrToVarContent = ptrToVar + 8
    ptrToVarBSTR = StrPtr(myVar)
    
    Debug.Print " ptrToStringVar   = 0x" & HexPtr(ptrToStringVar)
    Debug.Print "*ptrToStringVar   = 0x" & Mem_ReadHex(ptrToStringVar, PTR_LENGTH)
    Debug.Print " ptrToStringBSTR  = 0x" & HexPtr(ptrToStringBSTR)
    Debug.Print "*ptrToStringBSTR  = 0x" & Mem_ReadHex(ptrToStringBSTR, lngBSTRLen)
    Debug.Print " ptrToVar         = 0x" & HexPtr(ptrToVar)
    Debug.Print "*ptrToVar         = 0x" & Mem_ReadHex(ptrToVar, 16)
    Debug.Print " ptrToVarContent  = 0x" & HexPtr(ptrToVarContent)
    Debug.Print "*ptrToVarContent  = 0x" & Mem_ReadHex(ptrToVarContent, PTR_LENGTH)
    Debug.Print " ptrToVarBSTR     = 0x" & HexPtr(ptrToVarBSTR)
    Debug.Print "*ptrToVarBSTR     = 0x" & Mem_ReadHex(ptrToVarBSTR, lngBSTRLen)
End Sub
Output
 ptrToStringVar   = 0x001CF010
*ptrToStringVar   = 0x3CA6E007
 ptrToStringBSTR  = 0x07E0A63C
*ptrToStringBSTR  = 0x480065006C006C006F00200077006F0072006C0064002100
 ptrToVar         = 0x001CF000
*ptrToVar         = 0x080000000000000064A6E00700000000
 ptrToVarContent  = 0x001CF008
*ptrToVarContent  = 0x64A6E007
 ptrToVarBSTR     = 0x07E0A664
*ptrToVarBSTR     = 0x480065006C006C006F00200077006F0072006C0064002100

Explanation

For brevity I won’t break out all the variables and byte layout for this example. Just note that adding 8 to the Variant pointer in this case yields a pointer that behaves exactly like a pointer to a statically typed String variable. In other words, the content of the String variable is a pointer to a BSTR structure. The typed content of the Variant String is also a pointer to a BSTR structure. But the typed (String) content of the Variant variable starts in the 8th byte of the Variant as a whole. This parallels the previous example, where the content (a 32-bit integer literal) of a statically type Long was the same as the content of a Variant Long, except that the typed (Long) content of the Variant started at the 8th byte of the Variant as a whole.

Example 3: Static Array vs Variant Array
Code
Public Sub ArrayVarExample()

    Dim aLongs() As Long
    Dim ptrToaLongsVar As LongPtr
    Dim ptrToaLongsSAFEARRAY As LongPtr
    Dim ptrToaLongsData As LongPtr
    Dim ptrArrCursor As LongPtr
    
    Dim vLongs As Variant
    Dim ptrTovLongsVar As LongPtr
    Dim ptrTovLongsContent As LongPtr
    Dim ptrTovLongsSAFEARRAY As LongPtr
    Dim ptrTovLongsData As LongPtr
    Dim ptrVarCursor As LongPtr
    
    Dim i As Long, ub As Long
    Dim lValue As Long, vValue As Long
    Dim strFmt As String
    Dim pvDataOffset As Long
    
#If Win64 Then
    pvDataOffset = 16
#Else
    pvDataOffset = 12
#End If
    
    ub = 6
    ReDim aLongs(ub)
    For i = 0 To ub
        aLongs(i) = (i * (i + 1)) / 2
    Next
    
    ' Copy via value-type VBA array semantics
    vLongs = aLongs
    
    ptrToaLongsVar = VarPtrArray(aLongs)
    Mem_Copy ptrToaLongsSAFEARRAY, ByVal ptrToaLongsVar, PTR_LENGTH
    Mem_Copy ptrToaLongsData, ByVal (ptrToaLongsSAFEARRAY + pvDataOffset), PTR_LENGTH
    
    ptrTovLongsVar = VarPtr(vLongs)
    ptrTovLongsContent = ptrTovLongsVar + 8
    Mem_Copy ptrTovLongsSAFEARRAY, ByVal ptrTovLongsContent, PTR_LENGTH
    Mem_Copy ptrTovLongsData, ByVal (ptrTovLongsSAFEARRAY + pvDataOffset), PTR_LENGTH
    
    Debug.Print "Statically typed array"
    Debug.Print "  ptrToaLongsVar       = 0x"; HexPtr(ptrToaLongsVar)
    Debug.Print "  ptrToaLongsSAFEARRAY = 0x"; HexPtr(ptrToaLongsSAFEARRAY)
    Debug.Print "  ptrToaLongsData      = 0x"; HexPtr(ptrToaLongsData)
    
    Debug.Print "Variant array"
    Debug.Print "  ptrTovLongsVar       = 0x"; HexPtr(ptrTovLongsVar)
    Debug.Print "  ptrTovLongsContent   = 0x"; HexPtr(ptrTovLongsContent)
    Debug.Print "  ptrTovLongsSAFEARRAY = 0x"; HexPtr(ptrTovLongsSAFEARRAY)
    Debug.Print "  ptrTovLongsData      = 0x"; HexPtr(ptrTovLongsData)
    
    Debug.Print "Verify data"
    Debug.Print " Index  Longs() Variant"
    Debug.Print "------- ------- -------"
    
    ptrArrCursor = ptrToaLongsData
    ptrVarCursor = ptrTovLongsData
    strFmt = "@@@@@@  "
    For i = 0 To ub
        Mem_Copy lValue, ByVal ptrArrCursor, 4
        Mem_Copy vValue, ByVal ptrVarCursor, 4
        Debug.Print Format(i, strFmt); Format(lValue, strFmt); Format(vValue, strFmt)
        ptrArrCursor = ptrArrCursor + 4
        ptrVarCursor = ptrVarCursor + 4
    Next
    
End Sub
Output
Statically typed array
  ptrToaLongsVar       = 0x00427F40
  ptrToaLongsSAFEARRAY = 0x0042C110
  ptrToaLongsData      = 0x0C685800
Variant array               
  ptrTovLongsVar       = 0x00427F08
  ptrTovLongsContent   = 0x00427F10
  ptrTovLongsSAFEARRAY = 0x0042BE00
  ptrTovLongsData      = 0x0C685FE0
Verify data
 Index  Longs() Variant
------- ------- -------
     0       0       0  
     1       1       1  
     2       3       3  
     3       6       6  
     4      10      10  
     5      15      15  
     6      21      21  

Explanation

Once again I won’t explain each point in the example code and output. For a much more thorough discussion of pointers to arrays, see Array Variables and Pointers in Depth. The point here is that getting the content of the Variant/Array by stepping 8 bytes forward into the VARIANT structure gives us a pointer (to a SAFEARRAY) that behaves exactly like a pointer obtained directly by calling VarPtrArray on a statically-typed array variable.

More on variant pointers…

We’re not done with pointers to Variants yet! Two categories remain: Decimals and ByRef Variants. Those topics are big enough for their own posts, which will follow in the near future.