VBA Internals: Array Variables and Pointers in Depth

June 5, 2013 in VBA

This is the next installment of a series of deep-dives into the structure and implementation of variables in Visual Basic for Applications. For the previous posts, see the following:

In this post, I will cover the details of array variables and pointers. See Scalar Variables and Pointers in Depth for additional background and for the code for the utility functions HexPtr and Mem_ReadHex.

Pointers and memory for array variables

Like strings, arrays in VBA are treated semantically like value types but are implemented as reference types. Also like strings, arrays in VBA are implemented using a COM automation structure. For arrays the supporting COM type is the safe array, which comes with a large group of utility functions.

The SAFEARRAY structure

To be specific, safe arrays are implemented with a SAFEARRAY structure (which itself contains one or more SAFEARRAYBOUND structures) and a data vector. The SAFEARRAY is a short structure which contains information about the array data, but no actual array content. When I write SAFEARRAY in all capitals I am referring specifically to the SAFEARRAY structure, which is just the “header” or metadata for the safe array object as a whole. When I write “safe array” in lower case I am referring to the array object as a whole.

SAFEARRAYs are somewhat tricky because their size depends both on processor architecture (32- or 64-bit) and on the number of dimensions in the array. The VBA declaration for a 1-dimensional SAFEARRAY structure looks like this:

    cElements    As Long
    lLbound      As Long
End Type

    cDims        As Integer
    fFeatures    As Integer
    cbElements   As Long
    cLocks       As Long
    pvData       As LongPtr
    rgsabound(0) As SAFEARRAYBOUND
End Type

Note that the second-to-last field of the SAFEARRAY is a LongPtr, which is 4 bytes on a 32-bit system but 8 bytes on a 64-bit system. The last field is a fixed array of SAFEARRAYBOUND structures. The real COM type can have any number SAFEARRAYBOUND elements, corresponding to the number of dimensions in the safearray. To declare an array in a user-defined type in VBA, however, the number of elements must be fixed. This is because arrays inside user-defined types are not “real” arrays. That is, they are not implemented as safearrays, but are instead simple value vectors like C-style arrays. The size of a UDT must be fixed at compile time, so the size of any arrays in a UDT must also be fixed. Which is all to say, if you want to examine the SAFEARRAY headers of arrays with more than one dimension, you will either have to do some manual pointer arithmetic or declare multiple versions of a SAFEARRAY UDT, each with a different number of elements in the rgsabound field.

VBA Arrays: Pointers to pointers to pointers

Technically, SAFEARRAY headers can be pointed to any vector of actual data just by changing the value of the pvData field. But VBA adds even another layer of indirection in that the content of an array variable itself is not a SAFEARRAY header, but instead a pointer to a SAFEARRAY header. This is distinct from UDT variables, which contain the entire structure in the variable’s content. Put another way, calling VarPtr on a UDT typed variable will get you the address of the start of the UDT structure, but calling VarPtr on a array variable will get you an address to yet another pointer.

I think diagrams make this all a lot clearer, so here we go with the sample code and diagrams for a simple array of Longs. Note in order to get the pointer to an actual array variable, you need to manually declare a different signature for the VarPtr function, traditionally aliased as “VarPtrArray”. See Getting Pointers for more details.

Private Declare PtrSafe Function VarPtrArray Lib "VBE7" Alias _
    "VarPtr" (ByRef Var() As Any) As LongPtr

    cElements    As Long
    lLbound      As Long
End Type

    cDims        As Integer
    fFeatures    As Integer
    cbElements   As Long
    cLocks       As Long
    pvData       As LongPtr
    rgsabound(0) As SAFEARRAYBOUND
End Type

Sub ArrayPtrExample()
    Dim aLongs() As Long, i As Long
    Dim ptrToArrayVar As LongPtr
    Dim ptrToSafeArray As LongPtr
    Dim ptrToArrayData As LongPtr
    Dim ptrCursor As LongPtr
    Dim lngValue As Long
    ReDim aLongs(3 To 12)
    For i = 3 To 12
        ' Triangular sum of i
        aLongs(i) = i * (i + 1) / 2
    ' Get pointer to array *variable*
    ptrToArrayVar = VarPtrArray(aLongs)
    ' Get the pointer to the *SAFEARRAY* by directly
    ' reading the variable's address
    CopyMemory ptrToSafeArray, ByVal ptrToArrayVar, PTR_LENGTH
    ' Read the SAFEARRAY struct
    CopyMemory uSAFEARRAY, ByVal ptrToSafeArray, LenB(uSAFEARRAY)
    ' Get the pointer to the actual vector of longs
    ptrToArrayData = uSAFEARRAY.pvData
    Debug.Print " ptrToArrayVar  : 0x"; HexPtr(ptrToArrayVar)
    Debug.Print "*ptrToArrayVar  : 0x"; Mem_ReadHex(ptrToArrayVar, PTR_LENGTH)
    Debug.Print " ptrToSafeArray : 0x"; HexPtr(ptrToSafeArray)
    Debug.Print "*ptrToSafeArray : 0x"; Mem_ReadHex(ptrToSafeArray, LenB(uSAFEARRAY))
    Debug.Print " ptrToArrayData : 0x"; HexPtr(uSAFEARRAY.pvData)
    Debug.Print "*ptrToArrayData : 0x"; Mem_ReadHex(uSAFEARRAY.pvData, 40)
    ' Demonstrate pointer arithmetic on value vector
    ptrCursor = ptrToArrayData
    For i = 0 To 9
        ' Fetch the Long value
        CopyMemory lngValue, ByVal ptrCursor, 4
        ' Print the pointer and its dereferenced value
        Debug.Print "ptrToArrayData[" & i & "] : 0x"; HexPtr(ptrCursor); _
                    " : 0x"; Hex$(lngValue); " = "; lngValue
        ' Increment the pointer
        ptrCursor = ptrCursor + 4
End Sub
 ptrToArrayVar  : 0x0036EEF0
*ptrToArrayVar  : 0x80DB4600
 ptrToSafeArray : 0x0046DB80
*ptrToSafeArray : 0x01008000040000000000000A0DB4600
 ptrToArrayData : 0x0046DBA0
*ptrToArrayData : 0x060000000A0000000F00000015000000
ptrToArrayData[0] : 0x0046DBA0 : 0x6 =  6 
ptrToArrayData[1] : 0x0046DBA4 : 0xA =  10 
ptrToArrayData[2] : 0x0046DBA8 : 0xF =  15 
ptrToArrayData[3] : 0x0046DBAC : 0x15 =  21 
ptrToArrayData[4] : 0x0046DBB0 : 0x1C =  28 
ptrToArrayData[5] : 0x0046DBB4 : 0x24 =  36 
ptrToArrayData[6] : 0x0046DBB8 : 0x2D =  45 
ptrToArrayData[7] : 0x0046DBBC : 0x37 =  55 
ptrToArrayData[8] : 0x0046DBC0 : 0x42 =  66 
ptrToArrayData[9] : 0x0046DBC4 : 0x4E =  78 


The variable table in this case is simple:

Name Type Address
aLongs Long() 0x0036EEF0

But we have to take a number of hops to get from the variable to the actual Long values that make up the array. Along the way, we get the content of the SAFEARRAY structure. Here’s how it all maps out in detail. As always, the byte order and pointer size depends on the architecture; in my case it’s 32-bit office on a little-endian Intel processor:

Address 0 1 2 3
0x0036EEFx 80 DB 46 00
= 0x0046DB80
Address 0 1 2 3 4 5 6 7 8 9 A B C D E F
0x0046DB8x 01 00 08 00 04 00 00 00 00 00 00 00 A0 DB 46 00
0x0046DB9x 0A 00 00 00 03 00 00 00
01 00
= 0x0001 ⇒ 1 dimension
08 00
04 00 00 00
= 0x00000004 ⇒ 4 bytes per element
00 00 00 00
= 0x00000000 ⇒ No locks on array
A0 DB 46 00
= 0x0046DBA0 ⇒ Pointer to data vector
rgsabound[0] (SAFEARRAYBOUND):
0A 00 00 00
= 0x0000000A = 1010 ⇒ 10 elements in bound
03 00 00 00
= 0x00000003 ⇒ Lower bound is 3
Address 0 1 2 3 4 5 6 7 8 9 A B C D E F
0x0046DBAx 06 00 00 00 0A 00 00 00 0F 00 00 00 15 00 00 00
0x0046DBBx 1C 00 00 00 24 00 00 00 2D 00 00 00 37 00 00 00
0x0046DBCx 42 00 00 00 4E 00 00 00

1 See fFeatures section in SAFEARRAY structure