Sunday, 19 August 2012

Join and Split functions for 2-Dimensional arrays

Here's something I fished out of the attic and posted into StackOverflow...

The code's trivial, in the sense that anyone can do a bit of string-concatenation and a Redim() statement and you've probably done some kind of 'Join' and 'Split' already. But there are a couple of points about efficient string-handling in the comments; or rather, overcoming the inefficiencies of a language which has no string-builder class.

Someday, you are going to find out that concatenating strings slows down *severely* for long strings, and you'll need to know how to work around that.

So, without further ado:

Join2d: A 2-Dimensional Join function in VBA with optimised string-handling

Coding notes:
  1. This 'Join' function does not suffer from the 255-char limitation that affects most (if not all) of the native Concatenate functions in Excel, and the Range.Value code sample above will pass in the data, in full, from cells containing longer strings.
  2. This is heavily optimised: we use string-concatenation as little as possible, as the native VBA string-concatenations are slow and get progressively slower as a longer string is concatenated.
If you want to look more deeply into optimising string-handling in VBA and the VB family of languages, advanced techniques are listed in parts I, II and II of this web article: http://www.aivosto.com/vbtips/stringopt3.html

The biggest performance gain available in native VBA is to avoid allocation and concatenation ( here's why: http://www.aivosto.com/vbtips/stringopt2.html#huge ) - so I use join, split, and replace instead of myString = MyString & MoreString

Bigger gains are available if you use the Kernel string functions directly: after that, you're Googling for LightningStrings and taking the big step into pointer arithmentic... Which I consider a step too far: if you need that kind of performance, you need another platform.



Public Function Join2d(ByRef InputArray As Variant, _ 
                       Optional RowDelimiter As String = vbCr, _ 
                       Optional FieldDelimiter = vbTab,_ 
                       Optional SkipBlankRows As Boolean = False _ 
                       ) As String

' Join up a 2-dimensional array into a string. Works like the standard
'  VBA.Strings.Join, for a 2-dimensional array.
' Note that the default delimiters are those inserted into the string
'  returned by ADODB.Recordset.GetString

On Error Resume Next

' Coding note: we're not doing any string-handling in VBA.Strings - 
' allocating, deallocating and (especially!) concatenating are SLOW.
' We're using the VBA Join & Split functions ONLY. The VBA Join,
' Split, & Replace functions are linked directly to fast (by VBA
' standards) functions in the native Windows code. Feel free to 
' optimise further by declaring and using the Kernel string functions
' if you want to.

' ** THIS CODE IS IN THE PUBLIC DOMAIN **
'   Nigel Heffernan   Excellerando.Blogspot.com

Dim i As Long
Dim j As Long

Dim i_lBound As Long
Dim i_uBound As Long
Dim j_lBound As Long
Dim j_uBound As Long

Dim arrTemp1() As String
Dim arrTemp2() As String

Dim strBlankRow As String

i_lBound = LBound(InputArray, 1)
i_uBound = UBound(InputArray, 1)

j_lBound = LBound(InputArray, 2)
j_uBound = UBound(InputArray, 2)

ReDim arrTemp1(i_lBound To i_uBound)
ReDim arrTemp2(j_lBound To j_uBound)

For i = i_lBound To i_uBound

    For j = j_lBound To j_uBound
        arrTemp2(j) = InputArray(i, j)
    Next j

    arrTemp1(i) = Join(arrTemp2, FieldDelimiter)

Next i

If SkipBlankRows Then

    If Len(FieldDelimiter) = 1 Then
        strBlankRow = String(j_uBound - j_lBound, FieldDelimiter)
    Else
        For j = j_lBound To j_uBound
            strBlankRow = strBlankRow & FieldDelimiter
        Next j
    End If

    Join2d = Replace(Join(arrTemp1, RowDelimiter), strBlankRow, RowDelimiter, "")
    i = Len(strBlankRow & RowDelimiter)

    If Left(Join2d, i) = strBlankRow & RowDelimiter Then
        Mid$(Join2d, 1, i) = ""
    End If

Else

    Join2d = Join(arrTemp1, RowDelimiter)    

End If

Erase arrTemp1

End Function
For completeness, here's the corresponding 2-D Split function:

Split2d: A 2-Dimensional Split function in VBA with optimised string-handling



Public Function Split2d(ByRef strInput As String, _ 
                        Optional RowDelimiter As String = vbCr, _ 
                        Optional FieldDelimiter = vbTab, _ 
                        Optional CoerceLowerBound As Long = 0 _ 
                        ) As Variant

' Split up a string into a 2-dimensional array. 

' Works like VBA.Strings.Split, for a 2-dimensional array.
' Check your lower bounds on return: never assume that any array in
' VBA is zero-based, even if you've set Option Base 0
' If in doubt, coerce the lower bounds to 0 or 1 by setting 
' CoerceLowerBound
' Note that the default delimiters are those inserted into the
'  string returned by ADODB.Recordset.GetString

On Error Resume Next

' Coding note: we're not doing any string-handling in VBA.Strings -
' allocating, deallocating and (especially!) concatenating are SLOW.
' We're using the VBA Join & Split functions ONLY. The VBA Join,
' Split, & Replace functions are linked directly to fast (by VBA
' standards) functions in the native Windows code. Feel free to 
' optimise further by declaring and using the Kernel string functions
' if you want to.

' ** THIS CODE IS IN THE PUBLIC DOMAIN **
'    Nigel Heffernan   Excellerando.Blogspot.com

Dim i   As Long
Dim j   As Long

Dim i_n As Long
Dim j_n As Long

Dim i_lBound As Long
Dim i_uBound As Long
Dim j_lBound As Long
Dim j_uBound As Long

Dim arrTemp1 As Variant
Dim arrTemp2 As Variant

arrTemp1 = Split(strInput, RowDelimiter)

i_lBound = LBound(arrTemp1)
i_uBound = UBound(arrTemp1)

If VBA.LenB(arrTemp1(i_uBound)) <= 0 Then  
    ' clip out empty last row: a common artifact in data 
     'loaded from files with a terminating row delimiter
    i_uBound = i_uBound - 1
End If

i = i_lBound
arrTemp2 = Split(arrTemp1(i), FieldDelimiter)

j_lBound = LBound(arrTemp2)
j_uBound = UBound(arrTemp2)

If VBA.LenB(arrTemp2(j_uBound)) <= 0 Then 
 ' ! potential error: first row with an empty last field...
    j_uBound = j_uBound - 1
End If

i_n = CoerceLowerBound - i_lBound
j_n = CoerceLowerBound - j_lBound

ReDim arrData(i_lBound + i_n To i_uBound + i_n, j_lBound + j_n To j_uBound + j_n)

' As we've got the first row already... populate it
' here, and start the main loop from lbound+1

For j = j_lBound To j_uBound
    arrData(i_lBound + i_n, j + j_n) = arrTemp2(j)
Next j

For i = i_lBound + 1 To i_uBound Step 1

    arrTemp2 = Split(arrTemp1(i), FieldDelimiter)

    For j = j_lBound To j_uBound Step 1

        arrData(i + i_n, j + j_n) = arrTemp2(j)

    Next j

    Erase arrTemp2

Next i

Erase arrTemp1

Application.StatusBar = False

Split2d = arrData

End Function
Share and enjoy... And watch out for unwanted line breaks in the code, inserted by your browser (or by Blogger's helpful formatting functions)

Wednesday, 1 August 2012

Adding a month to the end of the month: another Excel annoyance:

Ever see a column of month-end payment days do this in a spreadsheet?
 31/08/2005  30/11/2005  28/02/2006  28/05/2006  28/08/2006  28/11/2006  28/02/2007  28/05/2007  28/08/2007  28/11/2007  28/02/2008

Sigh. All VBA developers eventually face the weary task of correcting the VBA.DateTime function library because of this loathsome miscoding by Microsoft:



' Special handling required for adding months at EOM:
' VBA.DateAdd("m", 1, "28 Feb 2006") = 28/03/2006 (!)
' Business logic is ALWAYS that adding a month to EOM
' gives the end of the following month - 31 Mar 2006.

Here's my solution: I guess you've all got one of your own by now.

The usual health warning applies to ATTRIBUTE statements: they are not recognised by the VBA editor, you have to drag-and-drop the entire module out of the VB IDE, insert the statements manually in notepad, and drag the object back. In case you didn't know, the VB_ProcData attribute places AddDate in the 'Date & Time' category of the Spreadsheet Function Wizard, instead of letting it languish in obscurity under 'User-Defined'.


Public Function AddDate( _
        ByVal DateString As String, _
        Optional ByVal ReferenceDate As Date _
        Optional Subtract As Boolean = False _
        ) As Date
'ATTRIBUTE AddDate.VB_Description="Add a datestring of the form '1m', '10d' or '5y' to the reference date. \r\nBy default the reference date is the current date. \r\nInteger dates only: time expressed as fractional days is discarded. \r\nAll addition and subtraction uses Actual/Actual: no other date convention is implemented.
'ATTRIBUTE AddDate.VB_ProcData.VB_Invoke_Func = " \n2"

'Nigel Heffernan 2001

'THIS CODE IS IN THE PUBLIC DOMAIN

'Add a datestring of the form '1m', '10d' or '5y' to the reference date.
'By default the reference date is the current date.
'Integer dates only: time expressed as fractional days is discarded.
'All addition and subtraction uses Actual/Actual: no other date convention is implemented.

Const VB_HELPFILE As String = "C:\PROGRA~1\COMMON~1\MICROS~1\VBA\VBA6\1033\VbLR6.chm"
' I'm too lazy to do the proper registry lookup for this help file.

On Error GoTo ErrSub

Dim sNum As String
Dim iLen As Integer
Dim i As Long
Dim strLabel As String

If ReferenceDate = 0 Then
    ReferenceDate = Date
End If

DateString = Trim(UCase(DateString))
DateString = Left(DateString, 16)

If DateString = "SPOT" Then

    DateString = "2" 'Spot price - 'zero-day' plus settlement lag
    strLabel = "d"

ElseIf DateString = "OVERNIGHT" Then

    DateString = "1"
    strLabel = "d"

ElseIf DateString = "O/N" Then

    DateString = "1"
    strLabel = "d"

ElseIf DateString = "DAILY" Then

    DateString = "1"
    strLabel = "d"

ElseIf DateString = "WEEKLY" Then

    DateString = "7"
    strLabel = "d"

ElseIf DateString = "ANNUAL" Then

    DateString = "1"
    strLabel = "yyyy"  ' Year

ElseIf DateString = "YEARLY" Then

    DateString = "1"
    strLabel = "yyyy"  ' Year

ElseIf DateString = "MONTHLY" Then

    DateString = "1"
    strLabel = "m"

ElseIf DateString = "QUARTERLY" Then

    DateString = "3"
    strLabel = "m"

ElseIf DateString = "SEMI-ANNUAL" Then

    DateString = "6"
    strLabel = "m"

ElseIf DateString = "SEMIANNUAL" Then

    DateString = "6"
    strLabel = "m"

ElseIf InStr(DateString, "MONTH") Then

    iLen = InStr(DateString, "M")
    strLabel = "m"      ' Month"

ElseIf InStr(DateString, "YEAR") Then

    iLen = InStr(DateString, "Y")
    strLabel = "yyyy"  ' Year"

ElseIf InStr(DateString, "DAY") Then

    iLen = InStr(DateString, "D")
    strLabel = "d"      ' Day"

ElseIf InStr(DateString, "M") Then

    iLen = InStr(DateString, "M")
    strLabel = "m"      ' Month"

ElseIf InStr(DateString, "Y") Then

    iLen = InStr(DateString, "Y")
    strLabel = "yyyy"  ' Year"

ElseIf InStr(DateString, "D") Then

    iLen = InStr(DateString, "D")
    strLabel = "d"      ' Day"

ElseIf InStr(DateString, "Q") Then

    iLen = InStr(DateString, "Q")
    strLabel = "q"      ' Quarter"

ElseIf InStr(DateString, "W") Then

    iLen = InStr(DateString, "W")
    strLabel = "ww"     ' Week"

ElseIf IsNumeric(DateString) Then

    iLen = Len(DateString)
    strLabel = "d"      ' Day"

Else

    GoTo ErrSub

End If

sNum = Trim(Left(DateString, iLen - 1))

If Not IsNumeric(sNum) Then

    'Trim down until we reach a number

    Do Until IsNumeric(sNum) Or Len(sNum) < 1

        sNum = Left(sNum, Len(sNum) - 1)
        sNum = Trim(sNum)

        'Do not read "5-Year" as "Minus five years"

        If Right(sNum, 1) = "-" Then

            sNum = Left(sNum, Len(sNum) - 1)
            sNum = Trim(sNum)

        End If

    Loop

End If

If Len(sNum) < 1 Then
    GoTo ErrSub

End If

If Not IsNumeric(sNum) Then

    GoTo ErrSub

End If

i = CLng(sNum)

If Subtract Then

    i = -1 * i

End If



' Special handling required for adding months at EOM:
' VBA.DateAdd("m", 1, "28 Feb 2006") = 28/03/2006 (!)
' Business logic is ALWAYS that adding a month to EOM
' gives the end of the following month - 31 Mar 2006.

If strLabel = "m" Then

    If Month(ReferenceDate) <> Month(ReferenceDate) + 1 Then 'EOM detected

        ReferenceDate = ReferenceDate + 1
        AddDate = DateAdd(strLabel, i, ReferenceDate)
        AddDate = AddDate - 1

    Else

        AddDate = DateAdd(strLabel, i, ReferenceDate)

    End If

Else

    AddDate = DateAdd(strLabel, i, ReferenceDate)

End If

ExitSub:

    Exit Function

ErrSub:

    If Len(Dir(VB_HELPFILE)) > 0 Then

        Err.Raise 13, "AddDate Function", _
         "'" & DateString & "'" & " was not recognised as date interval." & vbCrLf _
         & vbCrLf _
         & "Try typing '10d', '3m' or '5y', or the date " & vbCrLf _
         & "interval as a number of calendar days.", _
         VB_HELPFILE, 1000013

    Else

        Err.Raise 13, "AddDate Function", _
         "'" & DateString & "'" & " was not recognised as date interval." & vbCrLf _
         &  _
        vbCrLf _
         & "Try typing '10d', '3m' or '5y', or the date " &  _
        vbCrLf _
         & "interval as a number of calendar days."

    End If

     

End Function