-1

In an another post, the user Excellll provided a macro address the aforementioned question.

I have a worksheet which has data as below:

A                      B           C
abc,def,ghi,jkl      1,2,3     a1,e3,h5,j8

The following solution turns it into

abc  1  a1
abc  2  a1
abc  3  a1
abc  1  e3
abc  2  e3
abc  3  h5

However, I wanted to know how the macro can be modified as the number of columns of data grow from 3 columns of data to 10 columns of data.

I tried modifying the macro a number of times based upon the patterns in the code that I saw, but I kept getting an error.

Community
  • 1
  • 1
user1657410
  • 1
  • 1
  • 1
  • What you want to do is theoretically possible, but since you have made your own attempt, I would recommend you post your example, then identify the error and maybe even see where it is occurring. It is very likely someone will be able to help. – psubsee2003 Oct 06 '12 at 01:29

2 Answers2

1

I am a fan of recursion but only if I believe it provides the simpliest solution. I do not believe it is appropriate for this problem.

In the original question, UJ9 had:

Column    A                B         C
Row 1     abc,def,ghi,jkl  1,2,3     a1,e3,h5,j8

and wanted:

Column    A    B   C
Row  1    abc  1   a1
Row  2    abc  2   a1
Row  3    abc  3   a1
Row  4    abc  1   e3
Row  5    abc  2   e3
Row  6    abc  3   h5
 :
Row 48    jkl  3   j8

user1657410 wants the same but with 10 columns.

The solutions for the original problem use three (one per column) nested for-loops. Adapting those solutions for ten nested for-loops is possible but not an easy implementation. Let us consider the principle behind those solutions and then look for a different implementation strategy.

If we index the values in each column we get:

Column    A                B         C
Row 1     abc,def,ghi,jkl  1,2,3     a1,e3,h5,j8
Index     0   1   2   3    0 1 2     0  1  2  3

What the solutions do is generate every combination of index: 000 001 002 003 010 011 012 013 020 021 021 023 100 ... 323 and use the digits to select the appropriate substring from the appropriate string.

To adapt this approach for a larger number of columns we need to switch from nested for-loops to arrays with one entry per column. One array hold the maximum value of the index for the column and the other holds the currently selected index. The initial state would be something like:

Column               A    B    C    D    E    F    G    H    I    J
Maximum index array  4    3    4    4    3    2    6    3    4    2
Current index array  0    0    0    0    0    0    0    0    0    0

We now need a loop that will increment the Current index array like a speedometer except each column has its own maximum. That is, we want to add one to the rightmost element of the Current index array unless it is already at its maximum value. If it is at its maximum value, it is reset to zero and the next column to the left is incremented unless it is at its maximum value. This continues until the loop wants to increment the leftmost index past its maximum value. That is, we need a loop which will set the Current index array to the following values:

Column               A    B    C    D    E    F    G    H    I    J
Maximum index array  4    3    4    4    3    2    6    3    4    2
Current index array  0    0    0    0    0    0    0    0    0    0
                     0    0    0    0    0    0    0    0    0    1
                     0    0    0    0    0    0    0    0    0    2
                     0    0    0    0    0    0    0    0    1    0
                     0    0    0    0    0    0    0    0    1    1
                     0    0    0    0    0    0    0    0    1    2
                     0    0    0    0    0    0    0    0    2    0
                     0    0    0    0    0    0    0    0    2    1
                     0    0    0    0    0    0    0    0    2    2
                     0    0    0    0    0    0    0    0    3    0
                     0    0    0    0    0    0    0    0    3    1
                     0    0    0    0    0    0    0    0    3    2
                     0    0    0    0    0    0    0    1    0    0
       :      :
                     4    3    4    4    3    2    6    3    4    2

For each different value of the Current index array, you select the appropriate substring from each column and generate a row containing the substrings.

Before we go any further, are you sure you want to generate a row per combination of sub-string? With the maximum index values I selected for my example, you would get 2,520,000 rows.

The code below assumes the source row is row 1. It outputs the generated rows starting at row 3. This code generates a table like the one above so you can properly understand how the code works. Below this code are instructions to amend it to output substrings. The code adjusts to the number of columns in the source row. The code does not check that your version of Excel can support the number of rows generated.

Sub Combinations()

  Dim ColCrnt As Long
  Dim ColMax As Long
  Dim IndexCrnt() As Long
  Dim IndexMax() As Long
  Dim RowCrnt As Long
  Dim SubStrings() As String
  Dim TimeStart As Single

  TimeStart = Timer

  With Worksheets("Combinations")

    ' Use row 1 as the source row.  Find last used column.
    ColMax = .Cells(1, Columns.Count).End(xlToLeft).Column

    ' Size Index arrays according to number of columns
    ' Use one based arrays so entry number matches column number
    ReDim IndexCrnt(1 To ColMax)
    ReDim IndexMax(1 To ColMax)

    ' Initialise arrays
    For ColCrnt = 1 To ColMax
      SubStrings = Split(.Cells(1, ColCrnt).Value, ",")
      ' SubStrings is a zero-based array with one entry
      ' per comma separated value.
      IndexMax(ColCrnt) = UBound(SubStrings)
      IndexCrnt(ColCrnt) = 0
    Next

    RowCrnt = 3     ' Output generated values starting at row 3

    Do While True

      ' Use IndexCrnt() here.
      ' For this version I output the index values
      For ColCrnt = 1 To ColMax
        ' This will generate an error if RowCrnt exceeds the maximum number
        ' of columns for your version of Excel.  
        .Cells(RowCrnt, ColCrnt).Value = IndexCrnt(ColCrnt)
      Next
      RowCrnt = RowCrnt + 1

      ' Increment values in IndexCrnt() from right to left
      For ColCrnt = ColMax To 1 Step -1
        If IndexCrnt(ColCrnt) < IndexMax(ColCrnt) Then
          ' This column's current index can be incremented
          IndexCrnt(ColCrnt) = IndexCrnt(ColCrnt) + 1
          Exit For
        End If
        If ColCrnt = 1 Then
          ' Leftmost column has overflowed.
          ' All combinations of index value have been generated.
          Exit Do
        End If
        IndexCrnt(ColCrnt) = 0
        ' Loop to increment next column
      Next

    Loop

  End With

  Debug.Print Format(Timer - TimeStart, "#,###.##")

End Sub

If you are happy that you understand the above code, replace:

      ' For this version I output the index values
      For ColCrnt = 1 To ColMax
        .Cells(RowCrnt, ColCrnt).Value = IndexCrnt(ColCrnt)
      Next

by:

      For ColCrnt = 1 To ColMax
        SubStrings = Split(.Cells(1, ColCrnt).Value, ",")
        .Cells(RowCrnt, ColCrnt).Value = SubStrings(IndexCrnt(ColCrnt))
      Next

This revised code output the appropriate substring for each combination but it will be slow with large numbers of combination because it extracts the required substring from the source cell for every generated row. For example, it generates 27,648 rows in 12.66 seconds. The code below takes 9.15 seconds but uses a more advanced technique.

Step 1, replace:

  Dim SubStrings() As String

by:

  Dim SubStrings() As Variant

With Dim SubStrings() As String, SubString(N) can only contain a string. With Dim SubStrings() As Variant, SubString(N) can contain a string or an integer or a floating-point value. This is not good in most situations because a variant is slower to process than a string or a long and you will not be warned if you set it to the wrong sort of value for your code. However, I am going to store an array in SubString(N). I will be using what is called a ragged array because each row has a different number of columns.

Step 2, replace:

    ReDim IndexCrnt(1 To ColMax)
    ReDim IndexMax(1 To ColMax)

by:

    ReDim IndexCrnt(1 To ColMax)
    ReDim IndexMax(1 To ColMax)
    ReDim SubStrings(1 To ColMax)

Step 3, replace:

    ' Initialise arrays
    For ColCrnt = 1 To ColMax
      SubStrings = Split(.Cells(1, ColCrnt).Value, ",")
      ' SubStrings is a zero-based array with one entry
      ' per comma separated value.
      IndexMax(ColCrnt) = UBound(SubStrings)
      IndexCrnt(ColCrnt) = 0
    Next

by:

    ' Initialise arrays
    For ColCrnt = 1 To ColMax
      SubStrings(ColCrnt) = Split(.Cells(1, ColCrnt).Value, ",")
      IndexMax(ColCrnt) = UBound(SubStrings(ColCrnt))
      IndexCrnt(ColCrnt) = 0
    Next

With the first version, I overwrite the array SubStrings everytime I split a cell. With the second version, I save each column's substrings. With the values used by UJ9 in the original question, the new SubString looks like:

        ---- Columns -----
Row     0    1    2    3  
  1     abc  def  ghi  jkl
  2     1    2    3
  3     a1   e3   h5   j8

Step 4: replace:

      For ColCrnt = 1 To ColMax
        SubStrings = Split(.Cells(1, ColCrnt).Value, ",")
        .Cells(RowCrnt, ColCrnt).Value = SubStrings(IndexCrnt(ColCrnt))
      Next

by:

      For ColCrnt = 1 To ColMax
        .Cells(RowCrnt, ColCrnt).Value = SubStrings(ColCrnt)(IndexCrnt(ColCrnt))
      Next

With the revised code I do not split a source cell for every generated value. I extract the substring I require from the array.

Note: if you have ever used two dimensional arrays, you will have written something like MyArray(Row,Column). Ragged arrays are different; you write MyArray(Row)(Column).

Tony Dallimore
  • 12,335
  • 7
  • 32
  • 61
0

Here's a generalised solution that uses Recursion to handle any number of columns (greater than 1)

Sub Combinations()
    Dim aSrc As Variant

    ' Get Data into an array
    '  This section is an example to get the source data into an array
    '  Replace this section if your data is sourced differently.
    '  The required format of aSrc is Array(1 To NumberOfColumnsOfData)
    '  where each element aSrc(n) is Array(1 To NumberOfRowsInColumnN, 1 To 1) of Variant
    Dim rSrc As Range, colR As Range
    Dim sh As Worksheet
    Dim a As Variant
    Dim i As Long
    Set sh = ActiveSheet ' <-- Adjust to suit
    Set rSrc = sh.[A:D]  ' <-- Adjust to suit
    ReDim aSrc(1 To rSrc.Columns.Count)
    With sh
        For i = 1 To rSrc.Columns.Count
            Set colR = rSrc.Columns(i)
            aSrc(i) = .Range(colR.Cells(1, 1), colR.Cells(.Rows.Count, 1).End(xlUp))
        Next
    End With

    ' Generate output
    '  This populates aDst(1 To lSize, 1 To NumberOfSourceColumns)
    '  where lSize is total number of combinations
    Dim aDst As Variant
    Dim lSize As Long
    Dim n As Long
    Dim aBase() As String
    lSize = 1
    For i = 1 To UBound(aSrc)
        lSize = lSize * UBound(aSrc(i), 1)
    Next
    ReDim aDst(1 To lSize, 1 To UBound(aSrc))
    ReDim aBase(0 To UBound(aSrc) - 1)
    n = 1
    aBase = Split(String(UBound(aSrc) - 1, ","), ",")
    aBase(0) = aSrc(1)(1, 1)
    Generate aSrc, aDst, aBase, 1, n

    ' Place output into sheet
    '   Starting at cell rDst
    Dim rDst As Range
    Set rDst = [E1]  ' <-- Adjust to suit
    Set rDst = rDst.Resize(UBound(aDst, 1), UBound(aDst, 2))
    rDst = aDst

End Sub

Private Sub Generate(ByRef aSrc As Variant, ByRef aDst As Variant, ByRef aBase As Variant, ByVal pCol As Long, ByRef pDst As Long)
    Dim i As Long, j As Long
    If pCol = UBound(aSrc) Then
        ' If iterating the last source column, output to aDst
        For i = 1 To UBound(aSrc(pCol), 1)
            For j = 1 To UBound(aBase)
                aDst(pDst, j) = aBase(j - 1)
            Next
            aDst(pDst, j) = aSrc(pCol)(i, 1)
            pDst = pDst + 1
        Next
    Else
        ' If NOT iterating the last source column, aBase and call Generate again
        For i = 1 To UBound(aSrc(pCol), 1)
            aBase(pCol - 1) = aSrc(pCol)(i, 1)
            Generate aSrc, aDst, aBase, pCol + 1, pDst
        Next
    End If
End Sub
chris neilsen
  • 52,446
  • 10
  • 84
  • 123
  • I'm not saying I don't understand it, but just in case you are not sufficiently distant from this code to know this... the comments and variable names are not sufficiently clear for me to have any idea what your program does without reading the code. Particularly the Generate sub. – Daniel Sep 09 '12 at 02:53