0

I have variables and the values of the variables as per the table below.

Required variables and values of the variables

I would like to write code that produces the desired output below. I.e. a single row for each combination of values.

I need code that works for a variable number of variables. In this case there are 3 variables but I need something that works when the user specifies the number of variables and the name/values of each variable.

Desired output

Zain
  • 95
  • 6
  • 1. can you show us what you tried by yourself? Even where and how you searched for... Then, I cannot understand the logic you try expressing in the example suggesting what you need. What 0 and 1 should be on the "Smoker" variable (which does not have any one between its values - in the definition table). Can you better explain **in words** what you try accomplishing and **based on what logic**? Please, clarify this aspect. – FaneDuru Jul 12 '23 at 08:46
  • I don't know where to start. My only idea is to loop through each variable but as I said there will be a a variable number of variables. I can't do a variable number of loops. I want to produce output for each COMBINATION of the input values. Does the "Desired Output" not explain what I need? Don't worry about the name of the 3 variables - they could be A, B and C. – Zain Jul 12 '23 at 09:34
  • You appear to be trying to replicate a "cross-join" or "Cartesian Product". This starting point might help https://superuser.com/questions/106156/how-can-i-create-a-cross-join-in-excel – CHill60 Jul 12 '23 at 10:18
  • Previously: https://stackoverflow.com/questions/19780016/vba-write-all-possible-combinations-of-4-columns-of-data/19780307#19780307 – Tim Williams Jul 12 '23 at 15:19

3 Answers3

3

Please, test the next code and send some feedback. It uses arrays and should be fast enough for reasonable range to be processed. It is able to process as columns you need, up to the CPU processing limitation:

Sub combinations()
  Dim ws As Worksheet, rng As Range, arr2D1(1 To 1, 1 To 1)
  Dim arrCol, arrNo, arrFin, countR As Long, i As Long, j As Long, k As Long
  
  Set ws = ActiveSheet
  Set rng = ws.Range("B2:D5") 'it may contain a bigger (variable)  number of columns...
  
  ReDim arrCol(rng.Columns.count - 1)
  ReDim arrNo(UBound(arrCol))
  
  For i = 1 To rng.Columns.count
    'place each column content as a 2D array, in an array:
    If Application.CountA(rng.Columns(i)) = 1 Then 'for only one value in the column first cell:
        arr2D1(1, 1) = rng.Columns(i).cells(1).Value: arrCol(i - 1) = arr2D1
        arrNo(i - 1) = 1
    Else
        arrCol(i - 1) = ws.Range(rng.Columns(i).cells(1), rng.Columns(i).cells(1).End(xlDown)).Value
        arrNo(i - 1) = UBound(arrCol(i - 1))  'place in another array each column number of rows
    End If

    countR = IIf(countR = 0, arrNo(i - 1), countR * arrNo(i - 1)) 'calculate the necessary total number of rows of the array to return
  Next i

  ReDim arrFin(1 To countR, 1 To rng.Columns.count) 'redim the array to return
 
  Dim noRep As Long        'number of each value repeating per each column
  Dim multipliedNo As Long 'cummulated previous columns numbers to repeat
  Dim cummulatedNo As Long 'cummulated previous columns arrays elements
  Dim m As Long, c As Long, ii As Long
  
        j = 1  'initialize the columns variable
        For j = 1 To UBound(arrCol) + 1 'Iterate between the 2D array elements of arrCol
                multipliedNo = 1: cummulatedNo = 0 '(re)initialize the variables
                For m = 1 To UBound(arrFin, 2) 'iterate between the columns
                    multipliedNo = multipliedNo * arrNo(m - 1) 'multiplies the number of arrays/columns elements UP TO the processed column
                    noRep = countR / multipliedNo              'number of repetition for each element of the processed column
                    If m > 1 Then cummulatedNo = IIf(cummulatedNo = 0, arrNo(m - 2), cummulatedNo * arrNo(m - 2)) 'number of global repetition per column
                    If m = j Then Exit For 'cummulate only up to the present column...
                Next m
                
                c = 1 'initialize the final array rows variable
                For ii = 1 To IIf(j = 1, 1, cummulatedNo)     'global repetitions
                    For i = 1 To UBound(arrCol(j - 1))        'iterate between the column (array) elements
                        For k = 1 To noRep                    'repeat the same element from noRep times
                            arrFin(c, j) = arrCol(j - 1)(i, 1)'place the necessary value on processed column
                            c = c + 1
                        Next k
                    Next i
                Next ii
        Next j
             
        'drop the process result array, at once:
        Range("I2").Resize(UBound(arrFin), UBound(arrFin, 2)).Value = arrFin

End Sub

I think (now) it is not prepared for the case of only one element per column, which will not be placed in an array, but this can be solved after you confirm that it works as you need.

Please, send some feedback after testing it.

FaneDuru
  • 38,298
  • 4
  • 19
  • 27
  • Thanks! I don't really know how, but this seems to do the job for my data. – Zain Jul 13 '23 at 19:28
  • @Zain Glad I could help! Yes, it does not follow the classic algorithm of calculating the combinations on rows. It calculates the necessary repetitions of each column element, of each group and so on. I will place a version better commented... – FaneDuru Jul 14 '23 at 10:18
  • @Zain Adapted the code to also process the columns containing only one value in the first cell of the column... – FaneDuru Jul 14 '23 at 11:03
2

I realised that the need to cater for a variable number of variables would make this quite complicated, so I had a go at it in my lunch break. The following is a complete solution - but it is not tidy!! It also cannot be re-run - if you want to re-run it you have to convert all the tables back to Ranges and delete the Power Queries that have been created. I've included a couple of my standard helper functions too.

Public Sub SetUpTablesForCrossJoin()
    
    Dim lastColumn As Long, thisColumn As Long, lastRowNo As Long
    Dim rng As Range, rng1 As Range
    lastColumn = lastCol(shtUserData)

    'I used your data, so 1st Column just says "Values" and 1st Row is "Variable Name"

    For thisColumn = 2 To lastColumn
        Set rng = shtUserData.Columns(thisColumn)
        lastRowNo = lastRow(rng)
        Set rng1 = Range(rng.Cells(1), rng.Cells(lastRowNo))

        'Turn each of the data sets into Excel Tables
        shtUserData.ListObjects.Add(xlSrcRange, rng1, , xlYes).Name = "Table" & CStr(thisColumn - 1)
    Next
    
    'Now turn each of those tables into a Power Query
    Dim PQName As String, colHeader As String
    For thisColumn = 2 To lastColumn
        PQName = "Table" & CStr(thisColumn - 1)
        colHeader = shtUserData.ListObjects(PQName).HeaderRowRange(1, 1)

        ThisWorkbook.Queries.Add Name:=PQName, Formula:= _
            "let Source = Excel.CurrentWorkbook(){[Name=" & Chr(34) & PQName & Chr(34) & "]}[Content], #""Changed Type"" = " & _
                    "Table.TransformColumnTypes(Source,{{""" & colHeader & """, type text}}) in #""Changed Type"""
        ThisWorkbook.Connections.Add2 "Query - " & PQName, _
            "Connection to the '" & PQName & "' query in the workbook.", _
            "OLEDB;Provider=Microsoft.Mashup.OleDb.1;Data Source=$Workbook$;Location=" & PQName & ";Extended Properties=""""" _
            , "SELECT * FROM [" & PQName & "]", 2
    Next

    'Now use Power Query to Cross join them all
    Dim lfcr As String: lfcr = lfcr
    Dim queryFormula As String, addTable As String, expandTable As String
    
    queryFormula = "let" & lfcr & "    Source = Table1" & lfcr
    addTable = "    ,#""Added Custom%s"" = Table.AddColumn(%s, ""Custom"", each %s)"
    expandTable = "    ,#""Expanded Custom%s"" = Table.ExpandTableColumn(#""%s"", ""Custom"", {""%s""}, {""%s""})"
    
    Dim prevstep As String, prevstep1 As String
    For thisColumn = 2 To lastColumn - 1    'N.B. We've already captured Table1
        PQName = "Table" & CStr(thisColumn)
        colHeader = shtUserData.ListObjects(PQName).HeaderRowRange(1, 1)
        prevstep = IIf(thisColumn = 2, "Source", sPrintf("#""Expanded Custom%s""", thisColumn - 2))
        prevstep1 = sPrintf("Added Custom%s", thisColumn - 1)
        
        queryFormula = queryFormula & sPrintf(addTable, thisColumn - 1, prevstep, PQName) & _
                lfcr & sPrintf(expandTable, thisColumn - 1, prevstep1, colHeader, colHeader) & lfcr
    Next
    queryFormula = queryFormula & " in" & lfcr & sPrintf("    #""Expanded Custom%s""", thisColumn - 2)
    ThisWorkbook.Queries.Add Name:="CrossJoin", Formula:=queryFormula

    With shtCrossJoin.ListObjects.Add(SourceType:=0, Source:= _
        "OLEDB;Provider=Microsoft.Mashup.OleDb.1;Data Source=$Workbook$;Location=CrossJoin;Extended Properties=""""" _
        , Destination:=Range("CrossJoin!$A$1")).QueryTable
        .CommandType = xlCmdSql
        .CommandText = Array("SELECT * FROM [CrossJoin]")
        .ListObject.DisplayName = "CrossJoin"
        .Refresh BackgroundQuery:=False
    End With

End Sub
Public Function lastCol(rg As Variant) As Long
'Find the last column used in a worksheet - regardless of where the data starts and ends
'and regardless of whether the data is contiguous or jagged
'Note use of Variant for the input range parameter
'This allows both worksheets and ranges to passed into the function
    On Error Resume Next
    Dim lc As Long
    lc = rg.Cells.Find(What:="*" _
                    , After:=rg.Cells(1, 1) _
                    , LookAt:=xlPart _
                    , LookIn:=xlFormulas _
                    , SearchOrder:=xlByColumns _
                    , SearchDirection:=xlPrevious).Column
                    
    If Err.Number <> 0 Then
        lc = 1  'cater for a completely empty sheet
    End If
    On Error GoTo 0
    lastCol = lc
End Function
Public Function lastRow(rg As Variant) As Long
'Find the last row used in a worksheet - regardless of where the data starts and ends
'and regardless of whether the data is contiguous or jagged
'Note use of Variant for the input range parameter
'This allows both worksheets and ranges to passed into the function
    On Error Resume Next
    Dim lr As Long
    lr = rg.Cells.Find(What:="*" _
                    , LookAt:=xlPart _
                    , LookIn:=xlFormulas _
                    , SearchOrder:=xlByRows _
                    , SearchDirection:=xlPrevious).Row
    If Err.Number <> 0 Then
        lr = 1  'cater for a completely empty sheet
    End If
    On Error GoTo 0
    lastRow = lr
End Function
Public Function sPrintf(ByVal inputStr As String, ParamArray replaceStrs() As Variant)
'Mimic the C++ sprintf function, replace place holders in a string with an array of values
'This version can handle input strings with %s - the array must be in the correct order
'It also handles the more flexible %1, %2 etc in which case the correct array cell will be used
'Don't be tempted to call this with sPrintf(inputStr, Array("a", "b")) as this will fail.
'Use a single dimension array with sPrintf(inputStr, "a", "b")

    Dim i As Long, res As String, bOld As Boolean, sRep As String
    
    bOld = IIf(InStr(1, inputStr, "%s", vbBinaryCompare) > 0, True, False)
    
    res = inputStr

    For i = LBound(replaceStrs) To UBound(replaceStrs)
        sRep = IIf(bOld, "%s", "%" & CStr(i + 1))
        res = Replace(res, sRep, replaceStrs(i), 1, 1, vbBinaryCompare)
    Next
    
    sPrintf = res
    
End Function

Basically the code is converting your data into Excel tables so they can be easily queried with Power Query. It then creates one Connection-only Power Query for each of those tables (aka Variables) Then it builds a Cross Join power query "on the fly" to merge all of the table queries into one big cartesian product. That bit could definitely do with a tidy up!

CHill60
  • 1,180
  • 8
  • 14
1

With the Sub below you can create combinations with the column values of any range. Empty cells are not considered valid values, and a column without any valid cells is not considered when creating combinations. I also considered it reasonable that the values of each column should be unique in the column and during preprocessing I create an array of arrays that have unique values. In my example the value "AA" in the first column is considered only once. If you need any further clarification, ask me.

Private Sub CommandButton1_Click()
   Call createCombinations(Me.Range("C5:F9"), Me.Range("G5"))
End Sub

Public Sub createCombinations(source As Range, destCell As Range)
   Dim colArr() As Variant, nrow() As Variant, ipos() As Long, clen() As Long
   Dim clcnt As Long, cntrws As Long, c As Long, nrcntr As Long, cc As Long
   Dim tmp  As Variant, tlb As Long, tub As Long, tclcnt As Long
   
   clcnt = source.Columns.CountLarge
   ReDim colArr(0 To clcnt - 1)
   ReDim ipos(0 To clcnt - 1)
   ReDim clen(0 To clcnt - 1)
   cntrws = 1

   ' CREATE AN ARRAY OF ARRAYS
   For c = 0 To clcnt - 1
      ' unique data of one column in array
      tmp = ColToUniq1DArray(source.Columns(c + 1))
      tlb = LBound(tmp):  tub = UBound(tmp)
      ' ONLY IF HAVE DATA CONSIDER IN COMBINATION
      If tub >= tlb Then
         ipos(tclcnt) = 0
         colArr(tclcnt) = tmp
         clen(tclcnt) = tub - tlb + 1
         cntrws = cntrws * (clen(tclcnt))
         tclcnt = tclcnt + 1
      End If
   Next
   ' The resulting combinations are the product of the cells in each column
   If tclcnt = 0 Then
      MsgBox ("There is no data to create combinations")
      Exit Sub
   End If
   'Debug.Print "Created rows> " & cntrws & "  Created columns> " & tclcnt

   ' create the output array to hold combinations
   ReDim nrow(1 To cntrws, 1 To tclcnt)
   Do 
      ' write combination in array
      nrcntr = nrcntr + 1
      For c = 0 To tclcnt - 1
         nrow(nrcntr, c + 1) = colArr(c)(ipos(c))
      Next

      ' Here is the heart of the algorithm
      ' Increment the index of column by one, AND
      ' reset all indexes of columns to the right to zero
      For c = tclcnt - 1 To 0 Step -1
         If ipos(c) < clen(c) - 1 Then
            ipos(c) = ipos(c) + 1
            For cc = c + 1 To tclcnt - 1
               ipos(cc) = 0
            Next
            GoTo Lnext
         End If
      Next
      ' END OF COMBINATIONs WHEN NO ONE INDEX CAN INCREMENT BY ONE
      Exit Do
Lnext:
   Loop
   If Not destCell Is Nothing Then
      destCell.Resize(UBound(nrow, 1), UBound(nrow, 2)).Value2 = nrow
   End If
End Sub

' HELPER FUNCTION TRANFORM COLUMN TO UNIQUE ARRAY OF VALUES
Public Function ColToUniq1DArray(columnRange As Range) As Variant
   Dim d As Scripting.Dictionary, i As Long, ub As Long, myArray As Variant
   With columnRange
      myArray = Range(columnRange(1), columnRange(.Cells(.rows.CountLarge, columnRange(1).Column).End(xlUp).row)).Value2
   End With
   Set d = CreateObject("Scripting.Dictionary")
   ub = UBound(myArray, 1)
   For i = LBound(myArray, 1) To ub
      If myArray(i, 1) <> vbNullString Then
         d(myArray(i, 1)) = False
      End If
   Next i
   ColToUniq1DArray = d.Keys
   'Set d = Nothing
End Function

enter image description here