0

I need to merge two data schema into one. I have Schema1 and Schema2. I need to join these two into Schema3. Furthermore, I have a Select statement which queries a data set but I can't figure out how to use the select statement on the data set containing both tables (both Schema1 and Schema2) and combine them into the new table schema3 which is a table in the same data set comprised of both table's fields.

Example

Schema 1 ID, Food, Book, Rice, Cave

Schema 2 ID , Carpet, Strings, Run

Schema 3 ID, Food, Book, Rice, Cave, Carpet, Strings, Run

Fill the Schema3 Table with this command

Sql Command:

Select * Schema1 [except ID] and all fields from Schema2 [exceptID] Inner Join 
Schema2 ON Schema1.ID = Schema2.ID
Where ID = {dynamically defined variable 'X'}

Please excuse the lack of proper syntax. The main issue here again is querying a dataset with the select statement and filling up a table with the results. Im not exactly connecting to my DB because I already filled a dataset locally.

------Edit ------ I really just need a way to create an array of data rows from a query of two tables.

Ccorock
  • 892
  • 12
  • 37
  • Do you need a new merged DataTable or would an anonymous type suffice? – Tim Schmelter Nov 01 '12 at 00:27
  • I don't see that it would matter either way, as long as I can supply the table's contents to the report viewer at the end of the process. – Ccorock Nov 01 '12 at 00:30
  • The database will do it too, you can use the UNION keyword. – le3th4x0rbot Nov 01 '12 at 00:31
  • @BaileyS: But the data is already in memory. Ccorock: it's easy to join tables with `Linq-To-DataSet`, what .NET framework version are you using? – Tim Schmelter Nov 01 '12 at 00:32
  • Thats why Im having such trouble, because I want to use a Sql style query that I would normally use on on the DB but instead Im having to use the loaded datasets. – Ccorock Nov 01 '12 at 00:37
  • @Ccorock: Switch to at least 2008, but keep the framework version - should not be a big deal to upgrade. 2005 is not a good IDE. It's like Vista in OS. – Victor Zakharov Nov 01 '12 at 01:41
  • @Neolisk, yeah I've come to curse it a few times. Its unfortunate that I don't have that option. – Ccorock Nov 01 '12 at 01:43
  • I suggested UNION as an alternative to this entire situation, maybe to think of in the future or whatever. – le3th4x0rbot Nov 01 '12 at 06:15

2 Answers2

2

You could use this extension method here which i've written from scratch recently for another question. It enables to merge multiple tables by a common key. If no key is specified it will just use the default DataTable.Merge method:

public static DataTable MergeAll(this IList<DataTable> tables, String primaryKeyColumn)
{
    if (!tables.Any())
        throw new ArgumentException("Tables must not be empty", "tables");
    if(primaryKeyColumn != null)
        foreach(DataTable t in tables)
            if(!t.Columns.Contains(primaryKeyColumn))
                throw new ArgumentException("All tables must have the specified primarykey column " + primaryKeyColumn, "primaryKeyColumn");

    if(tables.Count == 1)
        return tables[0];

    DataTable table = new DataTable("TblUnion");
    table.BeginLoadData(); // Turns off notifications, index maintenance, and constraints while loading data
    foreach (DataTable t in tables)
    {
        table.Merge(t); // same as table.Merge(t, false, MissingSchemaAction.Add);
    }
    table.EndLoadData();

    if (primaryKeyColumn != null)
    {
        // since we might have no real primary keys defined, the rows now might have repeating fields
        // so now we're going to "join" these rows ...
        var pkGroups = table.AsEnumerable()
            .GroupBy(r => r[primaryKeyColumn]);
        var dupGroups = pkGroups.Where(g => g.Count() > 1);
        foreach (var grpDup in dupGroups)
        { 
            // use first row and modify it
            DataRow firstRow = grpDup.First();
            foreach (DataColumn c in table.Columns)
            {
                if (firstRow.IsNull(c))
                {
                    DataRow firstNotNullRow = grpDup.Skip(1).FirstOrDefault(r => !r.IsNull(c));
                    if (firstNotNullRow != null)
                        firstRow[c] = firstNotNullRow[c];
                }
            }
            // remove all but first row
            var rowsToRemove = grpDup.Skip(1);
            foreach(DataRow rowToRemove in rowsToRemove)
                table.Rows.Remove(rowToRemove);
        }
    }

    return table;
}

You can call it in this way:

var tables= new[] { Schema1, Schema2};
DataTable Schema3 = tables.MergeAll("ID");

Edit: If you don't need a new DataTable with the merged schema you could also use Linq-To-DataSet (now VB.NET):

Dim schema3 = From r1 In schema1
          Join r2 In schema2 On r1.Field(Of Int32)("ID") Equals r2.Field(Of Int32)("ID")
          Select New With {
                .ID = r1.Field(Of Int32)("ID"),
                .Food = r1.Field(Of String)("Food"),
                .Book = r1.Field(Of String)("Book"),
                .Rice = r1.Field(Of String)("Rice"),
                .Cave = r1.Field(Of String)("Cave"),
                .Carpet = r2.Field(Of String)("Carpet"),
                .Strings = r2.Field(Of String)("Strings"),
                .Run = r2.Field(Of String)("Run")
            }
Community
  • 1
  • 1
Tim Schmelter
  • 450,073
  • 74
  • 686
  • 939
0

Try this:

  ''' <summary>
  ''' Merge two datatables that have a 1:1 relationship
  ''' </summary>
  ''' <param name="dtb1">Required Datatable.</param>
  ''' <param name="dtb2">Required Datatable.</param>
  ''' <param name="dtb1MatchField">Required String. Field name in dtb1 to use to match records</param>
  ''' <param name="dtb2MatchField">Required String. Field name in dtb2 to use to match records</param>
  ''' <remarks></remarks>'
  Private Function MergeDataTables(ByVal dtb1 As DataTable, ByVal dtb2 As DataTable, ByVal dtb1MatchField As String, ByVal dtb2MatchField As String) As DataTable
    Dim dtbOutput As DataTable = dtb1.Copy
    Dim lstSkipFields As New List(Of String)
    For Each dcl As DataColumn In dtb2.Columns
      Try
        dtbOutput.Columns.Add(dcl.ColumnName, dcl.DataType)
      Catch ex As DuplicateNameException
        lstSkipFields.Add(dcl.ColumnName)
      End Try
    Next dcl
    'Merge dtb2 records that match existing records in dtb1'
    Dim dtb2Temp As DataTable = dtb2.Copy
    For int2 As Integer = dtb2Temp.Rows.Count - 1 To 0 Step -1
      Dim drw2 As DataRow = dtb2Temp.Rows(int2)
      Dim o2 As Object = drw2(dtb2MatchField)
      For Each drw1 As DataRow In dtbOutput.Rows
        Dim o1 As Object = drw1(dtb1MatchField)
        If o1.ToString = o2.ToString Then
          For Each dcl As DataColumn In dtb2Temp.Columns
            If Not lstSkipFields.Contains(dcl.ColumnName) Then
              drw1(dcl.ColumnName) = drw2(dcl.ColumnName)
            End If
          Next dcl
          dtb2Temp.Rows.Remove(drw2)
        End If
      Next drw1
    Next int2
    'add rows that weren not in dtb1'
    For Each drw2 As DataRow In dtb2Temp.Rows
      Dim drw1 As DataRow = dtbOutput.NewRow
      For Each dcl As DataColumn In dtb2Temp.Columns
        drw1(dcl.ColumnName) = drw2(dcl.ColumnName)
      Next dcl
      dtbOutput.Rows.Add(drw1)
    Next drw2
    Return dtbOutput
  End Function
SSS
  • 4,807
  • 1
  • 23
  • 44