1

Sorry for strange title of the question, but I don't know how to formulate it more short. If you know how to formulate it better, I will be glad if you edit my question. So, I have the following table:

enter image description here

I'm tolking about CustomerId and EventType fields. The rest is not important. I think you understand that this table is something like log by customers events. Some customer make event - I have event in the table. Simple.

I need to choice all customers events where each customer had event with type registration and type deposit. In other words, customer had registration before? The same customer had deposit? If yes and yes - I need to select all events of this customer.

How I can do that with the help of LINQ?

So I can write SQL like

select *
From "CustomerEvents"
where "CustomerId" in (
    select distinct "CustomerId"
    from "CustomerEvents"
    where "EventType" = 'deposit'

    intersect

    select distinct "CustomerId"
    from "CustomerEvents"
    where "EventType" = 'registration'
)

It works, but how to write it on LINQ?

And second question. SQL above works, but not it is not universal. What if tomorrow I will need to show events of customers who have registration, deposit and - new one event - visit? I have to write new one query. Like:

select *
From "CustomerEvents"
where "CustomerId" in (
    select "CustomerId"
    from "CustomerEvents"
    where "EventType" = 'deposit'

    intersect

    select distinct "CustomerId"
    from "CustomerEvents"
    where "EventType" = 'registration'

     intersect

    select distinct "CustomerId"
    from "CustomerEvents"
    where "EventType" = 'visit'
)

Uncomfortable :( As source data, I have List with event types. Is there some way to make it dynamically? I mean, I have new one event in the list - I have new one intersect.

P.S I use Postgres and .NET Core 3.1

Update

I pine here a scheme enter image description here

Aleksej_Shherbak
  • 2,757
  • 5
  • 34
  • 71
  • Why just don't use plain `select` query with `where EventType IN (..)` clause without nested queries and distinct? – Pavel Anikhouski Apr 22 '20 at 11:34
  • @PavelAnikhouski I'm not sure that it will work. In this case I will just show all `registration` and `deposit` events, for all customers. But I need to show events of customers where each customer 1) had `registration`, 2) had `deposit`, Am I wrong? – Aleksej_Shherbak Apr 22 '20 at 11:40
  • Not sure, that I get your point correct, but grouping may solve you problem. Please also share your linq query and DB entities – Pavel Anikhouski Apr 22 '20 at 11:47
  • @PavelAnikhouski could you check my update? – Aleksej_Shherbak Apr 22 '20 at 11:57
  • 1
    For the SQL, you dont need the `distinct` since `intersect` is a set operation it will remove any duplicates. – Magnus Apr 22 '20 at 12:33

2 Answers2

1

I haven't tested to see if this will translate to SQL correctly, but if we assume ctx.CustomerEvents is DbSet<CustomerEvent> you could try this:

var targetCustomerIds = ctx
  .CustomerEvents
  .GroupBy(event => event.CustomerId)
  .Where(grouped => 
    grouped.Any(event => event.EventType == "deposit")
    && grouped.Any(event => event.EventType == "registration"))
  .Select(x => x.Key)
  .ToList();

and then select all events for these customers:

var events = ctx.CustomerEvents.Where(event => targetCustomerIds.Contains(event.CustomerId));

To get targetCustomerIds dynamically with a variable number of event types, you could try this:

// for example
var requiredEventTypes = new [] { "deposit", "registration" };

// First group by customer ID
var groupedByCustomerId = ctx
  .CustomerEvents
  .GroupBy(event => event.CustomerId);

// Then filter out any grouping which doesn't satisfy your condition
var filtered = GetFilteredGroups(groupedByCustomerId, requiredEventTypes);

// Then select the target customer IDs
var targetCustomerIds = filtered.Select(x => x.Key).ToList();

// Finally, select your target events
var events = ctx.CustomerEvents.Where(event => 
  targetCustomerIds.Contains(event.CustomerId));

You can define the GetFilteredGroups method like this:

private static IQueryable<IGrouping<int, CustomerEvent>> GetFilteredGroups(
    IQueryable<IGrouping<int, CustomerEvent>> grouping,
    IEnumerable<string> requiredEventTypes)
{
    var result = grouping.Where(x => true);
    foreach (var eventType in requiredEventTypes)
    {
        result = result.Where(x => x.Any(event => event.EventType == eventType));
    }

    return result;
}

Alternatively, instead of selecting the target customer IDs, you can try to directly select your target events from the filtered groupings:

// ...
// Filter out any grouping which doesn't satisfy your condition
var filtered = GetFilteredGroups(groupedByCustomerId, requiredEventTypes);

// Select your events here
var results = filtered.SelectMany(x => x).Distinct().ToList();

Regarding the inability to translate the query to SQL Depending on your database size and particularly on the size of CustomerEvents table, this solution may or may not be ideal, but what you could do is load the optimized collection to memory and perform the grouping there:

// for example
var requiredEventTypes = new [] { "deposit", "registration" };

// First group by customer ID, but load into memory
var groupedByCustomerId = ctx
  .CustomerEvents
  .Where(event => requiredEventTypes.Contains(event.EventType))
  .Select(event => new CustomerEventViewModel 
    { 
      Id = event.Id, 
      CustomerId = event.CustomerId, 
      EventType = event.EventType 
    })
  .GroupBy(event => event.CustomerId)
  .AsEnumerable();

// Then filter out any grouping which doesn't satisfy your condition
var filtered = GetFilteredGroups(groupedByCustomerId, requiredEventTypes);

// Then select the target customer IDs
var targetCustomerIds = filtered.Select(x => x.Key).ToList();

// Finally, select your target events
var events = ctx.CustomerEvents.Where(event => 
  targetCustomerIds.Contains(event.CustomerId));

You will need to create a type called CustomerEventViewModel like this (so you don't have to load the entire CustomerEvent entity instances to memory):

public class CustomerEventViewModel
{
  public int Id { get; set; }
  public int CustomerId { get; set; }
  public string EventType { get; set; }
}

And change the GetFilteredGroups like this:

private static IEnumerable<IGrouping<int, CustomerEvent>> GetFilteredGroups(
    IEnumerable<IGrouping<int, CustomerEvent>> grouping,
    IEnumerable<string> requiredEventTypes)
{
    var result = grouping.Where(x => true);
    foreach (var eventType in requiredEventTypes)
    {
        result = result.Where(x => x.Any(event => event.EventType == eventType));
    }

    return result;
}

It should now work fine.

Dejan Janjušević
  • 3,181
  • 4
  • 41
  • 67
  • Thank you!!! It looks like very good! I will try your solution as soon as I can, and I will report here about the results! Thank you again!!! – Aleksej_Shherbak Apr 22 '20 at 12:30
  • Could you explain what is `Test` here `private static IQueryable> GetFilteredGroups(..` ? – Aleksej_Shherbak Apr 22 '20 at 12:55
  • And one more problem `EventType` is not Enum (( It's string. So, this `foreach (var eventType in requiredEventTypes) { result = result.Where(x => x.Any(e => e.EventType == eventType)); }` form this solution gives me the following error: `ArgumentException: Expression of type 'System.String' cannot be used for parameter of type 'Domains.CustomerEvent' of method 'Boolean Contains[CustomerEvent](System.Collections.Generic.IEnumerable`1[Domains.CustomerEvent], Domains.CustomerEvent)' (Parameter 'arg1')`. Somebody knows what is wrong? – Aleksej_Shherbak Apr 22 '20 at 13:12
  • Ok, after some research I have doubts will this solution work or not. See this answer https://stackoverflow.com/a/57807780/7542937 grouped and any can not be translated in sql. Am I right? – Aleksej_Shherbak Apr 22 '20 at 13:33
  • 1
    @Aleksej_Shherbak sorry, Test was a leftover of the code I tried, I fixed the code so you can try again (also removed enums and used strings instead so you should be able to just copy and paste). As for the Linq query, not sure if it can be translated to SQL - see my edit. – Dejan Janjušević Apr 22 '20 at 14:30
  • thank you! This table contains approximately 2 000 000 rows. After 5 months it will ~ 5 000 000. – Aleksej_Shherbak Apr 22 '20 at 14:36
0

Thank you for @Dejan Janjušević. He is excpirienced developer. But it seems EF can't translate him solution to SQL (or just my hands grow from wrong place). I publish here my solution for this situation. It's simple stuped. So. I have in the table EventType. It is string. And I have from the client the following filter request:

List<string> eventType

Just list with event types. So, in the action I have the following code of the filter:

if (eventType.Any())
{
    List<int> ids = new List<int>();

    foreach (var e in eventType)
    {
        var customerIdsList =
            _context.customerEvents.Where(x => x.EventType == e).Select(x => x.CustomerId.Value).Distinct().ToList();

        if (!ids.Any())
        {
            ids = customerIdsList;
        }
        else
        {
            ids = ids.Intersect(customerIdsList).ToList();
        }
    }

    customerEvents = customerEvents.Where(x => ids.Contains(x.CustomerId.Value));
}

Not very fast, but works.

Aleksej_Shherbak
  • 2,757
  • 5
  • 34
  • 71