1

I think I am missing a key piece of MongoDB Aggregate Pipeline knowledge to close to loop on what I am trying to do here.

Here is my data:

[
   {frameId: <some unique guid>, videoSessionId: 1, subject: "John", "createdBy": "personA", "blurry": false},
   {frameId: <some unique guid>, videoSessionId: 1, subject: "John", "createdBy": "personA", "blurry": false},
   {frameId: <some unique guid>, videoSessionId: 1, subject: "John", "createdBy": "personA", "blurry": false},
   {frameId: <some unique guid>, videoSessionId: 2, subject: "John", "createdBy": "personA", "blurry": false},
   {frameId: <some unique guid>, videoSessionId: 3, subject: "John", "createdBy": "personB", "blurry": true},
   {frameId: <some unique guid>, videoSessionId: 3, subject: "John", "createdBy": "personB", "blurry": true},
   {frameId: <some unique guid>, videoSessionId: 3, subject: "John", "createdBy": "personB", "blurry": false},
   {frameId: <some unique guid>, videoSessionId: 3, subject: "John", "createdBy": "personB", "blurry": false},
   {frameId: <some unique guid>, videoSessionId: 4, subject: "Mary", "createdBy": "personB", "blurry": false},
   {frameId: <some unique guid>, videoSessionId: 4, subject: "Mary", "createdBy": "personB", "blurry": false},
   {frameId: <some unique guid>, videoSessionId: 5, subject: "Mary", "createdBy": "personB", "blurry": false},
   {frameId: <some unique guid>, videoSessionId: 5, subject: "Mary", "createdBy": "personB", "blurry": false},
   {frameId: <some unique guid>, videoSessionId: 6, subject: "Mary", "createdBy": "personB", "blurry": false},
   {frameId: <some unique guid>, videoSessionId: 6, subject: "Mary", "createdBy": "personB", "blurry": false},
   {frameId: <some unique guid>, videoSessionId: 7, subject: "Mary", "createdBy": "personB", "blurry": false},
   {frameId: <some unique guid>, videoSessionId: 7, subject: "Mary", "createdBy": "personB", "blurry": false},
   {frameId: <some unique guid>, videoSessionId: 7, subject: "Mary", "createdBy": "personB", "blurry": false},
   {frameId: <some unique guid>, videoSessionId: 7, subject: "Mary", "createdBy": "personB", "blurry": false},
   {frameId: <some unique guid>, videoSessionId: 7, subject: "Mary", "createdBy": "personB", "blurry": false},
]

I want to query the data and receive this:

[
       {frameId: <some unique guid>, videoSessionId: 3, subject: "John", "createdBy": "personB", "blurry": false},
       {frameId: <some unique guid>, videoSessionId: 3, subject: "John", "createdBy": "personB", "blurry": false},
       {frameId: <some unique guid>, videoSessionId: 4, subject: "Mary", "createdBy": "personB", "blurry": false},
       {frameId: <some unique guid>, videoSessionId: 4, subject: "Mary", "createdBy": "personB", "blurry": false},
       {frameId: <some unique guid>, videoSessionId: 5, subject: "Mary", "createdBy": "personB", "blurry": false},
       {frameId: <some unique guid>, videoSessionId: 5, subject: "Mary", "createdBy": "personB", "blurry": false},    
]

Here is the plaintext version of what I desire:

"Out of all the non-blurry frames created by personB, give me a max number of 2 videoSessions per subject."

A little deeper:

  • The above queried data only contains one videoSession's worth of data since John only has 1 videoSession created by personB.
  • The above queried data only contains 2 out of 4 of Mary's video sessions. Although she has 4 video sessions that were created by personB and are not blurry, we want to only gather a max of 2 video sessions per subject.

I am currently getting around this, using the following. This seems inefficient to me since it requires a lookup, unwind, replace root, and another match.

  1. $match createdBy=personB and blurry=false
  2. $group by subject and $addToSet of videoSessionId to get an array of videoSessionIds
  3. a $project that does a $slice of the maxNumber to limit the videoSessionIds per subject
  4. a $lookup and $unwind by videoSessionId on the same collection to fill back up the original data
  5. a $replaceRoot, as the $unwind puts it into a subject
  6. another $match for createdBy=personB and blurry=false --> since the $lookup and $unwind gave me back all the records back for the particular videoSessionIds I identified I wanted.
Rao
  • 61
  • 3
  • your Question is not clear, Please [edit] the question to make it more clear and understandable, so that we can help you better. – Ravi Shankar Bharti Sep 17 '17 at 06:02
  • @RaviShankar -- edited to be more explicit. However, the data I have and the data I want to receive out of the query are the same. Let me know what else is unclear to you. – Rao Sep 17 '17 at 07:32
  • Rather than your "explicit description" it would be better to show the actual code and a small sample of data ( notably the "related" data is missing ) in order to reproduce. It seems on general perusal that you are looking for a "limit to grouping", of which [mongodb group values by multiple fields](https://stackoverflow.com/a/22935461/2313887) explains the general problem and approaches. You probably want parallel queries for each "grouping", or indeed wait for the new features as demonstrated there. – Neil Lunn Sep 17 '17 at 23:44

0 Answers0