I think I am missing a key piece of MongoDB Aggregate Pipeline knowledge to close to loop on what I am trying to do here.
Here is my data:
[
{frameId: <some unique guid>, videoSessionId: 1, subject: "John", "createdBy": "personA", "blurry": false},
{frameId: <some unique guid>, videoSessionId: 1, subject: "John", "createdBy": "personA", "blurry": false},
{frameId: <some unique guid>, videoSessionId: 1, subject: "John", "createdBy": "personA", "blurry": false},
{frameId: <some unique guid>, videoSessionId: 2, subject: "John", "createdBy": "personA", "blurry": false},
{frameId: <some unique guid>, videoSessionId: 3, subject: "John", "createdBy": "personB", "blurry": true},
{frameId: <some unique guid>, videoSessionId: 3, subject: "John", "createdBy": "personB", "blurry": true},
{frameId: <some unique guid>, videoSessionId: 3, subject: "John", "createdBy": "personB", "blurry": false},
{frameId: <some unique guid>, videoSessionId: 3, subject: "John", "createdBy": "personB", "blurry": false},
{frameId: <some unique guid>, videoSessionId: 4, subject: "Mary", "createdBy": "personB", "blurry": false},
{frameId: <some unique guid>, videoSessionId: 4, subject: "Mary", "createdBy": "personB", "blurry": false},
{frameId: <some unique guid>, videoSessionId: 5, subject: "Mary", "createdBy": "personB", "blurry": false},
{frameId: <some unique guid>, videoSessionId: 5, subject: "Mary", "createdBy": "personB", "blurry": false},
{frameId: <some unique guid>, videoSessionId: 6, subject: "Mary", "createdBy": "personB", "blurry": false},
{frameId: <some unique guid>, videoSessionId: 6, subject: "Mary", "createdBy": "personB", "blurry": false},
{frameId: <some unique guid>, videoSessionId: 7, subject: "Mary", "createdBy": "personB", "blurry": false},
{frameId: <some unique guid>, videoSessionId: 7, subject: "Mary", "createdBy": "personB", "blurry": false},
{frameId: <some unique guid>, videoSessionId: 7, subject: "Mary", "createdBy": "personB", "blurry": false},
{frameId: <some unique guid>, videoSessionId: 7, subject: "Mary", "createdBy": "personB", "blurry": false},
{frameId: <some unique guid>, videoSessionId: 7, subject: "Mary", "createdBy": "personB", "blurry": false},
]
I want to query the data and receive this:
[
{frameId: <some unique guid>, videoSessionId: 3, subject: "John", "createdBy": "personB", "blurry": false},
{frameId: <some unique guid>, videoSessionId: 3, subject: "John", "createdBy": "personB", "blurry": false},
{frameId: <some unique guid>, videoSessionId: 4, subject: "Mary", "createdBy": "personB", "blurry": false},
{frameId: <some unique guid>, videoSessionId: 4, subject: "Mary", "createdBy": "personB", "blurry": false},
{frameId: <some unique guid>, videoSessionId: 5, subject: "Mary", "createdBy": "personB", "blurry": false},
{frameId: <some unique guid>, videoSessionId: 5, subject: "Mary", "createdBy": "personB", "blurry": false},
]
Here is the plaintext version of what I desire:
"Out of all the non-blurry frames created by personB, give me a max number of 2 videoSessions per subject."
A little deeper:
- The above queried data only contains one videoSession's worth of data since John only has 1 videoSession created by personB.
- The above queried data only contains 2 out of 4 of Mary's video sessions. Although she has 4 video sessions that were created by personB and are not blurry, we want to only gather a max of 2 video sessions per subject.
I am currently getting around this, using the following. This seems inefficient to me since it requires a lookup, unwind, replace root, and another match.
- $match createdBy=personB and blurry=false
- $group by subject and $addToSet of videoSessionId to get an array of videoSessionIds
- a $project that does a $slice of the maxNumber to limit the videoSessionIds per subject
- a $lookup and $unwind by videoSessionId on the same collection to fill back up the original data
- a $replaceRoot, as the $unwind puts it into a subject
- another $match for createdBy=personB and blurry=false --> since the $lookup and $unwind gave me back all the records back for the particular videoSessionIds I identified I wanted.