0

So I've gotten by with Mongo till now, without having to do anything that complex. But now I'm up against something.

I've got an Publisher model and a User model.

  • Users have an array of publisherIds.
  • so user.following = [1,2,3,4];

I'm building an admin table, and I need to show all publishers, and their number of followers.

Obviously I can't loop over each publisher and run a mongo query there, so what approach should I take?

Collection 1 Users

{
    id: 1,
    name: 'fred',
    following: [1,2,3,4],
},{
    id: 2,
    name: 'andy',
    following: [1,2]
},{
    id: 3,
    name: 'stephen',
    following: [1]
}

Desired output of collection 2 Publishers

{
    publisherId: 1,
    numberOfFollowers: 3
},{
    publisherId: 2,
    numberOfFollowers: 2
},{
    publisherId: 3,
    numberOfFollowers: 1
}
Sean Clark
  • 1,436
  • 1
  • 17
  • 31
  • This is a bit broad without being specific about what you want to achieve. The "seemingly" implied result is the "inverse" where each publisher has a list of users, rather than the current case. Whether that is practical as an array of users or keeping the data as separate objects depends entirely on your data. As for the difference between methods, well the appoach to the "presumed" is basically the same, where the initial goal is to "emit" a singular "user/publisher" combo and do something with it. Both will explode more data than the source, but aggregate will be faster at it, generally. – Blakes Seven Aug 21 '15 at 07:06
  • So, my end result should be a list of publishers where every publisher has a follower count. The follower count being the number of users that have their id set in each of their own arrays of publishers they follow. – Sean Clark Aug 21 '15 at 13:22
  • That point is apparent from your question. However if you read what I responded with, then the main question you have not answered is "how you expect to accumlate this data". In alll honesty. Consider the quality of what you have been given as an "answer" to date. Not great is it. Therefore the better and clearer your question is, then the more it will help you. – Blakes Seven Aug 21 '15 at 14:13
  • I'm trying to articulate what I'm looking for, but because i'm used to SQL it's hard to translate. I'll update my question with expected inputs and outputs. – Sean Clark Aug 21 '15 at 17:31

1 Answers1

0

Map Reduce strategy can be a little slower and a lot faster than aggregation framework depending on size of data. MapReduce would run JavaScript in a separate thread and use the code you provide to emit and reduce parts of your document to aggregate on certain fields. You can certainly look at the exercise as aggregating over each "fieldValue". Aggregation framework can do this as well but would be much faster as the aggregation would run on the server in C++ rather than in a separate JavaScript thread. Also

The MongoDB aggregation pipeline consists of stages. Each stage transforms the documents as they pass through the pipeline. Pipeline stages do not need to produce one output document for every input document; e.g., some stages may generate new documents or filter out documents. Pipeline stages can appear multiple times in the pipeline.

enter image description here

More reading: Is Mongodb Aggregation framework faster than map/reduce?

And to your example, here you go:

enter image description here

Community
  • 1
  • 1
Abhishek Dey
  • 1,601
  • 1
  • 15
  • 38
  • That's interesting. Do either of those apply to solving my issue? – Sean Clark Aug 21 '15 at 03:25
  • Do you have any context of matching or using regex in the query? Then aggregation could take a long time, else it's always go for aggregation strategy. – Abhishek Dey Aug 21 '15 at 03:26
  • You're 2 steps further than me already. I'm not worried about speed, i'm trying to figure out what mongo thing to use to do what would otherwise be a simple sql join – Sean Clark Aug 21 '15 at 03:27
  • This seems like the opposite effect. I have 2 collections. users and publishers. I want to get a list of publishers as my final result. But every user has an array of publisher IDs that they follow. So in my publishers output, i need to count all users that have my publisherID in their following array. – Sean Clark Aug 21 '15 at 03:40