169

I am at the start of building a Facebook style notification system for our page (social gaming type) and I'm now researching what would be the best way to design such system. I'm not interested in how to push notifications to the user or anything like that (for now even). I am researching how to build the system on the server (how to store notifications, where to store them, how to fetch them etc...).

So ... some requirements that we have:

  • at peak times we have about 1k concurrent logged-in users (and many more guests, but they don't matter here as they will not have notifications) that will generate many events
  • there will be different types of notifications (user A has added you as a friend, user B has commented on your profile, user C has liked your image, user D has beaten you on game X, ...)
  • most events will generate 1 notification for 1 user (user X has liked your image), but there will be cases where one event will generate many notifications (it's user Y's birthday for instance)
  • notifications should be grouped together; if for instance four different users like some image, the owner of that image should get one notification stating that four users have liked the image and not four separate notifications (just like FB does)

OK so what I was thinking is that I should create some sort of queue where I would store events when they happen. Then I would have a background job (gearman?) that would look at that queue and generate notifications based on those events. This job would then store notifications in the database for each user (so if an event affects 10 users, there would be 10 separate notifications). Then when user would open a page with the list of notifications I would read all those notifications for him (we ware thinking to limiting this to 100 latest notifications) and group them together and then finally display them.

Things I'm concerned about with this approach:

  • complex as hell :)
  • is database the best storage here (we are using MySQL) or should I use something else (redis seems like a good fit too)
  • what should I store as a notification? user ID, user ID who initiated the event, type of event (so that I can group them and display appropriate text) but then I kinda don't know how to store the actual data of the notification (for instance URL&title of the image that was liked). Should I just "bake" that info when I generate the notification, or should I store the ID of the record (image, profile, ...) being affected and pull the info out of the DB when displaying the notification.
  • performance should be OK here, even if I have to process 100 notifications on-the-fly when displaying the notifications page
  • possible performance problem on every request because I would have to display the number of unread notifications to the user (which could be a problem in its own since I would group notifications together). This could be avoided though if I generated the view of notifications (where they are grouped) in the background and not on-the-fly

So what do you think about my proposed solution and my concerns? Please comment if you think I should mention anything else that would be relevant here.

Oh, we are using PHP for our page, but that shouldn't be a big factor here I think.

Jan Hančič
  • 53,269
  • 16
  • 95
  • 99
  • How much time it took you to build this notification system as one man efforts. I just want to have an estimation to make the timelines accordingly. – Shaharyar Nov 24 '15 at 10:55
  • @Shaharyar I think it depends on the complexity of the notification system. –  Apr 20 '16 at 06:43
  • I used the same system with MySQL to build a priority based notification system. The good thing is that it scales to a few thousand users, if it goes more than that, it blows up, specially with Android and GCM. I would like to know alternatives to MySQL like redis, rabbitMQ, Kafka which naturally exhibit a message queue, kind of functionality. – Ankit Marothi Jan 23 '17 at 13:57

4 Answers4

167

A notification is about something (object = event, friendship..) being changed (verb = added, requested..) by someone (actor) and reported to the user (subject). Here is a normalized data structure (though I've used MongoDB). You need to notify certain users about changes. So it's per-user notifications.. meaning that if there were 100 users involved, you generate 100 notifications.

╔═════════════╗      ╔═══════════════════╗      ╔════════════════════╗
║notification ║      ║notification_object║      ║notification_change ║
╟─────────────╢      ╟───────────────────╢      ╟────────────────────╢
║ID           ║—1:n—→║ID                 ║—1:n—→║ID                  ║
║userID       ║      ║notificationID     ║      ║notificationObjectID║
╚═════════════╝      ║object             ║      ║verb                ║
                     ╚═══════════════════╝      ║actor               ║
                                                ╚════════════════════╝

(Add time fields where you see fit)

This is basically for grouping changes per object, so that you could say "You have 3 friend requests". And grouping per actor is useful, so that you could say "User James Bond made changes in your bed". This also gives ability to translate and count notifications as you like.

But, since object is just an ID, you would need to get all extra info about object you want with separate calls, unless object actually changes and you want to show that history (so for example "user changed title of event to ...")

Since notifications are close to realtime for users on the site, I would tie them with nodejs + websockets client with php pushing update to nodejs for all listeners as change gets added.

Artjom Kurapov
  • 6,115
  • 4
  • 32
  • 42
  • +1 @Artjom, this is an awesome post. I was wondering, you mention `object` is just an id. How do you know what type of object it is? e.g it could be a friend request, or anything else, so how do you link the notification to the appropriate event? Many thanks – Mick Apr 29 '13 at 01:14
  • 1
    notification_object.object identifies change type, like a string "friendship" The actual reference to changed object with its extra data that I talk about is in notification_change.notificationObjectID – Artjom Kurapov Apr 29 '13 at 07:35
  • I see, that's very smart. So, do you ever need to store the class of the object anywhere in the database then? Or do you have a way of retrieving the class from the string stored in `notification_object.object` (like 'friendship)? because, if we wish to get some information on the notification object, we need to load it using its class and its identifiers. Am I right? Do you have a trick for this? like an event listener, or something that converts `notification_object.object` and its identifier into the actual object? Many thanks @Artjom, this is huge help :) – Mick Apr 29 '13 at 08:45
  • This is just a schema for DB, if you want data to be used as objects with some class inheritance or active record, naturally you have to somehow map what class corresponds to what string value. In php, I was fine with plain arrays and loading extra information for each object as I saw fit in simple switch with that string. Eg. if its a "friendship", then I needed to load user data to make up proper message with a name, link to profile etc. – Artjom Kurapov May 10 '13 at 09:04
  • what could be possible criteria for aggregation of "notification_changes"? – machete Oct 04 '13 at 20:41
  • 2
    This may be a dumb question but with this set up what do you do once the user has seen or acted on the notification? Do you just remove it from the database or just use dates to see if the user has logged in since the notification was created? – Jeffery Mills Feb 17 '14 at 03:16
  • 4
    I know this topic is already quite old, however i'm a bit puzzled about the first table, what excactly is the purpose of this table? what is the advantage of having this as a separate table versus putting the userID in the notification_object table? In other words when will you create a new entry in notification and when will you just add an object and change to an existing notification with this structure? – Bas Goossen Feb 27 '14 at 18:48
  • 3
    @JefferyMills You could have a status field like `is_notification_read` in the `notification` table and mark it appropriately if it is `unread`, `read` or `deleted`. – Kevin Jun 15 '14 at 14:42
  • 1
    This is quite old but I don't really understand. Do you have any example of schema (foobar)? – Vinz243 Aug 16 '14 at 13:49
  • I don't understand either. How is notification_change linked to the other two tables? – Mark Murphy Feb 25 '15 at 18:43
  • 2
    I also have struggled to understand some aspects of this solution, and made a separate question about it: http://dba.stackexchange.com/questions/99401/understanding-a-notification-system – user45623 May 04 '15 at 06:37
  • btw. if you are wondering about how to send/store notifications across browsers/users and you are not fond of setting up socket.io/nodejs on your site, you could use pusher.com to send events or firebase.com to send/store them – Artjom Kurapov Jun 08 '16 at 07:22
  • 1
    We've designed similar system and documented it here - http://blog.osmosys.asia/2017/04/26/design-notification-system/ – Abijeet Patro Jul 18 '17 at 07:30
  • @AbijeetPatro I'm referring your blog, nice detailed explanation, my question is **how/where** you have mapped `action_type/s` based on `entity_type_id` ? e.g:- action_type can be known from `entity_type_id`, i.e. `action_type` can be ‘added a’, ‘updated the’, ‘deleted the’, Thanks a lot – Shantaram Tupe Nov 17 '17 at 06:33
  • This link is broken blog.osmosys.asia/2017/04/26/design-notification-system try with this one http://blog.osmosys.asia/design-notification-system/ – Jason Glez Jul 19 '18 at 21:40
27

This is really an abstract question, so I guess we are just going to have to discuss it instead of pointing out what you should or shouldn't do.

Here's what I think about your concerns:

  • Yes, a notification system is complex, but not as hell though. You can have many different approaches on modeling and implementing such systems, and they can have from a medium to a high-level of complexity;

  • Pesonally, I always try to make stuff database-driven. Why? Because I can guarantee having full control of everything that's going on - but that's just me, you can have control without a database-driven approach; trust me, you are gonna want control on that case;

  • Let me exemplify a real case for you, so you can start from somewhere. In the past year I've modeled and implemented a notification system in some kind of a social network (not like facebook, of course). The way I used to store notifications there? I had a notifications table, where I kept the generator_user_id (the ID of the user that is generating the notification), the target_user_id (kind of obvious, isn't it?), the notification_type_id (that referenced to a different table with notification types), and all that necessary stuff we need to fill our tables with (timestamps, flags, etc). My notification_types table used to have a relation with a notification_templates table, that stored specific templates for each type of notification. For instance, I had a POST_REPLY type, that had a template kind of like {USER} HAS REPLIED ONE OF YOUR #POSTS. From there, I just treated the {} as a variable and the # as a reference link;

  • Yes, performance should and must be ok. When you think of notifications you think of server pushing from head to toe. Either if you are going to do it with ajax requests or whatever, you are gonna have to worry about performance. But I think that's a second time concern;

That model that I've designed is, of course, not the only one that you can follow, neither the best as well. I hope my answer, at least, follows you into the right direction.

Daniel Ribeiro
  • 10,156
  • 12
  • 47
  • 79
  • Why wouldn't I have control with some other data store? – Jan Hančič Mar 16 '12 at 11:40
  • Well, I didn't say that. What I said is that I can only guarantee data control with a database-driven approach; but that's just me. I'm gonna rephrase that. – Daniel Ribeiro Mar 16 '12 at 11:44
  • @DanielRibeiro the placeholders ({...}) in the notification template need to replace data of placeholders from the different set of tables in the database for the different type of notifications. E.g. one template is "{user} has liked your photo.", another template is "Your {Pagename} has new like." Etc. {PageName} and {user} and other placeholders will map from the different database table, so what should be the schema to get the placeholders value dynamically. – Ashish Shukla Mar 27 '17 at 14:43
  • DanielRibeiro how you replaced placeholders as asked by @Ashish Shukla, – Shantaram Tupe Nov 16 '17 at 06:36
  • @AshishShukla have you used or replaced placeholders, and how ? – Shantaram Tupe Nov 16 '17 at 06:36
  • I don't think placeholders/templates in a specific language are the way to go as it doesn't allow for localisation. – CaptRisky Sep 12 '18 at 05:57
8
╔════════════════════╗
║notification        ║
╟────────────────────╢
║Username            ║
║Object              ║
║verb                ║
║actor               ║
║isRead              ║
╚════════════════════╝

This looks a good answer rather than having 2 collections. You can query by username, object and isRead to get new events(like 3 pending friend requests, 4 questions asked etc...)

Let me know if there is problem with this schema.

Kaphy
  • 459
  • 4
  • 10
4

I personally don't understand very well the diagram for the accepted answer, So I'm going to attach a database diagram base on what I could learn from the accepted answer and other pages.

enter image description here

Improvements are well received.

Jason Glez
  • 1,254
  • 1
  • 16
  • 16
  • Seems like message_template would be in the NotificationType table. Also seems like the main_url would be in the notifications table, then you could eliminate the Notification_Message table. Can you explain the reason you have the NotificationMessage table on its own? – Jeff Ryan Sep 25 '18 at 21:14