1

I'm looking for a way to easily save the browsing history of the user that visited my website, after leaving the website that is. While saving this, I want to link pages together that were both visited during the session of the user. This way I can get the correlation between different pages.

The reason I want to save the data at the end of the session is performance. I don't want the overhead of writing in tables (with possible heavy indexing) with every pageload. Also saving at the end of the session (or after leaving the site) has the advantage that search engine spiders (who don't have state/session) can be easily filtered out (they will allways have no more than one page in session history).

Sidenote: I'm using the Yii framework in PHP 5.3 for my website.

I thought of a couple of solutions for my problem:

  1. Make an ajax request in onbeforeunload while user is leaving the site.

    I check if the user stays on the site by setting a variable in each onclick event on an <a> that has a href attribute that links to an internal page. The javascript method for this is unfortunately not foolproof, because of the browsers back and forward buttons (which do not set this variable). Has the added benefit that the session data is normally available during this "last (ajax) request".

  2. Write my own Session Handler

    This would be the more cumbersome solution. I haven't used a custom session handler in Yii yet and I am a little unsure if I can (for example) easily load and process expired session data before it is deleted in the overridden gcSession() function. Also this method could be fired when a new user enters the website, leaving him with the added loading time for processing multiple session histories at once (which I like to prevent).

  3. Use a scheduled task / cron yob

    This has the benefits of both but has the risk of using a lot of CPU resources. I also suspect that this would need a lot of extra handling. For this solution I would still want to use the Yii framework and (if possible) the same application context to execute the cron yob.

I would really appreciate any information that can help me make a choice between the above solution. Maybe I've overlooked a solution or a possibility? I would really like to fire an event at a set timeout in PHP, while not making the user wait as with sleep() for example. All in all I believe asynchronous execution in PHP (initiated from a single synchronous request) is what I'm looking for. Any suggestions?

Tommy Bravo
  • 532
  • 8
  • 29

2 Answers2

0
  1. Absolutely not. You're going to run into browser compatability issues. No-fires (misfires) with certain browsing operations like closing tabs vs. closing applications, back/forward buttosn. No, just no. Not to mention the security implications.... NEVER TRUST THE CLIENT

  2. I really don't think this is wise in general. You're going to go through mayhem just to keep your sessions organized, especially when it comes to linking up sessions. Might be an interesting deep-dive into yii innards but personally I don't feel it fits.

  3. Ultimately this is the only solution in my eyes. I don't see where you think this will be a big cpu load. You can run this through YII by using a YII Command Line (Docs) which should be able to piggyback your web-stuff.

    Simply set up a cron job to go through active sessions periodically and write them to the DB.

That's just my 2c

Aren
  • 54,668
  • 9
  • 68
  • 101
  • Thanks! I guess I just needed to hear that. Any idea how to access the active session data in a cron yob without the need of a custom session handler? – Tommy Bravo Sep 14 '11 at 06:54
  • I think I found the solution for my problem. The cron yob is the way to go. The session data can be loaded by using [this method](http://stackoverflow.com/questions/1699423/can-php-cron-jobs-access-session-variables-cookies/1699471#1699471). To avoid loading every session, each time the cron yob runs, I will save the timestamp of the last activity for each session in a `MYISAM` table. Using the `INSERT DELAYED` statement for this table it won't directly cost me any pageloading time. – Tommy Bravo Sep 14 '11 at 12:01
  • Or you could just mark the session as stale by setting a value in the session, then the cron job just finds all sessions that are stale and persists them in one go. – Aren Sep 14 '11 at 17:45
  • Yes, but then I still wouldn't know the sessionid's in my cron yob (even though I need all of them). – Tommy Bravo Sep 16 '11 at 09:26
0

Instead of all this stuff, I would just store in the session every page that the user visited.

When the session expires (how to capture this event? I don't know, this may be feasible I guess), you store in DB the pages visited. You can also write them to a temp file, which is then processed by a cron job during the night if you fear too much load on the server.

I approve Aren's reaction regarding the HTML/ajax stuff, this is way too crazy and unreliable. You can do that server-side better I think.

Matthieu Napoli
  • 48,448
  • 45
  • 173
  • 261