Garbage collecting sessions in PHP

In PHP, sessions are by default stored in a file in a directory. Sessions can be specifically destroyed from within the code, for example when users logout explicitly, but frequently they do not. As a result session files tend to hang around, and cause the problem of how to clean them up. The standard way is to use PHP’s own garbage collection which is normally enabled out of the box. In this, we define constants that specify the maximum lifetime for the session, and essentially the probability of clean up.

To make things more interesting, Debian, out of the box doesn’t do garbage collection in this way. It has a cron job that regularly erases session files in the defined session directory. But, if like me, and many others, you put your session files in a different directory for each application to avoid clashes on namespaces for two applications running under the same browser from the same server, you have a problem. If you forget Debian’s behaviour the session files will just grow indefinitely. I had forgotten this issue and found over a year’s worth of session files in a directory recently.

Solving this problem is actually quite difficult to do optimally. I mean, I could create a cron job to mirror Debian’s own, but then I’d have to put the maximum lifetime in a cron job somewhere, out of the way, and difficult for the average sys admin I’m working with to find and deal with. (That is, away from the main configuration of the project). Or I could parse this value out of the main configuration. But this leads to another problem. For some users, a 30 minute maximum idle time is acceptable (although in my case where actually a suite of applications are being used as a single gestalt entity that can even be a problem), but for many of my administrator users you need huge idle times, since they are used to logging in first thing, and periodically working at the application through the day.

In the end I settled on changing our framework to make it easy to pass through garbage collection values. This makes an interface to the configuration really easy, but it doesn’t solve the problems of long session times that not all users need, and huge delays in garbage collection.

In my last article I talked about a Munin plugin for OPUS, but when you look at it you’ll see these kind of cliff fall drops, which are caused by the garbage collection finally kicking in and removing sessions where users have not explicitly logged out. Currently, every ten minutes, OPUS runs through its user database and finds users who are allegedly online but have no active session file and then marks them offline. Then it updates the file with the online user count that Munin reads.

I suspect eventually, I will write a more sophisticated script that actually kills sessions depending upon idle time and the user class, which would make for a more accurate picture here. Any brighter ideas gratefully accepted.

My first Munin plugin

Munin is a great, really useful project for monitoring all sorts of things on servers over short and long term periods, and can help identify and even warn of undue server loads. It is also appropriately and poetically named for one of Odinn’s crows (so I suppose I should have written this on a Wednesday).

We’ve been running Munin on one of our production servers at work for quite some time, and it gives us a lot of confidence that, to say the least, the server is running in its comfort zone around the clock. Among other bits and pieces, we run OPUS and the PDSystem on this box, two of our home grown projects that are available to the students. For some time now I’ve considered writing a plugin for OPUS to show logged in users, and I finally did this, albeit the counts are not nearly so reliable as I’d like for two reasons, but I’ll probably discuss that in another post. Anyway, I arranged for OPUS to drop a simple text file which simply contains counts of online users with the syntax

student: 10
admin: 2

and so on, for each of the categories of users. Then I needed a plugin to deal with this. I decided to write it simple shell script, since its portable and I’m not much of a perl fan.


# Munin plugin for OPUS showing online users
# Copyright Colin Turner
# GPL V2+

# Munin plugins, at their simplest, are run either with "config" or
# no parameters (I plan to add auto configuration later).
case $1 in
  # In config mode, we spout out details of the graphs we will have
  # I want one graph, with lots of stacked values. The first one is
  # an AREA, and the others are stacked above them. I also (-l 0)
  # make sure the graph shows everything down to zero.
	cat <<'EOM'
graph_title OPUS online users
graph_args -l 0
graph_vlabel online users
graph_info The number of online users on OPUS is shown.
student.label student
student.min 0
student.draw AREA
staff.label academic
staff.min 0
staff.draw STACK
company.label hr staff
company.min 0
company.draw STACK
supervisor.label supervisor
supervisor.min 0
supervisor.draw STACK
admin.label admin
admin.min 0
admin.draw STACK
root.label root
root.min 0
root.draw STACK
application.label application
application.min 0
application.draw STACK
	exit 0;;

# Now the plugin is being run for data. Bail if the file is unavailable
if [ ! -r /var/lib/opus/online_users ] ; then
     echo Cannot read /var/lib/opus/online_users >&2
     exit -1

# Otherwise, a quick sed converts the default format to what Munin needs
cat /var/lib/opus/online_users | sed -e "s/:/.value/"

The plugin has now been running for several days, and you can see its output here. There are problems with it, but that’s more to do with PHP, Debian and user choice, and I’ll comment on that another time. However, already it gives me a useful feel for a lot of user behaviour.

Writing Munin plugins is easy, and Munin does so much of the hard work of turning your creation into something useful.