Tag Archives: python

Implementing configurable work-flow patterns in Python Django

In my previous article, I discussed some of changes I’ve made to my WAM software to handle assessment and work-flow. I thought I’d have a look at this from the technical side for those interested in doing something similar, this is obviously extensible to general workflow management, where you might want to tweak the workflow later without diving into code.

My challenge was to consider how not to hard code a work-flow, but to have something that would be configurable, in my case in a SQL layer because I’m using Python and Django.

I had an idea about the work-flow I wanted, and it looked a bit like this (carefully sketched on my tablet). These nodes are particular states, so this isn’t really a flow chart, as decisions aren’t shown. What is shown is what states can progress to the next ones. But I wanted to be able to change the pattern of nodes in the future, or rather, I wanted users to be able to do this without altering the code. I also wanted to work out who could do what, and who should know about what.

Workflow Example
Workflow Example

Understanding States

The first thing I did was to create a State model class, and I guess in my head I was thinking of Markov Models.

As you can see, I created variables that told me the name of the state, and an opportunity for a more detailed description. I then wanted to be able to specify who could do certain things, and be notified. So, rather than a long series of Booleans, I want for a text field – the work-flow won’t be edited very often, and when it is, it should be by someone who knows what they are doing. So it’s just a Comma Separated text field. For instance.

will indicate that the Module Coordinator and Moderator should be involved (this is an HE example, but the principle is quite extensible).

So the actors field will specify which kinds of people can invoke this state, and the notify field those who should get to hear about it.

I want to draw your attention to this bit:

What on earth does this do? It allows a Django model to have a Many to Many relationship with itself. In other words, for me to associate a number of states with this one. Please also note that presence of

This is most easily explained by comparison to the Facebook and Twitter friendship model. Both of these essentially link a User model in a many to many relationship with itself.

Facebook friends are symmetrical, once the link is established, it is two way. Twitter followers are not symmetrical.

I wanted to establish which successor states could be invoked from any given one. And this should not be symmetrical by default. You can see in my example graph above, I want it to be possible to move from state A to either B or C, but this is not entirely symmetric, it is possible to move from B to A, but it should not be possible to go from C to A. Without symmetric=False, each link will create an implied link back (all arrows in my state diagram would be bi-directional) which would be problematic. By establishing the relationship as asymmetric we can allow a reciprocal link (as is possible in Twitter, and our A and B example), but we don’t enforce it, so that we prevent back tracking in work-flow where it should not be allowed (as in our A and C example).

Invoking States

I then created another model to keep track of which states were invoked.

This model allows me to work out who (the signed_by field) invoked a particular AssessmentState, when, and with any particular notes.

I also added a field to record when a notification (notified) had been sent. On creation, I leave that field as null. One of the many glorious things about Django is that it’s infrastructure for custom management commands allows you to easily build command line tools for doing cron tasks while your web front end runs without interruption. I found this rather awkward, but not impossible, in PHP, but in Django the whole thing is very organic, and you get access to all your models. If you have pushed plenty of your logic into the Model layer and not the View layer, this can really help.

In my new custom commend I can easily work out which signoffs have not been notified yet:

I can then act upon those, send notifications, and if that’s successful, set the notified field to the time at which I sent them.

Further Reading

In this article I have concentrated on the Model layer, with a few other observations, and in particular the relationship from a State model to itself.

All of the Forms and Views are available within my GitHub repository for the project. They aren’t a work of art, but if you have any questions feel free to look there, or get in touch.

I hope that might be helpful to someone facing the same challenge, and do feel free to suggest how I could have solved the problem more elegantly.

Assessment handling and Assessment Workflow in WAM

Sometime ago I began writing a Workload Allocation Modeller aimed at Higher Education, and I’ve written some previous blog articles about this.

As is often the way, the scope of the project broadened and I found myself writing in support for handling assessments and the QA processes around them. At some point this necessitates a new name for WAM to something more general (answers on a post card please) but for now, development continues.

Last year I added features to allow Exams, Coursework, and their Moderation and QA documents to be uploaded to WAM. This was generally reasonably successful, but a bit clunky. We gave several External Examiners access to the system and they were able to look in at the modules for which they were an examiner and the feedback was pretty good.

What Worked

One of the things that worked best about last year’s experiment was that we put in information about the Programmes (Courses) each Module was on. It’s not at all unusual for many Programmes to have the same Module within them.

This can cause a headache for External Examination since an External Examiner is normally assigned to a Programme. In short, the same Module can end up being looked at by several Examiners. While this is OK, it can be wasteful of work, and creates potential problems when two Examiners have a different perspective on the Module.

So within WAM, I put in code an assumption of what we should be doing in paper based systems – that every Module should have a “Lead Programme”. The examiner for that Programme should be the one that has primacy, and furthermore, where they are presented other Modules on the Programme for which they aren’t the “lead” Examiner, they should know that this is for information, and they may not be required to delve into it in so much detail – unless they choose to.

This aspect worked well, and the External Examiners have a landing screen that shows which Modules they are examining, and which they are the lead Examiner.

What Didn’t Work

I had written code that was intended to look at what assessment artefacts had been uploaded since a last user’s login, and email them the relevant stuff.

This turned out to be problematic, partly because one had to unpick who should get what, but mostly because I’m using remote authentication with Django (the Python framework in which WAM is written), and it seems that the last login time isn’t always updated properly when you aren’t using Django’s built in authentication.

But the biggest problem was a lack of any workflow. This was a bit deliberate since I didn’t want to hardcode my School or Faculty’s workflow.

You should never design your software product for HE around your own University too tightly. Because your own University will be a different University in two years’ time.

So, I wanted to ponder this a bit. It made visibility of what was going on a little difficult. It looked a bit like this (not exactly, as this is a screenshot from a newer version of an older module):

Old view of Assessment Items
Old view of Assessment Items

with items shown from oldest at the bottom to newest at the top. You can kind of infer the workflow state by the top item, and indeed, I used that in the module list.

But staff uploaded files they wanted to delete (and that was previously disallowed for audit reasons) and the workflow wasn’t too clear and that made notifications more difficult.

What’s New

So, in a beta version of 2.0 of the software I have implemented a workflow model. I did this by:

  • defining a model that represented the potential states a Module could be in, each state defines who can trigger it, and what can happen next, and who should be notified;
  • defining a model that shows a “sign off” event.

Once it became possible to issue a “sign off” of where we were in the workflow, a lot of things became easier. This screenshot shows how it looks now.

Example of new assessment workflow
Example of new assessment workflow

Ok, it’s a bit of a dumb example, since I’m the only user triggering states here (and I can only do that in some cases since I’m a Superuser, otherwise some states can only be triggered by the correct stakeholder – the moderator of examiner).

However, you can see that now we can still have all the assessment resources, but with sign offs at various stages. The sign off could (and likely would) have much more detailed notes in a real implementation.

This in turn has made notification emails much easier to create. Here is the email triggered by the final sign off above.

The detailed notes aren’t shown in the email, in case other eyes are on it and there are sensitive comments.

All of this code is available at GitHub. It’s working now, but I’m probably do a few more bits before an official 2.0 release.

I will be demoing the system at the Royal Academy of Engineering in London next Monday, although that will focus entirely on WAM’s workload features.

Migrating Django Migrations to Django 2.x

Django is a Python framework for making web applications, and its impressive in its completeness, flexibility and power for speedy prototyping.

It’s also an impressive project for forward planning, it has a kind of built in “lint” functionality that warns about deprecated code that will be disallowed in future versions.

As a result when Django 2.0 was released I didn’t have to make many changes to my app code base to get it to work successfully. However, today when I tried to update my oldest Django App (started in Django 1.8x) I hit an unexpected snag. The old migrations were sometimes invalid. Curiously I don’t think this problem emerged the last time I tried.

Django uses migrations to move the database schema from one version to the next. Most of the time it’s a wonderful system. In the rare case it goes wrong it can be … tricky. Today’s problem is quite specific, and easier to fix.

Django 2.0 enforces that ForeignKey fields explicitly specify a behaviour to follow on deletion of the object pointed to by the key. In general whether we Cascade the deletion, or set the field to Null, getting the behaviour write can be important, particular on fields where a Null value has a legitimate meaning.

But a bit of a sting in the tail is that an older Django project may have migrations created automatically by Django which don’t obey this. I discovered this today and found I couldn’t proceed with my project unless I went back and modified the old migrations to be 2.0 compliant.

So if this happens to you, here are some suggestions on fixing the problem.

You will know if you have a problem if when you try to run your test server, or indeed replace runserver by check

you get an error and output like this

I would suggest you try runserver whatever you did before as it will continue to try each time you save a file.

Open your code with your favourite editor, and open your models.py file (you may have several depending on your project), and the migration file that’s broken as above.

Looking in your migration file you’ll find the offending line. In this case it’s the last (non trivial) line below.

To ensure that your migrations will be applied consistently with your final model (well, as long as nobody tries to migrate to an intermediate state) look carefully in the correct model (Activity) in this case, and see what decision you make for deletion there. In my case I want deletion of the ActivitySet to kill all linked Activitiy(s). So replicate the “on_delete” choice from there.

Each time you save your new migration file the runserver terminal window will re-run the check, hopefully moving on to the next migration that needs to be fixed. Work your way through methodically until your code checks clean. Check into source control, and you’re done.

 

Semi Open Book Exams

A few years ago, I switched one of my first year courses to use what I call a semi-open-book approach.

Open-book exams of course allow students to bring whatever materials they wish into them, but they have the disadvantage that students will often bring in materials that they have not studied in detail, or even at all. In such cases, sifting through materials to help them answer a question could be counter productive.

On the other hand, the real world is now an increasingly “open-book” environment, which huge amounts of information available to those in the workplace which is now almost always Internet connected.

So I decided to look at another approach. Students are allowed to bring in a single, personalised, A4 sheet, on which they can write whatever they wish on both sides. There are a few rules:

  • the sheet must be written on “by hand”, that is to say, it cannot be printed to from a computer, or typed;
  • the sheet must be “original”, that is to say, it cannot be a photocopy of another sheet (though students may of course copy their original for reference);
  • the sheet must be the student’s own work, and they must formally declare as much (with a tick box);
  • the sheet must be handed in with the exam paper, although it is not marked.

The purpose of these restrictions are to ensure that each student takes a lead in producing an individual sheet, and to inhibit cottage industries of copied sheets.

In terms of what can go on the sheet? Well anything really. It can be sections from notes, important formulae, sample questions or solutions. The main purpose here is to prompt students to work out what they would individually distill down to an A4 page. So they go through all the module notes, tutorial problems and more, and work out the most valuable material that deserves to go on one A4 page. I believe that this process itself is the greatest value of the sheet, its production rather than its existence in the exam. I’m working on some research to test this.

So I email them each an A4 PDF, which they can print out at home, and on whatever colour paper they may desire. The sheet is individual and has their student number on it with a barcode, for automated processing and analysis afterwards for a project I’m working on, but this is anonymised. The student’s name in particular does not appear, since in Ulster University, it does not appear on the exam booklet.

The top of my sheet looks like this:

The top of a sample guide sheet.

So, if you would like to do the same, I am enclosing the Python script, and LaTeX that I use to achieve this. You could of course use any other technology, or not individualise the sheet at all.

For convenience the most recent code will also be placed on a GitHub repository here, feel free to clone away.

My script has just been rewritten for Python 3.x, and I’ve added a lot of command line parameters to decouple it from me and Ulster University only use. It opens a CSV file from my University which contains student id numbers, student names, and emails in specific columns. These are the default for the script but can be changed. For each student it uses LaTeX to generate the page. It actually creates inserts for each student of the name and student number, you can then edit open-book.tex to allow the page to be as you wish it. You don’t need to know much LaTeX to achieve this, but ping me if you need help. I am also using a LaTeX package to create the barcodes automatically.

I’ve spent a bit of time adding command line parameters to this script, but you can try using

for information. The script has been rewritten for Python 3. If you run it without parameters it will enter interactive mode and prompt you.

I’d strongly recommend running with the –test-only option at first to make sure all looks good, and opening open-book.pdf will show you the last generated page so you can see it’s what you want.

Anyway, feel free to do your own thing, or mutilate the code. Enjoy!

I use a LaTeX template for the base information, this can be easily edited for taste.

 

Workload Allocation Modelling Update – Scalability

I have been doing some more work on my software to handle Academic Workload Modelling, developing a roadmap for two future versions, one being modifications needed to run real allocations for next year without scrapping existing data, and another being code to handle the moderation of exams and coursework (which isn’t really anything to do with workload modelling, there’s some more mission creep going on).

Improvements to Task Handling

Speaking of mission creep I noted in the last article I’d added some code to capture tasks that staff members would be reminded off and could self-certify as complete. I improved this a lot with more rich detail about when tasks were overdue and UI improvements.

I wanted to automate some batch code to send emails from the system periodically. I discovered that using a Django management command provided an elegant way to the batch mode code into the project that could be called with cron through the usual Django manage.py script that it creates to handle its own internal related tasks for the project from the command line.

It was easy to use this framework to add command switches and configuration of verbosity (you might note I haven’t disabled all output at the moment so I can monitor execution at this stage). I have set this up to email folks on a Monday morning with all the tasks, but also on Wednesday and Friday if there are urgent tasks still outstanding (less than a week to deadline).

I’ve been using this functionality live and it has worked very well. I used Django templates to help provide the email bodies, both in HTML and plain text.

Sample Task Reminder Email
Sample Task Reminder Email

Issues of Scale

My early prototype handled data for one academic year, albeit with fields in the schema to try and solve this at a later stage. It also suffered from a problem in that if other Schools wanted to use the system, how would I disaggregate the data both for security and convenience?

In the end I hit upon a solution for both issues, a WorkPackage model that allows a range of dates (usually one academic year) and a collection of Django User Groups to be specified. This allows all manually allocated activities, and module data to be specified with a package and therefore both invisible to other packages (users in other Schools, or in other Academic Years). I was also able to put the constants I’m using to model workload into the Django model, making it easier to tweak year on year.

I’m pretty much ready to use the system for a real allocation now without having to purge the test data I used this this year. I can simply create a new WorkPackage.

I need to write some functionality to allow one package’s allocations to be automatically rolled over to the next as a starting point, but I reckon that’s maybe two or three more hours.

Future Plans for the Application

The next part of planned functionality is an ability to handle coursework and examination and the moderation process. It will be quite a big chunk of new functionality and moving the system again to something quite a bit bigger than just a workload allocation system.

This of course means I need a better Application name, (WAM isn’t so awesome anyway). Suggestions on a post card.

Django Issues

I think I’m getting more to grips with Django all the time – although I often have the nagging feeling I’m writing several lines of code that would be simpler if I had a better feel for its syntax for dealing with QuerySets.

The big problem I hit, again, was issues in migrations. I created and executed migrations on my (SQLite) development system, but when I moved these over to production (MySQL) it barfed spectacularly.

Once again the lack of idempotent execution means you have to work out what part of the migration worked and then tag the migration as “faked” in order to move onto the next. This was sufficient this time, and I didn’t have to write custom migrations like last time, but it’s really not very reassuring.

Further Details

As before, the code is on GitHub, and the development website on foss.ulster.ac.uk, if you want more details.

Manually completing a botched django migration

I wrote a lot of code for my Workload Allocation system on Friday, and had been developing it on the machine with django’s built in lightweight web server, and a (default) sqlite database backend. In production I decided to use a MySQL backend in case sqlite was, well, too lite.

One of the things that is really neat about django, but which also profoundly scares me, is that it handles changes to the database schema automatically. I am used to doing all of this by hand. It has been a pleasant change, but I wondered what would happen if it went wrong.

Which it did on Friday. The migrations had worked perfectly well on the development server and after some testing I decided to roll the code into production, whereupon the migration failed. I’m still not sure why, but something in the django deep magic failed. To make things worse the process is, I have discovered, not idempotent, and trying to run the migration again caused it to fail in new places because some of the database schema changes had been successful; so it was now bailing out with “already exists” kind of errors.

Removing some tables and trying again didn’t quite do the trick. I thought about trying to fix the schema manually, since with the mysql command line tool I could see what fields needed to be added, but upon inspection the restraints added by django were complex and I was unsure how important they were.

So this is my clumsy workaround, that will no doubt come back to haunt me.

I used the following commands from the top of the django app directory to find the name of the migration that was failing, and than used –fake to force django to forget about having to apply it.

I then created a “manual” django migration that added the new fields.

It turns out that getting the dependency right at the top is very important, it needs to be previous migration.

The name of this script is important, follow the naming convention of your most recent failed migration, changing auto to custom and the timestamp appropriately. I discovered that django, would not run this migration. It detected a conflict with the previous migration that should have created the fields and wanted me to try and merge them. That would be pointless since the previous migration failed. I also discovered to my surprise there was no –force command line switch to override this logic, though Google perhaps suggests that previous versions of django allowed this.

So, I used the sqlmigration django command to output the correct SQL that it would produce if this migration did run. Once I got it showing in the shell, I forwarded this to a file.

Finally I used the mysql command line tool

to get access to the database, and then used the following command to import and run the SQL produced above.

And so far so good. I had been getting Server Errors on pages relating to the botched model before and at the moment they seem to be behaving correctly. Hopefully this may help you and not come back to haunt me.

Workload Allocation Monitoring (WAM) Prototype

I decided to start writing a workload allocation monitoring system for Higher Education. I found one written as part of a JISC project at Cambridge, but despite my experience with PHP I found it difficult to set-up, a bit crude (sorry) and hard to maintain. It was clearly very flexible, and I wanted something flexible, simple and clean.

So I decided I’d try writing something quickly using the Python django framework. This is my first web-app written in Python and so I dare say I would do some things differently with more experience, but I have now reached the point where I have a workable prototype that I can start to use myself. I’ve got to say, I found django to be pretty neat.

At its heart is a list of the loads against Academic Staff in a department or school. The idea is to try and increase transparency. There are problems with this approach: some known irregularities of loading can be for confidential reasons; small numbers of staff with key skills can cause issues as well, but it is intended to provide a basis.

Overall loads for staff.
Overall loads for staff.

 

 

 

 

While classically the word semester implies that there are two of them, most Universities operate a three semester system with the third covering the Summer. Unevenness in loading over the Summer is another cause of potential trouble, so the system tries to show loading as spread across semesters. A scaled column accounts for staff who do not have a 100% FTE contribution but their hours are up-scaled for comparison.

Naturally staff will want to see some granularity of these loads and they are broken into individual activities that are allocated to given members of staff.

Breakdown of activities for a staff member.
Breakdown of activities for a staff member.

An individual activity can be specified as occupying a number of hours, or alternatively a percentage of a staff member’s time. It can occupy one or more semesters (in which case it is spread evenly across them). Types can be allocated for activities to help track contributions of different types. It might be that an activity is related to a module or study, or not.

Activities are long term parts of work allocated hours or a percentage of time.
Activities are long term parts of work allocated hours or a percentage of time.

Speaking of modules basic information is stored for these, and another issue I think will help, tracking the submission of exams and coursework through various QA processes.

At a glance the most recent information about the exam and coursework status can be seen.
At a glance the most recent information about the exam and coursework status can be seen.

While activities are considered to be events with long engagements, another issue for staff are tasks that are allocated to them, usually of comparatively short duration. It can be hard to staff to remember all of these tasks, and hard for manager to follow up their completion, especially without annoying staff who have completed them already.

Tasks can be allocated against individual members of staff or groups or both.
Tasks can be allocated against individual members of staff or groups or both.

The web-app will allow tasks to be defined against one person, many people, categories of people and so on.

A list of tasks and their deadlines.
A list of tasks and their deadlines.

 

 

 

 

 

It is possible to easily see which tasks are still open and whether their deadline has come and gone.

The staff required to complete a task are shown, and those that have indicated completion. The system politely nags those still outstanding.
The staff required to complete a task are shown, and those that have indicated completion. The system politely nags those still outstanding.

A look at a given task will show who has completed it and who still needs to.

A given staff member can sign off their own task.
A given staff member can sign off their own task.

 

It is often the case that admin and clerical staff check off colleagues who have responded to a given call, so the system allows for staff with given permissions to indicate someone has having completed the task. Alternatively the member of staff can do this for themselves.

So while it is still a bit rough and ready I’ve reached the point where the system is stable enough for use. Of course the challenge comes when we consider the assumptions to come up with the hours and percentage loading in the first place. So I hope to pick the brains of some colleagues about this and start testing the system.

I’ve yet to make a formal release, but the code is Affero GPL (you can use the code free of restrictions (and charge) but cannot deprive others of the same freedom on derivative works) so feel free to have a look at it.

My roadmap for an initial release can be found on foss.ulster.ac.uk, where I will eventually host the code as well, but at the moment it can be found at GitHub. My previous post detailed how to get the app to work with a central authentication system your University likely has, or something similar.

Yeah… design and CSS is not my strongest skill, more work to be done on that.

Share and enjoy.

Django, CAS authentication and Apache

I am certainly no stranger to Web Development, but I decide to really look at the Python web framework django in some detail last week to write a small web application for Workload Modelling for Academic Staff.

Yes, this is a geeky, programming post.

In doing so I ran into some trouble trying to get CAS authentication to work with the app. I tried using a django-cas client I found, having found no direct CAS support in django. This took a reasonable number of code modifications, in several source files (really only a pain because I would have to maintain both development code and production code on different authentication). However the critical problem was that while I could get authentication into the “userland” parts of the app, I was getting redirect issues with the django generated administration interface.

So, I found a totally different approach. Django does have generic remote user support built-in which I hadn’t initially found. There are some details here. As you can see there are only two lines of code needed to enable this support.

I found this worked without any drama when I used Apache to force the CAS authentication. So the code required (in version 1.8 of django) is simply as follows, in the settings.py file.

The Apache Configuration looks something like this.

You will need to ensure you have Apache’s CAS and wsgi modules installed and enabled too.

I wasted a couple of hours going around the houses on this one, so hopefully it may save you. I will be hosting the project for my modeller on foss.ulster.ac.uk along with the code once I move it from GitHub.

Python script to randomise an m3u playlist

While I’m blogging scripts for playlist manipulation here is one I use in a nightly cron job to shuffle our playlists so that various devices playing from them have some daily variety. All disclaimers apply, it’s rough and ready but WorksForMe (TM).

I have an entry in my crontab like this

which takes a static playlist and produces a nightly shuffled version.

Python script to add a file to a playlist

I have a number of playlists on Gondolin, which is a headless machine. I wanted to be able to easily add a given mp3 file to the playlists which are in m3u format. That means that each entry has both the filename and an extended line with some basic metadata, in particular the track length in seconds, the track artist and name. I wanted a script that could extract this information from the mp3 file and make adding the entry easy. So I wrote this in Python. It’s rough and ready and it is probably not very Pythonic but it’s working for me. The script should create a playlist if it doesn’t currently exist, and check for a newline at the end of the file so that the appended lines are really on a new line. ItWorksForMe (TM).

This uses the eyeD3 Python library, which on Debian is provided in python-eyed3.

My basic usage is

the last parameter is the path relative to which the mp3 filename should be written to. This is useful for me because I rsync the whole tree between machines, as you will see there are options for writing an absolute pathname if you prefer. I should probably rewrite the script to do it relative to the playlist, but that’s another day.