Customising Django Entra ID Authentication
I recently had need to add Azure / Entra ID authentication to my Workload and Assessment Modelling (WAM) app, which is written in the rather excellent Django Python framework. After a little bit of desk research it seemed clear that django-auth-adfs looked like a suitable package to enable this, and it was pretty quick to add this to my app and get basic authentication working.
Previously WAM was using a legacy CAS method to authenticate University users - which worked, but never provided much more than a REMOTE_USER variable containing a username and the fact that the user was authenticated. I had hoped that this transition would allow much richer user data to be brought into WAM - especially when auto creating new users. In a large organisation, having new users simply created in a system with no context or home is a bit problematic.
In theory, django-auth-adfs supports this. It has a stanza in its configuration that does this. It looks something like this, or the relevant bit does.
AUTH_ADFS = {
# There is some config above this
'CLAIM_MAPPING': {'first_name': 'given_name',
'last_name': 'family_name',
'email': 'upn',
'staff': {'staff_number': 'onPremisesSamAccountName',
'job_title': 'jobTitle'},
},
# And some config below
}
What this does, is read values from the Entra ID system on the right, from its "claims" and maps them into values in the Django User model. It even has support for secondary models that extent the User class (like the Staff example below). Hopefully that is all you will need.
However, I had a few concerns when I looked at doing this with my own University claims:
- the claims were - a bit unintuitive in places - for instance requiring checking several obscure values to determine what type of member of staff was logging in;
- django-auth-adfs works by needing to mirror values in the Django Model layer for the claims you wanted to map, and I didn't want to hardcode the Database Layer around legacy claim design at one institution;
- even if you could neatly map things like School data, I didn't want long strings that are the names of Schools in many Models when a primary key would be more efficient.
In short, I needed some way to pre-process the incoming claims before the mapping process into the database layer.
Extending django-auth-adfs
I decided the best way to approach this was to write a new class to extend the authentication backend in django-auth-adfs, but I had a lot of false starts in doing this, and couldn't find much help online, which is my main reason for writing this for anyone trying to do the same thing. Looking at the source code, you'll need to create a custom version of backend.py, or just one function in it. There is a function in the class that handles the claim mapping and user manipulation. I thought I could derive that as it takes place after user creation. In a perfect world, it would look a bit like this.
# You need to extend not from AdfsBaseBackend (which I discovered painfully)
# But from one of the backends that actually undertakes the authentication.
class CustomAdfsBackend(AdfsAuthCodeBackend):
def update_user_attributes(self, user, claims, claim_mapping=None):
# My theory was simple, create a new_claims dict that was more
# sane than the remote supplied ones
new_claims = dict()
foo = claims.get("foo")
if foo:
new_claims["bar"] = do_some_transformation(foo)
# Then merge the dicts with the new one having preference
merged = claims | new_claims
# And then, finally call the super() function to do the hard work
# The claim mapping stanza in the configuration should be written against
# the merged claims.
return super().update_user_attributes(user, merged_claims)
It took quite a bit of trial and error to get to this point. I had used AdfsBaseBackend initiatially to derive from and that caused oauth errors (essentially because the base class doesn't provide authentication). So you need to use one of the two derived classes that django-auth-adfs provides in its backends.py file as your base. For me, AdfsAuthCodeBackend worked, or it seemed to, but with some problems.
The subtle secondary problem
However, this didn't quite work for me. It imported correctly into the User object, but (in my case) mangled doing so with the Staff object. Close examination of the original update_user_attributes() function in the base class reveals the culprit. The function is called recursively to handle secondary Model objects (configured by nested dictionaries). Unfortunately that calls our doctored version in the derived class a second time and the transformation logic just ends up breaking things. I suppose one could try and add a variable into the derived class to only call the transformation logic once, but in the end, I opted for a more direct solution. Not least because I wanted to do all sorts of house keeping such as creation new Faculty, School etc. entities on a new user login from unknown Schools.
The not so subtle solution
The most direct solution I can find is to deliberately not call the function in the base class with super().update_user_attributes(). You will need to do the hard work yourself. At least in my code, I left the merged claims logic intact should django-auth-adfs allow this in the future (for example, by including a trivial transformation function in the base class that can be derived, and which allows the transformation step to be separated from manipulating the user).
So here's some pseudo-code with a bit more detail about imports and the like that you may need. Call this local-adfs.py or something similar. Unfortunately this is a long snippet for a blog, but I hope this might help someone.
import logging
from django_auth_adfs.config import *
from django_auth_adfs.backend import AdfsBaseBackend, AdfsAccessTokenBackend, AdfsAuthCodeBackend
# Import the models from your app you will need to manipulate
from app.models import Staff, Campus, Faculty, School
# Get an instance of a logger
logger = logging.getLogger(__name__)
logger.debug("loading custom adfs backend")
class CustomAdfsBackend(AdfsAuthCodeBackend):
def update_user_attributes(self, user, claims, claim_mapping=None):
"""
For brevity, I've removed these comments, but you should comment your claims
and the mappings you need.
"""
logger.debug("fetch claims for mapping")
logger.debug("Incoming claims are {claims}".format(claims=claims))
# Get the faculty string name (and all the other claims you need, this is one example)
faculty_string = claims.get("faculty")
# Make a new dict for the transformed data
new_claims = dict()
logger.debug("remapping claims")
new_claims['new_name'] = do_some_transformation(claims.get('old_name'])
# I used the incoming data to make new structures in the database, for instance
# I put a function in the Faculty to model layer to create an object if it didn't
# exist, and return the object (or just fetch the existing one next time).
faculty_object = Faculty.get_or_create(faculty_string)
# Your new claim could, now, for instance, have a primary key and not a string
new_claims['faculty_pk'] = None if faculty_object is None else faculty_object.pk
logger.debug("outgoing claims are {new_claims}".format(new_claims=new_claims))
# Merge the claims, with new versions taking priority
merged_claims = claims | new_claims
logger.debug("merged claims: {}".format(merged_claims))
# Can't call super() because it recurses over sub-dicts which breaks our remapping <sigh>
# return super().update_user_attributes(user, merged_claims)
# <Thanos> Fine. I'll do it myself. </Thanos>
user.first_name = new_claims.get('first_name')
logger.debug("updated %s first name to %s" % (user, new_claims.get('first_name')))
user.last_name = new_claims.get('last_name')
logger.debug("updated %s last name to %s" % (user, new_claims.get('last_name')))
user.email = new_claims.get('email')
logger.debug("updated %s email to %s" % (user, new_claims.get('email')))
user.save()
logger.debug("saving data to User model %s" % user)
try:
staff = Staff.objects.get(staff_number=new_claims.get('ulster_username'))
except Staff.DoesNotExist:
logger.debug("Staff model object for %s not found, could not update" % user)
return
staff.faculty = faculty_object
logger.debug("updated %s faculty to %s" % (user, faculty_object))
staff.save()
logger.debug("saving data to Staff model %s" % staff)
You will now need to edit yoursettings.pyto change your authentication backend to your new derived version, or it will still call the base class logic. If you put that in a local directory under your app, it should look something like this.
# We need the ADFS authentication, but also the other backend for admins
AUTHENTICATION_BACKENDS = (
'app.local.local_adfs.CustomAdfsBackend',
#'django_auth_adfs.backend.AdfsAuthCodeBackend',
'django.contrib.auth.backends.ModelBackend',
)
Remember, this means the claim mapping in your AUTH_ADFS stanza is not used - you are doing it manually. I also added code to create groups for each department and add users into them automatically by type. The world is your oyster. If you had the same problem as me, I hope this helps and saved you some of the research hours I spent on it.
