Thoughts Are My Own: 2016

Tuesday, 23 February 2016

Leveraging the SIMS Active Directory Provisioning Service to support Single-Sign-On

Like a great many UK schools, we use Capitas SIMS as our school MIS. It's where the vast majority of our data, of all kinds, lives and is generally considered to be the 'source of truth' for all kinds of data about our students and staff.

Our computer network, again like most of the rest of the world, is based on Microsoft Active Directory. For our IT needs, this is another 'source of truth'. It provides the authentication database for all kinds of services - from standard desktop logons, to web proxy authentication, to 802.1x authentication to wireless networks, to managed print accounting, to e-mail, to Moodle and more. Active Directory security groups determine all kinds of policies and memberships across our environment and we try hard to tie as much as we can into this central infrastructure - as with well over 1000 active accounts, trying to do so manually would be an impossible task to stay on top of.

The problem, historically, has been maintaining a relationship between SIMS and Active Directory. SIMS already knows who is on-roll and employed at our school, it knows what classes those people are in, it knows who has responsibility for what, it knows a lot of stuff! SIMS is also the 'source of truth' that a number of popular 'hosted', 'cloud' or 'SAAS' (depending on your love of buzzwords) applications use to automatically discover that same information and populate their own databases. Third party extraction services exist, such as GroupCall, that the providers of external services can use to discover names of students on roll, names of staff, classes and who is in them, contact details for parents, behaviour and achievement information and more in order to make their products work, without having to ask school staff to manually import thousands of pieces of information somehow.

What these hosted application can't do however, is create any kind of a link to Active Directory - SIMS just doesn't have the information available to it. The result is that schools end up trying to manually maintain separate sets of usernames and passwords for a myriad of applications. This is difficult for schools - larger secondaries will see students starting and leaving most weeks. It's also difficult for users - trying to remember 10+ different username formats and maintain passwords to go with them is difficult - and doubly so if you joined a school mid-year and missed the bit where everyone else was given the relevant information. The answer is obviously single-sign on and support from externally hosted applications for schools to plug into authentication providers such as Active Directory. However, the challenge is still how to create and maintain that 'link' between SIMS and Active Directory, so a computer can know 'The person with Active Directory username bob is actually Bob Smith, in 7GD, we should show Bob Smiths homework when someone with his username logs into our website rather than Bob Jones'. You also want that 'link' to exist automatically and "just work" as people come and go or you're just moving your administrative overhead around rather than reducing it.

The most sensible answer i've found so far is the aptly named 'Active Directory Provisioning Service'. Capita make this add-on product for SIMS - and it's designed to, as the name suggests, provision accounts in Active Directory straight out of SIMS for students, staff and 'contacts' (parents usually). This product appears to be primarily designed to provide accounts for other products that Capita provide such as the SIMS Learning Gateway (in fact, that's why we have it in the first place), but it does just happen to open up a load of other integration options with a bit of work.

On its own, the ADPS can be handy, but doesn't change the world. It creates user objects, and it will even put those user objects into some handy security groups (including by class), but that's about it. In order to make these user objects useful for people to actually use for more than just simple web services, most schools will need to 'do stuff' with them (such as move them to suitable OUs, add them to existing security groups, create mailboxes behind them, create home shares, set profile paths - that kind of thing). We've created some handy scripts that automate most of this for us now, perhaps more on that another time - but the point is, most people probably won't have got much more out of the ADPS than provisioning accounts for SLG and then consolidating those accounts with an existing user you'd already made with some other method. However, the clever stuff starts to become more apparent when you take a look at how ADPS actually works.

The ADPS application makes, at install time, a few schema changes to Active Directory. The one i'm most interested in, is the one called 'capitachildrensservicesClientEntityGuids'. When we take a look at this field for a user object that has either been created by, or consolidated from the ADPS we can see some values appear:

This particular field, when populated, contains 2 unique IDs, seperated by a pipe character. I'm not sure what they are derived from exactly, but the important thing is:
1: They are unique per person
2: You can report on one of them using the SIMS reporting engine

The fact that you can report on the 'capitachildrensservicesClientEntityGuids' value means that now, there's a known, fixed, link between Active Directory and SIMS, that ADPS will maintain for you, that you can ask SIMS to produce reports containing data of your choice that includes it.

The specific service I was looking to provision SSO for while researching thing was something our school use called ShowMyHomework. As the name suggests, it's a website that shows you your homework. Initially, these guys couldn't seem to understand why SSO might be a thing that schools would want, but in recent years it seems they've finally seen the light and now provide support for talking LDAPS to their schools to authenticate users, rather than relying solely on their own disparate authentication database - and it seems to work well! SMHW pull data on students and classes from SIMS using GroupCall - and the unique identifier from SIMS that they have chosen to use to identify users within their platform is the SIMS 'ID' field. This seems to be a fairly common unique identifier for 3rd party products to use, so that's the one I wanted to automatically populate Active Directory with in this case - but it would be trivial to extend that to basically any value that the SIMS reporting engine will spit out.

So, now that we've discovered the link that ADPS creates, lets use that to populate our Active Directory with the SIMS 'ID' value for everyone.

1: Design a SIMS report that includes both the SIMS Person_ID and the 'External ID' for all our students on roll. You could do a similar one for staff or contacts, but this example focuses on students. The report should output to XML.
Tips: Find the 'External ID' data type under the 'CESThirdPartyFields' category. The SIMS user that runs the report will also need to be a member of the 'Third Party Reporting' usergroup to be able to report on the Person_ID object - SIMS will just report null values otherwise.

2: Either create a custom Active Directory field for your users by making a small edit to the Schema (mine was called hwcsSimsId) or designate an existing field to re-purpose.

3: Create a script of some kind to loop through all your Active Directory users, look for ones that have a 'capitachildrensservicesClientEntityGuids' value set, but not your custom SIMS ID set, parse the XML that SIMS will provide you to match the two values up and set the SIMS ID on missing users.

My attempt at this, written in PowerShell is here (MISID-Import-Students.ps1)

This script, that requires PowerShell 3 and the Microsoft Active Directory command line utils available, will:

Read in the SIMS XML Report
Discover all the user accounts in your Active Directory, starting at the base path of $SearchLDAPBase
For each account, look for the ones that have a 'capitachildrensservicesClientEntityGuids' value, but not a value under the $ADAttribute field
For those accounts, take the 'capitachildrensservicesClientEntityGuids' string, split it apart at the '|' symbol and store the second value (this is the 'External ID value')
Try and find a match in the SIMS XML data
If it finds one, take the value for 'ID' and write it to the $ADAttribute Active Directory field

The end result is, all being well, values appearing in your $ADAttribute field:

4: When this all works, create a Scheduled Task that uses the SIMS CommandReporter to run your SIMS report and produce new XML, then parse it and update your Active Directory as appropriate, all automatically. This runs once a day, overnight, for me - but the schedule could be anything. An example of how this whole process might work is in the same GitHub repo (MISID-ExportFromSIMS.ps1) along with some example XML output (SIMS ID To External ID.xml).

The end result is that as new users arrive with us:

GroupCall does its thing, sends updates to SMHW (or the service of your choice) who provision things their end, automatically
We run a SIMS report and update Active Directory with the SIMS ID of our new starter overnight, automatically
The user attempts to log into SMHW using their normal school username and normal school password. SMHW make an LDAPS query to us over the internet, authenticate the user and request their SIMS ID from our Active Directory
The user sees their environment on the SMHW platform - with 0 human involvement required (beyond ensuring they exist in SIMS - which is work we need to do anyway).

The same also becomes true for leavers - their accounts are de-provisioned, again, without anyone having to manually process them.

This seems to work well - and could obviously be extended to cover a number of similar applications where a tighter integration with authentication services is advantageous.

Thursday, 21 January 2016

Preventing "Online Radicalisation" in Schools

So, decided to start one of these blog things. It's like it's 2001 again! After years of sharing my thoughts mostly internally, on all kinds of topics, a number of people have suggested I start one of these - so let's give it a go! And what easier topic to begin with than 'radicalisation' and politics I guess. :/

The UK Government, in its infinite widsom, has decided that the risk of the nations children becoming "groomed" and "radicalised" via the internet while in school is in fact so great, that they are going to start making formal requirements that schools are to use 'online filters and monitoring' to definitely ensure that this can absolutely never happen. This is following one incident in London, that the school in question maintain likely had nothing to do with them anyway. *sigh*. It's another strand of the Governments 'Prevent' strategy, which in my humble opinion is one of the more insane, ridiculous and divisive ideas they've had recently - producing such gems as a 14 year old being questioned by school officials for using the term “L’ecoterrorisme” in a French discussion, parents being warned that their children may in fact be "extremists" if they dare to question Government or the media and a young child being referred to the authorities after making reference to the "history of the Caliphate" in a presentation on British foreign policy amongst others.

Now, parts of these requirements are actually reasonably sane. It's perfectly reasonable (and in fact already expected) for schools to keep an eye on what their students are doing online, just as they should keep an eye on what happens in the playground and in the classroom. It's also perfectly reasonable to expect schools to work to keep their students free from bullying, safe from adults who may pose a risk to them and to help them avoid accidentally falling over material that they might lack the maturity or emotional capacity to properly process yet. All good so far.

However, just as in the case of the 'Investigatory Powers Bill' that UK.Gov is desperately trying to push through (again), the technical measures that they want implemented in order to "keep us safe" are deeply misguided, ill thought through, a massive invasion of an individuals privacy and quite simply won't work - all at the same time. Basically, the Government has no clue whatsoever on anything technical but won't let a thing like that stop it making all kinds of insane technical demands.

History Time

Internet filtering in UK schools is nothing new. It all started back in the 90s when articles about this new fangled thing called 'The Internet' started becoming more common, some people even got 'The Internet' at home and schools started to take small steps to do likewise. Unfortunately, a lot of these articles people were reading often touched on the dangers of pornography and other adult content, so it was necessary for the Government, local authorities and schools to "do something" to stop this awful stuff making its way into our schools when they "Got The Internets". Doing this wasn't really all that difficult - the Internet was many orders of magnitude smaller, the only bit of it anyone was really concerned about was the World Wide Web and this itself was almost entirely unencrypted, very simple and mostly just pages of static content. Commercial vendors could, with relative ease, categorise the majority of the web pages on the Internet and the people that provided Internet access to schools could use that database alongside caching proxies to return error pages to users instead of "bad stuff" if it was requested. Even back then of course it was trivial to bypass this filtering (I still remember a few of the ways that worked for me when I was of school age!) and there were of course errors and omissions from the filtering databases - but access to the internet was far less critical back in the late 90s than it is now, was typically available on a small number of desktop computers in a school and I don't think it was really enough of a big deal for people to get too worked up about.

The Modern Classroom

These days of course, things have moved on massively. The internet is everywhere and used for everything, all of the time and schools are no exception to this. Schools often have many hundreds (if not thousands) of internet connected devices from traditional desktop computers, to laptops, to racks of tablets, big screens and projectors everywhere, online applications and storage and of course, major WiFi deployments that Students and Staff alike can connect their own equipment to and get online anywhere they happen to be. "The Internet" is no longer just the WWW either and all kinds of proprietary "apps" exist and interconnect to meet demand for just about anything that involves communication. The infrastructure behind this has naturally become more complex and is often delivered in far more layers of abstraction than it used to be. Network connectivity is critical these days, just as in most work places and not only do people jump and shout if it's unavailable for 10 minutes - they jump and shout if one particular service, like YouTube or Google Image search, isn't available for 10 minutes. In amongst all of this however, the internet filtering thing from the 90s still hangs on in very much the same form. It does however, have very different effects on things. For a start, it's simply not possible anymore for anyone, even a commercial entity with massive resources, to categorise the entire internet. Despite what many people who work in edu assume, the "science" behind web filtering is just still human beings, looking at websites and deciding what categories they fall into at the time they happen to have looked at them. The database of websites that results from this manual categorisation is then the one that is used as the decision maker by commercial web filtering products. Unfortunately for the users of these products though, to take just one example, it's estimated that 300 hours of video are added to YouTube every minute alone, so expecting a machine to somehow "know" if each one of those videos are "good" or "bad" is simply impossible - so schools are left with only an elephant-gun approach of "YouTube Ok" or "YouTube BAD" (because it might have a tiny proportion of content that some might consider 'not ok'). Personally, i'm of the opinion that restricting access to platforms like this, that provide access to an unlimited, never ending and freely available supply of amazing resources and ways of learning just because someone found something nasty is cutting your (educational) nose off to spite your face - but it's something that I know a good number of schools do - and the same is true of a colossal number of other amazing internet resources. Similarly, whilst the school I work in does employ this same 90s-style web filtering still, I try to err on the side of not crippling huge swathes of the internet 'just in case' and the benefit we see from not doing that is immediately obvious.

On a technical level though, even if we did decide that the risk was just "too great" and that we wanted to cripple our tech in the name of safety, the technical challenges of "restricting" the internet in 2016 are, as in many ways they should be, a law of ever diminishing returns. More and more of the internet is now encrypted and delivered securely over SSL. This is a "good thing" for the Internet and its users as it means that peoples communications are better protected from hackers, fraudsters and nefarious Governments across the world. It also means however, short of using some highly dubious methods that dramatically weaken the security of peoples communications over our network even when they are technically feasible (they aren't when people are using their own devices), it's no longer possible for those web filtering boxes to take a peek at the content you're accessing anymore. Therefore, unless I want to act like King Kanute and try to hold back the tide of technology by insisting that users don't use their own devices on my network, by breaking huge parts of the Internet and by taking some very ethically dubious steps to break into my users encrypted communications, it is in fact exceptionally difficult for me to, at a technical level, perform the necessary snooping to meet these ridiculous requirements. And even if I do try and act like King Kanute and do all of these things that will undermine the trust of my students and staff and make life generally a lot harder for them, and even if the "magic technology" manages to filter all of the "bad stuff" (which it can't), anyone can of course bypass all of it anyway with a few seconds help from Google, or by just turning on their 3G service or using a computer at home, or WiFi in a coffee shop, thus rendering the whole effort pointless.

Taking things a stage further solutions are available that, for the small price of a hefty license fee, signing your soul over the devil and installing a load of software with potentially highly dubious security practices of its own will turn your IT infrastructure into some kind of Orwellian nightmare where oodles of screenshots of every swear word, policy violation or in fact a hint of anyone mentioning the names of various political activists are immediately dispatched for analysis. This is not only sending the clear message that we simply don't trust our young people to start making their own decisions, but it's also trying to find a technical solution to what is in fact a social problem.

What should be done?

This brings me to what the 'common sense' solution to this might be. Students at my school enjoy a relative amount of freedom with technology. They use this to hunt out solutions to their school problems that other people have made help videos about, they make use of Twitter, they e-mail each other and their teachers. Our desktop computers are not locked down to within an inch of existence so they can write code, write scripts, take to Linux in their own virtualised environment, poke around the internals of operating systems without worrying they're about to be hauled over the coals for it and can generally geek-out with relative freedom. We have students who are highly technically competent and help me and the IT technicians out imaging computers, replacing hardware and the physical parts of refreshes - which benefits both them and us. The result of all this is a very low level of "malicious" behavioural problems when it comes to IT. When it comes to the other kinds of behaviour incidents that can result from computer use and Internet access, these are treated like any other behavioural matter. We do employ web filtering for certain categories of content and we do maintain a certain amount of logging data, but the expectation is that these technical measures are there to make the jobs of teachers easier and help them keep their classes on track, rather than "do their jobs for them" and mean they somehow don't have to worry about what students are doing online. I will of course always help where I can when incidents do occur (which is thankfully rare), but the IT Office is not the place you run to because someone typed something a little risque into Google, just as presumably the librarian would not be the person a member of staff ran to because someone brought a dirty magazine into school either. Ultimately, student conduct is not a problem looking for a technical solution - it's a social problem. I'm very lucky that this is an ethos that my SLT understand and also that the overwhelming majority of our students respond very well to being given a little more personal responsibility for their behaviour almost all of the time as well.

When incidents of concern do occur, as of course they do from time to time, my school has a very well defined behaviour policy and protocols and staff who will discuss the issue with the child, attempt to work out what went wrong and how mistakes can be learned from. To my mind this seems far more likely to provide learning and development opportunities, encourage young people to have a clearer understanding of right from wrong and how to keep themselves safer, instead of crippling the technology in some massive pretence that we can somehow control this massive online world and shouting at those young people who demonstrate what a farce it all is? Is this "education" not after all kind of the reason schools exist in the first place and not far more likely to help "Prevent" radicalisation from toxic ideologies than faceless monitoring and an assumption that trying to hide information will make it go away?

I knew I should have just ranted about group policy, roaming profiles and sandwiches shoved inside CD trays.