A stupid way of exporting icals from SIS to Google Calendar

| Tags: Articles in English

EDIT: It has been brought to my attention (thanks @Vojta) that SIS in fact supports calendar export links (Schedule NG → My schedule → Export) which is a much better solution than whatever I made in the following paragraphs. There are still some interesting individual tricks which are the main point of this blog post so still might want to read along.

A few days back I found out my university’s IS allows exporting your timetable into ical. I thought “huh, cool”, exported mine, imported it into Google Calendar and forgot about it. Some time later however, a change in said timetable happened and my calendar (obviously) didn’t update, so if I wanted to keep my calendar up-to-date, I’d have to import the timetable again, which is annoying. I could automate this.

Firstly, there’s no such thing as an API or anything of the sort for our IS. That means I spent a few minutes looking at the traffic between my browser and the backend and wrote a simple python function, that signs in using a username and password:

def auth(s: requests.Session, user: str, pwd: str):
	r = s.get("https://is.cuni.cz/studium/index.php")
	accode = re.findall(r'<input type="hidden" name="accode" value="([^"]*)">', r.text)[0]
	timestamp = re.findall(r'<input type="hidden" name="tstmp" value="([^"]*)">', r.text)[0]
	r = s.post("https://is.cuni.cz/studium/verif.php", data={'login':user, 'heslo':pwd, 'all':'Přihlásit+se', 'tstmp':timestamp, 'accode':accode})
	if not len(re.findall(r'odhlásit se',r.text)) == 0 and len(re.findall(r'přístup byl odmítnut',r.text)) == 0:
		return re.findall(r'([a-f0-9]{32})', r.url)[0]
	elif len(re.findall(r'odhlásit se',r.text)) == 0 and not len(re.findall(r'přístup byl odmítnut',r.text)) == 0:
		return False
	else:
		raise Exception("Uncertain state, the regexes don't work.")

It has some HTML parsing, but it’s not too bad. The only painful thing is that it keeps one of the session strings in the URL to which it redirects, so I have to use that session id in all subsequent requests. Other than that, fetching the ical is very simple.

s = requests.Session()
id = auth(s, sys.argv[1], sys.argv[2])
r = s.get(f'https://is.cuni.cz/studium/rozvrhng/roz_muj_micro.php?id={id}&tid=&rezim=vse&strict=1&skr=2022&sem=1&fak=11320&ical=1')
print(r.text)

Now, I have a python script that returns an ical. How to give it to Google Calendar in the simples possible way? A normal person might say that Google Calendar has an API, so why not use that? Well, first of all, it’s almost midnight and I’m way too lazy to read documentation or fiddle with OAuth and second of all, the API doesn’t support ical import.

Hmm, but the GUI supports “subscribing to a calendar URL”, whatever that is supposed to mean, so I could just publish the icals on some URL and Google Calendar will grab them automatically? Right but how to do that in a way that my sleep deprived brain won’t have to think about too much. Yeah, let’s use GitLab CI, that sounds legit.

I quickly realised I’d have to make my own Dockerfile, since the default python docker images don’t have reqests so I came up with this intensely complicated piece of code:

FROM python:alpine

RUN pip install requests

ENTRYPOINT ["ash"]

I then manually uploaded that to GitLab’s container registry and after like ten force-pushes I had a finished working .gitlab-ci.yml:

scrape:
	image:
		name: registry.gitlab.com/***/***
		entrypoint: [""]
	script:
		- python3 main.py $USERNAME $PASSWORD > rozvrh.ics
	artifacts:
		paths:
			- rozvrh.ics

And sure enough, this saves the correct ics into build artifacts, nice. But there’s another catch, I have to somehow give this to Google Calendar as a single URL, but I also don’t want to make the whole thing publicly available.

If the repository was public, I could use a URL like this one:

https://gitlab.com/<username>/<repo_name>/-/jobs/artifacts/master/raw/rozvrh.ics?job=scrape

After a bit of digging I found out that the GitLab API also supports this, with a bit weirder URL:

https://gitlab.com/api/v4/projects/<project_id>/jobs/artifacts/master/raw/rozvrh.ics?job=scrape

The project_id is fairly easy to get, you can find it in the source code of your project’s homepage. I could authenticate this API using a personal access token, but I have to supply that with a header. At this point I started thinking about making a query-parameter-to-header-proxy, but luckily, I found out you can also supply the token with the private_token query parameter, how convenient!

And now the only step left is to set up a scheduled run for the job that triggers every hour and my Google Calendar will (surely) never go out of sync.