A stupid way of exporting icals from SIS to Google Calendar
| Tags:
EDIT: It has been brought to my attention (thanks @Vojta) that SIS in fact supports calendar export links (Schedule NG → My schedule → Export) which is a much better solution than whatever I made in the following paragraphs. There are still some interesting individual tricks which are the main point of this blog post so still might want to read along.
A few days back I found out my university’s IS allows exporting your timetable into ical. I thought “huh, cool”, exported mine, imported it into Google Calendar and forgot about it. Some time later however, a change in said timetable happened and my calendar (obviously) didn’t update, so if I wanted to keep my calendar up-to-date, I’d have to import the timetable again, which is annoying. I could automate this.
Firstly, there’s no such thing as an API or anything of the sort for our IS. That means I spent a few minutes looking at the traffic between my browser and the backend and wrote a simple python function, that signs in using a username and password:
def auth(s: requests.Session, user: str, pwd: str):
r = s.get("https://is.cuni.cz/studium/index.php")
accode = re.findall(r'<input type="hidden" name="accode" value="([^"]*)">', r.text)[0]
timestamp = re.findall(r'<input type="hidden" name="tstmp" value="([^"]*)">', r.text)[0]
r = s.post("https://is.cuni.cz/studium/verif.php", data={'login':user, 'heslo':pwd, 'all':'Přihlásit+se', 'tstmp':timestamp, 'accode':accode})
if not len(re.findall(r'odhlásit se',r.text)) == 0 and len(re.findall(r'přístup byl odmítnut',r.text)) == 0:
return re.findall(r'([a-f0-9]{32})', r.url)[0]
elif len(re.findall(r'odhlásit se',r.text)) == 0 and not len(re.findall(r'přístup byl odmítnut',r.text)) == 0:
return False
else:
raise Exception("Uncertain state, the regexes don't work.")
It has some HTML parsing, but it’s not too bad. The only painful thing is that it keeps one of the session strings in the URL to which it redirects, so I have to use that session id in all subsequent requests. Other than that, fetching the ical is very simple.
s = requests.Session()
id = auth(s, sys.argv[1], sys.argv[2])
r = s.get(f'https://is.cuni.cz/studium/rozvrhng/roz_muj_micro.php?id={id}&tid=&rezim=vse&strict=1&skr=2022&sem=1&fak=11320&ical=1')
print(r.text)
Now, I have a python script that returns an ical. How to give it to Google Calendar in the simples possible way? A normal person might say that Google Calendar has an API, so why not use that? Well, first of all, it’s almost midnight and I’m way too lazy to read documentation or fiddle with OAuth and second of all, the API doesn’t support ical import.
Hmm, but the GUI supports “subscribing to a calendar URL”, whatever that is supposed to mean, so I could just publish the icals on some URL and Google Calendar will grab them automatically? Right but how to do that in a way that my sleep deprived brain won’t have to think about too much. Yeah, let’s use GitLab CI, that sounds legit.
I quickly realised I’d have to make my own Dockerfile, since the default python
docker images don’t have reqests
so I came up with this intensely
complicated piece of code:
FROM python:alpine
RUN pip install requests
ENTRYPOINT ["ash"]
I then manually uploaded that to GitLab’s container registry and after like ten
force-pushes I had a finished working .gitlab-ci.yml
:
scrape:
image:
name: registry.gitlab.com/***/***
entrypoint: [""]
script:
- python3 main.py $USERNAME $PASSWORD > rozvrh.ics
artifacts:
paths:
- rozvrh.ics
And sure enough, this saves the correct ics into build artifacts, nice. But there’s another catch, I have to somehow give this to Google Calendar as a single URL, but I also don’t want to make the whole thing publicly available.
If the repository was public, I could use a URL like this one:
https://gitlab.com/<username>/<repo_name>/-/jobs/artifacts/master/raw/rozvrh.ics?job=scrape
After a bit of digging I found out that the GitLab API also supports this, with a bit weirder URL:
https://gitlab.com/api/v4/projects/<project_id>/jobs/artifacts/master/raw/rozvrh.ics?job=scrape
The project_id
is fairly easy to get, you can find it in the source code of
your project’s homepage. I could authenticate this API using a personal access
token, but I have to supply that with a header. At this point I started thinking
about making a query-parameter-to-header-proxy, but luckily, I found out you can
also supply the token with the private_token
query parameter, how convenient!
And now the only step left is to set up a scheduled run for the job that triggers every hour and my Google Calendar will (surely) never go out of sync.