Matomo (formerly Piwik) analytics#
Matomo is a self-hosted free & open source alternative to Google Analytics.
Why?#
Matomo gives us better control of what is tracked, how long it is stored & what we can do with the data. We would like to collect as little data as possible & share it with the world in safe ways as much as possible. Matomo is an important step in making this possible.
How it is set up?#
Matomo is a PHP+MySQL application. We use the apache based upstream
docker image to run it. We can
improve performance in the future if we wish by switching to nginx+fpm
.
We use Google CloudSQL for MySQL
to provision a fully managed, standard mysql database. The
sidecar pattern
is used to connect Matomo to this database. A service account with appropriate
credentials to connect to the database has been provisioned & checked-in
to the repo. A MySQL user with name matomo
& a MySQL database with name matomo
should also be created in the Google Cloud Console.
Initial Installation#
Matomo is a PHP application, and this has a number of drawbacks. The initial
install `must <https://github.com/matomo-org/matomo/issues/10257>`_ be completed
with a manual web interface. Matomo will error if it finds a complete config.ini.php
file (which we provide) but no database tables exist.
The first time you install Matomo, you need to do the following:
Do a deploy. This sets up Matomo, but not the database tables
Use
kubectl --namespace=<namespace> exec -it <matomo-pod> /bin/bash
to get shell on the matomo container.Run
rm config/config.ini.php
.Visit the web interface & complete installation. The database username & password are available in the secret encrypted files in this repo. So is the admin username and password. This creates the database tables.
When the setup is complete, delete the pod. This should bring up our
config.ini.php
file, and everything should work normally.
This is not ideal.
Admin access#
The admin username for Matomo is admin
. You can find the password in
secret/staging.yaml
for staging & secret/prod.yaml
for prod.
Security#
PHP code is notoriously hard to secure. Matomo has had security audits, so it’s not the worst. However, we should treat it with suspicion & wall off as much of it away as possible. Arbitrary code execution vulnerabilities often happen in PHP, so we gotta use that as our security model.
We currently have:
A firewall hole (in Google Cloud) allowing it access to the CloudSQL instance it needs to store data in. Only port 3307 (which is used by the OAuth2+ServiceAccount authenticated CloudSQLProxy) is open. This helps prevent random MySQL password grabbers from inside the cluster.
A Kubernetes NetworkPolicy is in place that limits what outbound connections Matomo can make. This should be further tightened down - ingress should only be allowed on the nginx port from our ingress controllers.
We do not mount a Kubernetes ServiceAccount in the Matomo pod. This denies it access to the KubernetesAPI.