Matomo (formerly Piwik) analytics#
Matomo gives us better control of what is tracked, how long it is stored & what we can do with the data. We would like to collect as little data as possible & share it with the world in safe ways as much as possible. Matomo is an important step in making this possible.
How it is set up?#
Matomo is a PHP+MySQL application. We use the apache based upstream
docker image to run it. We can
improve performance in the future if we wish by switching to
We use Google CloudSQL for MySQL
to provision a fully managed, standard mysql database. The
is used to connect Matomo to this database. A service account with appropriate
credentials to connect to the database has been provisioned & checked-in
to the repo. A MySQL user with name
matomo & a MySQL database with name
should also be created in the Google Cloud Console.
Matomo is a PHP application, and this has a number of drawbacks. The initial
install `must <https://github.com/matomo-org/matomo/issues/10257>`_ be completed
with a manual web interface. Matomo will error if it finds a complete
file (which we provide) but no database tables exist.
The first time you install Matomo, you need to do the following:
Do a deploy. This sets up Matomo, but not the database tables
kubectl --namespace=<namespace> exec -it <matomo-pod> /bin/bashto get shell on the matomo container.
Visit the web interface & complete installation. The database username & password are available in the secret encrypted files in this repo. So is the admin username and password. This creates the database tables.
When the setup is complete, delete the pod. This should bring up our
config.ini.phpfile, and everything should work normally.
This is not ideal.
The admin username for Matomo is
admin. You can find the password in
secret/staging.yaml for staging &
secret/prod.yaml for prod.
PHP code is notoriously hard to secure. Matomo has had security audits, so it’s not the worst. However, we should treat it with suspicion & wall off as much of it away as possible. Arbitrary code execution vulnerabilities often happen in PHP, so we gotta use that as our security model.
We currently have:
A firewall hole (in Google Cloud) allowing it access to the CloudSQL instance it needs to store data in. Only port 3307 (which is used by the OAuth2+ServiceAccount authenticated CloudSQLProxy) is open. This helps prevent random MySQL password grabbers from inside the cluster.
A Kubernetes NetworkPolicy is in place that limits what outbound connections Matomo can make. This should be further tightened down - ingress should only be allowed on the nginx port from our ingress controllers.
We do not mount a Kubernetes ServiceAccount in the Matomo pod. This denies it access to the KubernetesAPI.