Skip to content
Snippets Groups Projects
workers.md 21.63 KiB

Scaling synapse via workers

For small instances it recommended to run Synapse in the default monolith mode. For larger instances where performance is a concern it can be helpful to split out functionality into multiple separate python processes. These processes are called 'workers', and are (eventually) intended to scale horizontally independently.

Synapse's worker support is under active development and subject to change as we attempt to rapidly scale ever larger Synapse instances. However we are documenting it here to help admins needing a highly scalable Synapse instance similar to the one running matrix.org.

All processes continue to share the same database instance, and as such, workers only work with PostgreSQL-based Synapse deployments. SQLite should only be used for demo purposes and any admin considering workers should already be running PostgreSQL.

See also Matrix.org blog post for a higher level overview.

Main process/worker communication

The processes communicate with each other via a Synapse-specific protocol called 'replication' (analogous to MySQL- or Postgres-style database replication) which feeds streams of newly written data between processes so they can be kept in sync with the database state.

When configured to do so, Synapse uses a Redis pub/sub channel to send the replication stream between all configured Synapse processes. Additionally, processes may make HTTP requests to each other, primarily for operations which need to wait for a reply ─ such as sending an event.

Redis support was added in v1.13.0 with it becoming the recommended method in v1.18.0. It replaced the old direct TCP connections (which is deprecated as of v1.18.0) to the main process. With Redis, rather than all the workers connecting to the main process, all the workers and the main process connect to Redis, which relays replication commands between processes. This can give a significant cpu saving on the main process and will be a prerequisite for upcoming performance improvements.

If Redis support is enabled Synapse will use it as a shared cache, as well as a pub/sub mechanism.

See the Architectural diagram section at the end for a visualisation of what this looks like.

Setting up workers

A Redis server is required to manage the communication between the processes. The Redis server should be installed following the normal procedure for your distribution (e.g. apt install redis-server on Debian). It is safe to use an existing Redis deployment if you have one.

Once installed, check that Redis is running and accessible from the host running Synapse, for example by executing echo PING | nc -q1 localhost 6379 and seeing a response of +PONG.

The appropriate dependencies must also be installed for Synapse. If using a virtualenv, these can be installed with:

pip install "matrix-synapse[redis]"

Note that these dependencies are included when synapse is installed with pip install matrix-synapse[all]. They are also included in the debian packages from matrix.org and in the docker images at https://hub.docker.com/r/matrixdotorg/synapse/.

To make effective use of the workers, you will need to configure an HTTP reverse-proxy such as nginx or haproxy, which will direct incoming requests to the correct worker, or to the main synapse instance. See the reverse proxy documentation for information on setting up a reverse proxy.

When using workers, each worker process has its own configuration file which contains settings specific to that worker, such as the HTTP listener that it provides (if any), logging configuration, etc.

Normally, the worker processes are configured to read from a shared configuration file as well as the worker-specific configuration files. This makes it easier to keep common configuration settings synchronised across all the processes.