workers.md



Scaling synapse via workers
For small instances it recommended to run Synapse in the default monolith mode.
For larger instances where performance is a concern it can be helpful to split
out functionality into multiple separate python processes. These processes are
called 'workers', and are (eventually) intended to scale horizontally
independently.
Synapse's worker support is under active development and subject to change as
we attempt to rapidly scale ever larger Synapse instances. However we are
documenting it here to help admins needing a highly scalable Synapse instance
similar to the one running matrix.org.
All processes continue to share the same database instance, and as such,
workers only work with PostgreSQL-based Synapse deployments. SQLite should only
be used for demo purposes and any admin considering workers should already be
running PostgreSQL.
See also Matrix.org blog post
for a higher level overview.

Main process/worker communication
The processes communicate with each other via a Synapse-specific protocol called
'replication' (analogous to MySQL- or Postgres-style database replication) which
feeds streams of newly written data between processes so they can be kept in
sync with the database state.
When configured to do so, Synapse uses a
Redis pub/sub channel to send the replication
stream between all configured Synapse processes. Additionally, processes may
make HTTP requests to each other, primarily for operations which need to wait
for a reply ─ such as sending an event.
Redis support was added in v1.13.0 with it becoming the recommended method in
v1.18.0. It replaced the old direct TCP connections (which is deprecated as of
v1.18.0) to the main process. With Redis, rather than all the workers connecting
to the main process, all the workers and the main process connect to Redis,
which relays replication commands between processes. This can give a significant
cpu saving on the main process and will be a prerequisite for upcoming
performance improvements.
If Redis support is enabled Synapse will use it as a shared cache, as well as a
pub/sub mechanism.
See the Architectural diagram section at the end for
a visualisation of what this looks like.

Setting up workers
A Redis server is required to manage the communication between the processes.
The Redis server should be installed following the normal procedure for your
distribution (e.g. apt install redis-server on Debian). It is safe to use an
existing Redis deployment if you have one.
Once installed, check that Redis is running and accessible from the host running
Synapse, for example by executing echo PING | nc -q1 localhost 6379 and seeing
a response of +PONG.
The appropriate dependencies must also be installed for Synapse. If using a
virtualenv, these can be installed with:

pip install "matrix-synapse[redis]"


Note that these dependencies are included when synapse is installed with pip install matrix-synapse[all]. They are also included in the debian packages from
matrix.org and in the docker images at
https://hub.docker.com/r/matrixdotorg/synapse/.
To make effective use of the workers, you will need to configure an HTTP
reverse-proxy such as nginx or haproxy, which will direct incoming requests to
the correct worker, or to the main synapse instance. See
the reverse proxy documentation for information on setting up a reverse
proxy.
When using workers, each worker process has its own configuration file which
contains settings specific to that worker, such as the HTTP listener that it
provides (if any), logging configuration, etc.
Normally, the worker processes are configured to read from a shared
configuration file as well as the worker-specific configuration files. This
makes it easier to keep common configuration settings synchronised across all
the processes.