Deploy MsPASS with Docker Compose
Prerequisites
Docker Compose is a tool to deploy and coordinate multiple Docker containers. To install Docker Compose on machines that you have root access, please refer to the guide here and follow the instructions for your specific platform. Docker Compose uses YAML files to run multiple containers. Please refer to its documentation for more details on using the tool.
Configure MsPASS Containers
The MsPASS runtime environment is composed of multiple components each serves a specific functionality.
They are: frontend
, scheduler
, worker
, db
, dbmanager
, and shard
.
The MsPASS container can be launched as one of these roles by specifying the MSPASS_ROLE
environment variable.
The options include:
frontend
: Utilizing Jupyter Notebook to provide users an interactive development environment that connects to the other components.scheduler
: This is the Dask scheduler or the Spark master that coordinates all its corresponding workers.worker
: This is the Dask worker or Spark worker that does the computation in parallel.db
: This role runs a standalone MongoDB daemon process that manages data access.dbmanager
: This role runs MongoDB’s config server and router server to provide access to a sharded MongoDB cluster.shard
: Each shard contains a subset of the sharded data to provide horizontal scaling of the database.all
: This role is equivalent tofrontend
+scheduler
+worker
+db
. Note that this is the default role, and it is how the container is launched in the previous section.
The following environment variables also need be set for the different roles to communicate:
MSPASS_SCHEDULER
: User can usedask
orspark
to run parallel computations.dask
is the default.MSPASS_SCHEDULER_ADDRESS
: This is the IP address or hostname of thescheduler
.worker
andfrontend
rely on this to communicate with thescheduler
.MSPASS_DB_ADDRESS
: This is the IP address or hostname of thedb
ordbmanager
.frontend
rely on this to access the database.MSPASS_SHARD_LIST
: This is a space delimited string of format$HOSTNAME/$HOSTNAME:$MONGODB_PORT
for all theshard
.dbmanager
rely on this to build the sharded database cluster.MSPASS_SHARD_ID
: This is used to assign eachshard
a unique name such that it can write to its owndata_shard_${MSPASS_SHARD_ID}
directory under the /db directory (in case the shards run on a shared filesystem).MSPASS_JUPYTER_PWD
: In thefrontend
, user can optionally set a password for Jupyter Notebook access. If set to an empty string, jupyter can be accessed with no password which may cause security issues. If unset, the Jupyter Notebook will generate a random token and print to stdout, which is the default behavior.
User may change the default ports of all the underlying components by setting the following variables:
JUPYTER_PORT
: The default is8888
.DASK_SCHEDULER_PORT
: The default is8786
.SPARK_MASTER_PORT
: The default is7077
.MONGODB_PORT
: The default is27017
.
These variables are for experienced users only. The deployment can break if mismatching ports are set.
Deploy MsPASS Containers
Docker Compose can deploy multiple MsPASS Containers of different roles, which simulates a distributed environment. Below, we provide two exemplary Docker Compose configurations that have distributed setup for both the computation (with Dask or Spark) and the database (with MongoDB).
1version: '3.7'
2
3services:
4
5 mspass-dbmanager:
6 image: mspass/mspass
7 volumes:
8 - "${PWD}/:/home"
9 ports:
10 - 27017:27017
11 depends_on:
12 - mspass-shard-0
13 - mspass-shard-1
14 environment:
15 MSPASS_ROLE: dbmanager
16 MSPASS_SHARD_LIST: mspass-shard-0/mspass-shard-0:27017 mspass-shard-1/mspass-shard-1:27017
17 MONGODB_PORT: 27017
18 healthcheck:
19 test: echo 'db.runCommand("ping").ok' | mongosh localhost:27017/test --quiet
20 interval: 10s
21 timeout: 60s
22 retries: 5
23 start_period: 5s
24
25 mspass-shard-0:
26 hostname: mspass-shard-0
27 image: mspass/mspass
28 volumes:
29 - "${PWD}/:/home"
30 environment:
31 MSPASS_ROLE: shard
32 MSPASS_SHARD_ID: 0
33 MONGODB_PORT: 27017
34 SHARD_DB_PATH: scratch
35 healthcheck:
36 test: echo 'db.runCommand("ping").ok' | mongosh localhost:27017/test --quiet
37 interval: 10s
38 timeout: 60s
39 retries: 5
40 start_period: 5s
41
42 mspass-shard-1:
43 hostname: mspass-shard-1
44 image: mspass/mspass
45 volumes:
46 - "${PWD}/:/home"
47 environment:
48 MSPASS_ROLE: shard
49 MSPASS_SHARD_ID: 1
50 MONGODB_PORT: 27017
51 SHARD_DB_PATH: scratch
52 healthcheck:
53 test: echo 'db.runCommand("ping").ok' | mongosh localhost:27017/test --quiet
54 interval: 10s
55 timeout: 60s
56 retries: 5
57 start_period: 5s
58
59 mspass-scheduler:
60 image: mspass/mspass
61 volumes:
62 - "${PWD}/:/home"
63 ports:
64 - 8786:8786
65 environment:
66 MSPASS_ROLE: scheduler
67 MSPASS_SCHEDULER: dask
68 DASK_SCHEDULER_PORT: 8786
69 healthcheck:
70 test: wget --no-verbose --tries=1 --spider http://localhost:8786
71 interval: 10s
72 timeout: 60s
73 retries: 5
74 start_period: 5s
75
76 mspass-worker:
77 image: mspass/mspass
78 volumes:
79 - "${PWD}/:/home"
80 depends_on:
81 - mspass-scheduler
82 environment:
83 MSPASS_ROLE: worker
84 MSPASS_SCHEDULER: dask
85 MSPASS_SCHEDULER_ADDRESS: mspass-scheduler
86
87 mspass-frontend:
88 image: mspass/mspass
89 volumes:
90 - "${PWD}/:/home"
91 ports:
92 - 8888:8888
93 depends_on:
94 - mspass-dbmanager
95 - mspass-scheduler
96 environment:
97 MSPASS_ROLE: frontend
98 MSPASS_SCHEDULER: dask
99 MSPASS_SCHEDULER_ADDRESS: mspass-scheduler
100 MSPASS_DB_ADDRESS: mspass-dbmanager
101 MSPASS_JUPYTER_PWD: mspass
102 JUPYTER_PORT: 8888
1version: '3'
2
3services:
4
5 mspass-dbmanager:
6 image: mspass/mspass
7 volumes:
8 - "${PWD}:/home"
9 command: dockerize -wait tcp://mspass-shard-0:27017 -wait tcp://mspass-shard-1:27017 -timeout 240s /usr/sbin/start-mspass.sh
10 ports:
11 - 27017:27017
12 depends_on:
13 - mspass-shard-0
14 - mspass-shard-1
15 environment:
16 MSPASS_ROLE: dbmanager
17 MSPASS_SHARD_LIST: mspass-shard-0/mspass-shard-0:27017 mspass-shard-1/mspass-shard-1:27017
18 MONGODB_PORT: 27017
19
20 mspass-shard-0:
21 hostname: mspass-shard-0
22 image: mspass/mspass
23 volumes:
24 - "${PWD}:/home"
25 environment:
26 MSPASS_ROLE: shard
27 MSPASS_SHARD_ID: 0
28 MONGODB_PORT: 27017
29
30 mspass-shard-1:
31 hostname: mspass-shard-1
32 image: mspass/mspass
33 volumes:
34 - "${PWD}:/home"
35 environment:
36 MSPASS_ROLE: shard
37 MSPASS_SHARD_ID: 1
38 MONGODB_PORT: 27017
39
40 mspass-scheduler:
41 image: mspass/mspass
42 volumes:
43 - "${PWD}:/home"
44 ports:
45 - 7077:7077
46 environment:
47 MSPASS_ROLE: scheduler
48 MSPASS_SCHEDULER: spark
49 SPARK_MASTER_PORT: 7077
50
51 mspass-worker:
52 image: mspass/mspass
53 volumes:
54 - "${PWD}:/home"
55 command: dockerize -wait tcp://mspass-scheduler:7077 -timeout 240s /usr/sbin/start-mspass.sh
56 depends_on:
57 - mspass-scheduler
58 environment:
59 MSPASS_ROLE: worker
60 MSPASS_SCHEDULER: spark
61 MSPASS_SCHEDULER_ADDRESS: mspass-scheduler
62
63 mspass-frontend:
64 image: mspass/mspass
65 volumes:
66 - "${PWD}:/home"
67 command: dockerize -wait tcp://mspass-dbmanager:27017 -wait tcp://mspass-scheduler:7077 -timeout 240s /usr/sbin/start-mspass.sh
68 ports:
69 - 8888:8888
70 depends_on:
71 - mspass-dbmanager
72 - mspass-scheduler
73 environment:
74 MSPASS_ROLE: frontend
75 MSPASS_SCHEDULER: spark
76 MSPASS_SCHEDULER_ADDRESS: mspass-scheduler
77 MSPASS_DB_ADDRESS: mspass-dbmanager
78 MSPASS_JUPYTER_PWD: mspass
79 JUPYTER_PORT: 8888
To test out the multi-container setup, we can use the docker-compose
command, which will deploy all the components locally.
First, save the content of one of the two code blocks above to a file called docker-compose.yml
, and make sure that you are running in a directory where you want to keep the files created by the containers (i.e., the db, logs, and work directories).
Then, run the following command to start all the containers:
docker-compose -f docker-compose.yml up -d
This command will start all the containers as services running in the background and you will see all the containers started correctly with outputs like this:
$ docker-compose -f docker-compose.yml up -d
Creating network "mspass_default" with the default driver
Creating mspass_mspass-shard-1_1 ... done
Creating mspass_mspass-scheduler_1 ... done
Creating mspass_mspass-shard-0_1 ... done
Creating mspass_mspass-worker_1 ... done
Creating mspass_mspass-dbmanager_1 ... done
Creating mspass_mspass-frontend_1 ... done
You can then open http://127.0.0.1:8888/
in your browser to access the Jupyter Notebook frontend.
Note that it may take a minute for the frontend to be ready.
You can check the status of the frontend with this command:
docker-compose -f docker-compose.yml logs mspass-frontend
The notebook will ask for a password for access, just type in mspass
there as you can tell that we have set the MSPASS_JUPYTER_PWD
environment variable for the mspass-frontend
service in the docker-compose.yml
file.
When you are done with MsPASS, you can bring down the containers with:
docker-compose -f docker-compose.yml down
You should see similar outputs to the following indicating all the containers are correctly cleaned up:
$ docker-compose -f docker-compose.yml down
Stopping mspass_mspass-frontend_1 ... done
Stopping mspass_mspass-dbmanager_1 ... done
Stopping mspass_mspass-worker_1 ... done
Stopping mspass_mspass-shard-0_1 ... done
Stopping mspass_mspass-scheduler_1 ... done
Stopping mspass_mspass-shard-1_1 ... done
Removing mspass_mspass-frontend_1 ... done
Removing mspass_mspass-dbmanager_1 ... done
Removing mspass_mspass-worker_1 ... done
Removing mspass_mspass-shard-0_1 ... done
Removing mspass_mspass-scheduler_1 ... done
Removing mspass_mspass-shard-1_1 ... done
Removing network mspass_default