mongodb
Table of Content
mongodb¶
https://www.mongodb.com/en-us
https://www.mongodb.com/docs/
https://github.com/mongodb/mongo
backup and restore¶
https://www.mongodb.com/docs/manual/tutorial/backup-and-restore-tools/
This operation was confirmed working using mongodb version 6 running as docker container.
# the container name is "mongodb6" in this example
# backup one database
docker exec mongodb6 mkdir -p /opt/backup
docker exec mongodb6 /usr/bin/mongodump -u username_here --password="password_here" --authenticationDatabase=admin --db="db_name_here" --gzip --out=/opt/backup
# retrieve the copy of the backup file to local machine outside docker container
docker cp mongodb6:/opt/backup/{db_name_here}/work.bson.gz ~/db_bak/.
# to restore, assuming the same version of database service is running
# prepare the backup directory and store the backup file there
docker exec mongodb6 mkdir -p /opt/backup/{db_name_here}
cd ~/db_bak
docker cp work.bson.gz mongodb6:/opt/backup/{db_name_here}/.
# and restore using the backup file
docker exec mongodb6 /usr/bin/mongorestore -u username_here --password="password_here" --authenticationDatabase=admin --gzip /opt/backup/{db_name_here}/work.bson.gz
# note that the parent directory name of the backup file is important
# because that becomes the name of the database
MongoDB Community Kubernetes Operator¶
https://github.com/mongodb/mongodb-kubernetes-operator/blob/master/README.md
note¶
The community edition does not support changing the volume size.
installation¶
https://github.com/mongodb/mongodb-kubernetes-operator/blob/master/docs/install-upgrade.md
# add the repository to the helm
helm repo add mongodb https://mongodb.github.io/helm-charts
# helm repo update
# see the available items in the mongodb repo
helm search repo mongodb
# see the list of available versions of mongodb community operator
helm search repo -l mongodb/community-operator
# get the values file of the interesting version
helm show values mongodb/community-operator --version=0.11.0 > mongodb-community-operator-0.11.0-values.yaml
# keep this one and generate a copy to edit and use
cp mongodb-community-operator-0.11.0-values.yaml mongodb-community-operator-values.yaml
# edit mongodb-community-operator-values.yaml file
# generate helm source and helm release flux manifest to let flux gitops process it
# and, create the namespace before passing the manifests to flux
# the mongodb community operator should be deployed
values file¶
These are the changes I made on the values file.
- operator
- watch namespace: all
- extraenv
- set my custom cluster domain name
- database
- namespace: mongo
script to generate flux source and helm release¶
#!/bin/bash
# add flux helmrepo to the manifest
flux create source helm mongodb \
--url=https://mongodb.github.io/helm-charts \
--interval=1h0m0s \
--export >../mongodb-community-operator.yaml
# add flux helm release to the manifest including the customized values.yaml file
flux create helmrelease mongodb-community-operator \
--interval=10m \
--target-namespace=mongo \
--source=HelmRepository/mongodb \
--chart=community-operator \
--chart-version=0.11.0 \
--values=../values/mongodb-community-operator-values.yaml \
--export >>../mongodb-community-operator.yaml
namespace¶
I like to create a namespace independent from helm release.
---
apiVersion: v1
kind: Namespace
metadata:
name: mongo
labels:
service: mongo
type: infrastructure
creating a database¶
This is my test mdbc manifest. Create the user secret separately. In this case, I created a secret named "mdbadmin-secret" in the same namespace.
apiVersion: mongodbcommunity.mongodb.com/v1
kind: MongoDBCommunity
metadata:
name: testmongo
namespace: mongo
spec:
members: 3
type: ReplicaSet
version: "4.4.29"
security:
authentication:
modes: ["SCRAM"]
users:
- name: mdbadmin
db: admin
passwordSecretRef: # a reference to the secret that will be used to generate the user's password
name: mdbadmin-secret
roles:
- name: clusterAdmin
db: admin
- name: userAdminAnyDatabase
db: admin
- name: dbAdminAnyDatabase
db: admin
- name: readWriteAnyDatabase
db: admin
scramCredentialsSecretName: my-scram
additionalMongodConfig:
storage.wiredTiger.engineConfig.journalCompressor: zlib
statefulSet:
spec:
volumeClaimTemplates:
- metadata:
name: data-volume
spec:
accessModes: ["ReadWriteOnce"]
storageClassName: "directpv-min-io"
resources:
requests:
storage: 20Gi
- metadata:
name: logs-volume
spec:
accessModes: ["ReadWriteOnce"]
storageClassName: "directpv-min-io"
resources:
requests:
storage: 2Gi
template:
spec:
nodeSelector:
role: storage-node
containers:
- name: mongod
resources:
limits:
cpu: "0.2"
memory: 250M
requests:
cpu: "0.2"
memory: 200M
- name: mongodb-agent
resources:
limits:
cpu: "0.2"
memory: 250M
requests:
cpu: "0.2"
memory: 200M
initContainers:
- name: mongodb-agent-readinessprobe
resources:
limits:
cpu: "2"
memory: 200M
requests:
cpu: "1"
memory: 100M
user secret¶
https://github.com/mongodb/mongodb-kubernetes-operator/blob/master/docs/users.md#create-a-user-secret
deploying mdbc on a namespace other than the one mdbc operator is running¶
The required sa, roles, and rolebindings are not automatically created when deploying mdbc manifest.
Create this set on whichever namespace you are deploying mdbc database on, and the creation of necessary statefulset runs successfully.
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
name: mongodb-database
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: Role
name: mongodb-database
subjects:
- kind: ServiceAccount
name: mongodb-database
---
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
name: mongodb-database
rules:
- apiGroups:
- ""
resources:
- secrets
verbs:
- get
- apiGroups:
- ""
resources:
- pods
verbs:
- patch
- delete
- get
---
apiVersion: v1
kind: ServiceAccount
metadata:
name: mongodb-database
accessing the database¶
Various secrets are created by the operator.
NAME TYPE DATA AGE
mdbadmin-secret Opaque 1 125m
my-scram-scram-credentials Opaque 6 23m
testmongo-admin-mdbadmin Opaque 4 17m
testmongo-agent-password Opaque 1 23m
testmongo-config Opaque 1 23m
testmongo-keyfile Opaque 1 23m
The name of the database (mdbc, MongoDBCommunity) is "testmongo", db name for the credentials is "admin", and the username created is "mdbadmin". You can find username, password, and connection string inside the secret. Run kubectl -n mongo get secret testmongo-admin-mdbadmin -o jsonpath='{.data.connectionString\.standard}' | base64 -d
for example, and you get the decoded mongodb connection string.
{
"apiVersion": "v1",
"data": {
"connectionString.standard": "base64 string here"
"connectionString.standardSrv": "base64 string here"
"password": "base64 string here"
"username": "base64 string here"
},
"kind": "Secret",
"metadata": {
"creationTimestamp": "2024-08-29T00:58:24Z",
"name": "testmongo-admin-mdbadmin",
"namespace": "mongo",
"ownerReferences": [
{
"apiVersion": "mongodbcommunity.mongodb.com/v1",
"blockOwnerDeletion": true,
"controller": true,
"kind": "MongoDBCommunity",
"name": "testmongo",
"uid": "2134b34c-edd2-4117-8220-cfcf959d8199"
}
],
"resourceVersion": "54265198",
"uid": "032abf28-7537-4bee-a14e-f9114d5dd8b7"
},
"type": "Opaque"
}
using pymongo¶
pip install "pymongo[srv]"
and try lines below in interactive mode.
from pymongo import MongoClient
username = "username_here"
password = "password_here"
db_server = "server_part_of_connection_string_here"
# example in case of my testmongo mdbc in mongo namespace
# db_server = "testmongo-svc.mongo.svc.cluster.local/admin?replicaSet=testmongo&ssl=false"
client = MongoClient("mongodb+srv://%s:%s@%s" % (username, password, db_server),serverSelectionTimeoutMS=4000,)
client.admin.command("ping")
client.server_info().get("version")
# create database & collection, and add a record
db = client["testdb"]
col = db["testcol"]
dct = {"work_id": "123", "title": "testtitle"}
col.find_one_and_update({"work_id": "123"}, {"$set": dct}, upsert=True)
# confirm that the new database is created
for c in client.list_databases():
print(c)
client.close()
deleting the mdbc¶
In my case using flux, I just removed the manifest from the kustomization and the database related workloads are gone. I had to manually delete the pvc used.
$ kubectl -n mongo get pvc
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS VOLUMEATTRIBUTESCLASS AGE
data-volume-testmongo-0 Bound pvc-4650f7d7-2b00-41b2-927c-5b5efb8d80c0 20Gi RWO directpv-min-io <unset> 22m
data-volume-testmongo-1 Bound pvc-0ebecb90-01a5-4e2a-a582-776c54a7bbdf 20Gi RWO directpv-min-io <unset> 21m
data-volume-testmongo-2 Bound pvc-af616099-b7c9-4c73-a01e-f711305478ae 20Gi RWO directpv-min-io <unset> 21m
logs-volume-testmongo-0 Bound pvc-2d6fd006-28ac-460b-9a65-d2dc4a997f72 2Gi RWO directpv-min-io <unset> 22m
logs-volume-testmongo-1 Bound pvc-2118b5d5-a6b0-4229-ba13-95a07a70cde8 2Gi RWO directpv-min-io <unset> 21m
logs-volume-testmongo-2 Bound pvc-f183b82e-22f5-4278-a7fd-19cb7d65d1c5 2Gi RWO directpv-min-io <unset> 21m
$ kubectl -n mongo delete pvc logs-volume-testmongo-0
persistentvolumeclaim "logs-volume-testmongo-0" deleted
$ kubectl -n mongo delete pvc logs-volume-testmongo-1
persistentvolumeclaim "logs-volume-testmongo-1" deleted
$ kubectl -n mongo delete pvc logs-volume-testmongo-2
persistentvolumeclaim "logs-volume-testmongo-2" deleted
$ kubectl -n mongo delete pvc data-volume-testmongo-2
persistentvolumeclaim "data-volume-testmongo-2" deleted
$ kubectl -n mongo delete pvc data-volume-testmongo-1
persistentvolumeclaim "data-volume-testmongo-1" deleted
$ kubectl -n mongo delete pvc data-volume-testmongo-0
persistentvolumeclaim "data-volume-testmongo-0" deleted
backup and restore of mongodb data running on mdbc¶
- create a pod with pvc running python image
- PV might not be needed if the data is small enough
- specific numbers here...?
- PV might not be needed if the data is small enough
- get inside the container and run python
- kubectl -n namespace_for_the_application exec -it pod/backup -- bash
- pip install -U pip setuptools
- pip install "pymongo[srv]"
- OR, just have the pod spin up the container with pymongo installed
- run bson dump script
- kubectl cp to copy the backup data to somewhere else
import bson
from pymongo import MongoClient
import os
db_username = os.environ.get("ENV_FOR_DB_USERNAME")
db_password = os.environ.get("ENV_FOR_DB_PASSWORD")
db_server = os.environ.get("ENV_FOR_DB_SERVER")
db_name = "database_name_here"
path = "/opt/backup" # or any directory wich pvc volume mount
connection_string = ("mongodb+srv://%s:%s@%s" % (db_username, db_password, db_server,),)
conn = MongoClient(connection_string, serverSelectionTimeoutMS=4000)
def dump(conn, db_name, path):
db = conn[db_name]
# all available collections
collections = db.list_collection_names()
for coll in collections:
with open(os.path.join(path, f'{coll}.bson'), 'wb+') as f:
for doc in db[coll].find():
f.write(bson.BSON.encode(doc))
return 0
def restore(conn, db_name, path):
db = conn[db_name]
for coll in os.listdir(path):
if coll.endswith('.bson'):
with open(os.path.join(path, coll), 'rb+') as f:
db[coll.split('.')[0]].insert_many(bson.decode_all(f.read()))
return 0
# run whichever you need, dump() or restore()