Credentials and Secrets¶
The Dataverse Java EE application needs to access remote resources like a PostgreSQL database or a persistent identifier service like DataCite. For this you’ll need credentials, which are meant to be kept secret.
Besides credentials, you might need to think of certificates, too, depending on your actual setup. Your mileage may vary.
Credentials in Dataverse application container¶
Credentials used in Dataverse may be found in the upstream Installation Guide, mostly in the config section.
Concept¶
The basic idea behind credentials used in the Dataverse application container based on the dataverse-k8s image is using environment variables to promote them. These mechanisms are described in Configuration, too.
Non-Secret Materials¶
You can provide credentials directly as environment variables from your
Deployment
, PodPreset
, ConfigMap
, Secret
et al. When not using
Kubernetes, environment variables is still a widely used concept.
Silent Secrets¶
Please keep in mind that passing in secret information like a password, key or similar should be done otherwise. For these, you can mount files at certain places (see image documentation), which will be read and piped into an environment variable, crafted into a JVM password alias, configuration files, etc.
Example: PostgreSQL connection¶
This example is about the PostgreSQL credentials and can be adapted to different
use cases. It uses the Kubernetes concept of Secrets
(see below).
More examples can be found at /personas/demo/secrets.yaml
.
kubectl create secret generic dataverse-postgresql \
--from-literal=username='dataverse' \
--from-literal=password='changeme' \
--from-literal=database='mydataverse'
Executing the above will create a Secret
in your Kubernetes cluster.
It could be used in the Deployment
like this (excerpt) to configure
username, password and database name for the Dataverse PostgreSQL service:
kind: Deployment
# ...
spec:
containers:
- name: dataverse
image: iqss/dataverse-k8s
env:
- name: POSTGRES_USER
valueFrom:
secretKeyRef:
name: dataverse-postgresql
key: username
optional: true
- name: POSTGRES_DATABASE
valueFrom:
secretKeyRef:
name: dataverse-postgresql
key: database
optional: true
volumeMounts:
- name: db-secret
mountPath: "/secrets/db"
readOnly: true
volumes:
- name: db-secret
secret:
secretName: dataverse-postgresql
Example: Admin account password¶
The password for the superadmin account dataverseAdmin
defaults to admin1
when you install (precise: bootstrap) Dataverse on Kubernetes running the
Quickstart / Demo.
To use a different initial password, create a Secret
(or use some other way
to get the password into the file). (For a complete Secret
example, have a
look at /personas/demo/secrets.yaml
)
kind: Secret
# ...
metadata:
name: dataverse-admin
# ...
stringData:
password: admin1
If you did not use the default dataverse-admin
name for the secret, you will
have to adapt the boostrap Job
spec with your secret name.
During bootstrap, the mounted secret at ${SECRETS_DIR}/admin/password
provisions
your password while creating the account. A less secure way is to provide it as environment
variable ADMIN_PASSWORD
.
Hint
Using a password not matching the enabled password policies will force you to provide a new password on first login. See the Dataverse guides for more details.
Danger
You really should change it to something more secure when not used for ephemeral purposes.
Note
This default password is the same as IQSS/dataverse-ansible uses.
This is a bootstrap-time-only option. You cannot reset your password this way.
Example: Builtin Users API Key¶
By default, your installation is secured to not allow other builtin users next
to dataverseAdmin
. If you need or want to change this, you can provision a
secret value to the BuiltinUsers.KEY
setting when bootstrapping.
As this is an extension to the API, you need to extend your API secret as shown below.
kind: Secret
# ...
metadata:
name: dataverse-api
# ...
stringData:
key: your-super-secret-unblock-key
userskey: your-even-more-secure-BuiltinUsers.KEY-value
During bootstrap, the mounted secret at ${SECRETS_DIR}/api/userskey is read and provisioned.
Note
This is a bootstrap-time-only option. This cannot be set by configuration job
by design. You still could use a manual curl
call.
How to use secret informations within K8s¶
Keeping things secret in a Kubernetes cluster needs attention at a few places:
Secure storage at rest
Secure distribution in/across cluster
Secure usage in containers
For production environments, you really should be looking closely at all of this. Every admin admires sleeping at nighttimes and not putting out fires.
Secure usage¶
The most important thing to understand is how to deal with secret information when configuring Dataverse and using services. Obviously you will need to inject the secret data into running containers. There are multiple ways to do so, but to be safe there are “best practices”:
Use Kubernetes Secrets to store secret information. No excuses.
Prefer mounting secrets as (memory-backed) text files in containers rather than pushing into environment variables (easier to sneak on those than files).
Read more about securely injecting credentials in containers in the upstream documentation and below.
Note
For bigger clusters, applications, levels of security, etc. this might be insufficient. You should read articles on third-party tools, like this and others.
Secure storage and distribution¶
Right under the container level there are some other attack vectors, where a maleficent guy could sneak on your secrets:
Cluster communication between your services, K8s services and K8s nodes
Stored secrets, used harddisks
There are checklists for being production ready with a K8s cluster. Use ‘em. Example.
Some basics (taken from here):
Secure communication by using TLS wherever possible.
Especially secure communication with
etcd
, which holds your secret data decrypted.Let
etcd
encrypt its data when at rest.
Secrets deployment tooling¶
You should also think about your deployment workflow for secrets:
It might be a good idea to use tools like Vault in big environments or teams.
If you like GitOps, take a look at the concept of sealed secrets.
Sealed Secrets: https://github.com/bitnami-labs/sealed-secrets
Even simpler, not requiring a K8s
Controller
:Mozilla SOPS: https://github.com/mozilla/sops (Experimental Kustomize support & others)
Keepass database +
decrypt.py Python script
using PyKeePass