Vault 1.13.0 release notes
Software Release date: March 1, 2023
Summary: Vault Release 1.13.0 offers features and enhancements that improve the user experience while solving critical issues previously encountered by our customers. We are providing an overview of improvements in this set of release notes.
We encourage you to upgrade to the latest release of Vault to take advantage of the new benefits provided. With this latest release, we offer solutions to critical feature gaps that were identified previously. Please refer to the Changelog within the Vault release for further information on product improvements, including a comprehensive list of bug fixes.
Some of these enhancements and changes in this release include the following:
PKI improvements:
- Cross Cluster PKI Certificate Revocation: Introducing a new unified OCSP responder and CRL builder that enables a certificate revocations and CRL view across clusters for a given PKI mount.
- PKI UI Beta: New UI introducing cross-signing flow, overview page, roles and keys view.
- Health Checks: Provide a health overview of PKI mounts for proactive actions and troubleshooting.
- Command Line: Simplified CLI to discover, rotate issuers and related commands for PKI mounts
Azure Auth Improvements:
- Rotate-root support: Add the ability to rotate the root account's
client secret defined in the auth method's configuration via the new
rotate-root
endpoint. - Managed Identities authentication: The auth method now allows any Azure resource that supports managed identities to authenticate with Vault.
- VMSS Flex authentication: Add support for Virtual Machine Scale Set (VMSS) Flex authentication.
- Rotate-root support: Add the ability to rotate the root account's
client secret defined in the auth method's configuration via the new
GCP Secrets Impersonated Account Support: Add support for GCP service account impersonation, allowing callers to generate a GCP access token without requiring Vault to store or retrieve a GCP service account key for each role.
Managed Keys in Transit Engine: Support for offloading Transit Key operations to HSMs/external KMS.
KMIP Secret Engine Enhancements: Implemented Asymmetric Key Lifecycle Server and Advanced Cryptographic Server profiles. Added support for RSA keys and operations such as: MAC, MAC Verify, Sign, Sign Verify, RNG Seed and RNG Retrieve.
Vault as a SSM: Support is planned for an upcoming Vault PKCS#11 Provider version to include mechanisms for encryption, decryption, signing and signature verification for AES and RSA keys.
Replication (enterprise): We fixed a bug that could cause a cluster to wind up in a permanent merkle-diff/merkle-sync loop and never enter stream-wals, particularly in cases of high write loads on the primary cluster.
Share Secrets in Independent Namespaces (enterprise): You can now add users from namespaces outside a namespace hierarchy to a group in a given namespace hierarchy. For Vault Agent, you can now grant it access to secrets outside the namespace where it authenticated, and reduce the number of Agents you need to run.
User Lockout: Vault now supports configuration to lock out users when they have consecutive failed login attempts. This feature is enabled by default in 1.13 for the userpass, ldap, and approle auth methods.
Event System (Alpha): Vault has a new experimental event system. Events are currently only generated on writes to the KV secrets engine, but external plugins can also be updated to start generating events.
Kubernetes authentication plugin bug fix: Ensures a consistent TLS configuration for all k8s API requests. This fixes a bug where it was possible for the http.Client's Transport to be missing the necessary root CAs to ensure that all TLS connections between the auth engine and the Kubernetes API were validated against the configured set of CA certificates.
Kubernetes Secretes Engine on Vault UI: Introducing Kubernetes secret engine support on the UI
Client Count UI improvements: Combining current month and previous history into one dashboard
OCSP Support in the TLS Certificate Auth Method: The auth method now can check for revoked certificates using the OCSP protocol.
UI Wizard removal: The UI Wizard has been removed from the UI since the information was occasionally out-of-date and did not align with the latest changes. A new and enhanced UI experience is planned in a future release.
Vault Agent improvements:
- Auto-auth introduced
token_file
method which reads an existing token from a file. The token file method is designed for development and testing. It is not suitable for production deployment. - Listeners for the Vault Agent can define a role set to
metrics_only
so that a service can be configured to listen on a particular port to collect metrics. - Vault Agent can read configurations from multiple files.
- Users can specify the log file path using the
-log-file
command flag orVAULT_LOG_FILE
environment variable. This is particularly useful when Vault Agent is running as a Windows service.
- Auto-auth introduced
OpenAPI-based Go & .NET Client Libraries (Public Beta): Use the new Go & .NET client libraries to interact with the Vault API from your applications.
Known issues
When Vault is configured without a TLS certificate on the TCP listener, the Vault UI may throw an error that blocks you from performing operational tasks.
The error message: Q.randomUUID is not a function
Note
Refer to this Knowledge Base article for more details and a workaround.
The fix for this UI issue is coming in the Vault 1.13.1 release.
Token creation with a new entity alias could silently fail
A regression caused token creation requests under specific circumstances to be forwarded from perf standbys (Enterprise only) to the active node incorrectly. They would appear to succeed, however no lease was created. The token would then be revoked on first use causing a 403 error.
This only happened when all of the following conditions were met:
- the token is being created against a role
- the request specifies an entity alias which has never been used before with the same role (for example for a brand new role or a unique alias)
- the request happens to be made to a perf standby rather than the active node
Retrying token creation after the affected token is rejected would work since the entity alias has already been created.
Affected versions
Affects Vault 1.13.0 to 1.13.3. Fixed in 1.13.4.
update-primary can lead to data loss
It's possible to lose data from a Vault cluster given a particular configuration and sequence of steps. This page describes two paths to data loss, both associated with the use of update-primary.
Normally update-primary does not need to be used. However, there are a few cases where it's needed, e.g. when the known primary cluster addresses of a secondary don't contain any of the correct addresses. But update-primary does more than you might think: it does almost everything that enabling a secondary does, except that it doesn't wipe storage. One of the steps that it takes is to temporarily remove most of the mount table records: it removes all mount entries except for those that are managed automatically by vault, e.g. identity mounts.
This update-primary behaviour is unintended and we'll be reworking it in an upcoming release. Once it lands the changelog entry will be "Fix a race condition with update-primary that could result in data loss after a DR failover."
update-primary with local data in shared mounts
If update-primary is done on a PR secondary with shared mounts containing local data (e.g. pki certs, approle secretids), the merkle tree on the PR secondary may get corrupted due to a timing race.
When this happens, the PR secondary still contains all the stored data, e.g. listing local certs from PKI mounts will return the correct results. However, because the merkle tree has been corrupted, a downstream DR secondary will not receive the local data, and will delete it if it already had it. If the PR secondary's DR secondary is promoted before the PR secondary is repaired, the newly promoted PR secondary will not contain the local data it ought to. If the former PR secondary is lost or destroyed, the missing data will not be recoverable other than via a snapshot restore.
Detection and remediation
If the TRACE level log line "cleaning key in merkle tree"
appears immediately subsequent
to an update-primary on a PR secondary, that's an indicator that the timing race was lost
and that the merkle tree may be corrupt.
Repairing the corrupt merkle tree is done by issuing a replication reindex request to the PR secondary.
If logs are no longer present (the update-primary was done some time in the past), it's probably best to reindex the PR secondary pre-emptively as a precaution.
update-primary with "Allow" path filters
There is a further path to data loss associated update-primary.
This issue requires that the PR secondary receiving an update-primary request has an
associated Allow
path filter defined for it. Like the first issue, this one too has
a timing aspect: the problem may or may not manifest, depending on how
quickly the mount tables truncated by update-primary get repaired by replication.
At startup/unseal (and after an update-primary), Vault runs a background job that looks at the mount data it has stored and tries to delete any that doesn't belong there, based on path filters. This behaviour was introduced in 1.0.3.1 to recover from a regression that allowed for inappropriate filtering of data: we needed to ensure that any previously unfiltered data got cleaned up on secondaries that ought not have it.
If a performance secondary has an associated Allow path filter, this cleanup code can misfire during the interval between when the truncated mount tables are written by update-primary and the time when they get rewritten by replication. The cleanup code will delete the data associated with the missing mount entries. The cleanup code doesn't modify the merkle tree, and as a result this deleted data won't be discovered as missing and repaired by replication.
Detection and remediation
When the cleanup code fires it logs the INFO level message "deleted mistakenly stored
mount entry from backend"
. This is a reliable indicator that the bug was hit.
If logs aren't available, the other indicator that this problem has manifested is to query the shared mount in question. The secondary won't have any of the data that the primary does, e.g. roles and configuration will be absent.
Reindexing the performance secondary will update the merkle tree to reflect the missing storage entries and allow missing shared data to be replaced by replication. However, any local data on shared mounts (such as PKI certs) will not be recoverable.
Impacted versions
Affects all current versions of Vault.
Feature deprecations and EOL
Please refer to the Deprecation Plans and Notice page for up-to-date information on feature deprecations and plans. A Feature Deprecation FAQ page addresses questions about decisions made about Vault feature deprecations.