Skip to content

Conversation

@bigbrett
Copy link
Contributor

@bigbrett bigbrett commented Jan 22, 2026

TL;DR: Makes wolfHSM server safe to use in multithreaded scenarios.

Overview

This pull request implements a generic framework for thread-safe access to shared server resources in wolfHSM, specifically targeting the NVM (non-volatile memory) and global key cache subsystems as the first shared data to be protected. Crypto is left to a subsequent PR but is the likely next candidate.

Note that a server context itself still cannot be shared across threads without proper serialization by the caller. This PR just adds the mechanisms such that, when multiple server contexts share an NVM instance or global keystore, access to those shared resources is properly serialized, allowing requests from multiple clients to be processed concurrently in separate threads.

Changes:

  • Introduces lock abstraction layer (wh_lock.{c,h}) with callback-based design for platform independence
  • Example POSIX lock implementation using pthread_mutex
  • Refactor NVM and keystore layer internals to use lock abstraction such that global keystore and NVM can be shared by multiple server contexts in a thread-safe manner
  • Thread safe functionality enabled with the WOLFHSM_CFG_THREADSAFE build option. When this option is NOT defined, all lock abstraction operations compile to no-ops, with zero overhead
  • Adds "thread safe stress test" to test suite that attempts the flesh out data races via a large number of contention cases, meant to be run under ThreadSanitizer

Gaps/future work:

  • Serializing access to global crypto state, specifically hardware crypto for ports. A bit of a tricky problem since offload is provided at the port level, and there isn't a good way for wolfHSM to know which algos will be accelerated and which wont. A naive implementation might consider simply locking the server crypto context, but this contains a mixture of local (CMAC) and quasi-global (RNG) elements and no abstraction for hardware. Locks also need to be synchronized with the wolfCrypt port mutex. We should refactor the server crypto context and perhaps split it into local and global structures, with the global supporting hardware state. Future work....

…ety,

serializing access to shared global resources like NVM and global keycache
Copy link
Contributor

@billphipps billphipps left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Truly excellent! You solved this just the way I had hoped for!
My requested changes are very limited and not really functional. More just fleshing out the exact requirements for a real implementation and a few minor typos and renaming opportunities.

The stress testing framework is outstanding!

@@ -0,0 +1,149 @@
/*
* Copyright (C) 2024 wolfSSL Inc.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
* Copyright (C) 2024 wolfSSL Inc.
* Copyright (C) 2026 wolfSSL Inc.

Happy new year!

typedef int (*whLockCleanupCb)(void* context);

/** Acquire exclusive lock (blocking) */
typedef int (*whLockAcquireCb)(void* context);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps a comment on why a non-blocking version of Acquire is not useful or not required?

* All callbacks receive a user-provided context pointer (from whLockConfig).
* Each lock instance protects exactly one resource.
*
* Return: WH_ERROR_OK on success, negative error code on failure
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add that no state is modified on error, as if the call had never happened.

Recommend that Initialize also acquires the lock, to stop initialization sequencing problems
Recommend that Cleanup is Non-blocking and will either acquire the mutex and destroy it, or mark it to be destroyed by the owner as it attempts to release it (consider atomic set of a flag in the state). This should stop race conditions at cleanup.

Recommend returning:
WH_ERROR_BADARGS: Invalid context pointer (follows EINVAL). Note NULL context means disable and all functions return WH_ERROR_OK.
WH_ERROR_BADARGS: Attempting to Acquire or Release a context that was not Initialized (follows EINVAL)
WH_ERROR_LOCKED: (optional) If non-owner attempts to release an acquired mutex OR if owner attempts to acquire the same mutex it already owns
WH_ERROR_OK: Releasing a lock that has not been acquired but is intialized
WH_ERROR_OK: Cleaning Up a lock that has not been initialized.

*
* This function initializes the lock by calling the init callback.
* If config is NULL or config->cb is NULL, locking is disabled
* (single-threaded mode) and all lock operations become no-ops.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All functions return WH_ERROR_OK when locking is disabled.

Init should return WH_ERROR_BADARGS when lock is NULL.

*
* @param[in] lock Pointer to the lock structure.
* @return int Returns WH_ERROR_OK on success, or a negative error code on
* failure (e.g., WH_ERROR_LOCKED if acquisition failed).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't make sense to return ERROR_LOCKED if the acquisition failed. If the callback returned a status that is couldn't acquire the lock due to error, then ABORTED would be a better call. Locked makes me thing someone else has the lock, BUT this function is supposed to block until that is no longer true.

#include "wolfhsm/wh_lock.h"
#include "wolfhsm/wh_error.h"

#ifdef WOLFHSM_CFG_THREADSAFE
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this the best name? Consider the more mundane WOLFHSM_CFG_LOCKS. Threadsafe may imply more than just locks, like cancelability.

memset(&lock, 0, sizeof(lock));

WH_TEST_PRINT("Testing lock lifecycle...\n");

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider checking that acquire/release/cleanup fail properly (BADARGS?) when on an uninitialized (but zeroed) wh_lock. Note that this should NOT attempt to call any callbacks.

/* Cleanup should succeed */
rc = wh_Lock_Cleanup(&lock);
WH_TEST_ASSERT_RETURN(rc == WH_ERROR_OK);

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Verify that Acquire/Release/Cleanup fail properly on a wh_lock after cleanup. This may be the same test as recommended above IF the cleanup zero's the structure. Hint hint.

/* Configure lock with error-checking mutex for better debugging */
memset(&ctx->nvmLockCtx, 0, sizeof(ctx->nvmLockCtx));
pthread_mutexattr_init(&ctx->mutexAttr);
pthread_mutexattr_settype(&ctx->mutexAttr, PTHREAD_MUTEX_ERRORCHECK);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that this attribute can set errno values that other types cannot and may not block/fail in the same way a "default" or "normal" pthread mutex will. These values should be trapped in the port because it indicates an undefined behavior WOULD have occurred.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider adding posix into the name of this file since it heavily used posix to provide any real functionality.

Copy link
Contributor

@rizlik rizlik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't look into tests yet.
Great work.
Is this lock enough to properly synchronize client request?
Example, _HandleNvmRead:

    rc = wh_Nvm_GetMetadata(server->nvm, id, &meta);
    if (rc != WH_ERROR_OK) {
        return rc;
    }

    if (offset >= meta.len)
        return WH_ERROR_BADARGS;

    /* Clamp length to object size */
    if ((offset + len) > meta.len) {
        len = meta.len - offset;
    }

    rc = wh_Nvm_ReadChecked(server->nvm, id, offset, len, out_data);
    if (rc != WH_ERROR_OK)

metadata can be changed between GetMetadata and ReadChecked.
Also, when handling key request:

            /* get a new id if one wasn't provided */
            if (WH_KEYID_ISERASED(meta->id)) {
                ret     = wh_Server_KeystoreGetUniqueId(server, &meta->id);
                resp.rc = ret;
            }
            /* write the key */
            if (ret == WH_ERROR_OK) {
                ret     = wh_Server_KeystoreCacheKeyChecked(server, meta, in);
                resp.rc = ret;
            }

the id might not be unique anymore when _KeysotreCacheKeyCached.

Would more coarse granular locking at request level simplify the design?

return ret;
}

/* Use unlocked variants since we already hold the lock */
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

probably this is assuming too much on the calling context

return WH_ERROR_BADARGS;
}

ret = _LockKeystore(server);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we skip locking on local cache?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants