Conclave: Persistence and transactions

Enclaves should be a fully integrated component of an app

Mike Hearn
8 min readOct 23, 2020

In this article we’ll explore some of the design thinking behind the Conclave API.

So you’ve got a tamperproof, encrypted memory space. You’ve got remote attestation and you’ve got private keys, so you can convince other programs you’re running inside that memory space. You’ve got encrypted messages. It’s finally easy to use. That’s pretty swish and it’s enough for some specific kinds of applications, like pure functions over relatively small datasets.

But, there are still three other things that are core to nearly any application: messaging, persistence and transactions. Conclave takes a radical approach to them that’s quite unlike other enclave frameworks. Here’s how it works and just as importantly, why it works.

Conclave Mail

Most enclave frameworks expect you to connect to the enclave using TLS. In this approach the developer is given a stream of bytes, along with pseudo-certificates that must be checked in unusual ways to verify the remote attestation.

Conclave works a bit differently. In Conclave you work with messages, not streams. Your serialised data can be easily encrypted using a remote attestation object, such that only the enclave can decrypt it. You may also set headers on the message that are not encrypted and thus visible to the host, however, they are tamperproofed such that the enclave can detect if the host modified them. The resulting encrypted message is called a mail, and the sender may include their own public key so the enclave can reply. Behind the scenes this system uses the highly respected Noise protocol framework, and the sender’s key is authenticated by mixing it into the elliptic curve Diffie-Hellman handshake. This means enclaves can use the sender’s public key as a form of identity.

There are numerous advantages to this approach.

It means the enclave is not a server. It therefore doesn’t have to include all the complexity servers require such as the ability to handle connections, flow control, handling reset connections, using a full TLS stack, buffering, asynchronously selecting on file descriptors and so on. Every bit of code you can keep out of an enclave is a win because ultimately (a) someone will eventually want to audit that code so they can convince themselves it does what you claim it does, (b) all shipping SGX hardware has a limited amount of fast EPC RAM, and (c) it lets you keep the enclave/host interface small, simple and therefore more likely to be secure. All the enclave needs to handle mail is the ability to exchange byte arrays with the host and an implementation of the Noise encryption framework, which is very small, modern and simple. The implementation can be read/understood in its entirety in just a few hours.

Another advantage is that the framework can offer padding of messages to block size-based side channel attacks for you (beta 3 doesn’t do this yet but the feature is easy to add: we just had other priorities at the moment).

Yet another advantage is making it easier to use client clocks to block rewind attacks the host might try to mount on the enclave. That’s a topic for another day, though.

The advantage we’ll explore in depth today is that Mail doesn’t just solve communication between client and server: it also solves persistence and transactions.

Persistence

Mail is encrypted asymmetrically: any program that has obtained a serialised EnclaveInstanceInfo object can create an encrypted mail to that target enclave, without immediately doing any network traffic or communicating with the enclave. There’s no need to establish a session. The enclave can likewise decrypt the mail without communicating with the client. The app developer is responsible for moving the resulting encrypted mails around.

This has some fairly trivial consequences, for example, enclaves can be loaded and unloaded on demand in the ‘function as a service’ style, which can help most efficiently use valuable EPC RAM. Because Conclave enclaves use the GraalVM native image tool, they can start very fast and don’t have a warmup period, so this is also highly appropriate for fast loading and unloading.

But there’s also a more important consequence, which is that we now have a simple way for an enclave to persist its internal state: it can mail itself. The host recognises when this is happening and stores the mail to disk instead of sending it over the network. On restart the mail is re-delivered to the enclave, which can use the contents to reinitialise its heap.

The mail-to-self pattern grants us some interesting benefits. Many problems are shared between communication and persistence, for example, the desire to pad data structures to sizes that don’t leak information about what’s in them. We can implement this feature once and both use cases get it.

Another important feature is traversing revocations. An enclave that’s found to have a security hole can be revoked, as can versions of the SGX software and microcode produced by Intel. When this happens keys are rotated in such a way that new enclaves can decrypt mails sent to old enclaves, but not vice-versa. The upgrade can be done in a backwards compatible way but users can be assured that once they detect the upgrade via remote attestation, the system has been re-secured: the host cannot take newly encrypted data, downgrade the enclave and then exploit it to access the new data. In the upcoming Beta 4 release we’ve integrated the infrastructure for this with Mail, so upgrades are seamless and client/enclave communication isn’t interrupted across the enclave/host restart (the client will start using the new key once it re-downloads the remote attestation, which it can be easily made to do every so often).

Avoiding state persistence

But can you completely avoid the need for saving your in-memory state to disk/a database?

Let’s consider the base case of a multi-party computation with a small dataset. It receives contributions from a variety of users throughout the day, and produces a combined calculation at the end of the day. We could implement it like this:

  • The enclave runs all the time.
  • Clients submit data when the user gets back from lunch and types in today’s values.
  • The host receives the encrypted data and immediately gives it to the enclave.
  • The enclave updates the state of its calculation and sends the current intermediate values to itself, thus persisting it to disk.
  • At the end of the day the host asks the enclave to run the calculation. It produces the result and mails each client with the answer. The host relays the answer to the users, potentially with retry if the user is currently offline.

We could do that, but it’s actually over-complex. Because mail is fully async you don’t need to be constantly writing state to disk as users perform actions, like you would in a normal web app. Instead consider a simpler design:

  • Clients submit data when the user gets back from lunch.
  • The host receives the encrypted data and writes the mail to disk.
  • The host waits until the right time of day (or until all the users submitted their data).
  • The host loads the enclave and sends it all the mail at once.
  • The enclave produces the encrypted final results and hands them back to the host for delivery.
  • The host unloads the enclave and starts delivering the results.

The enclave doesn’t need any explicit persistence in this design. If the host restarts during the day then it doesn’t matter because the enclave wasn’t loaded. If the host dies for some reason in the short window of time when it’s delivering the queued mail to the enclave, that doesn’t matter either: the enclave is a pure function so the host can just restart and retry. All the data comes from the users anyway so why rewrite it from one format to another?

Transactions

A significant amount of complexity inside a typical web app is related to providing ACID transactions. Put simply, a user action must occur or not occur as a unit. It can’t be allowed to only half-complete. This is usually implemented with an RDBMS, SQL and ORMs, but of course that’s not the only way.

Although that could be done for an enclave too, again, we’re trying to keep the amount of code minimal. The more code in the enclave the more you have to trust the enclave’s developer blindly because it becomes too hard to properly audit.

Mail is transactional. The enclave can acknowledge mails it has received. This indicates to the host the enclave is done with the mail and it can be erased. Simultaneously the enclave can request delivery of one or more mails. The host is responsible for providing transactional semantics (if it doesn’t the app will be unreliable, but that doesn’t help the host break in).

Consider an enclave that implements a simple order book. Orders come in from users at arbitrary times, and the enclave updates its in-memory data structures for the book. When two orders cross they are deleted from the order book and the users are informed that their orders matched, so they can proceed to settle the trade.

The enclave can implement this without needing the complexity of interacting with a database or running a web server. Each order is a mail. The enclave receives them but doesn’t acknowledge them to the host right away. When two orders cross, the enclave atomically requests the host to acknowledge the mails for the two orders (i.e. requesting that they be deleted) and deliver the mails to the client indicating that the trades crossed. If the host restarts it can replay all the un-acknowledged mails up to that point to rebuild the in-memory order book. It’s up to the host to implement the transactional semantics, for example it could use a database engine, a transactional K/V store or any MQ broker that supports transactions. The enclave doesn’t care about this implementation detail, and nor do the auditors.

What if the host doesn’t delete a mail when it was really requested? Then old orders may cross, and this will be detected when the parties try to settle up. If the enclave is just providing a private encrypted “dark pool” then this kind of attack doesn’t matter, as there’s no incentive for the host to break their own service like that. If the enclave is trying to stop the host from doing this for some reason then more work is required to fully sequence all the submitted orders such that the host can’t reorder or replay them. Mail contains features that make this possible, again, without needing trusted storage (which SGX hardware can’t provide by itself, because you can use any storage with enclaves as it’s up to the host to provide it).

Conclusions

To build a service which doesn’t require your users to trust you requires careful thinking. You want to move as much ‘plumbing’ out of the enclave as possible, leaving only the core of the business logic your users really care about. This gives you maximum freedom to alter the operational aspects of your enclave service without triggering any change in the remote attestation that users would have to double check.

Unlike most other enclave frameworks which try to put entire servers into the protected memory space, Conclave provides high level abstractions to let you achieve tiny yet robust enclaves. When you consider the total cost of ownership of an enclave-oriented system you’ll see that minimising the work of third party auditors and/or your users will be essential. Mail is a keystone in our strategy to let you achieve that.

--

--

Mike Hearn

My work blog: Lead Platform Engineer at R3 (my personal blog is at blog.plan99.net)