Auditing data access with no leaks

The ability to provide a complete data access audit is vital for critical systems that handle sensitive data. CEA-List contributed to the development of a new formal framework that guarantees that every read access operation—even if interrupted—is detected while protecting the access logs from prying eyes.

In distributed systems, several remote computers cooperate to perform tasks. One of the major security requirements for systems that handle sensitive data (healthcare, finance, legal…) is the ability to keep reliable data access logs. This presents two main challenges: The first is to guarantee that every access operation is detected, ensuring that no users go unaudited; and the second is to do this in a way that protects the access logs from prying eyes.

In the field of distributed systems, the first formal (i.e. mathematically rigorous) definitions of auditability are fairly recent. And they are based on the strong assumption that a read access operation is only audited if it is completed. This is a loophole that bad actors could use to access data and then simulate a malfunction that prevents completion, thus avoiding detection. Plus, if the access logs are available to anyone, bad actors could use this information to figure out who accessed which data—creating problematic data leaks.

In a departure from the existing definition of auditability for distributed systems, CEA-List introduced two new formal definitions used as a conceptual framework for this research. The first definition characterizes what constitutes an effective read access operation as the instant a reader might learn the value being accessed, regardless of whether the read access operation is completed or not. The second definition establishes the concept of an “uncompromised operation”, which guarantees that it cannot be known if an operation was completed on the data or not.

From an algorithmic standpoint, access to a piece of data and the recording of the access operation in the audit log are combined in a single, indivisible operation, ensuring that a read access operation is logged, even if it is not completed. To prevent readers from learning about each other’s accesses, each reader is associated with a bit (0 or 1) initialized secretly and at random, creating a simple but unbreakable encryption to protect the audit log. When a- read operation is performed, the reader’s bit is inversed (0 becomes 1, or vice versa). The values observed look like a random sequence of 0s and 1s, making it impossible to determine which readers have accessed the data. Only auditors who have the initial secret key can know who has- read which data. This is done by comparing the key with the final log.

Formal proof of the algorithm guarantees reliable operation of the system in all circumstances. Despite multiple access operations, the data remains consistent (when a piece of data is read, the last value assigned to it is clearly visible). In addition, bad actors cannot interfere with the system, all read operations are effectively logged, and the privacy of the access operations is protected, even if curious readers look at the audit logs.

Auditing data access with no leaks

Alexandre Rapetti

Learn more

Flagship publication

CEA-List, the smart digital systems specialists

▼ Naviguer dans le portail ▼