Secrets management platforms (SMPs) have become the industry standard for safeguarding sensitive data such as passwords, private keys, and other critical secrets. By centralizing and securing these assets, SMPs help prevent secret sprawl and reduce the exposure of credentials that can often be challenging to track and protect. However, since these platforms hold access keys to private services and systems, they are also attractive targets for malicious actors. For some organizations, this can put important infrastructure at stake, making it more essential than ever to secure these platforms.
To help strengthen our defenses, our security teams zero in on anomaly detection in secrets consumption based on access pattern modeling. We’ve worked on a project aimed at detecting irregularities that signal unauthorized access, which improved our defensive measures and contributes to Adobe’s overall security posture.
In this blog, we will provide an overview of this project and share how our findings can help enhance detection capabilities that better protect critical secrets in SMPs.
Defining Solution Requirements
SMPs are primarily intended to keep secrets safe; however, there are few mechanisms built into them that can help prevent unauthorized access to these secrets if an attacker gets their hands on a valid authentication token. At Adobe, we decided to build our own alerting mechanism that primarily would leverage deep knowledge of our unique infrastructure, including users, assets, and scope.
Given our dynamic and rapidly changing environment, we needed to build a solution that not only could help us achieve our goals but also could operate efficiently. Our work centered around a subset of security-related modeling challenges that fall into the following categories:
- High Volume of Data: The solution must handle a significant volume of logs, which is typical for applications within large intranet infrastructures.
- Resource-Efficient Computation: The approach would need to be efficient and minimally resource-intensive. Our method utilizes minimal computational resources compared to state-of-the-art deep learning models that rely on billions of parameters.
- Mixed Numerical and Categorical Attributes: The solution must accommodate input data, such as SMP access logs, which contain both categorical and numerical values.
- Skewed Numerical Ranges: Our solution must address the challenges posed by data with non-uniform variance, which could significantly impact the training of neural networks, whether they have a small or large number of parameters.
To address these challenges, we implemented a methodology that would:
- Preprocess the Dataset: We enhanced the dataset by adding time-lag features, such as time of day and day of the week.
- Build a Statistical/Neural Model: We developed a predictive model of normal behavior that incorporated time-lag features and other multinomial data.
- Evaluate Actual Behavior Against Predictions: We utilized the prediction model to determine whether actual behavior deviates significantly from the expected outcomes.
Data Transformation and Aggregation
As a part of our solution, we created models of normal behavior and assessed how much an observed behavior diverged from the normal behavior. In doing so, our approach targeted attacker behavior and modeled read and list operations based on both categorical and numerical attributes.
SMPs typically log each operation by capturing key details such as event time, status, requested secret, and operation type (read, create, update, delete, list). While these details are generally applicable, our constraints specify that the secret is retrieved using a combination of categorical values, typically involving attributes such as customer, namespace, and path. This information was instrumental in allowing us to perform our analysis and anomaly detection.
For our reference dataset, we used a collection of SMP access logs that identified the client and contained the client’s unique identifier and source IP address, where objects (secrets) are accessed via a namespace and path. The log also contained the status of the operation and type of request, including read, create, update, delete, and list.
We then converted the resulting dataset into hourly aggregations based on the read, create, update, and delete operations, and computed spikes as the maximum number of events of the same type within a single minute. This provided us with a final dataset where each individual item contains a list of:
- Categorical values: SMP instance, namespace, client, source IP address, and lag-time specific identifiers
- Numerical values: Average and maximum number of operations performed in the timespan measured for read, list, create, update, or delete requests.
Leveraging a Statistical Tree-Structure
Alongside the SMP access logs, we used a statistical tree structure to learn and update a baseline statistical model quickly. This model works in tandem with a neural network using a residual learning approach. Each level of the tree focuses on a different attribute, and the order in which these attributes are processed is determined by our data analysis. However, the tree can also be built automatically using standard machine learning metrics, like data entropy.