Leveraging anomaly detection for secure secrets management

Image generated with Adobe Firefly.

Secrets management platforms (SMPs) have become the industry standard for safeguarding sensitive data such as passwords, private keys, and other critical secrets. By centralizing and securing these assets, SMPs help prevent secret sprawl and reduce the exposure of credentials that can often be challenging to track and protect. However, since these platforms hold access keys to private services and systems, they are also attractive targets for malicious actors. For some organizations, this can put important infrastructure at stake, making it more essential than ever to secure these platforms.

To help strengthen our defenses, our security teams zero in on anomaly detection in secrets consumption based on access pattern modeling. We’ve worked on a project aimed at detecting irregularities that signal unauthorized access, which improved our defensive measures and contributes to Adobe’s overall security posture.

In this blog, we will provide an overview of this project and share how our findings can help enhance detection capabilities that better protect critical secrets in SMPs.

Defining Solution Requirements

SMPs are primarily intended to keep secrets safe; however, there are few mechanisms built into them that can help prevent unauthorized access to these secrets if an attacker gets their hands on a valid authentication token. At Adobe, we decided to build our own alerting mechanism that primarily would leverage deep knowledge of our unique infrastructure, including users, assets, and scope.

Given our dynamic and rapidly changing environment, we needed to build a solution that not only could help us achieve our goals but also could operate efficiently. Our work centered around a subset of security-related modeling challenges that fall into the following categories:

High Volume of Data: The solution must handle a significant volume of logs, which is typical for applications within large intranet infrastructures.
Resource-Efficient Computation: The approach would need to be efficient and minimally resource-intensive. Our method utilizes minimal computational resources compared to state-of-the-art deep learning models that rely on billions of parameters.
Mixed Numerical and Categorical Attributes: The solution must accommodate input data, such as SMP access logs, which contain both categorical and numerical values.
Skewed Numerical Ranges: Our solution must address the challenges posed by data with non-uniform variance, which could significantly impact the training of neural networks, whether they have a small or large number of parameters.

To address these challenges, we implemented a methodology that would:

Preprocess the Dataset: We enhanced the dataset by adding time-lag features, such as time of day and day of the week.
Build a Statistical/Neural Model: We developed a predictive model of normal behavior that incorporated time-lag features and other multinomial data.
Evaluate Actual Behavior Against Predictions: We utilized the prediction model to determine whether actual behavior deviates significantly from the expected outcomes.

Data Transformation and Aggregation

As a part of our solution, we created models of normal behavior and assessed how much an observed behavior diverged from the normal behavior. In doing so, our approach targeted attacker behavior and modeled read and list operations based on both categorical and numerical attributes.

SMPs typically log each operation by capturing key details such as event time, status, requested secret, and operation type (read, create, update, delete, list). While these details are generally applicable, our constraints specify that the secret is retrieved using a combination of categorical values, typically involving attributes such as customer, namespace, and path. This information was instrumental in allowing us to perform our analysis and anomaly detection.

For our reference dataset, we used a collection of SMP access logs that identified the client and contained the client’s unique identifier and source IP address, where objects (secrets) are accessed via a namespace and path. The log also contained the status of the operation and type of request, including read, create, update, delete, and list.

We then converted the resulting dataset into hourly aggregations based on the read, create, update, and delete operations, and computed spikes as the maximum number of events of the same type within a single minute. This provided us with a final dataset where each individual item contains a list of:

Categorical values: SMP instance, namespace, client, source IP address, and lag-time specific identifiers
Numerical values: Average and maximum number of operations performed in the timespan measured for read, list, create, update, or delete requests.

Leveraging a Statistical Tree-Structure

Alongside the SMP access logs, we used a statistical tree structure to learn and update a baseline statistical model quickly. This model works in tandem with a neural network using a residual learning approach. Each level of the tree focuses on a different attribute, and the order in which these attributes are processed is determined by our data analysis. However, the tree can also be built automatically using standard machine learning metrics, like data entropy.

A diagram of a network Description automatically generated

Each node and leaf on the tree contain statistics that represent the data points that pass through them. We used the information generated by this tree to apply a non-uniform data normalization technique, which allowed us to predict the mean and standard deviation for each entry in the dataset.

This approach also enabled us to handle previously unseen attribute values. For instance, if we encounter an IP address that hasn’t been seen before, we can backtrack to the previous level in the tree and model this new entry using the other available attributes. By doing this, we maintain data normalization and consistency based on previously observed data points.

Key Outcomes and Benefits of Enhanced Anomaly Detection

Our research indicates that using SMP logs to identify unauthorized access to credentials is effective, but it's essential to correlate these results with other event sources for accurate event classification.

The benefits of this analysis are substantial. With automated anomaly detection, we gain a deeper understanding of access patterns, which allows us to identify anomalies in SMP traffic quickly and receive timely alerts for unusual events. Our custom-designed learning scheme has also helped us achieve a 98 percent reduction in false positives that require significantly fewer parameters; this has resulted in faster training times and improved efficiency.

Additionally, by leveraging a hybrid modeling approach that integrates categorical and numerical inputs, we’ve further minimized false positives while optimizing computational resources. This comprehensive approach not only strengthens our defenses but also streamlines the process of identifying and responding to potential security threats.

Future Enhancements

As we look to the future, our research plans will concentrate on several key areas to enhance our understanding and effectiveness in anomaly detection within secrets management platforms. We will aim to compute the consumption order of attributes efficiently to enhance our modeling and analysis of access patterns, which will optimize our algorithms for faster and more accurate anomaly detection. Additionally, we will be investigating various use cases to broaden the applicability of our approach and identify new patterns.

Through our anomaly detection efforts, we continue to better protect sensitive information and prevent secrets from falling into the wrong hands. By refining our models and expanding our research, we aim to stay ahead of evolving security challenges and effectively safeguard our systems against malicious actors. We are excited about the potential impact of our work and look forward to sharing our progress in future updates.