Detecting and Predicting Malicious Access in Enterprise Networks Using the Louvain Community Detection Algorithm

By Gal Sadeh, Sr. Data Scientist, Silverfort
Many data breaches start with gaining access to an insignificant computer and propagating by jumping from one computer to another until reaching the valuable ‘crown jewels’, like admin credentials, information about an important DB holding customer data and more. Detecting and preventing these attacks is a very complicated task for security professionals since the number of possible attack paths is extremely high and networks change frequently (new entities are added or removed, permissions are changed etc.). Most of the techniques for detecting attacks are based on recognition of known malicious patterns, but for complex attacks this is no longer enough. When the attack doesn’t have a known recognizable pattern it’s extremely hard to detect it.

Our initial attempts with attack detection

Since Silverfort analyzes authentication and access data monitored over the entire enterprise network and cloud environments – it needs to analyze a lot of data. However, it took a few attempts to decide what would be the best way to analyze all this data to detect unknown threats. I gave this question a lot of thought. At first, I thought to map all the possible access paths based on the data and try to detect the vulnerable ones. However, this approach becomes almost impossible to implement when the network contains many entities, since the number of possible access paths grows exponentially with the number of entities. I tried to improve this method by eliminating some of the access paths. For example, if a user doesn’t have permissions in the directory to access a certain server – we can eliminate that path. This approach helped us process a large amount of access paths, but it wasn’t enough, because even without permissions an access path might be possible through exploitation of vulnerabilities.

So, we decided to try a different approach: we divided the network to communities, based on the fact that members of a community communicate with each other, and other shared resources, more than with other entities in the network.

Let me further explain:

What are communities and what are community detection algorithms?

Networks can be modelled as graphs of connections (edges) between nodes, where each node represents an entity (user, computer, etc.) and each edge connecting between the nodes represents a logical connection. For example, if a user accesses a server, the user and server would be the ‘nodes’ and the connection between them would be represented by an ‘edge’.
Each edge is also associated with a weight that represents the strength of the connection, which is based on the number of authentication and access requests we see on this edge.
The way community detection algorithms work is taking a raw graph as input, analyzing it and assigning each node to a community. Each community is characterized by its ‘density’ which is the relationship between the weights of all the edges and the number of nodes in the community. Most community detection algorithms have an ‘objective’ which is a number that represents the overall compatibility of the nodes to the communities they were assigned to. The goal is to maximize that number. A larger objective means ‘High Density’ communities.
Note that different algorithms may have slightly different ‘objectives’.

Assigning nodes to communities using the Louvain algorithm

We selected the Louvain algorithm because it is a well-known algorithm that runs very fast even in large complex networks – its run time is almost linear to the network’s size. The algorithm tries to maximize its objective by repeating these two phases:

  1.  It assigns each node in the graph to a single community. For each edge connecting between node u and node v, the algorithm checks if merging the communities of u and v increases the objective, and if so, it performs the merge.
  2. The algorithm builds a new graph, where each community is represented by a single node, and the weight assigned to the edge between the communities is the sum of the weights of all the edges between the merged nodes of each community.

The algorithm stops when there is no step that increases the objective. The output is an induced graph, where nodes represent communities. For our implementation we used a Python package called “community”. Further reading material can be found here: http://www.emilio.ferrara.name/wp-content/uploads/2011/07/isda2011-k-path.pdf

Community detection and cyber security

Since each community is dense, meaning there are many access paths between the nodes, if an attacker gains access to one of the entities (user account or device) in a community, it would be easier for him or her to gain access to other entities in that community. Obviously, it’s harder for an attacker to expand to other communities since there are less access paths between different communities. If used correctly, community detection can help identify suspicious activity within a network, predict lateral movement paths and enforce smarter policies in real time.

How Silverfort fits in

When Silverfort’s Risk-Based Authentication Platform is deployed in a network, it monitors all the authentication and access activity between the various entities and constructs a graph as previously described. The raw graph looks like this:

Each circle is a user and each triangle is a server. The weights on the edges are determined by the number of authentication and access requests monitored, and permissions associated with the different nodes (for simplicity, weights are not shown on the graph). Then we use the Louvain algorithm to identify communities in the network. After identifying the communities, the map will look like this:


Now, we spot nodes that communicate with entities in communities they aren’t a part of. Those nodes would be valuable targets for attackers, because they are the only points of access between communities. Therefore, an attacker will have to go through them in order to spread to other communities. Securing the edges (connections) of these nodes is important for preventing an attack from spreading in the network.

Using Silverfort’s authentication policies, companies can respond automatically to detected threats by enforcing step-up authentication to validate the user’s identity or blocking access altogether. This is true both for user-to-machine and machine-to-machine communication.
Identifying the communities and the valuable cross-community connections that would be targeted by attackers, helps Silverfort customers in setting smart secure authentication policies. The community graph can also help admins identify misconfigurations in their networks. In the graph below, some of the valuable nodes and connections that should be secured are marked in purple.

How is it leveraged today?

Some of our customers are already using Silverfort’s community analysis (which is part of our AI-based risk engine) to identify communities and cross community connections in their networks, identify misconfigurations and ‘weak spots’ that should be secured, and detect abnormal activity. Now, we are helping customers make the next step – applying real-time secure authentication policies in response to detected threats, and effectively preventing lateral movement in their networks and cloud environments.

Gal Sadeh, Sr. Data Scientist, Silverfort
Gal is a Senior Data Scientist in Silverfort’s research team. He is responsible for big data analytics and developing AI engines. He joined Silverfort after many years of research and leadership roles at the 8200 elite cyber unit of the Israel Defense Forces. Gal holds a B. Sc. in Mathematics and Computer Science from Tel Aviv University.

Stop Identity Threats Now