Introduction
Artificial Intelligence (AI) is a rapidly evolving field, with many subfields and techniques for building intelligent systems. One of the most promising subfields is Machine Reasoning, which uses symbolic representations and logical reasoning to draw conclusions from data. In the field of cybersecurity, machine reasoning can be used to build a virtual attacker that can simulate millions of cyber attacks to determine specific attack scenarios against an organization, and calculate the risk from these attacks. In this article, we will explore what machine reasoning is, how it can be used to build a virtual attacker, and what challenges need to be addressed to make it work.
Machine Learning vs. Machine Reasoning
Before diving into machine reasoning, it is important to distinguish it from machine learning, which is often used interchangeably with AI. Machine learning is a statistical method that involves the analysis of large amounts of data to identify hidden patterns and build models that can classify new data automatically. In contrast, machine reasoning uses symbolic representations and logical reasoning to draw conclusions from facts. It is based on the analysis of concepts and relationships, rather than on statistical patterns.
To build a virtual attacker, we need to use machine reasoning to analyze the semantics of cyber threats and IT systems. Semantic knowledge graphs are used to represent concepts and relationships in a way that is understandable to machines. These graphs enable reasoning systems to understand the meaning of the data and draw conclusions by analyzing the graph of concepts and projecting them onto new data.
Semantic Graphs for Cyber Threats
In the world of cybersecurity, a semantic graph for cyber threats can be produced by using information and concepts found in standard information sources, such as the MITRE ATT&CK and NVD CVE. Attack techniques can be analyzed to define the "requirements" of the attackers. If you combine a semantic graph of cyber threats with a graph describing features of an organization's IT systems, the reasoning system can deduce what information is needed to enable the technique and build a virtual attacker that can explain how, in principle, to attack an organization.
The semantic graph is based on the Resource Description Framework (RDF), which is a directed graph described as triplets. Each triplet in an RDF graph has three components: a node for the subject, an arc with the predicate linking the subject to the object, and a node for the object. For example, the concept of a user account can be represented as a triplet: (User Account, has, Username).
Creating a Virtual Attacker
To create a virtual attacker, the reasoning system finds which attack methods are relevant to an organization by checking the prerequisites for attack techniques and calculating which of them are most relevant to the organization. The more accurate information there is about the organization, the more relevant the answer will be, and you can find out which attack techniques the organization is sensitive to.
For example, a basic prerequisite for an SQL injection attack is that the system must include an SQL database. Another example is that a condition that must be met in attacks against passwords (brute force) is the use of weak passwords that can be cracked. Many MITRE ATT&CK TTPs may be irrelevant due to missing organizational prerequisites.
Challenges in Building a Virtual Attacker
There are three major challenges in building a virtual attacker. The first is the precise semantic analysis of attack techniques, such as those described in MITRE ATT&CK, which are described for human understanding and not suitable for reasoning systems. The solution is to rewrite the techniques precisely with consistent and precise basic concepts and create an appropriate semantic model.
The second challenge is to create a language (ontology) that connects concepts from different attack domains, such as permissions, vulnerabilities, and configurations, and to create the semantic graph. There are detailed ontologies that explain the relationship between various cyber concepts such as the UCO of the University of Maryland or MITRE D3F3ND.
The third challenge is collecting relevant information from an organization's systems. This can be done by interfacing with existing systems and translating the information into the common language or by a dedicated scanner.
Once the system has gathered all the information, it can simulate millions of cyber attacks to determine specific attack scenarios against the organization, and calculate the risk from these attacks. The goal is to determine courses of action to mitigate attack scenarios, reduce risk and build cyber resilience.
Benefits of Machine Reasoning
Machine reasoning has several benefits over other methods of AI, especially in the world of risk analysis and management. One of the main benefits is explainability - reasoning systems excel in the ability to explain the "thought" process that led to the conclusion, which is lacking in most machine learning systems. Semantic graph technologies also make it possible to combine different types, formats, and sources of information into a common language and to achieve semantic and logical action capability on the integrated information. This enables the assessment of organizational resilience to prevent cyber-attacks.
Conclusion
Machine reasoning is a promising subfield of AI that has many applications in the field of cybersecurity. By using semantic knowledge graphs to represent concepts and relationships, we can create a virtual attacker that can simulate millions of cyber attacks and determine specific attack scenarios against an organization. Although there are several challenges to building a virtual attacker, such as precise semantic analysis of attack techniques and creating a language that connects concepts from different attack domains, the benefits of machine reasoning make it a valuable tool in the world of risk analysis and management. As with any development in the cyber world, the concern is that these systems will also be in the hands of the attacker, but the benefits of machine reasoning outweigh the risks, and we must continue to develop and improve these systems to build a more resilient and secure cyber world. Machine reasoning is an example of white box AI where all the data is humanly understandable, and all the reasoning is explainable, which goes a long way to defeating adversarial attacks and building trust in the AI.