NDR and network probe testing using M&NTIS

July 15, 2025

NDR and network probe testing using M&NTIS

Introduction

In an increasingly complex threat landscape, evaluating the effectiveness of network security tools is essential. M&NTIS Platform provides a controlled lab environment to simulate realistic attack scenarios and test how well a probe can detect and interpret various network protocols.

What sets M&NTIS apart is its integration with the MITRE ATT&CK framework, a widely adopted knowledge base of adversary techniques. This alignment allows for consistent tagging of attacks and easier correlation with probe alerts.

In this article, we explore how M&NTIS can be used to perform protocol decoding tests, replay traffic, and benchmark a probe’s detection capabilities through MITRE-classified attacks. We use Suricata network probe for our experimentation.

Evaluation methodology

In this methodology, every test is performed live. However, M&NTIS allows the user to save the network traffic file (pcap) for later use. It can be useful for replay or for offline probe analysis.

Live stimulis

The M&NTIS topology used to test probes has several machines running different network services (web server, ftp server, email server, tunneling, and so on.). We stimulate directly those services from a client machine either manually from the GUI or via pre-crafted tests. These tests allow the client machine to send emails, perform HTTP API requests or exfiltrate files via DNS tunneling for example. They aim at triggering specific functionnalities for each protocol in order to assess if a probe is able to understand such functionnalities. The stimulis are splitted in protocols. In each protocol stimuli, there are different protocol specific tests. Between every test, there is a time delay in order to ease the probe log analysis by checking the timestamps.

Protocol decoding

We want to benchmark the ability of the tool to decode different applicative sessions. To achieve this, it is required to implement different protocols with some options enabled. In our probe benchmark tool, we have these applicative protocols enabled: HTTP and HTTPS, SNMP, SMTP and IMAP, SSH, DNS, FTP and ICMP. For every test, there is a pre-captured pcap file which can be replayed from the client machine point of view (called Attacker in the figure).. Each protocol has its pcap file. To this day, there are 16 pcap files available to be replayed. Each one is captured based on the test for the protocol.

If we take for example the SMTP, we have implemented tests for different options. The encoding for sending an e-mail is changed with the allowed ones: quoted-printable, base64, binary, 8bit and 7bit. In addition, the TLS mode is enabled in one test. Other tests consists in adding Cc and Bcc to assess if the probe is able to read the recipient. Adding a blacklisted file as an attached file which shall trigger an alert from the probe. An ETRN request is performed in a test. As a result, every applicative protocol implemented have different options or modes enabled to assess the performances of a probe.

Pcap replay

M&NTIS allow users to replay saved PCAP files during a lab. Such PCAP can be saved from a previous lab. This feature ensure tests reproducibility. In addition, it allows datasets to be created for non-regressiong testing for example.

MITRE ATT&CK’s signature base

MITRE ATT&CK is a widely accessible knowledge base that catalogs adversary tactics and techniques observed in real-world scenarios. It serves as a foundational resource for creating specific threat models and methodologies across the private sector, government, and the cybersecurity product and service industry. As a result, more and more cybersecurity companies use the MITRE ATT&CK classification to tag their threat signatures. M&NTIS uses this same classification for its attack catalog, labs and datasets.

Network Intrusion Detection Systems (NIDS) focus on detecting attack in the network. Therefore, we need to restrain MITRE ATT&CK usage to techniques that have impact on the network, such as:

T1048: Exfiltration Over Alternative Protocol
T1021: Remote services
T1210: Exploitation of remote services
T1077: Windows Admin Shared
T1018: Remote system discovery
T1041: Exfiltration over Command and Control channel
T1571: Application layer protocols (tunneling)

We should also consider covering all tactics that produce network effect, such as:

C&C communications (e.g., PoshC2, Metasploit, etc.). The objective for the sensor is to ensure that signatures detect typical C&C traffic.
Lateral movement (e.g., WinRM, WMI, PsExec, etc.). The objective is to ensure that signatures detect typical lateral movement techniques.
Credential theft over the network (e.g., Responder, Mimikatz, etc.). Here we test that signatures detect typical network-based credential theft.
Data exfiltration (e.g., DNS tunneling). The objective is to evaluate the probe’s ability to measure traffic peaks during a specific time period, or to detect patterns characteristic of tunneling.
Network discovery. Here we test that signatures manage to detect the offensive techniques used to enumerate information in an IT system.

Several techniques are implemented in M&NTIS labs and attacks. These techniques are classified according to the MITRE ATT&CK classification. In order to benchmark a probe and test its MITRE ATT&CK signature base coverage, it is possible to play one or multiple M&NTIS attacks iteratively.

M&NTIS enables a user to filter attack techniques by different filters. On of these filters inform about different side effects by running an attack technique. As a result, it is possible to perform probe testing directly according to network traces left by attack techniques. alt text

Network probe integration in M&NTIS Labs

Wihtin M&NTIS Platform, there are two different ways to test a network probe. The first one is the virtualized mode. In this case, the network probe is installed in a virtual machine (also called virtual appliance) which can be used directly in the M&NTIS Platform as part of a topology. The second mode is the physical way (only available on On-Premise versions of M&NTIS Platform). The network probe running on a physical appliance can be connected to the running lab through a dedicated network interface on the host system running M&NTIS Platform.

In both network probe testing modes, the network traffic can be mirrored from the running lab to the running network probe. It is also possible to select on which topology nodes and network intefaces the traffic mirroring should take place.

The following picture illustrates a topology that uses Suricata as a network probe in a virtualized mode. Suricata virtual appliance can be easily replaced by another solution, as the topology is generic enough. The machine Attacker is the one from where every tests are performed. Most of those tests are targeted against WebServerUbuntu because it runs most software services to test. The network probe is connected to its own switch SwitchMonitoring. The machine Client1 is on the same LAN as Attacker. There are other different machines to allow other tests or to connect to the internet. The dnsserver is the DNS host which holds all domains useful for the topology, and enables to connect to the internet for installing software needed for the tests.

alt text

Probe result analysis

Alert retrieval

It is possible to connect to a running machine in order to inspect its logs. This is useful to retrieve alerts from a probe under test. In the example topology, the target of evaluation is Suricata. An auditor can fetch the eve.log file and extract each log which could correlate at the time the attack is triggered. In addition, it is possible to wait before triggering the next test. This helps during a live analysis if one wants to reset the alerts triggered by the probe.

Automated defense and attack correlation

As explained, alerts can be live retrieved in a lab. Thanks to the MITRE ATT&CK compatibility of M&NTIS, attack and defense alerts can be correlated easier. If the probe under test is compatible with the MITRE’s classification, it is possible elaborate its confusion matrice (False Positives, False Negatives, and so on). Such feature leads to an automated probe performances’s benchmark.

Suricata example

We test the Suricata’s ability to decode the FTP protocol. A file containing a specific value is transfered from a client machine to the FTP server. The topology’s packets are mirrored to Suricata. As a result, M&NTIS is able to benchmark probes in real time. For this example, the following next rule assess if Suricata is able to detect a pattern in the FTP communication. The directory of name e9713ae04a02a810d6f33dd956f42794 is created using the MKDIR FTP command.

alert ftp 192.168.101.2 any -> any any (msg:"POC Sonde: FTP : catched the flag"; content:"e9713ae04a02a810d6f33dd956f42794"; nocase; sid: 1000002; rev: 1;)

Suricata does trigger the next alert

03/29/2024-16:28:47.893439  [**] [1:1000002:1] POC Sonde: FTP : catched the flag [**] [Classification: (null)] [Priority: 3] {TCP} 192.168.101.2:45756 -> 192.168.103.3:21

As a result, Suricata is able to decode at least this kind of command in a FTP transfer.

Limitations

There may be ambiguities on MITRE ATT&CK classification from a defense or attack product point of view. As a result, some deeper inspection on alerts might be required to assess if the alert does correspond to the live attack.

Additionally, temporal correlation can also be considered to verify whether an alert was triggered within the time range during which a stimuli was executed. However, the evaluation of correct classification often remains manual—especially in cases where MITRE ATT&CK tagging cannot be fully trusted or is absent.

Conclusion

M&NTIS platform is suited for security products benchmarking. It offers a framework to perform, among others, network attacks, challenge defense capabilities and measure defense ruleset coverage as well as alert classification relevance.

The proposed evaluation methodology can be automated within M&NTIS Platform, thus allowing to reduce the complexity and the time required to conduct a comprehensive probe’s benchmark.