Threat Hunting: Detection based on Prevalence
10 Feb 2023 — Borja Merino
One of the phases which we emphasize more after finishing our Purple Team exercises is the review, creation, and improvements of queries that allow us to efficiently identify those techniques executed by the Red Team that have not generated any type of alert in the corresponding EDR. This duty cycle allows us to enrich the detection capabilities of our customers significantly. The result of this type of exercise is usually a set of automatic queries (sometimes tools) that cover those TTPs not identified by the EDR’s own detection logic to continue expanding the protection shield against a greater number of threats.
Generally, queries based on LFO and prevalence are the resources we use most with some clients to detect anomalies that “escape” the known baseline. The objective of this post is to explain precisely this second criterion, prevalence, for being of great value when identifying malicious code executed by Threat Actors and Red Teams.
MDATP allows you to enrich, through the FileProfile() function, the information related to a file that has been the subject of a certain query. This function allows you to recover, for example, file size, whether the file is executable, the state of its signature, etc. One of the most attractive fields for identifying anomalies is GlobalPrevalence which returns the number of instances observed by Microsoft globally (other fields, such as GlobalFirstSeen and GlobalLastSeen, are also really valuable for hunting). From Microsoft’s own documentation, you can see several examples of use:
| where ActionType == "FileCreated" and Timestamp > ago(1d)
| project CreatedOn = Timestamp, FileName, FolderPath, SHA1
| invoke FileProfile("SHA1", 500)
| where GlobalPrevalence < 15
Note: It is important to note that the FileProfile() function has several limitations, as indicated in its documentation: “Enrichment functions will show supplemental information only when they are available. Availability of information is varied and depends on a lot of factors”
Although the existence of a file with low prevalence does not necessarily mean that it is malicious (temporary files, software updates, etc.), if used surgically to locate suspicious binaries, it can be of great value in detecting advanced threats.
Let’s see a practical example. We know that, to this day, side-loading-based techniques popularized by groups such as Korplug/Sogu (or the CIA itself) and even the development of plugins for trusted signed applications are still quite effective in achieving code execution and circumventing some EDR solutions. In a recent purple team exercise, trying to replicate these types of actions, a plugin was developed for Notepad++ to load and execute a CS beacon. The loader (NppRust.dll), basically, after loading and executing certain junk code to delay its execution for a few seconds, reads from a supposed configuration file (plugin.cfg) the corresponding stager (RC4 encryption) to retrieve and execute a CS beacon within the Notepad++ address space.
The GlobalPrevalence field provided by MDATP is of great value against this type of attack by allowing us to generate detection rules that periodically search for process creation events (DeviceProcessEvents) with suspicious DLLs under certain criteria (commonly, many of these queries have various exceptions and conditions to reduce false positives with which to adjust the search to the client’s ecosystem).
A possible search criterion for side-loading or suspicious plugins is to identify unsigned DLLs with a low prevalence that are loaded into signed processes. The following image shows a query of these characteristics, ordering in an ascending way the field of the GlobalPrevalance, on the telemetry generated by the implant previously described.
It is noted that a DLL (NpRust.dll) with a prevalence of 1 has been executed within the process “Notepad++.exe” whose GlobalPrevalence is 62817. If we observe the prevalence of the rest of the loaded plugins, an anomaly with the malicious DLL is clearly manifested.
The possibilities with this field when it comes to building queries are limitless; for example, we can search for processes with a low prevalence executed from certain paths, processes with a low prevalence that execute certain ActionType (“ProcessPrimaryTokenModified”, “QueueUserApcRemoteApiCall”, “CreateRemoteThreadApiCall”, etc.), processes with a low prevalence that resolve domains of Slack, Telegram, etc. Following the previous example, if we incorporate as a criterion signed processes that load unsigned DLLs that have low prevalence and that have created a successful connection, we can further limit the search to malicious DLLs that take advantage of legitimate processes.
In our scenario, this query returns the C2 to which the implant connects, data that allows us to identify new implants that take advantage of side loading (in the example, a legitimate binary of hp, discfcsn.exe).
Although the examples have been made with MDATP, EDR systems, and SIEM solutions that collect this type of information in their telemetry, they provide great weight to Hunting while limiting Threat Actors and Red Team their execution paths (even more so when techniques based on “Command and Scripting Interpreter” are also closely monitored). Hunters, on the other hand, should direct their efforts to polish queries based on prevalence in a precise way to limit and reduce the number of FPs as much as possible.