Endpoint intrusion detection is hard (well, it’s also difficult on the network, but we’re talking hosts in this post). To get really actionable results, we need to minimize the noise, but also be careful not to be fat-fingered on those ignore lists because they might cost us some juicy findings. This is not a trivial task: in most environments we have heterogeneous systems running diverse workloads, which means there’s no one-size-fits-all ruleset that you can just load up and call it a day. The same challenges (although on a smaller scale) are present in my homelab as well. I might have mentioned before that the home network must be Fort Knox, so here I go developing endpoint detection rules which will be running on all my servers.
During my quest for the ultimate solution, I found that there’s relatively little information which goes in-depth on Linux detection use cases, especially when considering open source software. Of course, there’s trusty old auditd, but sometimes it suffers from performance problems and I wouldn’t wish parsing its multiline log format upon my worst enemy. Auditbeat eases this pain a little with JSON logging, but it still uses the quite dated audit subsystem.
So let’s ditch auditd. What do we have instead? Nowadays, eBPF is all the rage, which is an in-kernel virtual machine that makes developing kernel stuff safer and easier. Going into the inner workings of eBPF is outside the scope of this post, but let’s just say you can trace syscalls with it more efficiently than auditd. Surprisingly, only a few tools exist which utilize this awesome piece of tech. One that does exist is Sysmon for Linux, a port by Microsoft of the well known Sysmon tool, which runs on Windows. Let’s find out what it can do.
For the following examples, I’m going to assume a working Sysmon setup on a modern Linux kernel that supports eBPF (I use Debian 12 for testing). Also, all code that follows is to be placed inside the Sysmon.EventFiltering XML tag in config.xml. I’m using sysmonLogView to format the (originally XML) events.
Initial config
By default Sysmon logs all process terminations. While knowing whether a command was successful or not can be beneficial in some incident response scenarios, it also produces large volumes of logs. We can disable them for now:
<RuleGroup name="disable_event_classes" groupRelation="or">
<ProcessTerminate onmatch="include" />
</RuleGroup>
Suspicious processes
Let’s start with catching some script kiddies: most likely they’re going to run privilege escalation scripts once they got a foothold on the machine. Some of the most common ones are linenum, linPEAS and enumy. To detect them we use the following Sysmon config:
<RuleGroup name="sus_exec_privesc" groupRelation="or">
<ProcessCreate onmatch="include">
<CommandLine condition="contains">linpeas</CommandLine>
<CommandLine condition="contains">linenum</CommandLine>
<CommandLine condition="contains">enumy</CommandLine>
</ProcessCreate>
</RuleGroup>
With these rules in place, we get the following log entry when executing bash LinEnum.sh
:
Event SYSMONEVENT_CREATE_PROCESS
RuleName: sus_exec_privesc
UtcTime: 2024-03-01 19:09:32.736
ProcessGuid: {1ba1a76e-27ec-65e2-9d9b-95ccab550000}
ProcessId: 7994
Image: /usr/bin/bash
CommandLine: bash LinEnum.sh
CurrentDirectory: /home/n00b
User: n00b
LogonGuid: {1ba1a76e-0000-0000-e803-000000000000}
LogonId: 1000
TerminalSessionId: 8
IntegrityLevel: no level
Hashes: SHA256=7ebfc53f17925af4340d4218aafd16ba39b5afa8b6ac1f7adc3dd92952a2a237
ParentProcessGuid: {1ba1a76e-27e2-65e2-9d9b-7cd559550000}
ParentProcessId: 7992
ParentImage: /usr/bin/bash
ParentCommandLine: bash
ParentUser: n00b
Cool, we got every details needed for a quality alert: who did what and when. We also have valuable details about the parent process, which will help us later with more advanced detections. This rule will yield few (if any) false positives and it’s also possible to expand the condition list with lots of malicious tool names.
Web server RCE
Next let’s try to detect shells spawned from web application servers like PHP. That shouldn’t be happening, right? Well, sometimes it does, and usually it ain’t pretty. For this demo let’s ask our junior developer, Mr. ChatGPT to write us a nice little PHP app which does things like pinging endpoints and returning the output. Our guy delivered the project like this:
prompt: write a very short php app which displays a textbox asking for an ip address, then pings the ip and displays the output
<!DOCTYPE html>
<html>
<head>
<title>IP Ping Tool</title>
</head>
<body>
<h2>IP Ping Tool</h2>
<form method="post">
<label for="ip">Enter IP Address:</label>
<input type="text" id="ip" name="ip" required>
<button type="submit">Ping</button>
</form>
<?php
if ($_SERVER["REQUEST_METHOD"] == "POST") {
$ip = $_POST["ip"];
$output = shell_exec("ping -c 4 $ip");
echo "<pre>$output</pre>";
}
?>
</body>
</html>
Looks good enough and does the job:
But there’s a little bug that slipped through. Using shell_exec() is something I’m never really comfortable with, but here the consequences are catastrophic: without proper input validation we can easily inject arbitrary commands.
How can this exploit be detected? We know that php-fpm runs with www-data user and normally has no business spawning shells, so we can craft the following Sysmon rule to alert us on such activity:
<RuleGroup name="www_rce" groupRelation="and">
<ProcessCreate onmatch="include">
<User condition="is">www-data</User>
<Image condition="end with">sh</Image>
</ProcessCreate>
</RuleGroup>
Unfortunately, with the ping tool example we’d get a false positive alert on every legitimate use as well, because PHP internally uses /usr/bin/dash to execute shell_exec() calls, which would of course match our event filter and generate the following event:
Event SYSMONEVENT_CREATE_PROCESS
RuleName: www_rce
UtcTime: 2024-03-03 11:53:48.283
ProcessGuid: {1ba1a76e-64cc-65e4-b96b-524826560000}
ProcessId: 14241
Image: /usr/bin/dash
CommandLine: sh -c ping -c 4 172.16.51.254
CurrentDirectory: /var/www/html
User: www-data
LogonGuid: {1ba1a76e-0000-0000-2100-000000000000}
LogonId: 33
TerminalSessionId: 4294967295
IntegrityLevel: no level
Hashes: SHA256=f5adb8bf0100ed0f8c7782ca5f92814e9229525a4b4e0d401cf3bea09ac960a6
ParentProcessGuid: {00000000-0000-0000-0000-000000000000}
ParentProcessId: 14122
Let’s improve our detection. Instead of relying on the executable ending with “sh”, we could instead exclude the known false positives and log everything else:
<RuleGroup name="www_rce" groupRelation="and">
<ProcessCreate onmatch="include">
<User condition="is">www-data</User>
<Image condition="is not">/usr/bin/ping</Image>
<CommandLine condition="not begin with">sh -c ping -c 4</CommandLine>
</ProcessCreate>
</RuleGroup>
Thanks to these tweaks, we no longer get junk logs for normal use, but running the exploit triggers the rule, resulting in this event being logged:
Event SYSMONEVENT_CREATE_PROCESS
RuleName: www_rce
UtcTime: 2024-03-03 12:36:47.746
ProcessGuid: {1ba1a76e-6edf-65e4-a91d-9e9499550000}
ProcessId: 14374
Image: /usr/bin/cat
CommandLine: cat /etc/passwd
CurrentDirectory: /var/www/html
User: www-data
LogonGuid: {1ba1a76e-0000-0000-2100-000000000000}
LogonId: 33
TerminalSessionId: 4294967295
IntegrityLevel: no level
Hashes: SHA256=008f819498fe591f3cc920d543709347d8d14a139bb3482bc2cd8635c1b3162e
ParentProcessGuid: {1ba1a76e-6edc-65e4-b99b-1fa5c4550000}
ParentProcessId: 14372
ParentImage: /usr/bin/dash
ParentCommandLine: sh
ParentUser: www-data
Fileless malware
While running malicious executables from memory on Linux is not as common as on Windows, it’s still worth a detection rule to catch fileless processes. For example, PyLoose, a relatively new malware uses this technique to do bad stuff, mostly start cryptominers.
First, we’re gonna test out fileless execution to see what is visible to Sysmon. For this, I used memrun, a tool to execute ELF binaries from memory. Let’s write a simple app in C for the test and load it with memrun:
# cat hello.c
#include <stdio.h>
int main() {
printf("Hello memory!\n");
return 0;
}
# gcc -o hello hello.c
# ./memrun nofilehere ./hello
Hello memory!
sysmon logged these two events (shortened for readability):
Event SYSMONEVENT_CREATE_PROCESS
...
Image: /root/detecting-fileless-proc/memrun
CommandLine: ./memrun nofilehere ./hello
...
Event SYSMONEVENT_CREATE_PROCESS
...
Image: /memfd:
CommandLine: nofilehere
...
The first is pretty generic and no unique strings there which couldn’t easily be changed by a seasoned attacker. The other one, however, looks more interesting. What really happens here?
# strace -e memfd_create,write,execve -y ./memrun nofileshere ./hello
execve("./memrun", ["./memrun", "nofileshere", "./hello"], 0x7fff921606b8 /* 21 vars */) = 0
memfd_create("", MFD_CLOEXEC) = 3</memfd:>(deleted)
write(3</memfd:>(deleted), "\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0P\20\0\0\0\0\0\0"..., 15952) = 15952
execve("/proc/self/fd/3", ["nofileshere"], 0xc000012040 /* 0 vars */) = 0
write(1</dev/pts/0>, "Hello memory!\n", 14Hello memory!
) = 14
+++ exited with 0 +++
Looking at the strace output we can see that the file descriptor returned by memfd_create() actually resolves to “/memfd:”. Creating a detection rule is now straightforward:
<RuleGroup name="fileless_process" groupRelation="and">
<ProcessCreate onmatch="include">
<Image condition="is">/memfd:</Image>
</ProcessCreate>
</RuleGroup>
Once again, we got results with no common FPs!
Event SYSMONEVENT_CREATE_PROCESS
RuleName: fileless_process
UtcTime: 2024-03-04 18:03:07.769
ProcessGuid: {1ba1a76e-0cdb-65e6-5d91-aa153b560000}
ProcessId: 20068
Image: /memfd:
CommandLine: nofilehere
CurrentDirectory: /root/detecting-fileless-proc
User: root
LogonGuid: {1ba1a76e-081c-65e6-0000-000000000000}
LogonId: 0
TerminalSessionId: 240
IntegrityLevel: no level
ParentProcessGuid: {1ba1a76e-081c-65e6-9d4b-da0675550000}
ParentProcessId: 19798
ParentImage: /usr/bin/bash
ParentCommandLine: -bash
ParentUser: root
Conclusion
We can see that Sysmon is really useful as a host IDS sensor, but it can be a bit tricky to get its configuration right. It’s best to start small, with tried and tested rules that emit no false positives, especially when dealing with environments at scale. You really don’t want to roll out a rule which triggers every few seconds to thousands of servers, only to get drowned in noise and miss the info that really matters. There’s a lot more to explore with this nifty tool, so stay tuned for part 2 where we’ll explore the realms of EventID 3, a.k.a. “Network Connect”.