To improve the security posture and to detect advanced threats organizations needs to embrace proactive hunting on their networks, as part of their regular security activities.
This article is not intended to be an explanation of why proactive hunting is a good thing; there's already plenty of eloquent writing out there on that topic. Rather, this is a discussion of how the process can be run in an efficient way.
The process is based around the steps below, but ultimately the goal is for you to have a better understanding of your infrastructure and to build up a set of automated, repeatable detections, increasing your detection capacity over time. Proactive hunting is expensive (in time mainly), so the last thing you want to do is make it a constant cost (where trying to detect the same thing over and over always requires manual labor). Instead you want to spend the time developing the detection capability once and reuse it in an automated fashion. If you find yourself hunting for a day and having to spend another day a month later attempting to detect the same behavior, you're not doing it right. Your detection corpus should grow over time while your efforts stay constant.
Not really mentioned here is an ad-hoc form of proactive hunting: the idea of simply going through the data you have available, looking for general anomalies without knowing what they are ahead of time. This form of hunting is very legitimate, and given experienced analysts, can have great results. However if not done properly it is not sustainable. This is why this type of hunting should be seen as part of the first step of the process outlined (the idea phase) to ensure the efforts are not wasted.
So let's look at a sample process:
- Posit a new hypothesis.
- Prototype a detection mechanism.
- Test the detection mechanism.
- Tweak the detection mechanism.
- Evaluate the correctness of your hypothesis.
Posit a New Hypothesis
This should begin very simply by saying something like: "If an executable ran from the recycle bin on Windows that would be suspicious?"
You may be tempted to scoff at some hypotheses, thinking they are too obvious. Don't be fooled; the sad reality of every organization is that the real world (not your hypothesis) is *much* dirtier than what you believe it to be. Something that may be an obvious sign of suspicious activity to you may prove to be extremely common because your marketing department is using software developed by an obscure company and they thought it would be a great DRM feature to execute from the recycle bin. You may be right (it's a horrible idea), but it doesn't matter. You will have to deal with the real world.
Prototype a Detection Mechanism
Here you will want to figure out a way of testing your hypothesis. There are usually many ways to do this, but the focus should be on finding a cheap way to give you just enough data to determine whether you're on the right path or not. Sometimes it turns out that developing a prototype is going to be very expensive. There are no hard-and-fast rules; you will have to determine the return on investment of going with this hypothesis versus another one.
Thinking outside the box is often very useful in this step. Developing a complete detection package may be great long-term, but developing two small tests that cover the critical aspects of the hypothesis (or most likely to be wrong) can often pay off.
This is also where having a very agile platform is useful. Having a platform where prototyping won't bring down "production" and where you don't have to wait on a vendor to implement this cool new feature is irreplaceable. Big shock: this is exactly what Lima Charlie is doing.
Test the Detection Mechanism
Next comes the actual deployment of the test. This is fairly straightforward. The only real concerns should be around the impact to your infrastructure and/or your detection system backends. If the tests developed need to be pushed to the various sensors on your infrastructure, you will obviously want to do a progressive rollout, perhaps even limiting the full deployment to a random sample of sensors. Deploying the test to a backend is always better, purely from a stability point of view.
Tweak the Detection Mechanism
In nearly all cases, your detection mechanism will not give you the results you expected. Often, the behavior you thought would never occur does occur at least on some occasions. This is where it becomes necessary to at least attempt to tweak the detection, usually by whitelisting some of the "hits" you've received and found to be false positives. So tweak the detection and redeploy. Getting the detection just right on the first try is so rare that it's almost always worth attempting to tweak it and redeploy it at least once.
Evaluate the Correctness of Your Hypothesis
This is where you wrap up. Were you able to squash all the false positives? The focus is on false positives, not by lack of true positives. This statement is important to understand. What it means is that if you've created a detection, and eliminated all false positives, and are left with no hits at all, it's not a bad detection, it just means that this odd behavior is not occurring right now. From experience, the detection that gives you no true positives today may very well be the one that gives you the initial tip-off to the next attacker on your network in a year or two. This is where collecting and building up your automated detection capabilities becomes truly effective. You've created a new detection and if you've done a good job, you should be able to add it to your roster and only need to revisit it on the rare occasions where something needs to be added to the whitelist.
On Windows, the PowerShell scripting engine can be used in many different ways. Generally speaking, it is invoked with a single argument which is the script file to execute. That being said, it is also possible to invoke the interpreter with the "-EncodedCommand" followed by a base64 encoded command.
- This seems like an odd feature to use legitimately, so I posit that if I see a PowerShell execute with a command line matching that pattern, it is likely suspicious behavior.
- Create a stateless detection in LimaCharlie that looks for any execution on Windows with a command line matching the regular expression ".*(\\\\|/)powershell.exe \-EncodedCommand .+".
- Deploy it to my LC backend.
- Looking at the results, I notice a recurring pattern of an executable called "casemanager.exe" spawning a PowerShell instance with an encoded command. Decoding the command we can see it is simply converting some text files into an XML format. Investigating further reveals that "casemanager.exe" is a custom piece of software developed by the finance department to help them manage data transfer between two products they use. Since this is the only case we see occurring, we will whitelist it and redeploy.
- Once whitelisted, we get no more hits. Success. Remember, just because there are no hits today does not mean we will not get any tomorrow. So upon code review, we commit the new detection to our repository, deploy it and forget about it.
This process of course makes a few assumptions. Chief among them is that you have the capability to codify, deploy and constantly execute those detections. This is by no means a given. Many products now allow you, at the bare minimum, to create customized logic around process parent/child relationships, command line patterns and the likes (of course it goes without saying that this also includes classic indicators like hashes).
If you find yourself lacking this part of the puzzle, I strongly encourage you to have a look at Lima Charlie. It is an open source, cross platform endpoint security framework designed with the process of proactive hunting in mind. It will provide you with many boilerplate features available in most endpoint systems as well as advanced customizable frameworks, enabling you to easily design complex detections and automations.
About the Author: Maxime Lamothe-Brassard is a security engineer at Google. Opinions are his own and do not reflect those of his employer.