Runtime Detection for Kubernetes and Linux

Oct 19, 2023

Learn more about runtime protection from Spyderbat: their history, platform, use cases, pros, and cons. As always, nothing in this post is sponsored or AI generated!

Overview

Cloud configuration and vulnerability scanning were the first frontier of cloud native security startups for a reason: frankly, it was easy. Open source projects like CloudQuery, ScoutSuite, and others were copied with only a few weeks worth of work, and then the Cloud APIs scanned for “misconfigurations.”

Drowning in alerts, many companies have started realizing the difficulty of operationalizing this data without context, and ultimately that runtime protection is where the rubber meets the road for defense. No amount of vulnerability scanning can prevent zero days, missed endpoints, or burn down their own backlogs. Whether it’s runtime context informing the configuration scanning, or runtime defense providing complete prevention, runtime has proven to be the next frontier of cloud protection.

This week, we dive deep on Spyderbat, one of the first companies to invest heavily into eBPF tooling that has since expanded rapidly into K8s and cloud runtime protection. We’ll take a look at Spyderbat’s history, platform, ideal use cases, pros, and cons.

History

Spyderbat was founded in 2019 in Austin Texas, raised a 4.2 million dollar seed round in 2020, and in 2022 raised a 10 million dollar series A. Their early differentiator was visualizing process trees on Linux systems to enable easier investigations for security teams. At the time, the major EDR players were far behind on Linux protection, and this is where Spyderbat excelled.

Spyderbat has continued to focus on runtime insights, but as the major players have slowly added in attack graphs and expanded container support, Spyderbat differentiated through two factors: clearly mapping attacker sessions and expanding into kubernetes protections.

What’s most remarkable about Spyderbat has been their continued innovation with a team of only 25 employees. From meeting with them over the last month, they’re a highly technical team that isn’t afraid to push the boundaries of runtime detection using the immense event data at their disposal. When the product manager and sales team can talk in depth about kubernetes configuration and telemetry data, you have a team that’s more committed to solving technical challenges than finding the perfect “GTM motion” or “market positioning.”

Platform

The Spyderbat platform can be separated into four primary features: the first, investigate, is the historical core of the product. The other three are newer: Kubernetes (a view of alerts tied to containers), Spydertop (shows process hardware utilization), and Guardian (preventative process and network configurations). Each of the newer features offer unique value propositions that show Spyderbat’s expansion into kubernetes workloads.

Investigate

The core workflow of the platform is creating investigations out of “Spydertraces” - rolled up alerts of potentially malicious activity. Spydertraces are triggered by suspicious actions, such as a root shell or sudo command, and roll up the rest of the historical session under them. This is the key noise reduction and insight provided over other EDR vendors - instead of getting a bunch of alerts and trying to piece together what happened with more granular searches, the platform is doing the rollup for you.

Example Spydertraces - the suspicious command rolls up all correlated alerts

The basic Spydertrace rollup offers a ton of nice features: showing if a session is still open, how long it lasted for, and the process tree of what happened. However, things quickly got complicated when trying to expand beyond this: “data layers” can be added in to show additional context, and searches run across the data provided via eBPF.

Data layers are Spyderbat’s concept of showing multiple searches together visually

Overall, the Investigation part of the platform offers the core value proposition of next generation EDR - empowering teams with relevant data to quickly assess threats. Spydertraces accurately reflected attacks that happened, but layering in additional data got just as confusing as it is elsewhere.

Kubernetes

Kubernetes extends the investigation view into a visualization for kubernetes clusters. Most EDR agents are atrocious at accomplishing this piece of modern protection: Linux host alerts are absolutely useless without relevant cluster context. Spyderbat’s visualization of K8s is one of the most useful I’ve seen in that it:

Shows network traffic from the containers
Shows the services alongside the pods
Shows the containers underneath the pods

Visualization of a K8s cluster - you can imagine this gets noisy in large environments

This piece of the platform does a great job visualizing the alerts; however, it is still confusing when the platform will be taking you to this view vs. the investigate view. Furthermore, the investigate view can occasionally show logs from the point of view of the root node rather than the container itself.

Example of investigation showing the node view rather than the cluster view, leading to confusion, especially for users not familiar with what might be happening

Spydertop

The Spydertop functionality offers something for DevOps and SRE teams to get value out of the platform: seeing the process utilization of what’s running, along with network and disk traffic. An additional value layer is seeing this historically, being able to troubleshoot utilization issues during load testing. A lot of the time, the SRE team is separated from the development team and has very little visibility into what’s running inside the container itself. From their perspective, if the container is running, then there are no issues.

Example of a python webapp process

Spydertop helps as a quick and basic way to see what might be happening inside the container. While DataDog, Grafana, etc. offer much more robust telemetry than this, there’s a value in the simplicity of seeing historical data from a process point of view rather than pivoting through graphs of data. This can be especially useful for security teams, as understanding the runtime user of a process can often be challenging.

Ability to sort processes and utilization by namespace

Guardian

Guardian is the latest feature within the Spyderbat platform, and it offers a peak at what I believe is the future of kubernetes security. Network Security Policies and Process level allow lists have long been a potential feature of Kubernetes platforms; however, almost no organizations are mature enough to have robust declarations of allowable network and process actions within their containers. The future of this space will be defined by companies that make implementing these things automatic and safe.

Example of a Guardian Policy in the GUI

Because of their access to the runtime data, Spyderbat has a unique ability to generate these policies based on the actual runtime actions of the processes rather than trying to hypothesize from the code what they will do. Theoretically, one could leave a policy running in audit mode for a day or so at a time, and then elevate that policy to preventative mode for each new deployment. Similar actions could be taken by generating the policy based on actions in staging, and then enforcing it in production.

Implementing these protections would still be a challenge: Spyderbat offers a useful CLI interface to script policy generation and changes, but there’s a lot that can potentially go wrong when implementing something like this. All in all this is still a nerdy problem to solve, but Spyderbat promises to make the impossible achievable through the runtime policy building.

Ideal Use Cases

DevOps minded security teams with the freedom to invest in container and linux runtime

Spyderbat is one of very few options available to technical security teams hoping to level up their runtime security in kubernetes environments. The challenge will always be that this requires a level of visibility and sophistication that many teams don’t have, as they’re struggling to manage the deluge of alerts from their other tooling. That being said, meaningful consolidation here can mean reducing the investment in other tools as they become unnecessary. Potential cost savings can be achieved by moving from expensive providers who often overcharge for container agents.

Security teams frustrated with their EDR performance in cloud and container environments

As someone who has managed and deployed classic EDR solutions into containers, words cannot describe the frustration and lack of insights they provide. Often, these agents serve as mere fodder for renewal contracts without generating any meaningful alerts. They waste a ton of analyst time with useless data, and waste a ton of engineering time with weird behavior and the need for custom rules. Spyderbat also requires lower compute overhead than other agents.

SRE teams looking to shore up their security and gain visibility into processes

SRE teams who care about the security of their running workloads can find a friend in Spyderbat’s Spydertop feature. In my days working closely with SRE teams, one of our constant areas of confusion on incident response calls was deciphering what had happened to a pod to make it crash. Was it truly an attack? Did a process run out of control? Usually, this meant correlating SIEM logs against DataDog just to get a basic picture of what happened.

Large enterprises with high sensitivity data scoped to specific environments

If your enterprise has highly sensitive environments, it can be worth implementing the Guardian feature to gain the strictest visibility and guidelines around the workloads.

Security teams looking for linux EDR

Beyond containers, Spyderbat has been doing Linux detection longer, and it’s arguably much more mature in this area. When it comes to Linux protection alone, I can’t think of an EDR provider I’d rather be using (when it comes to containers, my opinion here is more situational).

When to Avoid

You’re already overwhelmed by your existing alerts and tooling, and getting a handle on kubernetes seems like a distant dream. In this case, adding more alerts will never be an answer unless you can use Spyderbat to remove noise from other existing detections.
Lack of containerized or linux workloads
Lack of SRE security champions or skill gaps on your existing security team for kubernetes
Lack of existing visibility into where your container workloads are and their deployment flows
Analysts are unprepared for this level of alerting and your more sophisticated teams are already overwhelmed. Getting your analysts used to the platform will require some level of kubernetes familiarity or training if they don’t already have it.

Pros & Cons

Pros:

Rich runtime alert data with process and network visibility
Spydertraces provide real time insights into activity into your cluster
Guardian offers a unique view to the future of preventative security
API driven platform that’s able to integrate with a ton of telemetry providers
Kubernetes view provides great visibility into what your workloads are actually doing

Cons:

Dashboard UI is built entirely around searches, leading to some confusing use cases, workflows, and data visualization
Kubernetes and Linux host views can sometimes collide and cause confusion
No FIM or Malware scanning
Requires other tooling for general cloud visibility outside of the workloads
Alerts require a moderate level of technical kubernetes experience to decipher

Checkout the video for more!