The new Opengrep repo can be found here.
Last month, Semgrep (who apparently just re-messaged as an AI AppSec engineer?) made the unfortunate decision to change their licensing model. Here was my original response to that on LinkedIn. Because we live in a world where nuance gets skipped, it’s important to say first that I’ve always liked Semgrep’s scanning capabilities and values. They have (and had) legendary community advocates whom you should go follow, like Tanya Janca, Kyle Kelly, and Clint Gibler. They also have great researchers who are heavily engaged like Kurt Boberg and Pieter De Cremer. They’re even running a promotion where you can get your team free training from Tanya herself!
Unfortunately these sorts of posts often get reduced to “James likes or doesn’t like vendor x,” but I honestly don’t feel that way about any vendor. Every vendor has strengths and weaknesses depending on the use case (or else they’d have no customers!). The question of “success” is just a measurement of how broad that use case is, as how much better a company does it than their competition.
In this post, I’ll share why I was disappointed with the license change, and why I’m so excited for OpenGrep, a forever open source alternative sponsored by over 10 competitive security vendors.
A small change with big impact
I found Semgrep’s license changes to be disappointing from both a “market analyst” position, and from end user position. From an analyst position, rug pulling an open source license is the lazy way of trying to stop competition, while harming everyone’s customers. By way of analogy, I remember what Netflix in the video, or Steam in the gaming industries said about tackling piracy - rather than focusing on shutting down pirates, their goal was to create a product experience better than what the pirates offered. Customers are best served when companies focus on creating a paid offering better than the free one.
When it comes to open source licensing, to be clear, Semgrep has a legal right to own their work, even if it contradicts their earlier stated goals. Also, unlike some more opinionated OSS advocates, I don’t have any strong moral issues with them changing licenses, besides a general feeling that it’s lame. However, it’s better for everyone if instead of focusing on winning via legal battles (Drake), we focus on delivering the things people want (Kendrick). I’m now one step closer to the “three things Drake and Kendrick taught me about B2B SaaS” post.
From the end user perspective, I don’t love Semgrep’s SaaS product - I think it was slow to add scanning capabilities without the CLI, the rule structure creates a ton of weird duplication, and the general management features I find to be lacking. For clarity, it’s far from an uncompetitive product - having features, detections, and customizations that make a competitive PoC with anyone else in the industry, it’s just not the thing that makes me talk about Semgrep.
Conversely, a lot of Semgrep using companies have things I find really valuable - from Arnica’s workflows (which I recently did a video using), to Amplify’s AI AutoFixes, to Aikido’s all in one capabilities, to Kodem’s runtime reachability, or Endor’s static reachability - each vendor has iterated on Semgrep to provide clearly differentiated and valuable products. I could say positive things about each vendor supporting this initiative.
In the security industry, there’s a weird amount of hate for “vendors that are ‘just’ a wrapper for open source x.” I think it’s weird because this is how most modern software works - a lot of SaaS is “just” applied open source projects at scale. This is why I wish Semgrep had taken a licensing approach rather than putting up a wall - let end users be free to meet their needs, and get paid your deserved fee for building the engine. This was not apparent from Semgrep’s post, but in discussing with vendors, it seems that Semgrep chose to try and shutdown competition, rather than just trying to make them pay for it.
But if you’re an engineer, let’s talk about what matters - getting the most security scanning you can in the easiest way possible, preferably for free. Semgrep’s move towards the paid platform has over time stonewalled some cool capabilities from free:
Multi-file analysis
Cross-function analysis
Reachability
Pro detection rules
Support for certain languages
Important metadata around ignore status
The loss of metadata is what now makes using Semgrep internally, for free, virtually impossible - as ignoring findings is the most critical part of running a SAST at scale. Effectively, the open source version of Semgrep is now a fun scanning toy for one off scans, but nothing you could seriously implement at an organization.
Five Reasons to Care about Opengrep
This is why I’m excited about Opengrep launching today from a consortium of application security vendors. Application Security is a rare field because most vendors I know, especially startups, are genuine developer security nerds who just want to make developer lives easier. To be sure, each is passionately committed to their approach; however, most choose to try and make something helpful rather than sell snake oil as is common perception.
Opengrep is committed to providing a permanent solution for open source static analysis. Here are five reasons I think this is great for everyone.
Static Security Analysis is most often a black box comparison, with thorough testing always being somewhat cherry picked.
On the Devsecops subreddit, there’s almost a daily post asking “I’m comparing vendor x with vendor y, which SAST should I use?” One issue with this question is that you’d expect then answer to be a math problem - here’s the vendor with the most detections, and what percent of those are true positives.
The problem with SAST is that it’s impossible for any individual to fully vet these tools. Arguably, I’ve done some of the most SAST testing of any person or group out there, but I’ve never dared to say “this scanner is objectively the best one,” because I know each interprets findings drastically differently.
For instance, when I made a simple post congratulating some vendors for detecting a Sequelize based SQL injection, I was immediately made aware of the complexity of just this single example. Some vendors messaged me as they had detected it, but disabled the rule because it was so noisy. Semgrep commented that they discovered a gap in taint source support with Node standard libraries. Other vendors caught it as almost an accidental consequence of how they detect SQL injections.
The fact of the matter is that the industry benefits immensely from open source data and baselines to build and find these detections quickly. Whether it’s to quickly support emerging frameworks (looking at you JS framework of the week), or attempting the most complex hidden injection ever, the industry needs an open source baseline that can be easily shared and updated.
The industry is better off with a standardized and shareable AST
I view the creation of an abstract syntax tree (AST) as little different from having a standardized SBOM format - having a universal way to break down code into common patterns is not something that should need constant reinvention. From this common breakdown, every vendor I’ve worked with has their own way of maintaining in house rules, and deciding which community rules to support. Vendor time is better spent differentiating here, on things like false positive reduction or prioritization, than on new ways to break down code.
Community rules developed in Opengrep can more genuinely unify language dependent open source scanners
Despite the success of Semgrep, many companies have the “too many scanners” problem because they’ve had to spread out across open source projects, such as Bandit for Python or Rubocop for Ruby. There’s a huge engineer benefit to these tools being unified; however, Semgrep adoption would always be hampered by the possibility of an open source rug pull. I hope Opengrep can create a long-term unification of these scanners.
Paid features will eventually become free
On day one, Opengrep will offer seemingly minor features for free - most importantly the metadata that Semgrep has now locked behind login. However, I’d encourage attending the roadmap session, in order to learn more about the future of the product. I’m hopeful here because I know many of the vendors supporting the project have built their own ways of doing the Semgrep pro features, and I’m hoping they’ll find their way contributed back to the community.
Anyone can help with confidence
I hope that the knowledge of long term community ownership of this project will encourage greater community contribution and involvement, since you’re not basically doing free work for a particular vendor, but genuinely helping the entire community.
I imagine the most common reaction to this news will be, “I use vendor x who doesn’t use Semgrep, so I don’t care about this.” Nonetheless, Opengrep offers a future where we can instead say “I can’t believe I used to have to pay just for that.” Application security is hard enough. The community deserves a great free scanning tool, one with robust options that doesn’t just exist to ultimately serve a single corporate interest.