The Security Data Fabric Shift Explained: Why Zscaler Paid $350M for Avalor And What It Means For The Security Industry
Security Data Fabrics Explained
Welcome to a special edition of TCP! I'm Darwin Salazar, Product Managerr at Monad and a former Detection Engineer. Each week, I distill the latest and most exciting developments in cybersecurity innovation into digestible, bite-sized updates. If you’re serious about staying at the forefront of the latest in security products, attacker techniques, and industry news make sure to hit the “Subscribe” button below to get my insights delivered straight to your inbox every week 📩
Zscaler's recent acquisition of Avalor has the industry buzzing with many folks, including seasoned practitioners and VCs, scratching their heads. Why did Zscaler drop ~$350M on a data fabric geared towards vulnerability management? Isn't Zscaler a network security, CNAPP, and zero trust player? What even is a data fabric? Zscaler CEO, Jay Chaudhry, states that it's an AI play. What does this all really mean?
The acquisition also seems to have served as a catalyst for Avalor-like companies to make moves. Below is a timeline of what transpired after the acquisition was finalized:
March 14th: Zscaler finalizes Avalor acquisition for $350M.
March 19th: Sentinel One's venture arm, S Ventures, invests in Auguria.
March 21st: Tarsal raises $6M seed and appoints Barrett Lyon as CTO.
March 22nd: Leen.dev announces their $2.8M pre-seed.
March 26th: Abstract Security emerges from stealth with $8.5M seed funding.
All of this activity comes on the heels of the "best-of-breed vs. platform" debate Palo Alto Network's most recent earnings report sparked. In some way, this is all connected and in this post, I'll attempt to make sense of it. This is one of my lengthier posts so feel free to skip through the sections.
Here's what we'll be covering:
Security Data Fabric v. Security Data ETL
What Zscaler and Avalor do
Why the acquisition makes sense
What it means for the industry
Security Data Fabric v. Security Data ETL
It’s easy to conflate data fabrics with data ETL processes so let’s clearly define these before moving forward.
Security Data Fabric
A security data fabric is the infrastructure and processes that create an integrated layer across many nodes (or data sources). Data fabrics often leverage advanced analytics and machine learning to identify relationships and patterns among disparate data sources, enabling real-time, data-driven decision-making.
This is what Avalor does, for vulnerability management.
Security Data ETL
On the other hand, security data ETL (Extract, Transform, Load) is a specific process that focuses on extracting data from various sources, transforming it into a clean, standardized format, and loading it into a central repository for analysis. ETL ensures data consistency and accuracy but it does not include the analytical nor relational aspects of a data fabric.
While a security data fabric enables an integrated, flexible, and analytics-driven approach to managing security data, ETL processes are what set the foundations for a data fabric.
Now, let’s take a look at Zscaler and Avalor..
Zscaler
At a high level, Zscaler is a market leader in the Security Service Edge (SSE) space, a key cloud security contender with their Cloud-native application protection platform (CNAPP), and they also market a 'Zero Trust Exchange' platform. Ultimately, Zscaler has bundled many traditionally stand-alone products like a CSPM, CASB, CWPP, CIEM, firewalls, and web gateways into these various core platform offerings.
With nearly 3,500 customers and a wide array of offerings, I'd imagine Zscaler ingests, processes, and correlates at least a few petabytes of data per day. For reference, Chaudhry states that they process 400 billion cloud 'transactions' per day. You can only imagine how much they're doing on the SSE and Zero Trust Exchange side.
Lastly, Zscaler is no small fry. They're a publicly traded company with a market cap. of ~$29B.
Avalor
Avalor describes itself as a Data Fabric for Security with it's use case being vulnerability management. What this means is that Avalor extracts data from a customer's security solutions, normalizes and cross-pollinates findings from those solutions in order to surface highest-risk issues and then presents them to the customer. Avalor also has remediation assistance workflows so its not like they're solely surfacing vulns, but also helping with remediation.
Aside from having a robust correlation engine that factors in the context (i.e., environment, reachability, exploitability, resource tags) that influence potential vulnerability impact, they also seem to have highly performant data infrastructure that enables the above, at scale. Avalor also has 150+ 3rd party integrations which include some of Zscaler's competitors.
Note: While founded in 2022, Avalor came out of stealth in April 2023 with $30M in Series A funding. Backed by Cyberstarts, TCV, and Salesforce Ventures.
Why the acquisition makes sense
Simply put, the acquisition makes sense because it enables Zscaler to accelerate the implementation of AI into their products and it provides the infrastructure needed to correlate massive volumes of data across their offerings. This cross-correlation of data allows Zscaler to surface higher fidelity and prioritized findings to their customers, providing a level of precision and context that they weren’t previously able to.
In the past couple of years, we’ve seen vendors double down on contextualizing security issues, because without context, everything is seemingly on fire all the time and security teams struggle with deciding what to prioritize. I’m a firm believer that Wiz has eaten much of PANWs market share due to their attack path analysis and other contextual features. Without data infra to support cross-pollination of data sources, it’s nearly impossible to add context to security issues. This is why the Avalor acquisition gives Zscaler an upper hand in the near-term.
Building the data infra to power both AI and large-scale data correlation at Zscaler's scale is a tall and hairy engineering challenge. It requires processing petabytes of data from millions of endpoints, billions of daily transactions, and numerous different data sources. The data pipelines and storage systems must be highly scalable, secure, and optimized for real-time correlation and AI applications.
Avalor's data fabric offers a turnkey solution to this challenge. It can cleanse, normalize, and enrich data from Zscaler's various products to create a unified data asset that can be used for both training AI models and correlating security findings. For example, robust data infra can enable Retrieval-Augmented Generation (RAG) techniques to dynamically retrieve relevant snippets or context from the data fabric to inform AI-generated security recommendations.
With the acquisition, Zscaler gains access to Avalor's 150+ integrations, which allow the ingestion of diverse data from various sources, including Zscaler competitors. This rich dataset is INVALUABLE for training AI models and gives Zscaler a unqiue advantage in the market.
However, this also means that Zscaler now has a backdoor to competitor data through Avalor's integrations, which makes for a tricky situation. Maintaining these integrations require ongoing partnerships for API updates, troubleshooting etc. so it’ll be interesting to see how this all plays out.
Lastly, the acquisition also allows Zscaler to enter the vulnerability management space with a differentiated, mature product that is eating market share from incumbents. Another big win.
By acquiring Avalor, Zscaler has:
Accelerated its AI roadmap by years. I expect a Zscaler copilot later this year.
Enhanced its ability to deliver high-fidelity, prioritized security findings.
Acquired a next-gen vulnerability management solution.
Inherited 150+ integrations with visibility into competitor’s data models.
In my opinion, this acquisition positions Zscaler extremely well moving forward. At the end of the day, integrating codebases, products and backends is not easy so the value left to be realized will come down to how well Zscaler can execute.
What it tells us about where we're headed
For Incumbent Vendors
The security industry has lagged other industries in adopting big data, ML and AI, but this acquisition highlights that security is finally becoming more data-driven.
To be competitive in the new security landscape, vendors need a robust data strategy. This includes data collection pipelines that can handle terabytes of data per day, scalable storage systems for petabyte-scale data lakes, stream and batch processing to derive real-time and historical insights, and schema management to impose structure on disparate data sources.
If you're a vendor with multiple products in your portfolio going for a platform play, you need a data fabric to deliver an integrated user experience. A data fabric enables normalizing data models across products, linking entities, and providing unified APIs and UIs. This is especially important for vendors who have grown via acquisition and have disparate backends.
Here are some security vendors investing in data fabrics:
Crowdstrike: Falcon Platform and their threat graph.
Microsoft: In 2019, Microsoft launched their Intelligent Security Graph which connects signals across their security ecosystem. I’d imagine their security copilot leverages this graph.
F5: Recently launched an AI data fabric to that powers their AI copilot.
SentinelOne: Singularity Platform and data lake depicted below.
For Startups
If you're a security startup looking to differentiate, a 3rd party data fabric can provide access to a diversity of data sources to power unique insights. By leveraging a data fabric, startups can focus on building innovative analytics and AI capabilities on top of a comprehensive dataset, rather than spending valuable resources on data integration and normalization.
This can help them quickly bring differentiated offerings to market that draw insights from a wider range of security signals.
For Security Teams
Security teams that are adopting a best-of-breed approach can leverage fabrics and ETL products to have harmonize, normalize, enrich, cross-correlate and move data on their own terms rather than on the terms of a vendor.
Every organization has a different risk profile and business environment. It’s impossible for platform vendors to account for this variability. Taking on a best-of-breed approach with a fabric or ETL solution, enables security teams to build tailor-made security strategies and solutions though it requires more work.
Security leaders can also leverage this tooling to create richer, continuous, and custom KPIs spanning multiple data sources rather than going the spreadsheet route. By optimizing data before feeding it into a SIEM, data lake, long-term storage etc., teams can save on compute, ingest, and storage costs.
In essence, security data fabrics and ETL products enable teams to own their data, maximize the value of their best-of-breed tools, streamline security operations, and create tailored KPIs while potentially reducing their massive SIEM costs. This approach puts the power back in the hands of security teams, rather than relying too much on vendors.
Conclusion
Zscaler's $350M acquisition of Avalor is evidence of the growing importance of data and AI in cybersecurity. By inheriting Avalor’s proven data fabric and differentiated vuln management solution into their portfolio, Zscaler has positioned itself extremely well for the next 5 years.
As the best-of-breed vs. platform debate rages on, one thing is clear: both approaches can benefit from a data fabric.
As more vendors invest in data fabrics and AI, the writing on the wall becomes more prominent. Data and AI will be the driving forces shaping the next decade of cybersecurity innovation.
Feedback?
Have any feedback or would like to keep the discussion going? Feel free to drop a comment, subscribe and share with friends!
Disclaimer: The views and opinions expressed are solely my own and do not reflect the views of my employer.
full disclosure - I'm with Avalor, so my perspective is shaped by my depth of knowledge on the value of data fabrics - at the very end of your comments, you have this line:
As the best-of-breed vs. platform debate rages on, one thing is clear: both approaches can benefit from a data fabric.
I would go one step further and argue that a data fabric doesn't just "benefit" both best-of-breed and platform - it ends the need for those to be in opposition - with a data fabric, you get best-of-breed AND all the benefits of a platform instead of making a trade off between the two -
this is what I see as a one of the big wins for customers - use whatever solutions fit your environment best AND have all the data in one place, informing and enriching each other
great post - thx for your analysis!