In this blog, we will share why we started CloudQuery as an open-source cloud asset inventory, as well as some of our product and technical decisions along the way.
Before I jump into the technical and product discussion, I want to give a personal context: As someone who has been in the security industry for more than a decade, working for, running enterprise security companies and seeing many founded during the last couple of years. One thing always made my stomach hurt: The "Get a Demo" button (Maybe because Im a millennial or just because the industry didn't move fast enough).
I was looking for alternative solutions, potentially dev-first or even better open-source so I can engineer around them, but couldn't find any. To validate that I'm not missing something nor I'm the only one asking for that, I've released the first CQ version early 2021 and it started to gain traction pretty quick. This is where we decided to double down on the opportunity, raise money, build the team and embark on this journey.
Now, I'll share some of our thought process and how we've built and designed CloudQuery and why we think open-source is the right way to solve some of the fundamental challenges in the cloud.
First, let’s look at the following simplified cloud management market landscape.
This landscape can be quite confusing as the number of acronyms, vendors, and solutions just keep growing, so for simplicity, I’ve scribbled just 3 circles: CSPM, Cost, and another for all other acronyms.
Before we embarked on our open-source journey, we were looking at this landscape and tried to understand: Why are there so many vendors? Why are more popping up? Why are more acronyms popping up? What is the root cause for that?
If we throw for a second all confusing acronyms out the window and define in layman’s terms what we want to achieve as SREs, Security Engineers, or DevOps, we can say the following: “I want to ask questions, get answers and then enforce/monitor some of those answers based on what I have in my Cloud/SaaS infrastructure”.
Translating this to technical, product terms, it means we need an up-to-date database with all the information/configuration (asset inventory) to be able to ask questions.
This means we need a performant, up-to-date ETL (Extract, Transform, Load) engine with a wide variety of integrations - with good breadth (support for many different cloud/service providers) and depth (comprehensive coverage for every cloud provider’s features).
And then we got our aha moment!
The first and foremost issue: is that we have an infinite amount of APIs. If we look at other verticals such as IaC (Infrastructure as Code -Terraform, Pulumi, CloudFormation) we can see that all of them are open-source, and for a good reason:
The second issue we saw that caused the market fragmentation: co-locating the ETL engine with the database and processing layer.
Different solutions might need different queries or even different databases. Moreover, the number of use-cases and questions you want to ask and enforce is infinite, so the user must have raw access to the database.
Given these insights, we scribbled the following:
You can observe the following components:
Replacing acronyms with use-cases: Instead of adding more acronyms we just want to focus on the end use-cases:
We are really excited about the future of cloud management and we think it’s open-source, customizable, and community first.
P.S - we are hiring, join us to build an open-source future.