Microsoft open sources CodeQL queries used in Solorigate inquiry

Microsoft sign attached to a building

Microsoft has made its CodeQL querying tool open source so developers can scan code for security flaws that match those unearthed in the recent SolarWinds supply-chain attack.

According to the Microsoft security team, a key aspect of the so-called Solorigate attack was the supply chain compromise that enabled hackers to modify binaries in SolarWinds’ Orion product. This attack allowed criminals to remotely perform malicious activities, such as credential theft, privilege escalation, and lateral movement, to steal sensitive information.

Microsoft disclosed the attack also compromised some of its systems. It recently concluded that while some code files for Azure, Intune, and Exchange were accessed, no customer data was compromised. At the time, Microsoft President Brad Smith called it "a moment of reckoning".

To ensure hackers didn’t modify Microsoft’s code, it crafted CodeQL queries to scan code for malicious modifications. CodeQL is a semantic code-analysis engine that’s part of GitHub and can scan code for security vulnerabilities and share this data with others to help protect their code. It builds a database around the compiling code that can be queried like a normal database. It can be used for static analysis and reactive code inspection across the enterprise.

The firm announced it’ll release its SolarWinds CodeQL queries so developers can scan their code for potential compromises.

"We are open sourcing the CodeQL queries that we used in this investigation so that other organizations may perform a similar analysis," it said.

It added that the queries simply serve to “home in on source code that shares similarities with the source in the Solorigate implant, either in the syntactic elements (names, literals, etc.) or in functionality”.

Microsoft has aggregated the CodeQL databases produced by the various build systems or pipelines company-wide to a centralized infrastructure where it can query across the breadth of CodeQL databases at once.

“Aggregating CodeQL databases allows us to search semantically across our multitude of codebases and look for code conditions that may span between multiple assemblies, libraries, or modules based on the specific code that was part of a build. We built this capability to analyze thousands of repositories for newly described variants of vulnerabilities within hours of the variant being described, but it also allowed us to do a first-pass investigation for Solorigate implant patterns similarly, quickly,” Microsoft said.

Microsoft warned that some CodeQL queries might find similar behavior in benign code, so all “findings will need review to determine if they are actionable.”

You can find the CodeQL queries on GitHub.

Rene Millman

Rene Millman is a freelance writer and broadcaster who covers cybersecurity, AI, IoT, and the cloud. He also works as a contributing analyst at GigaOm and has previously worked as an analyst for Gartner covering the infrastructure market. He has made numerous television appearances to give his views and expertise on technology trends and companies that affect and shape our lives. You can follow Rene Millman on Twitter.