GitGuardian, the security startup hunting down online secrets to keep companies safe from hackers

More than 3,000 company credentials unwittingly end up online everyday. GitGuardian helps firms plug these leaks

When the login details of an Uber engineer were exposed in 2016 – signalling one of the most high-profile breaches of recent years – the names and addresses of 57 million riders and drivers were left at the mercy of hackers. 

None of Uber’s corporate systems had been directly breached, though. Its security infrastructure was working as it should. Instead, the credentials were found buried within the code of an Uber developer’s personal GitHub account. This account and its repositories were hacked, reportedly due to poor password hygiene and the stolen credentials used to access Uber’s vast datastore. This breach, which Uber sat on for a year, resulted in a then-record-breaking $148 million fine.

Yet despite this public lesson in how not to handle private credentials, so-called company secret leakage is an everyday occurrence

The rise of secret leakage

Research from North Carolina State University found that in just six months between October 2017 and April 2018, more than half a million secrets were uploaded to GitHub repositories, including sensitive login details, access keys, auth tokens and private files. A 2019 SANS Institute survey found that half of company data breaches in the past 12 months were a result of credential hacking – higher than any other attack method among firms using cloud-based services. 

This is where GitGuardian comes in.

Founded in 2017 by Jérémy Thomas and Eric Fourrier – a pair of applied mathematics graduates and software engineers specialising in data science, machine learning and AI – the Paris-based cybersecurity startup uses a combination of algorithms, including pattern matching and machine learning, to hunt for signs of company secrets in online code. According to the company’s figures, more than a staggering 3,000 secrets make their way online every day.

“The idea for GitGuardian came when Eric and I spotted a vulnerability buried in a GitHub repository,” CEO and co-founder Thomas tells IT Pro. “This vulnerability involved sensitive credentials relating to a major company being leaked online that had the potential to cost the firm tens of millions of dollars if they had got into the wrong hands. We alerted the company to the vulnerability and it was able to nullify it in less than a week.” 

“We then built an algorithm and real-time monitoring platform that automated and significantly built-upon the manual steps we took when we made that initial detection, and this platform attracted interest from GitHub’s own Scott Chacon as well as Solomon Hykes from Docker and Renaud Visage from EventBrite.” 

How the cloud is fuelling secret leakage

The problem of sensitive data leakage stems in part from the increasing reliance of software developers on third-party services. To integrate such services, developers often juggle hundreds of credentials with varying sensitivity, from API keys used to provide mapping features on websites to Amazon Web Services login details, and private cryptographic keys for servers. Not to mention the many secrets designed to protect data, surrounding payment systems, intellectual property and more. 

In the process of handling these integrations, more than 40 million developers and almost 3 million businesses and organisations globally use GitHub, the public platform that lets developers share code and collaboratively work on projects. Either by accident (in the majority of cases), or occasionally knowingly, these uploads have company secrets buried within them alongside the code that’s being developed. As was seen with the Uber breach, hackers can theoretically scour this code, steal credentials and hack company accounts all without the developer and their employer being any the wiser.

How GitGuardian plugs these leaks

GitGuardian’s technology works by first linking developers registered on GitHub to their respective companies. This already gives the company greater insight over who their developers are on GitHub and the levels of public activity they’re involved in. This is especially important for developers’ personal repositories because they’re completely out of their companies’ control, yet too often contain corporate credentials. 

Once linked, GitGuardian’s algorithms scrutinise any and all code changes, known as commits, made by these developers in real-time, looking for signs of company secrets. Such signs within these commits range from code patterns to file types that have previously been found to contain credentials.  

“Our algorithms scan the content of more than 2.5 million commits a day, covering over 300 types of secrets from keys to database connection strings, SSL certificates, usernames and passwords,” Thomas continues.

Once a leak occurs, it takes four seconds for GitGuardian to detect it and send an alert to the developer and their security team. On average, the information is removed within 25 minutes and the credential is revoked within the hour. For every alert, GitGuardian seeks feedback from its developers and security teams who rate the accuracy of the detection: were company secrets actually exposed or was it a false positive? Consequently, the algorithm is constantly evolving in response to new secrets and how they are leaked.

This seems like a simple premise, even if the technology behind it is far from simple. But what’s to stop a hacker building a similar algorithm to intercept the secrets before GitGuardian’s platform spots it? 

“GitGuardian is indeed competing with individual black hat hackers, as well as organised criminal groups,” Thomas explains. “We constantly improve our algorithms to be quicker and smarter than they are, and to be able to detect a wider scope of vulnerabilities, which requires a dedicated, highly skilled team.

“We're helped in this by our users and customers who give us feedback – at scale – that we reinject into our algorithms. Our white hat approach allows us to collect feedback and this gives us a tremendous edge over black hats. You can see this as the unfair advantage you get by doing good.”

GitGuardian has already supported global government organisations, more than 100 Fortune 500 companies and 400,000 individual developers. It’s now setting its sights on adding even more developers and companies to its platform to further improve its algorithm, and extend this technology for use on private sites. 

“We started GitGuardian by tackling secrets in source code and private sites,” concludes Thomas. “Our ambition really is to be developers’ and cybersecurity professionals’ best friend when it comes to securing the vulnerability area that is emerging due to modern software development techniques [and] we’re on the road to doing this.”

Featured Resources

How to scale your organisation in the cloud

How to overcome common scaling challenges and choose the right scalable cloud service

Download now

The people factor: A critical ingredient for intelligent communications

How to improve communication within your business

Download now

Future of video conferencing

Optimising video conferencing features to achieve business goals

Download now

Improving cyber security for remote working

13 recommendations for security from any location

Download now

Recommended

GitHub walks back Jewish employee’s termination for using “Nazi” on Slack
Business operations

GitHub walks back Jewish employee’s termination for using “Nazi” on Slack

19 Jan 2021
So you want to work for a tech startup? It’s not all ping-pong tables and yoga
startups

So you want to work for a tech startup? It’s not all ping-pong tables and yoga

22 Dec 2020
The IT Pro Podcast: Is the sun setting on Silicon Valley?
Business strategy

The IT Pro Podcast: Is the sun setting on Silicon Valley?

18 Dec 2020
UK tech startups to watch in 2021
startups

UK tech startups to watch in 2021

15 Dec 2020

Most Popular

Star Alliance passenger data stolen in SITA data breach
data breaches

Star Alliance passenger data stolen in SITA data breach

5 Mar 2021
I went shopping at Amazon’s till-less supermarket so that you don’t have to
automation

I went shopping at Amazon’s till-less supermarket so that you don’t have to

5 Mar 2021
How to find RAM speed, size and type
Laptops

How to find RAM speed, size and type

26 Feb 2021