Microsoft's AI-powering database of celebrity faces has been taken down

The collection of millions of images has been linked with unethical projects in China

Facial recognition

Microsoft has quietly pulled its facial recognition database of 10 million images of 100,000 people's faces offline.

MS Celeb was published in 2016 and contained images of faces gleaned from the internet used to train recognition algorithms. The images were harvested from search engines and included based on them being uploaded with a Creative Commons license.

"The rich information provided by the knowledge base helps to conduct disambiguation and improve the recognition accuracy, and contributes to various real-world applications, such as image captioning and news video analysis," said Microsoft at the time.

According to the accompanying research paper, the database was originally supposed to just include images of celebrities, but according to researcher Adam Harvey's Megapixels project, the term 'celebrity' was used quite broadly.

"While the majority of people in this dataset are American and British actors, the exploitative use of the term "celebrity" extends far beyond Hollywood," said Harvey. "Many of the names in the MS Celeb face recognition dataset are merely people who must maintain an online presence for their professional lives: journalists, artists, musicians, activists, policy makers, writers, and academics.

Advertisement
Advertisement - Article continues below
Advertisement - Article continues below

"Many people in the target list are even vocal critics of the very technology Microsoft is using their name and biometric information to build."

Aside from developing facial recognition algorithms, the database had other applications. Military researchers harnessed the large dataset, as did Chinese AI and facial recognition companies SenseTime and Megvii, according to the Financial Times.

The database was also reportedly linked with startups in China that build AI algorithms to profile and track ethnic minorities, mainly consisting of Muslims.

China's pervasive surveillance camera network has come under scrutiny since its inception, as has its social credit system but the discovery of the profiling and tracking of the Uighurs was a first for the country.

Although it's been taken offline, traces of the database still exist on the web and freely available to download on GitHub, along with many other databases filled with millions of images.

Advertisement - Article continues below

The facial recognition industry is one of much controversy, the technology is frequently shown to be inaccurate - in some cases showing racial and gender bias. Other notable cases include the NYPD's clumsy use of the technology, using celebrity lookalikes to search its database for real criminals.

Using publicly available images to fill databases has also caused a stir in recent months. Notably, a database used by IBM contained one million faces gleaned from image hosting site Flickr, which prompted privacy concerns. 

Featured Resources

What you need to know about migrating to SAP S/4HANA

Factors to assess how and when to begin migration

Download now

Your enterprise cloud solutions guide

Infrastructure designed to meet your company's IT needs for next-generation cloud applications

Download now

Testing for compliance just became easier

How you can use technology to ensure compliance in your organisation

Download now

Best practices for implementing security awareness training

How to develop a security awareness programme that will actually change behaviour

Download now
Advertisement

Most Popular

Visit/microsoft-windows/32066/what-to-do-if-youre-still-running-windows-7
Microsoft Windows

What to do if you're still running Windows 7

14 Jan 2020
Visit/operating-systems/25802/17-windows-10-problems-and-how-to-fix-them
operating systems

17 Windows 10 problems - and how to fix them

13 Jan 2020
Visit/policy-legislation/data-governance/354496/brexit-security-talks-under-threat-after-uk-accused-of
data governance

Brexit security talks under threat after UK accused of illegally copying Schengen data

10 Jan 2020
Visit/hardware/laptops/354533/dell-xps-13-new-9300-hands-on-review-chasing-perfection
Laptops

Dell XPS 13 (New 9300) hands-on review: Chasing perfection

14 Jan 2020