ARM unveils processor design with dedicated machine learning capabilities

ARM

Chip designer ARM has announced it is now offering its partners processors with dedicated machine learning capabilities.

Dubbed Project Trillium, the processor is ARM's attempt at making its chips the standard platform for machine learning in mobile and internet of things (IoT) devices and is said to be "the most efficient solution" to run neural networks.

"[Our] Machine Learning processor is an optimised, ground-up design for machine learning acceleration, targeting mobile and adjacent markets," Arm said. "The solution consists of state-of-the-art optimised fixed-function engines to provide best-in-class performance within a constrained power envelope."

The launch of the machine learning chip, aimed at general AI workloads, coincides with that of a fresh object detection chip that specialises in detecting faces, people and their gestures in moving images, even those in full HD and running at up to 60 frames per second.

This is actually the second generation of ARM's object-detection chip; its predecessor ran in Hive's smart security camera. ARM hopes that this updated version will be used by OEMs alongside its machine learning chip to detect faces or objects in an image or video, for example, passing the information on to the machine learning chip, which would then perform the face or image recognition.

ARM also said that the Project Trillium chips feature onboard memory that allows central storage for weights and feature maps, thus reducing traffic to the external memory and, therefore, power.

"[An] additional programmable layer engine supports the execution of non-convolution layers, and the implementation of selected primitives and operators, along with future innovation and algorithm generation," the firm explained, adding that there's also a network control unit which manages the overall execution and "traversal of the network" while the DMA moves data in and out of the main memory.

The firm stressed that the new machine learning chips are not meant for training machine learning models, but instead for running them at the edge. The idea is to offer mobile performance of 4.6 teraops but being so efficient that they only use 3 teraops per watt of power. However, ARM said it expects this could increase with additional optimisations.

Expect to see ARM's new AI-focused chips offered to its partners by summer time, and in the first consumer devices around this time next year.