Powering The Future Of Artificial Intelligence

AI’s rapid evolution is producing an explosion in new types of hardware accelerators for machine learning and deep learning. Some people refer to this as a “Cambrian explosion,” which is an apt metaphor for the current period of fervent innovation.

It refers to the period about 500 million years ago when essentially every biological “body plan” among multicellular animals appeared for the first time. From that point onward, these creatures, ourselves included, fanned out to occupy, exploit, and thoroughly transform every ecological niche on the planet.

The range of innovative AI hardware-accelerator architectures continues to expand. Although you may think that graphic processing units (GPUs) are the dominant AI hardware architecture, that is far from the truth.

Over the past several years, both startups and established chip vendors have introduced an impressive new generation of new hardware architectures optimized for machine learning, deep learning, natural language processing, and other AI workloads.

Chief among these new AI-optimised chipset architectures, in addition to new generations of GPUs, are neural network processing units (NNPUs), field programmable gate arrays (FPGAs), application-specific integrated circuits (ASICs), and various related approaches that go by the collective name of neurosynaptic architectures.

Today’s AI market has no hardware monoculture equivalent to Intel’s x86 CPU, which once dominated the desktop computing space. That’s because these new AI-accelerator chip architectures are being adapted for highly specific roles in the burgeoning cloud-to-edge ecosystem, such as computer vision.

The evolution of AI-accelerator chips

To understand the rapid evolution taking place in AI-accelerator chips, it’s best to focus on the marketplace opportunities and challenges as follows.

AI tiers

To see how AI accelerators are evolving, look to the edge, where new hardware platforms are being optimised to enable greater autonomy for mobile, embedded, and internet of things (IoT) devices.

Beyond the proliferation of smartphone-embedded AI processors, one of the most noteworthy in this regard is innovation in AI robotics, which is permeating everything from self-driving vehicles to drones, smart appliances, and industrial IoT.

One of the most noteworthy developments in this regard is Nvidia’s latest enhancements to its Jetson Xavier AI line of AI systems on a chip (SOCs). Nvidia has released the Isaac software development kit to assist with building robotics algorithms that will run on its dedicated robotics hardware.

Reflecting the complexity of intelligent robotics, Jetson Xavier chip consists of six processing units, including a 512-core Nvidia Volta Tensor Core GPU, an eight-core Carmel Arm64 CPU, a dual Nvidia deep-learning accelerator, and image, vision, and video processors. These let it handle dozens of algorithms to help robots autonomously sense environments, respond effectively, and operate safely alongside human engineers.

AI tasks

AI accelerators are beginning to permeate every tier in distributed cloud-to-edge, high-performance computing, hyper-converged server, and cloud-storage architectures. A steady stream of fresh hardware innovations are coming to all these segments to support more rapid, efficient, and accurate AI processing.

AI hardware innovations are coming to market to accelerate the specific data-driven tasks of these distinct application environments. The myriad AI chipset architectures on the market reflect the diverse range of machine learning, deep learning, natural language processing, and other AI workloads that range from storage-intensive training to compute-intensive inferencing and involve varying degrees of device autonomy and person-in-the-loop interactivity.

To address the range of workloads that AI chipsets are being used to support, vendors are mixing a wide range of technologies in their product portfolios and even in specific embedded-AI deployments, such as the SOCs that drive intelligent robotics and mobile apps.

As an example, Intel’s Xeon Phi CPU architecture has been used to accelerate AI tasks. But Intel recognizes that it will not be able to keep up without specialized AI accelerator chips that let it compete head-on with Nvidia Volta (in GPUs) and with the legions of vendors building NNPUs and other specialised AI chips. Thus, Intel now has a product team working on a new GPU, to be released in the next two years.

At the same time, it continues to hedge its bets with AI-optimized chipsets several architectural categories: neural network processors (Nervana), FPGAs (Altera), computer-vision ASICs (Movidius), and autonomous-vehicle ASICs (MobilEye). It has also projects to build self-learning neuromorphic and quantum computing chips for next-generation AI challenges.

AI tolerances

Every AI-acceleration hardware innovations must be survivable in term of its ability to achieve metrics defined within the relevant operational and economic tolerances.

In operational metrics, every AI chipset must conform to the relevant constraints in terms of form factors, energy efficiency, heat and electromagnetic emission, and ruggedness.

In economic metrics, it must be competitive in performance and cost of ownership for the tiers and tasks into which it’s designed to be deployed. Comparative industry benchmarks will become a key factor in determining whether an AI-accelerator technology has the price-performance profile to survive in a hotly competitive marketplace.

In an industry that’s moving toward workload-optimised AI architectures, users will adopt the fastest, most scalable, most power-efficient and lowest-cost hardware, software, and cloud platforms to run their AI tasks, including development, training, operationalisation, and inferencing, in every tier.

The diversity of AI-accelerator ASICs

AI-accelerator hardware architectures are the opposite of a monoculture. They are so diverse and evolving so rapidly that it’s hard to keep up with the relentless pace of innovation in this market.

Beyond the core AI chipset manufacturers, such as Nvidia and Intel, ASICs for platform-specific AI workloads abound.

You can see this trend in several recent news items:

  • Microsoft is preparing an AI chip for its HoloLens augmented reality headset.
  • Google has a special NNPU, the Tensor Processing Unit, which is available for AI apps on the Google Cloud Platform.
  • Amazon is reportedly working on an AI chip for its Alexa home assistant.
  • Apple is working on an AI processor that will power Siri and FaceID.
  • Tesla is building an AI processor for its self-driving electric cars.

AI-accelerator benchmark frameworks are beginning to emerge

Cross-vendor partnerships in the AI-accelerator market are growing more complicated and overlapping. For example, consider how China-based tech powerhouse Baidu is partnering separately with Intel and Nvidia.

In addition to launching its own NNPU chip for natural language processing, image recognition, and autonomous driving, Baidu is partnering with Intel for FPGA-backed AI-workload acceleration in its public cloud, an AI framework for Xeon CPUs, an AI-equipped autonomous car platform, a computer-vision powered retail camera, and adoption of Intel’s nGraph hardware-agnostic deep neural network compiler.

This is all on the heels of equivalent announcements with Nvidia, including plans to bring Volta GPUs to the Baidu cloud, a tweak to Baidu’s PaddlePaddle AI development framework for Volta, and rollout of Nvidia-powered AI to the Chinese consumer market.

Sorting through this bewildering range of AI-accelerator hardware options, and combinations thereof, both on the cloud and in specialised SoCs, is growing more difficult every day. Isolating the AI-accelerator hardware’s contribution to overall performance on any given task can be tricky without a flexible benchmarking framework.

Fortunately, the AI industry is developing open, transparent, and vendor-agnostic frameworks for benchmarking for evaluating the comparative performance of different hardware/software stacks in the running of diverse workloads.

MLPerf

For example, the MLPerf open source benchmark group is developing of a standard suite for benchmarking the performance of machine learning software frameworks, hardware accelerators, and cloud platforms.

Available on GitHub and currently in a beta version, MLPerf provides reference implementations for some AI tasks that predominate in today’s AI deployments. It scopes the benchmarks to specific AI tasks (such as image classification) performed by specific algorithms (such as Resnet-50 v1) against specific data sets (such as ImageNet).

The core benchmark focuses on specific hardware/software deployments, such image-classification training jobs running in Ubuntu 16.04, Nvidia Docker, and CPython 2 on platforms built from 16 CPU chips, one Nvidia P100 Volta GPU, and 600 gigabytes of local disk.

The MLPerf framework is flexible enough so that conceivably the GPU-based image-classification training can be benchmarked against the same tasks running on a different hardware accelerator, such as the recently announced Baidu Kunlun FPGAs, but within a substantially equivalent software/hardware stack.

Other AI industry benchmarking initiatives also enable comparative performance evaluations on alternate AI-accelerator chips, as well as of other hardware and software components in deployments addressing the same tasks using the same models against the same training or operational data.

These other benchmarking initiatives include DawnBench, ReQuest, the Transaction Processing Performance Council’s AI Working Group, and CEAN2D2. They are all flexible enough to be applied to any AI workload task running in any deployment tier and measured against any economic tolerance.

EEMBC Machine Learning Benchmark Suite

Reflecting the move of AI workloads to the edge, some AI benchmarking initiatives are purely focused on measuring the performance of hardware/software stacks deployed to this tier. For example, the industry alliance EEMBC recently started a new effort to define a benchmark suite for machine learning executing in optimised chipsets running in power-constrained edge devices.

Chaired by Intel, EEMBC’s Machine Learning Benchmark Suite group will use real-world machine learning workloads from virtual assistants, smartphones, IoT devices, smart speakers, IoT gateways, and other embedded/edge systems to identify the performance potential and power efficiency of processor cores used for accelerating machine learning inferencing jobs.

The EEMBC Machine Learning benchmark will measure inferencing performance, neural-net spin-up time, and power efficiency of low-, moderate-, and high-complexity inferencing tasks. It will be agnostic to machine learning front-end frameworks, back-end runtime environments, and hardware-accelerator targets. The group is working on a proof-of-concept and plans to release its initial benchmark suite by June 2019, addressing a range of neural-net architectures and use cases for edge-based inferencing.

EEMBC Adasmark benchmarking framework

Addressing a narrower scope of the edge tier and tasks, EEMBC’s Adasmark benchmarking framework focuses on AI-equipped smart vehicles. Separate from its Machine Learning Benchmark effort, EEMBC is developing a separate performance measurement framework for AI chips embedded in advanced driver assistance systems.

The suite helps measure the performance of AI inferencing tasks executing in multi-device, multichip, multi-application smart-vehicle platforms. It benchmarks real-world inferencing workloads associated with highly parallel smart-vehicle applications, such as computer vision, autonomous driving, automotive surround view, image recognition, and mobile augmented reality. It measures inferencing performance across complex smart-car edge architectures, which usually include multiple specialized CPUs, GPUs, and other hardware-accelerator chipsets performing distinct tasks within a common chassis.

Emerging AI scenarios will require even more specialty chips

Almost certainly, other specialized AI-edge scenarios will emerge that require their own specialized chips, SoCs, hardware platforms, and benchmarks. The next great growth segment in AI chipsets may be in accelerating edge nodes for cryptocurrency mining, a use case that, alongside AI and gaming, has soaked up a lot of demand for Nvidia GPUs.

One vendor specialising in this niche is DeepBrain Chain, which recently announced an computing platform that can be deployed in distributed configurations to power high-performance processing of AI workloads and mining of cryptocurrency tokens. The mining stations come in two-, four-, and eight-GPU configurations, as well as standalone workstations and a 128-GPU customized AI HPC clusters.

Before long, we are almost certain to see a new generation of AI ASICs focused on distributed cryptocurrency mining.

Specialised hardware platforms are the future of AI at every tier and for every task in the cloud-to-edge world in which we live.

InfoWorld

You Might Also Read: 

New IoT Chips See, Think & Act Autonomously:

A Strategic Company: The Internet of Things & How ARM Fits In:

 

« Healthcare Cyber-Attacks Still Going Up
Internet Risks Failure As Sea Levels Rise »

CyberSecurity Jobsite
Perimeter 81

Directory of Suppliers

IT Governance

IT Governance

IT Governance is a leading global provider of information security solutions. Download our free guide and find out how ISO 27001 can help protect your organisation's information.

Resecurity, Inc.

Resecurity, Inc.

Resecurity is a cybersecurity company that delivers a unified platform for endpoint protection, risk management, and cyber threat intelligence.

Cyber Security Supplier Directory

Cyber Security Supplier Directory

Our Supplier Directory lists 6,000+ specialist cyber security service providers in 128 countries worldwide. IS YOUR ORGANISATION LISTED?

LockLizard

LockLizard

Locklizard provides PDF DRM software that protects PDF documents from unauthorized access and misuse. Share and sell documents securely - prevent document leakage, sharing and piracy.

DigitalStakeout

DigitalStakeout

DigitalStakeout enables cyber security professionals to reduce cyber risk to their organization with proactive security solutions, providing immediate improvement in security posture and ROI.

Secunet Security Networks

Secunet Security Networks

Secunet is a leading cyber security company offering a combination of consultancy and products, delivering the highest level of security for data, applications and digital identities.

Hex Security

Hex Security

Hex Security Limited is a specialist Information Assurance (IA) consultancy working with associates and partners to deliver security certification and accreditation support.

Momentum

Momentum

The Cyber Security team at Momentum offers a professional and specialist recruitment service across Cyber & IT Security.

Texplained

Texplained

Texplained specializes in security audits of microchips to identify vulnerabilities and protect against invasive cyber attacks.

Radar Cyber Security

Radar Cyber Security

Radar Cyber Security is the only European supplier of Managed Detection & Response who provides its services based on inhouse developed technology.

SISSDEN

SISSDEN

SISSDEN will improve cybersecurity through the development of increased awareness and the effective sharing of actionable threat information.

OpenText

OpenText

OpenText is a leader in Enterprise Information Management software and a portfolio of related solutions for Information Governance, Compliance, Information Security and Privacy.

Axence

Axence

Axence provides professional solutions for the comprehensive management of IT infrastructure for companies and institutions all over the world.

GrrCON

GrrCON

GrrCON is an information security and hacking conference that provides the Midwest InfoSec community with a fun atmosphere to come together and engage with like minded people.

Quantum Generation

Quantum Generation

Quantum Cyber Security for a new age of communications. We are developing the largest decentralized orbital, and ground quantum mesh network based on blockchain technology.

Variti

Variti

Variti Intelligent Active Bot Protection technology — traffic analysis, detection and stopping of malicious bots in real-time and effective response to DDoS attacks.

Uptycs

Uptycs

Uptycs combines the open source universal agent, osquery, with a scalable security analytics platform for fleet visibility, intrusion detection, vulnerability monitoring and compliance.

Tide Foundation

Tide Foundation

Tide's breakthrough multi-party-cryptography enables TRUE-zero-trust technology that unlocks cyber-herd immunity.

Triaxiom Security

Triaxiom Security

Triaxiom Security offers penetration testing, security audits, and strategic consulting customized to meet your needs.

BaXian Group

BaXian Group

BaXian AG is an international consulting company specializing in IT security, data analytics, risk management and compliance.

Green Enterprise Solutions

Green Enterprise Solutions

Green Enterprise Solutions are a Namibian company providing Information and Communication Technology (ICT) services to corporate Namibia.