Powering The Future Of Artificial Intelligence

AI’s rapid evolution is producing an explosion in new types of hardware accelerators for machine learning and deep learning. Some people refer to this as a “Cambrian explosion,” which is an apt metaphor for the current period of fervent innovation.

It refers to the period about 500 million years ago when essentially every biological “body plan” among multicellular animals appeared for the first time. From that point onward, these creatures, ourselves included, fanned out to occupy, exploit, and thoroughly transform every ecological niche on the planet.

The range of innovative AI hardware-accelerator architectures continues to expand. Although you may think that graphic processing units (GPUs) are the dominant AI hardware architecture, that is far from the truth.

Over the past several years, both startups and established chip vendors have introduced an impressive new generation of new hardware architectures optimized for machine learning, deep learning, natural language processing, and other AI workloads.

Chief among these new AI-optimised chipset architectures, in addition to new generations of GPUs, are neural network processing units (NNPUs), field programmable gate arrays (FPGAs), application-specific integrated circuits (ASICs), and various related approaches that go by the collective name of neurosynaptic architectures.

Today’s AI market has no hardware monoculture equivalent to Intel’s x86 CPU, which once dominated the desktop computing space. That’s because these new AI-accelerator chip architectures are being adapted for highly specific roles in the burgeoning cloud-to-edge ecosystem, such as computer vision.

The evolution of AI-accelerator chips

To understand the rapid evolution taking place in AI-accelerator chips, it’s best to focus on the marketplace opportunities and challenges as follows.

AI tiers

To see how AI accelerators are evolving, look to the edge, where new hardware platforms are being optimised to enable greater autonomy for mobile, embedded, and internet of things (IoT) devices.

Beyond the proliferation of smartphone-embedded AI processors, one of the most noteworthy in this regard is innovation in AI robotics, which is permeating everything from self-driving vehicles to drones, smart appliances, and industrial IoT.

One of the most noteworthy developments in this regard is Nvidia’s latest enhancements to its Jetson Xavier AI line of AI systems on a chip (SOCs). Nvidia has released the Isaac software development kit to assist with building robotics algorithms that will run on its dedicated robotics hardware.

Reflecting the complexity of intelligent robotics, Jetson Xavier chip consists of six processing units, including a 512-core Nvidia Volta Tensor Core GPU, an eight-core Carmel Arm64 CPU, a dual Nvidia deep-learning accelerator, and image, vision, and video processors. These let it handle dozens of algorithms to help robots autonomously sense environments, respond effectively, and operate safely alongside human engineers.

AI tasks

AI accelerators are beginning to permeate every tier in distributed cloud-to-edge, high-performance computing, hyper-converged server, and cloud-storage architectures. A steady stream of fresh hardware innovations are coming to all these segments to support more rapid, efficient, and accurate AI processing.

AI hardware innovations are coming to market to accelerate the specific data-driven tasks of these distinct application environments. The myriad AI chipset architectures on the market reflect the diverse range of machine learning, deep learning, natural language processing, and other AI workloads that range from storage-intensive training to compute-intensive inferencing and involve varying degrees of device autonomy and person-in-the-loop interactivity.

To address the range of workloads that AI chipsets are being used to support, vendors are mixing a wide range of technologies in their product portfolios and even in specific embedded-AI deployments, such as the SOCs that drive intelligent robotics and mobile apps.

As an example, Intel’s Xeon Phi CPU architecture has been used to accelerate AI tasks. But Intel recognizes that it will not be able to keep up without specialized AI accelerator chips that let it compete head-on with Nvidia Volta (in GPUs) and with the legions of vendors building NNPUs and other specialised AI chips. Thus, Intel now has a product team working on a new GPU, to be released in the next two years.

At the same time, it continues to hedge its bets with AI-optimized chipsets several architectural categories: neural network processors (Nervana), FPGAs (Altera), computer-vision ASICs (Movidius), and autonomous-vehicle ASICs (MobilEye). It has also projects to build self-learning neuromorphic and quantum computing chips for next-generation AI challenges.

AI tolerances

Every AI-acceleration hardware innovations must be survivable in term of its ability to achieve metrics defined within the relevant operational and economic tolerances.

In operational metrics, every AI chipset must conform to the relevant constraints in terms of form factors, energy efficiency, heat and electromagnetic emission, and ruggedness.

In economic metrics, it must be competitive in performance and cost of ownership for the tiers and tasks into which it’s designed to be deployed. Comparative industry benchmarks will become a key factor in determining whether an AI-accelerator technology has the price-performance profile to survive in a hotly competitive marketplace.

In an industry that’s moving toward workload-optimised AI architectures, users will adopt the fastest, most scalable, most power-efficient and lowest-cost hardware, software, and cloud platforms to run their AI tasks, including development, training, operationalisation, and inferencing, in every tier.

The diversity of AI-accelerator ASICs

AI-accelerator hardware architectures are the opposite of a monoculture. They are so diverse and evolving so rapidly that it’s hard to keep up with the relentless pace of innovation in this market.

Beyond the core AI chipset manufacturers, such as Nvidia and Intel, ASICs for platform-specific AI workloads abound.

You can see this trend in several recent news items:

  • Microsoft is preparing an AI chip for its HoloLens augmented reality headset.
  • Google has a special NNPU, the Tensor Processing Unit, which is available for AI apps on the Google Cloud Platform.
  • Amazon is reportedly working on an AI chip for its Alexa home assistant.
  • Apple is working on an AI processor that will power Siri and FaceID.
  • Tesla is building an AI processor for its self-driving electric cars.

AI-accelerator benchmark frameworks are beginning to emerge

Cross-vendor partnerships in the AI-accelerator market are growing more complicated and overlapping. For example, consider how China-based tech powerhouse Baidu is partnering separately with Intel and Nvidia.

In addition to launching its own NNPU chip for natural language processing, image recognition, and autonomous driving, Baidu is partnering with Intel for FPGA-backed AI-workload acceleration in its public cloud, an AI framework for Xeon CPUs, an AI-equipped autonomous car platform, a computer-vision powered retail camera, and adoption of Intel’s nGraph hardware-agnostic deep neural network compiler.

This is all on the heels of equivalent announcements with Nvidia, including plans to bring Volta GPUs to the Baidu cloud, a tweak to Baidu’s PaddlePaddle AI development framework for Volta, and rollout of Nvidia-powered AI to the Chinese consumer market.

Sorting through this bewildering range of AI-accelerator hardware options, and combinations thereof, both on the cloud and in specialised SoCs, is growing more difficult every day. Isolating the AI-accelerator hardware’s contribution to overall performance on any given task can be tricky without a flexible benchmarking framework.

Fortunately, the AI industry is developing open, transparent, and vendor-agnostic frameworks for benchmarking for evaluating the comparative performance of different hardware/software stacks in the running of diverse workloads.

MLPerf

For example, the MLPerf open source benchmark group is developing of a standard suite for benchmarking the performance of machine learning software frameworks, hardware accelerators, and cloud platforms.

Available on GitHub and currently in a beta version, MLPerf provides reference implementations for some AI tasks that predominate in today’s AI deployments. It scopes the benchmarks to specific AI tasks (such as image classification) performed by specific algorithms (such as Resnet-50 v1) against specific data sets (such as ImageNet).

The core benchmark focuses on specific hardware/software deployments, such image-classification training jobs running in Ubuntu 16.04, Nvidia Docker, and CPython 2 on platforms built from 16 CPU chips, one Nvidia P100 Volta GPU, and 600 gigabytes of local disk.

The MLPerf framework is flexible enough so that conceivably the GPU-based image-classification training can be benchmarked against the same tasks running on a different hardware accelerator, such as the recently announced Baidu Kunlun FPGAs, but within a substantially equivalent software/hardware stack.

Other AI industry benchmarking initiatives also enable comparative performance evaluations on alternate AI-accelerator chips, as well as of other hardware and software components in deployments addressing the same tasks using the same models against the same training or operational data.

These other benchmarking initiatives include DawnBench, ReQuest, the Transaction Processing Performance Council’s AI Working Group, and CEAN2D2. They are all flexible enough to be applied to any AI workload task running in any deployment tier and measured against any economic tolerance.

EEMBC Machine Learning Benchmark Suite

Reflecting the move of AI workloads to the edge, some AI benchmarking initiatives are purely focused on measuring the performance of hardware/software stacks deployed to this tier. For example, the industry alliance EEMBC recently started a new effort to define a benchmark suite for machine learning executing in optimised chipsets running in power-constrained edge devices.

Chaired by Intel, EEMBC’s Machine Learning Benchmark Suite group will use real-world machine learning workloads from virtual assistants, smartphones, IoT devices, smart speakers, IoT gateways, and other embedded/edge systems to identify the performance potential and power efficiency of processor cores used for accelerating machine learning inferencing jobs.

The EEMBC Machine Learning benchmark will measure inferencing performance, neural-net spin-up time, and power efficiency of low-, moderate-, and high-complexity inferencing tasks. It will be agnostic to machine learning front-end frameworks, back-end runtime environments, and hardware-accelerator targets. The group is working on a proof-of-concept and plans to release its initial benchmark suite by June 2019, addressing a range of neural-net architectures and use cases for edge-based inferencing.

EEMBC Adasmark benchmarking framework

Addressing a narrower scope of the edge tier and tasks, EEMBC’s Adasmark benchmarking framework focuses on AI-equipped smart vehicles. Separate from its Machine Learning Benchmark effort, EEMBC is developing a separate performance measurement framework for AI chips embedded in advanced driver assistance systems.

The suite helps measure the performance of AI inferencing tasks executing in multi-device, multichip, multi-application smart-vehicle platforms. It benchmarks real-world inferencing workloads associated with highly parallel smart-vehicle applications, such as computer vision, autonomous driving, automotive surround view, image recognition, and mobile augmented reality. It measures inferencing performance across complex smart-car edge architectures, which usually include multiple specialized CPUs, GPUs, and other hardware-accelerator chipsets performing distinct tasks within a common chassis.

Emerging AI scenarios will require even more specialty chips

Almost certainly, other specialized AI-edge scenarios will emerge that require their own specialized chips, SoCs, hardware platforms, and benchmarks. The next great growth segment in AI chipsets may be in accelerating edge nodes for cryptocurrency mining, a use case that, alongside AI and gaming, has soaked up a lot of demand for Nvidia GPUs.

One vendor specialising in this niche is DeepBrain Chain, which recently announced an computing platform that can be deployed in distributed configurations to power high-performance processing of AI workloads and mining of cryptocurrency tokens. The mining stations come in two-, four-, and eight-GPU configurations, as well as standalone workstations and a 128-GPU customized AI HPC clusters.

Before long, we are almost certain to see a new generation of AI ASICs focused on distributed cryptocurrency mining.

Specialised hardware platforms are the future of AI at every tier and for every task in the cloud-to-edge world in which we live.

InfoWorld

You Might Also Read: 

New IoT Chips See, Think & Act Autonomously:

A Strategic Company: The Internet of Things & How ARM Fits In:

 

« Healthcare Cyber-Attacks Still Going Up
Internet Risks Failure As Sea Levels Rise »

ManageEngine
CyberSecurity Jobsite
Check Point

Directory of Suppliers

IT Governance

IT Governance

IT Governance is a leading global provider of information security solutions. Download our free guide and find out how ISO 27001 can help protect your organisation's information.

The PC Support Group

The PC Support Group

A partnership with The PC Support Group delivers improved productivity, reduced costs and protects your business through exceptional IT, telecoms and cybersecurity services.

BackupVault

BackupVault

BackupVault is a leading provider of automatic cloud backup and critical data protection against ransomware, insider attacks and hackers for businesses and organisations worldwide.

North Infosec Testing (North IT)

North Infosec Testing (North IT)

North IT (North Infosec Testing) are an award-winning provider of web, software, and application penetration testing.

XYPRO Technology

XYPRO Technology

XYPRO is the market leader in HPE Non-Stop Security, Risk Management and Compliance.

Censornet

Censornet

Censornet's autonomous, integrated cloud security gives mid-market organisations the confidence and control of enterprise-grade cyber protection.

Jetico

Jetico

Jetico provides pure & simple data protection software for all sensitive information throughout the lifecycle. Solutions include data encryption and secure data erasure.

Arthur J Gallagher & Co

Arthur J Gallagher & Co

Arthur J. Gallagher & Co. is a global insurance brokerage and risk management services firm. Services include Cyber Liability insurance.

OmniNet

OmniNet

OmniNet delivers the next generation of cybersecurity and is the only provider in the market to move the edge of small businesses to a virtual, omnipresent perimeter.

SAASPASS

SAASPASS

SAASPASS is a full-stack identity and access management solution, a single product which allows you to manage all your digital and physical access needs securely and conveniently.

Shape Security

Shape Security

Shape Security provide best-in-class defense against malicious automated cyberattacks on web and mobile applications.

TorGuard

TorGuard

TorGuard is a Virtual Private Network services provider offering secure encrypted access to the internet.

CSIRT GOV - Poland

CSIRT GOV - Poland

Computer Security Incident Response Team CSIRT GOV, run by the Head of the Internal Security Agency, acts as the national CSIRT responsible for coordinating the response to computer incidents.

Technology Ireland ICT Skillnet

Technology Ireland ICT Skillnet

Technology Ireland ICT Skillnet is a network of companies who collaborate to address skills needs within the technology sector.

Systems Assessment Bureau (SAB)

Systems Assessment Bureau (SAB)

Systems Assessment Bureau is an internationally recognized ISO Certification Body with a unique vision of “Excel together with global standards”.

Ekco

Ekco

Ekco is one of Europe’s leading managed cloud providers. With a network of infrastructure and security specialists across Europe, we’ve perfected our approach to supporting digital transformation.

Match Systems

Match Systems

Match Systems provides blockchain investigations, KYC, KYT, AML, Due Diligence and compliance services.

Covenant Technologies

Covenant Technologies

Make Covenant Technologies the only choice for your IT and cybersecurity recruitment needs. We deliver quality candidates at the forefront of the cybersecurity and IT industry.

CBIT Digital Forensics Services (CDFS)

CBIT Digital Forensics Services (CDFS)

CDFS is Australia’s premier supplier of digital forensic tools, industry-embedded training and certification to Law Enforcement, Government, and Corporate Enterprise.

SignPath

SignPath

SignPath provides leading-edge software and SaaS services that ensure code integrity from development to distribution.

AVIANET

AVIANET

AVIANET's goal is to empower enterprises and corporations worldwide and manage their digital transformation journey with confidence.