Brainstorming the next generation of computing
In the spring of 2020, as the exponential growth of the COVID-19 pandemic was capturing the attention of scientists worldwide, artificial intelligence engineers saw a rare opportunity. The spreading disease was a constant source of varied and ever-expanding datasets on which to train machines, from chest x-rays of asymptomatic patients to positivity and survival rates across populations. The demand for high-quality, predictive data analysis without direct human contact had never been higher.
Deep learning, a form of artificial intelligence in which layers of software neurons team up to extract knowledge from huge datasets and apply it to new scenarios, was already the star of many computer-assisted diagnosis (CAD) systems. Applying deep learning to chest x-ray scans to help diagnose COVID-19 infections was a natural progression of this trend. Deep learning proposals for COVID CAD emerged as early as March, and other AI applications for pandemic management soon followed, including remote infrared fever detection, self-driving disinfecting robots, and even mask-proof facial recognition.
Now, over a year into the pandemic, the less favorable impacts of this tidal wave of deep learning projects are becoming apparent, especially in terms of their huge energy footprints. Real-time computer vision, especially for moving images, requires dedicated, expensive, and energy-intensive electronic hardware to mimic the efficient visual processing of living brains. Though our biological processors require only a portion of our bodies' energy resources to function, a 2019 paper reports that the training period alone for the average deep learning algorithm has nearly five times the lifetime carbon footprint of the average American car, including its manufacture.
To address this outsized energy requirement, many researchers have turned to revolutionary approaches to computation. One of these is neuromorphic design, an approach to computing that views the living brain as the ideal computing machine and attempts to mimic its speed, flexibility, and efficiency by reproducing its structural elements in hardware. These principles sharply contrast with the approaches that have dominated computing since the 1950s. Where classical computers are limited by the "Von Neumann bottleneck," or the maximum rate of signal transfer between the processor and the memory unit, the brain stores much of its learned data in the synapses themselves, which are altered slightly by every spike in order to increase the efficiency of the next operation. This integrated memory system is distributed throughout the brain, allowing parallel processing in multiple locations at once—something classical architectures can't physically achieve.
To support more brainlike architectures, new neuromorphic hardware devices have been proposed and tested to play the roles of neurons, synapses, and sensors. Some of the most promising solutions harness the natural parallelism, power efficiency, and time-dependency of photonics. Optical elements, from lasers to photo-tunable sensors, not only vary widely, but are customizable to the exact needs of the system. They are especially well-suited to mimic visual processing functions, in which the raw data is already in the optical domain. In all cases, however, the advantages of going optical are consistent: faster or comparable computation speeds with low loss and a fraction of the carbon footprint.
Putting the optics back in visual computing
Copying the brain's strategies for processing streams of visual information has proven an effective way for deep-learning algorithms to approximate visual intelligence. However, most forms of deep learning are extremely taxing: The training data for a visual processing program typically consists of hundreds to thousands of frames, depending on the task, each of which contains thousands to millions of pixels in a fixed arrangement.
One of the most intuitive ways to introduce photonics into a computer vision system is to give the processor an optical upgrade. An international research team led by Australia's Swinburne University of Technology recently developed a convolutional accelerator designed to supercharge the processing power of an optical neural network (ONN). Despite their speed and efficiency in producing convolutions, ONNs are fully connected structures that take no shortcuts along the convolutional pipeline. The volume of input they can handle is typically only limited by their hardware.
The result of their accelerator was a record-breaking increase in computing speed. Once paired with another neural network based on the same architecture, the convolutional accelerator was able to sort images of handwritten digits, 0-9, at a rate of nearly 11 trillion operations per second—almost seven orders of magnitude faster than a human brain could accomplish the task, and over 1,000 times faster than leading all-electronic models. A press release described the microcomb-supported ONN as "an enormous leap forward for neural networks and neuromorphic processing in general."
Convolutions are operations common in deep learning, inspired by the way the visual cortex processes images. Convolutional neural networks (CNNs) use filters to scan for distinctive features in a given patch of an image. Each neuron in the first layer checks the input image, patch by patch, evaluates how well it matches a filter, and then passes that evaluation onto the next neuronal layer to be compared to results from neighboring patches. CNNs build up knowledge of an image from the most granular, or pixelated, level, much as the frontline neurons in the retina detect low-level features like edges and dark spots before the higher neuronal layers can piece together the whole shape or meaning of an object in the visual field.
However, convolutions are not "decisions," explains David Moss, director of the Optical Sciences Centre at Swinburne University of Technology. They are more like data cleaners, prepping datasets for another network to judge. "The convolutional accelerator takes data from any source and acts as a universal frontend to any neural network," says Moss. "It takes the data, distills it, simplifies it, and then that can be used by a neural network to make decisions."
Most CNNs require some form of external processing power, such as additional GPUs or other dedicated hardware. The Swinburne project supplements a deep learning neural network with a soliton crystal microcomb, a type of Kerr comb, which translates radio frequency signals to pulses in the optical domain (and vice versa). Discovered in 2007 by Tobias Kippenberg, Kerr combs marked the first time combs were generated on an integrated photonic chip. "That's when the field really exploded," recalls Moss. "Since then, it's been a challenge to keep up."
Using light waves to identify human movement
One of the most valuable aspects of photonic hardware is its inherent timekeeping abilities. By measuring the frequency of electromagnetic inputs against the immutable speed of light, optical elements can process time-bound data as though it is information from any spatial dimension.
Piotr Antonik, an associate professor with the Laboratoire Matériaux Optiques, Photonique et Systèmes (LMOPS) department at the CentraleSupélec in Metz, France, set out to prove that neural networks can be built out of repeating units of photonic hardware. "We had just learned about reservoir computing and were trying to make progress towards a fully optical system using that paradigm," recalls Antonik.
Reservoir computers, also called shallow recurrent neural networks or echo-state networks, reduce the mathematical complexity of training recurrent networks by optimizing only the output layer, a limited "reservoir" of relevant past knowledge. Antonik's team first tested their optoelectronic reservoir computing system on static image recognition, but the task was too easy to truly display the network's capabilities.
"The idea of reservoir computing, the basis of our neuromorphic computation, is that it's a recurrent network. A recurrent network is basically a dynamic system, a system that has temporal dynamics, that evolves in time," explains Antonik. To highlight the potential of the optical hardware would require a task that incorporated the time dimension somehow. "It makes more sense to do video recognition from our system."
For time-bound data, Antonik turned to a public video library of 600 short videos from KTH Royal Institute of Technology in Sweden. Each clip features one of 25 human subjects performing one of six basic human actions, including running, jogging, walking, boxing, hand waving, and hand clapping, in a variety of indoor and outdoor settings. "The goal of the task is to show the video to the system, make some preprocessing feature extraction, and then make the system recognize what of the six actions the person is doing in the video."
Antonik paired the optical arm of the experiment with a fully programmable gate array (FPGA), a flexible semiconductor integrated circuit configured for their particular optoelectronic setup. "We were using a relatively slow optical experiment with a very fast electronic chip, and we were able to run them in parallel." says Antonik. "It was like a symbiosis, a cooperation between the two, that allowed us to show a new way of doing neuromorphic computing at the time."
Reducing the cost of high-speed edge detection
For Antonio Hurtado, a senior lecturer for the Institute of Photonics at the University of Strathclyde in the United Kingdom, the most appealing feature of photonic hardware is not its speed or efficiency, but its ability to act almost exactly like an essential part of an organic brain. "One of the behaviors you can induce in lasers is excitability: the ability of a system to trigger a spike response," explains Hurtado. "That's the same effect you see in neurons, in the brain. We observed we could see these spiking responses, which were like neurons, but much faster."
The spiking effect led to an exciting new proposal. "The analogy occurred to us to use lasers as artificial photonic neurons, which are much faster than both biological systems and electronic versions."
To demonstrate their potential, Hurtado's team constructed an artificial neural network where each layer of information processing is performed by a common laser. Vertical cavity surface-emitting lasers, or VCSELs, are cheap, energy-efficient semiconductor lasers found everywhere from supermarket barcode scanners to smartphone facial recognition systems. With negligible energy consumption and ultrafast speed, VCSELs produce a spiking effect similar to that of a living neuron: a rapid change in voltage in response to a stimulus that passes information to the next layer of the network. Like biological neurons, these tiny lasers can integrate inputs from multiple other sensors before passing on a weighted signal to the next layer.
Hurtado paired the VCSEL-based neurons with a CNN and trained the whole system on digital images of symbols with varying complexities. Once trained, the CNN scans new images for matching edges and marks the presence of each horizontal, vertical, or diagonal line with a spike in the appropriate laser, at the appropriate time-boundary. Later convolutions can confirm the presence of higher-order patterns, like curving lines, intersections, and eventually whole symbols. Hurtado explains that the entire process is inspired by the edge detection function of the retina, but at the ultrafast speeds common to photonics. "We can run a whole image in around one microsecond, which is still very fast for edge detection."
Edge detection is only the first layer of visual processing, but the Strathclyde team is currently at work developing a complete retinal network, complete with photoreceptor, bipolar cell, and retinal ganglion cell, all with VCSELs as the hardware backbone. Experiments with the new setup are currently underway.
The slow pivot to higher speeds
In a 2020 article in NATURE, a group of photonics researchers suggests that inference may be exactly the right nail for the hammer of artificial intelligence. "The capacity of computing systems is in an arms race with the massively growing amount of visual data they seek to understand," write the authors, who note the rising importance of applications like surveillance, remote sensing, autonomous driving, microscopy, and IoT.
These tasks rely on complex neural networks to build visual knowledge that will later be employed to make decisions in some time-sensitive setting, requiring the difficult combination of portability and substantial processing power. Even stationary systems like advanced microscopes, where custom GPUs provide a short-term solution to a processing squeeze, face the problem of the fundamental physical limits of transmitting information through electronic media. In all these cases, building some part of the neural architecture with photonic materials has the potential to sharply reduce the footprint of the system.
Neuromorphic computing systems are still a long way from being fully optical, but researchers aren't fretting, especially while the ever-expanding dataverse continues to test the limits of more conventional all-electronic algorithms. "At the end of the day, there will be some sort of synergy," predicts Moss. "Electronics and optics each have their strengths. Cross-disciplinary changes take a long time to happen."
Lynne Peskoe-Yang is a science and technology writer.
|Enjoy this article?
Get similar news in your inbox