Gigabytes of Data? Real-Time Analysis Is Easy with this New Approach

New algorithms allow real-time interactive data processing at 10X previous rates for electron microscopy data.

Image courtesy of Lawrence Berkeley National Laboratory
In this schematic, a coherent electron microscopy beam is scanned over a carbon nanotube; the 2D diffracted image is recorded by a superfast detector for all scanned points.

The Science

Modern detectors are revolutionizing electron microscopy. These detectors allow scientists to access new frontiers of high-resolution imaging, giving them new views of proteins and the atomic structures of advanced materials. These detectors collect data at ultrafast time scales. As a result, they generate massive amounts of data that require extensive amounts of computer time and power. This means that data are typically analyzed not while the researcher is still working at the microscope, which would be ideal, but after the data have been collected. In this work, a team of scientists developed algorithms that allow real-time, interactive data processing from very large electron microscopy data sets. The algorithms take advantage of sparse data representation. This is a data sampling technique that reduces the size of the analyzed data set, reducing the required computer time and memory.

The Impact

The team developed multiple techniques for the use of sparse data representation. These techniques allow researchers to select the analysis technique that is best for a given experiment. These techniques provide real-time feedback on experimental parameters during the experiment. This enables researchers to capture specific data of interest in real time. For example, they can focus on collecting data on the structure of a material. Sparse data representation may have applications across many fields of research where the increasing size and complexity of data sets prove a challenge for timely analysis. 


A new detector, called 4D scanning transmission electron microscopy or 4D Camera, installed at the National Center for Electron Microscopy at the Molecular Foundry at Lawrence Berkeley National Laboratory, operates at 87,000 frames per second and produces raw data sets totaling 700 gigabytes. This is ten times the rate of any previous camera. The data set, generated in 15 seconds, would fill approximately three typical laptop hard drives. Given this large amount of data and the computational resources available, researchers could collect the data, but not analyze the data in real time.

Implementing sparse data representation and analysis provides a massive advantage. The raw data is compressed by keeping only the necessary information, resulting in a sparse data set. Typical algorithms to analyze the data require uncompressing the data which takes a lot of computer time and effort. The research team developed algorithms that directly utilize the compressed data to reduce computation time to a few minutes. This allows scientists to understand their results while they still have time to modify or optimize their experiment.


Mary Scott
Molecular Foundry


The 4D Camera was developed at Lawrence Berkeley National Laboratory with funding from Department of Energy (DOE) Office of Science, Basic Energy Sciences program. The experiments were performed at the Molecular Foundry, a DOE Office of Science user facility.


Pelz, P.M., et al., Observation of formation and local structures of metal-organic layers via complementary electron microscopy techniques. IEEE Signal Processing Magazine 39, 1 (2022). [DOI:10.1109/MSP.2021.3120981]

Highlight Categories

Program: BES , SUF

Performer: DOE Laboratory , SC User Facilities , BES User Facilities , Foundry