
Nature News · Feb 18, 2026 · Collected from RSS
MainCytology lies at the core of early detection of cervical, lung and bladder cancers because it is minimally invasive, low cost and widely deployable1,2,3,4,5,6,7. In routine practice, cytologists examine glass slides containing approximately 10,000–1,000,000 cells per slide, assessing individual cells and clusters under an optical microscope. Diagnostic judgements rely on abnormalities in three-dimensional (3D) nuclear and cytoplasmic morphology and spatial relationships between neighbouring cells. Cervical screening, in particular, is dominated by liquid-based Pap testing8,9, which has reduced cervical cancer incidence and mortality by improving sample quality and throughput4,5.Despite these strengths, cytology exhibits variable sensitivity and specificity because it fundamentally depends on subjective visual interpretation10,11. Inter-observer and intra-observer variability arises from differences in training and experience, as well as from cognitive biases such as confirmation and anchoring12,13,14,15,16. Incomplete adherence to guidelines and quality-control protocols17,18, together with fatigue, time pressure and high case volumes16, further degrades accuracy and contributes to missed or delayed diagnoses. The CervicalCheck cancer scandal19,20,21 illustrates the potential clinical consequences of subjective cytology. Manual review is also challenged by the low prevalence of abnormal cells, preparation artefacts and overlapping cellular structures, which obscure findings and make early lesions difficult to detect22,23.Artificial intelligence (AI) has therefore been widely explored to support cytological interpretation15,22,24,25,26. However, most approaches operate on two-dimensional (2D) images and target small subsets of ‘representative’ cells. More fundamentally, existing frameworks rarely scale to whole-slide analysis of hundreds of thousands of spatially dispersed cells seen in routine practice. Although 3D imaging can capture richer structural information that better reflects how cytologists interpret slides, it substantially increases demands on image acquisition, processing, storage and data transfer. Compared with histopathology and radiology27,28,29,30,31,32, cytology generates larger data volumes because morphologically diverse, non-cohesive cells must be imaged in depth. These constraints have limited the scalability of AI models, leaving current tools assistive and reliant on human interpretation and decision-making rather than functioning autonomously15,22,24,25,26.To overcome these limitations, we present a real-time, clinically validated autonomous cytology platform that integrates high-speed, high-resolution optical whole-slide tomography with edge computing—a distributed computing architecture that processes data close to its source33,34,35,36. The system acquires and locally compresses gigavoxel 3D whole-slide images before storage, accelerating model development and deployment without compromising image quality or AI performance. This enables routine digitization of thick cytology samples containing abnormal cell clusters, a long-standing challenge37,38,39, and delivers strong AI performance without requiring very large datasets; approximately 1,000 original 3D images per class are sufficient for effective AI training.A key innovation is the cluster of morphological differentiation (CMD), an image-derived analogue of the cluster of differentiation used in immunophenotyping40,41. The CMD supports a flow cytometry-like framework on the basis of morphology rather than fluorescence markers, enabling population-wide visualization and interrogation. Unlike earlier AI-based cytology tools15,19,20,21,22,24,25,26, CMD analysis lets cytologists explore cell populations through scatter plots, hierarchical gating and dimensionality reduction, improving interpretability, error detection and discovery of new phenotypes. Together, these advances establish a scalable, real-time cytology pipeline with clinical-grade autonomy and lay the foundation for an objective, reproducible and discovery-driven diagnostic paradigm.Whole-slide edge tomographyAs illustrated in Fig. 1, our edge computer-integrated optical whole-slide tomograph, referred to as the whole-slide edge tomograph, comprises a light-emitting diode as a light source, an XY translation stage, a Z translation stage integrated with imaging optics and a complementary metal–oxide–semiconductor (CMOS) image sensor, and an edge computer equipped with an image sensor field-programmable gate array (FPGA) and a system on module (SOM) for real-time digital image processing (see Extended Data Fig. 1 for details). This configuration enables the acquisition of 2D images across multiple depth layers and facilitates the construction, compression and archiving of 3D images during slide scanning in the lateral (XY) and longitudinal (Z) directions. The CMOS image sensor captures high-resolution 2D bright-field images (4,480 × 4,504 pixels per imaging section) at a rate of up to 50 frames per second, with 173 or 485 imaging sections per layer for SurePath (Becton, Dickinson and Company) or ThinPrep (Hologic) slides, respectively, and 40 layers per slide, yielding approximately 140 or 391 gigavoxels per slide, respectively. These 2D images are transmitted to the FPGA for initial image processing. The processed images are then sent to the graphics processing unit (GPU) through a dual four-lane Mobile Industry Processor Interface (MIPI), in which extra tasks, such as background correction, focus adjustment, 3D image construction and 3D image compression, are performed using the hardware encoder, a dedicated module in the SOM for real-time image compression. Similar to video compression, in which redundancies within each frame (intra-frame) and between consecutive frames (inter-frame) reduce the data size, our system exploits intra-layer compression by reducing spatial redundancy within individual 2D images and inter-layer compression by exploiting similarities between adjacent optical sections along the Z axis. This strategy enables efficient compression of sectional 3D image stacks while preserving diagnostic content. The compressed images are stored in the high-efficiency video coding (HEVC) format, which supports both intra-prediction and inter-prediction schemes and is well suited for volumetric image sequences. These sectional 3D images are then transmitted to a back-end server, where they are decoded and stitched into a comprehensive 3D image of the entire slide, covering approximately 10,000 to 1,000,000 cells. This whole-slide 3D image can be viewed in real time by cytologists on a high-definition monitor, allowing interactive functions, such as movement, magnification and focus adjustments, using a deep zoom image (DZI) viewer. Simultaneously, the 3D image undergoes AI-based population analysis. Nuclei within individual cells are detected using the high-performance object detection model YOLOX42, followed by cropping the best-focused objects and classification using the vision transformer model MaxViT43. Additionally, population analysis is conducted using the CMD, with the results provided to the cytologists. Both the 3D image and AI-generated analysis results are accessible in real time, supporting efficient diagnosis and evaluation.Fig. 1: Whole-slide edge tomography.The edge computer facilitates the acquisition of 2D images across multiple depth layers and enables the construction, compression and archiving of 3D images during slide scanning in both the lateral (XY) and longitudinal (Z) directions. The CMOS image sensor captures high-resolution 2D bright-field images (4,480 × 4,504 pixels per imaging section) at up to 50 frames per second, with 173 or 485 imaging sections per layer and 40 depth layers per slide, yielding approximately 140 or 391 gigavoxels per SurePath or ThinPrep slide, respectively. These 2D images are first transmitted to the FPGA within the edge computer for initial processing. Subsequently, they are sent to the GPU, where further tasks such as background correction, focus adjustment, 3D image construction and compression are carried out. The processed sectional 3D images are then transmitted to a server, where the GPU stitches them together into a comprehensive 3D representation of the entire slide, encompassing approximately 10,000 to 1,000,000 cells. This whole-slide 3D image can be viewed in real time by cytologists on a high-definition monitor, enabling interactive functions such as movement, magnification and focus adjustments. Simultaneously, AI-based population analysis is performed on the 3D image. Population analysis is further refined using the CMD, and the results are provided to the cytologists. Both the 3D image and AI-generated analysis results are accessible in real time, enabling efficient diagnosis and evaluation.Full size imageSystem performanceAs shown in Extended Data Fig. 2, the whole-slide edge tomograph achieves high-quality imaging and efficient data compression. Extended Data Fig. 2a presents the image quality under different HEVC compression settings (high, medium and low) across three liquid-based cytology slides. For each condition, 300 tomographic images (3 slides × 10 imaging sections × 10 layers) were acquired, and peak signal-to-noise ratio (PSNR) distributions are shown as histograms. Extended Data Fig. 2b summarizes the corresponding file sizes of the 3D whole-slide images, demonstrating an inverse relationship between compression level and data size: approximately 1 GB, 500–800 MB and 170 MB for high, medium and low compression of a ten-layer SurePath slide, respectively. File size scaled linearly with the number of Z layers. Extended Data Fig. 2c shows representative tomograms (2D cross-sections) of glandular, low-grade squamous intraepithelial lesion (LSIL), high-grade squamous intraepithelial lesion (HSIL) and adenocarcinoma cells across varying PSNR levels. The system resolves subcellular s