Resumen:
BACKGROUND: Microscopic examination of Giemsa-stained blood films remains a major form of diagnosis in malaria case management, and is a reference standard for research. However, as with other visualization-based diagnoses, accuracy depends on individual technician performance, making standardization difficult and reliability poor. Automated image recognition based on machine-learning, utilizing convolutional neural networks, offers potential to overcome these drawbacks. A prototype digital microscope device employing an algorithm based on machine-learning, the Autoscope, was assessed for its potential in malaria microscopy. Autoscope was tested in the Iquitos region of Peru in 2016 at two peripheral health facilities, with routine microscopy and PCR as reference standards. The main outcome measures include sensitivity and specificity of diagnosis of malaria from Giemsa-stained blood films, using PCR as reference. METHODS: A cross-sectional, observational trial was conducted at two peripheral primary health facilities in Peru. 700 participants were enrolled with the criteria: (1) age between 5 and 75 years, (2) history of fever in the last 3 days or elevated temperature on admission, (3) informed consent. The main outcome measures included sensitivity and specificity of diagnosis of malaria from Giemsa-stained blood films, using PCR as reference. RESULTS: At the San Juan clinic, sensitivity of Autoscope for diagnosing malaria was 72% (95% CI 64-80%), and specificity was 85% (95% CI 79-90%). Microscopy performance was similar to Autoscope, with sensitivity 68% (95% CI 59-76%) and specificity 100% (95% CI 98-100%). At San Juan, 85% of prepared slides had a minimum of 600 WBCs imaged, thus meeting Autoscope's design assumptions. At the second clinic, Santa Clara, the sensitivity of Autoscope was 52% (95% CI 44-60%) and specificity was 70% (95% CI 64-76%). Microscopy performance at Santa Clara was 42% (95% CI 34-51) and specificity was 97% (95% CI 94-99). Only 39% of slides from Santa Clara met Autoscope's design assumptions regarding WBCs imaged. CONCLUSIONS: Autoscope's diagnostic performance was on par with routine microscopy when slides had adequate blood volume to meet its design assumptions, as represented by results from the San Juan clinic. Autoscope's diagnostic performance was poorer than routine microscopy on slides from the Santa Clara clinic, which generated slides with lower blood volumes. Results of the study reflect both the potential for artificial intelligence to perform tasks currently conducted by highly-trained experts, and the challenges of replicating the adaptiveness of human thought processes.