408 Microsc. Microanal. 27 (Suppl 1), 2021 doi:10.1017/S1431927621001987 © Microscopy Society of America 2021 A Machine Learning Approach to Cluster Characterization for Atom Probe Tomography Roland Bennett 1 , Andrew Proudian 2 and Jeramy Zimmerman 3 1 Colorado School of Mines, GOLDEN, Colorado, United States, 2 Colorado School of Mines, Physics, Golden, Colorado, United States, 3 Colorado School of Mines, Physics, GOLDEN, Colorado, United States The clustering properties of solute species drive performance in materials ranging from metal alloys to organic light emitting diodes [1][2][3]. In atom probe tomography (APT) people have typically used cluster detection algorithms, such as the maximum separation algorithm (MSA), to detect and then characterize individual clusters. MSA, in particular, has the drawbacks of requiring user input parameters and not being sensitive to low density clustering [4][5]. While other cluster detection techniques have been developed, they often share similar limitations requiring high contrast between clusters and background [6][7][8]. In this work, we advance a machine learning model implemented using rapt [9], which we have developed in the statistical computing language R. Our model is based on spatial statistics summary functions that characterize global clustering properties and behavior in APT data sets, providing an alternative characterization approach to cluster detection analysis. In previous work, we utilized a Bayesian regularized neural network (BRNN) machine learning model, trained on features derived from Ripley’s K function to measure four metrics that characterize clusters: the cluster dopant density (ρ1), the background dopant density (ρ1), the mean cluster radius (r), and the radius blur ( δr) (i.e. the standard deviation of the cluster radius divided by the mean cluster radius) [5]. Here, we improve upon our previous work by incorporating features derived from the first-order summary functions G, G-cross, and F into our models, resulting in more accurate cluster analysis. These first order summary functions enable ρ1 and ρ2 to be predictions with very low error: in simulated training and testing data sets, 90% of predictions for both were within 3.5% of the actual value, as shown for ρ1 in Figure 1 (an improvement from 18% for ρ1 in our previous work). While the percent error of ρ2 was not measured in our previous work, the value of absolute error of the 90 th error percentile was reduced by 94%. These first order summary functions enable decoupling of r from δr, enabling the prediction of the average cluster radius itself, with 90% of predictions falling within 15% of the actual value (in comparison to 18% for predictions based solely on the K-function). The predicted value vs true value for ρ1 is shown in Figure 2. The simulated data sets used in this work (clustered point patterns of random clustering metrics, 10,000 for training and 2,500 for testing) were created on a single 28-core high-performance computing node and required only 90 minutes to generate and 15 minutes to train and create predictions, meaning it is also feasible on common desktop computers. A larger model using ten times as much data was also examined, but only minor improvements were seen, therefore not justifying the greater computational costs. In this talk, we discuss development and results of this algorithm, its applications to experimental data, and the implications of global clustering behavior. We have made example analyses available on our website, enabling other users of the APT community to easily adopt this method of cluster analysis. https://www.cambridge.org/core/terms. https://doi.org/10.1017/S1431927621001987 Downloaded from https://www.cambridge.org/core. IP address: 3.90.18.17, on 05 Nov 2021 at 00:45:50, subject to the Cambridge Core terms of use, available at