Predicting Cannabis Abuse Screening Test (CAST) Scores: A Recursive Partitioning Analysis Using Survey Data from Czech Republic, Italy, the Netherlands and Sweden Matthijs Blankers 1,2,3 *, Tom Frijns 1 , Vendula Belackova 4 , Carla Rossi 5 , Bengt Svensson 6 , Franz Trautmann 1 , Margriet van Laar 1 1 Department of Drug Monitoring, Trimbos Institute, Utrecht, the Netherlands, 2 Department of Psychiatry, Academic Medical Centre, University of Amsterdam, Amsterdam, the Netherlands, 3 Department of Research, Arkin, Amsterdam, the Netherlands, 4 Department of Addictology, First Faculty of Medicine, Charles University and General University Hospital, Prague, Czech Republic, 5 Centre for Biostatistics and Bioinformatics, University Rome Tor Vergata, Rome, Italy, 6 Department of Social Work, Malmo ¨ University, Malmo ¨ , Sweden Abstract Introduction: Cannabis is Europe’s most commonly used illicit drug. Some users do not develop dependence or other problems, whereas others do. Many factors are associated with the occurrence of cannabis-related disorders. This makes it difficult to identify key risk factors and markers to profile at-risk cannabis users using traditional hypothesis-driven approaches. Therefore, the use of a data-mining technique called binary recursive partitioning is demonstrated in this study by creating a classification tree to profile at-risk users. Methods: 59 variables on cannabis use and drug market experiences were extracted from an internet-based survey dataset collected in four European countries (Czech Republic, Italy, Netherlands and Sweden), n = 2617. These 59 potential predictors of problematic cannabis use were used to partition individual respondents into subgroups with low and high risk of having a cannabis use disorder, based on their responses on the Cannabis Abuse Screening Test. Both a generic model for the four countries combined and four country-specific models were constructed. Results: Of the 59 variables included in the first analysis step, only three variables were required to construct a generic partitioning model to classify high risk cannabis users with 65–73% accuracy. Based on the generic model for the four countries combined, the highest risk for cannabis use disorder is seen in participants reporting a cannabis use on more than 200 days in the last 12 months. In comparison to the generic model, the country-specific models led to modest, non- significant improvements in classification accuracy, with an exception for Italy (p = 0.01). Conclusion: Using recursive partitioning, it is feasible to construct classification trees based on only a few variables with acceptable performance to classify cannabis users into groups with low or high risk of meeting criteria for cannabis use disorder. The number of cannabis use days in the last 12 months is the most relevant variable. The identified variables may be considered for use in future screeners for cannabis use disorders. Citation: Blankers M, Frijns T, Belackova V, Rossi C, Svensson B, et al. (2014) Predicting Cannabis Abuse Screening Test (CAST) Scores: A Recursive Partitioning Analysis Using Survey Data from Czech Republic, Italy, the Netherlands and Sweden. PLoS ONE 9(9): e108298. doi:10.1371/journal.pone.0108298 Editor: Tiziana Rubino, University of Insubria, Italy Received December 3, 2013; Accepted May 6, 2014; Published September 29, 2014 Copyright: ß 2014 Blankers et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. Funding: Survey development and execution, and preparation of the full report were funded by the European Commission, project ‘‘Further insights into aspects of the illicit EU drugs market’’. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript. Competing Interests: The authors have declared that no competing interests exist. * Email: mblankers@trimbos.nl Introduction Cannabis is Europe’s most commonly used illicit drug with approximately 20 million adults having used the drug in the last year, which is about 6% of the population aged 15–64 years [1]. An indication of the public health impact of cannabis use is reflected in data on patients entering specialized treatment in Europe for substance use disorders: Cannabis is the second most frequently reported substance, after heroin [1]. At the same time, many users of cannabis do not develop substance use disorders or other problems associated with their cannabis use [2,3]. Patterns of substance use such as frequency of use or social contexts of use are important predictors of substance use disorders [4,5]. In a recent study, living alone, coping motives for cannabis use, recent negative life events, and cannabis use disorder symptoms were found to predict first incidence of cannabis dependence [2]. Against this backdrop, two important public health challenges can be raised: (1) to develop, validate and implement screening tools to identify those at-risk to develop cannabis use disorders, and (2) to identify patterns of risk factors or markers associated with at-risk use, in order to target drug policy efforts to individuals manifesting these patterns [53]. Recently, a number of screening instruments for risky substance use have been developed and validated for cannabis, including the Severity of Dependence Scale (SDS) [6,7], the Cannabis Use Disorder Identification Test PLOS ONE | www.plosone.org 1 September 2014 | Volume 9 | Issue 9 | e108298