Compliance of LC50 and NOEC data with Benford's Law: An indication of reliability? Pepijn de Vries a,n , Albertinka J. Murk b a Institute for Marine Resources and Ecosystem Studies Wageningen UR, P.O. Box 57,1780 AB Den Helder, The Netherlands b Wageningen UR Toxicology Section, P.O. Box 8000, 6700 EA Wageningen, The Netherlands article info Article history: Received 2 June 2013 Received in revised form 21 August 2013 Accepted 2 September 2013 Keywords: Benford's Law Reliability Quality assessment Data evaluation NOEC LC50 abstract Reliability of research data is essential, especially when potentially far-reaching conclusions will be based on them. This is also, amongst others, the case for ecotoxicological data used in risk assessment. Currently, several approaches are available to classify the reliability of ecotoxicological data. The process of classication, such as using the Klimisch score, is time-consuming and focuses on the application of standardised protocols and the documentation of the study. The presence of irregularities and the integrity of the performed work, however, are not addressed. The present study shows that Benford's Law, based on the occurrence of rst digits following a logarithmic scale, can be applied to ecotoxicity test data for identifying irregularities. This approach is already successfully applied in accounting. Benford's Law can be used as reliability indicator, in addition to existing reliability classications. The law can be used to efciently trace irregularities in large data sets of interpolated (no) effect concentrations such as LC50s (possibly the result of data manipulation), without having to evaluate the source of each individual record. Application of the law to systems in which large amounts of toxicity data are registered (e.g., European Commission Regulation concerning the Registration, Evaluation, Authorisation and Restriction of Chemicals) can therefore be valuable. & 2013 Elsevier Inc. All rights reserved. 1. Introduction In ecotoxicological risk assessment of substances, the quality and hence the reliability of underpinning data are vital. Reliability of these data is usually assessed by applying a scoring system. Klimisch et al. (1997) proposed an approach that is used by risk assessors from regulatory agencies to classify the reliability of studies performed. Other approaches are also available, some have been evaluated by Ågerstrand et al. (2011). Such scoring methods usually assess whether laboratory experiments are well documen- ted and conducted under standardised conditions. A problem with such classication methods is that they rely on the information provided and are time-consuming to perform. Such classications cannot account for irregularities in the data, e.g., as result of errors made during the performance of the test, errors made while interpreting the test results, or even deliberate data manipulation. The trustworthiness of data is also an issue in other elds using large data sets such as accounting. In that eld, an approach has been developed based on the occurrence of rst digits following a logarithmic scale, also called Benford's Law, Newcomb's Law or First Digit Law (Benford, 1938; Newcomb, 1881). It is successfully applied to identify suspicious book keeping (Rauch et al., 2011) or even fraud (Geyer and Williamson, 2004; Durtschi et al., 2004). In environmental science Benford's Law has been applied to identify irregularities in emission monitoring data (Dumas and Devine, 2000; Marchi and Hamilton, 2006) but to this date not to (eco)toxicicological data. This study applies Benford's Law to ecotoxicological data (median lethal concentrations, LC50 and No Observed Effect Concentrations, NOEC) as a tool to quickly screen large amounts of data for anomalies, thereby dealing with an untouched aspect of quality. 1.1. Benford's Law Benford's Law revolves around the rst non-zero digit in numbers of a data set (e.g., digit 8for the number 8.01, or 2for the number 0.023). One might expect that each leading digit occurs with equal frequency (that is, the chance of nding the leading digit 1is equal to that of nding digit 2, namely 1 9 0:111). However, Newcomb (1881) and later Benford (1938) (independently) observed that in many (but not all) data sets the leading non-zero digit 1is more common than 2, which in turn is more common than 3and so on. Newcomb (1881) formulated this observation as follows: Prob ðD 1 ¼ d 1 Þ¼ log 10 1 þ 1 d 1 ; for d 1 ¼ 1; ; 9; ð1Þ where the left-hand term indicates the probability that a rst non- zero digit ðD 1 Þ equals a specic digit ðd 1 Þ. So, according to Benford's Contents lists available at ScienceDirect journal homepage: www.elsevier.com/locate/ecoenv Ecotoxicology and Environmental Safety 0147-6513/$ - see front matter & 2013 Elsevier Inc. All rights reserved. http://dx.doi.org/10.1016/j.ecoenv.2013.09.002 n Corresponding author. E-mail address: pepijn.devries@wur.nl (P. de Vries). Please cite this article as: de Vries, P., Murk, A.J., Compliance of LC50 and NOEC data with Benford's Law: An indication of reliability? Ecotoxicol. Environ. Saf. (2013), http://dx.doi.org/10.1016/j.ecoenv.2013.09.002i Ecotoxicology and Environmental Safety (∎∎∎∎) ∎∎∎∎∎∎