280 Internal securIty revIew 20/19 Daniel Mider Jan Garlicki Wojciech Mincewicz The Internet Data Collection with the Google Hacking Tool – White, Grey or Black Open-Source Intelligence? Google search engine is as much common (and having almost no alternative), as it is unknown at the same time. Its potentials in the so-called sensitive data collection regarding individuals and institutions are underestimated. the well-constructed query, submitted to the Google, makes it possible to fnd deleted and archival sites, restore the structure of a webpage or the structure of an intranet, access server confguration parameters, obtain information secured intentionally from unauthorized access (paywall, passwords), obtain user names and passwords, their identifcation numbers (for example the numbers of payment cards, IDs, etc.) and access device confguration parameters (servers, web cameras, routers and others), to take control of them. such activities are described in the literature as G o o g l e H a c k i n g (GH), G o o g l e D o r k s (GD) or (rarely) – Google Scanning (GS) or Engine Hacking (EH). These terms mean such queries to the Google search engine so that it would make such data available to the users, who are unauthorized in ethical or legal sense, or both. 1 The notion of G o o g l e H a c k i n g was introduced by the authority fgure in the area, Johnny long. 2 the notion of G o o g l e D o r k means a person who is inept in securing online content, mainly web sites. the inability can be revealed by Google very easily. as the author points out, the meaning of this lexeme has changed over the time and currently it signifes a person who obtains confdential information from the Google. 3 The history of GH/GD started with the defning of the phenomenon in December 2002 by Johnny long, the method’s pioneer, although it existed already in 2000 1 cf. J. long, Google Hacking for Penetration Testers, rockland 2007, p. 534. collins dictionary contains also a similar defnition: https://www.collinsdictionary.com/submission/9695/google+dorks [access: 26 I 2018]. Apart from such defnitions there are other using IT sociolect, although meaning more or less the same, for example the understanding of GH as “consistent search exploits” or “obtaining a sui generis virtual notebook”. Cf. Google Hacking – w służbie hakerom, “Haker.edu. pl”, 10 VII 2015, in: https://haker.edu.pl/2015/07/10/google-hacking-google-dorks/ [access: 26 I 2018]. There are two most popular notions used interchangeably in the text, i.e. G o o g l e H a c k i n g and Google Dorks. cf. Roll Call Release. Intelligence for Police, Fire, EMS, and Security Personnel, 7 VII 2014, https://info.publicintelligence.net/DHS-FBI-NCTC-GoogleDorking.pdf [access: 26 I 2018]. 2 He is a famous author of the already non-existent web page http://johnny.ihackstuff.com; at present its content was moved to http://www.hackersforcharity.org/ghdb/, i.e. Google Hacking Database. He is also known as “j0hnny” and “j0hnnyhax”. 3 J. long, The Google Hacker’s Guide. Understanding and Defending Against the Google Hacker, http://pdf.textfles.com/security/googlehackers.pdf [access: 26 I 2018].