A PageRank based predictive model for the estimation of the
archaeological potential of an urban area
Nevio Dubbini, Gabriele Gattiglia
Abstract—We present the analysis of multi-faceted, GIS man-
aged data for determining the archaeological potential, i.e. a
measure of the possibility that a more or less significant archae-
ological stratification is preserved. We used a sizable number
of datasets, in order to consider the problem of estimation
of archaeological potential in all of its aspects: archaeological
data, building archaeological data, historical data, toponymic
data, geomorphological data. As the identification of relations
among finds is a key issue for the data mining in archaeological
interpretation process, we applied a modified version of the
PageRank model, because the criteria for assigning importance to
web pages by search engines are similar and based on relations,
also. The procedure included a categorization archaeological data,
the assignment of initial values of potential to the available data
through an automatic procedure, the creation of geomorpho-
logical facies maps, the definition of functional areas (i.e. the
levels of spatial and functional organization: urban, suburban
and rural areas), and the application of the PageRank based
algorithm. The model has been applied on the urban area of
Pisa, and tested through the data of 14 new cores. The map
of archaeological potential consists of the composition of the
7 layers, one for each archaeological period under considera-
tion: Protohistory, Etruscan period, Roman period, Late Roman
period, Early Medieval period, Late Medieval period, Modern
Age, Contemporary Age. The results, including the archaeological
potential map, are to be considered as the first steps towards an
automatic, formally definable, and repeatable, approach to the
computation of archaeological potential.
Keywords—predictive modelling, archaeological potential,
PageRank, archaeological GIS, geomorphology.
I. I NTRODUCTION
This paper studies the problem of computation of archae-
ological potential, the assumptions made to solve it, the
mathematical model used, the software implementation, and
the test of the algorithm in the case study of the urban area of
Pisa. We based the mathematical model on PageRank, because
there is an analogy between the criteria used for attributing
archaeological potential and the criteria used for assigning
importance to web pages in search engine algorithms. The
key issue of the computation of archaeological potential, from
an abstract viewpoint, is the identification of relations among
finds: the presence of a particular find near another could
strengthen or weaken the probability that they will form a
more complex structure, and so strengthen or weaken the
archaeological potential of the area. This is exactly the criterion
upon which page ranking algorithms are based, whereby each
web page attributes importance to the web pages it points to
N. Dubbini is with University of Pisa, Mathematics department, Pisa, Italy,
nevio.dubbini@gmail.com.
G. Gattiglia is with University of Pisa, Archaeology Department, Pisa, Italy.
(via a link) and receives importance from the web pages it
receives a link from. The reader can refer to [4] for further
explanations about the choice of the mathematical model, and
to [8] for a general mathematical introduction to PageRank
models. In the following we will consider all the archaeological
data as categorised, having assigned each find to a category
in order to characterize its salient features, to effectively
implement the algorithm, and to make the results general
enough to be applied also in different contexts (pp. 89-99,
[2]).
II. DEFINITION OF ARCHAEOLOGICAL POTENTIAL
The archaeological potential of an area represents the proba-
bility that a more or less significant archaeological stratification
is preserved. It is computed by analysing a series of historical,
archaeological and paleo-environmental data, with a degree of
approximation that may vary according to the quantity and
quality of the data provided. The archaeological potential of an
area is independent of any other following intervention that is
carried out, which must be regarded as a contingent risk factor.
The process of defining overall urban archaeological potential
consists in drawing up a series of predictive maps relative
to historical periods. The general criterion was to reconstruct
stratigraphic intervals, and integrate this information with both
archaeological and geomorphological data: geological maps
define stratigraphic units and sedimentary bodies, geomorpho-
logical and paleogeographical maps show relief forms and
define the geomorphic processes responsible for their genesis,
in addition to recent modifications. Generally speaking, each
morphological unit (or morphotype) can be more or less
suitable for settlements. Subsequently the diachronic evolution
of the forms has been characterised. In archaeological terms,
the following parameters were taken into consideration for
the predictive definition of the city throughout its historical
periods: typology of finds, inferred on the basis of the interpre-
tation of the archaeological records [7]; quality and quantity
of the archaeographic data; spatial and typological relations
among the finds, which allow identification in probabilistic
terms of the presence of further finds in areas that have not
been archaeologically investigated; expert judgment; land use,
including traces that are not strictly connected to constructions
or settlements, such as agricultural and/or farming practices;
historical data from written sources and maps. Finally, we
indentified the following overall parameters that best determine
urban archaeological potential: type of settlement, i.e. the
presence of settlement structures and their different typology;
density of settlement; multi-layering of deposits; removable
or non-removable nature of the archaeological deposit; degree
of preservation of the deposit, calculated according to the
presence of anthropic and natural removals [1].
978-1-4799-3169-9/13/$31.00 ©2013 IEEE 571