Landscape classification of Huelva ( Spain ) : An objective method of identification and characterization Clasificación del paisaje de Huelva ( España ) : Un método objetivo de identificación y caracterización

This study sought to classify the landscape of the province of Huelva (Andalusia, Spain) and validate the results, using a new application of classical multivariate methods in conjunction with GIS tools. The province was divided into 1 km x 1 km grid squares to which information was associated on four visually-perceivable variables: soil use, plant cover, lithology and relief. Grid cells were then classified using twoway indicator species analysis (TWINSPAN) and ordered by detrended correspondence analysis (DCA). Analysis of results yielded 8 major landscape types that were characterized by its indicator variables. This classification was checked by Discriminant Analysis, which yielded an 80% match with the TWINSPAN estimate.


INTRODUCTION
Article 6 of the European Landscape Convention (Council of Europe, 2000) sets out the commitment of the Parties to the "Identification and Assessment" of their own landscapes.This in turn requires detailed systematic knowledge of existing landscape types, their distribution, composition and value (Fairbanks & Benn, 2000;García-Quintana et al., 2005).
Landscape classification is therefore an essential prerequisite to landscape evaluation (Blankson & Green, 1991;Cooper & Murray, 1992).Similarly, any study of landscape change requires the prior application of classification methods (Jobin et al., 2003;Acosta et al., 2005).Landscape classification, moreover, is critical because it can significantly affect where and what conservation investments are made (Lindenmayer et al., 2008).Classification procedures are generally carried out by landscape ecologists interested in studying the interaction between human activity and the landscape (Farina, 2006).Landscape classification, in short, provides a valuable tool for regional planning (Bastian, 2000), whose main advantage is that it enables policies to be aimed specifically at land classes with clearly-defined features and spatial locations (Bunce & Heal, 1984;Cherrill, 1994).Bunce et al. (1996a) suggest that land classification procedures can be assigned, as a function of their objectivity, to one of two major categories: a) Intuitive approaches.These form the basis of traditional cartography and, although rarely supported by validation tests, usually work well.Formalized subjective methods may be seen as a development of the intuitive approach: rules are defined on the basis of experience and intuition, and are then rigorously applied in the classification procedure.b) Objective, mathematical techniques.These approaches have evolved from multivariate techniques originally used to describe the associations and groupings of plant species.The subjectivity lies in the initial selection of variables.
A number of landscapes have been classified into ecologically-homogeneous units using both intuitive (Bailey, 1996;Gallart et al., 1989) and objective mathematical approaches (Laut & Paine, 1982;Adamson, 1984;Belbin, 1993;Cooper, 1995).In both types of procedure, expert judgement is always part of our conceptual constructions; transparency is thus the only way to provide credibility and allow repeatability (Pedroli et al., 2006).
Among the polythetic divisive hierarchical clustering techniques, TWINSPAN (Two Way Indicator Species Analysis) is the most popular (Mc-Garigal et al., 2000).It has been used in a number of studies of this kind (Bunce et al., 1996a;Bunce et al., 1996b;Carter et al., 1999;Cooper & Loftus, 1998;Haines-Young, 1992;Ke-Ming et al., 2000;Lyon & Sagers, 2002;Mc-Nab et al., 1999;Chuman & Romportl, 2010).One advantage of this method over other classification techniques is that it allows elements to be grouped and at the same time provides an ecological interpretation of how groups differ (McGarigal et al., 2000).
Ordination analysis has often been used prior to the analysis of landscape, territorial, ecological or bioclimatic classifications in an attempt to identifyfrom a group of preselected variables -those variables contributing most to the definition of groups, with a view to using them for subsequent classification (Poudevigne & Alard, 1997;McNab et al., 1999;Lyon & Sagers, 2002;Mora & Iverson, 2002, Jobin et al., 2003).Nevertheless the complementary use of classification analysis and ordination charts is also recommended as a routine procedure in ecology (Legendre & Legendre, 1998).DCA (Detrended Correspondence Analysis) is one of the ordination techniques more frequently used.
Testing the validity of the result is one of the most problematic aspect of any landscape classification (Haines-Young, 1992).In order to get this objective a statistical method may be used.Discriminant analysis is normally used either to determine the relative contribution of various explanatory descriptors to the distinction between states (discrimination functions) -as in Triantafilis et al. (2003) -or to obtain a linear equation enabling new objects to be assigned to states generated by a previous classification (classification functions), as in the classification of ecosystem units by McNab et al., (1999).In Soto and Pinto (2010), the classification functions of discriminant analysis were used to validate the classification generated.
In relation to landscape studies in Spain, Serrano Giné (2012) difference two great ways to systematize the classification of landscapes: (1) according to criteria of form (regular or irregular geometries) and ( 2) according to criteria of content (summative, ecological or systemic ways).Among them, and of particular relevance to this work because integrate the scope of this study, are the Atlas de los Paisajes de España (Sanz Herráiz, 2003) and the Mapa de los Paisajes de Andalucía (Moreira, 2005).
Ortega Cantero (2010) stands out the use of visually-perceivable variables since from them the order, organization or structure of geographical reality, is clearly expressed.
The aims of this study were: 1) To identify and characterize the landscapes of the province of Huelva (Andalusia, Spain) using objective methods (multivariate classification analysis) and GIS tools, on the basis of visually-perceivable variables.2) To evaluate the classification thus obtained and establish degrees of differences between groups using multivariate ordination analysis.3) To check the validity of the information obtained using multivariate statistical methods.

Study area
The study was carried out in the province of Huelva (Andalusia, southern Spain) (figure 1).The province covers a surface area of 10,128 Km 2 , and has 513,403 inhabitants.Huelva is close to the Atlantic, and the resulting oceanic influence accounts for a narrower range of mean temperatures between the warmest and the coolest months.The terrain slopes gently nearer the south-ern coast, becoming rockier and steeper inland; the increase in altitude from south to north also prompts a progressive drop in mean temperatures.Winters are mild, with monthly mean temperatures of over 10ºC; average summer temperatures range around 25ºC, and maximum temperatures rarely exceed 40ºC.Average annual rainfall ranges around 500-600 mm (though in the more mountainous inland area it can reach 1000 mm).Maximum rainfall is recorded in late autumn-winter, while summers are very dry.In view of these characteristics, the climate can be classified as Mediterranean Oceanic (Pita López, 2003).Huelva comprises two major geostructural units: the Hesperic Massif (Sierra Morena) to the north, and the Guadalquivir Depression to the south.The Sierra Morena contains hard materials: slates, volcanic/sedimentary rocks, and limestone outcrops; there are also abundant plutonic intrusions.This unit contains the highest altitudes in the province, which do not exceed 1000 m.The Guadalquivir Depression acts as a catchment for sediment generated by erosion; basin-fill deposits during the most recent (Quaternary) Period are largely lacustrine, fluvial (terraces and alluvial), colluvial, eolian (coastal dunes, sand-sheets) and marshy, following the closure of several river estuaries (Moreira, 2003).
Relative surface areas by land use and major land cover are as follows: Buildings and Infrastructure, 1.73%; Wetlands and Water Bodies, 4.31%; Agricultural Land, 16.91%; and Forest and Natural, 77.06%.
The two most important protected natural spaces in the province of Huelva are the Sierra de Aracena y Picos de Aroche Natural Park and the Doñana National Park, the latter a UNESCO World Heritage site and one of Europe' s largest biological reserves.

Data collection
The province of Huelva was divided into 1 km x 1 km georeferenced grid squares or cells.Use of a standard sampling unit has two principal advantages: it is readily applicable to large areas, and it removes the subjectivity inherent in the use of sampling units defined by mapping natural variables (Zonneveld, 1989).For each of the 10,464 grid cells thus obtained, the following information was associated: land use and land cover (including continental waters), lithology, and relief.This procedure gave rise to a set of variables whose value was calculated for each of the grid cells, enabling their comparison and classification.The following specific variables were used: -Land use and cover.Data obtained from the Digital Map of Land Use and Land Cover in Andalusia, scale 1:25,000 (province of Huelva), available from the Andalusia Environmental Information Network (REDIAM).
The 112 land-use and land-cover types in the original legend were merged to form 49 classes (table 1), on the basis of formal homogeneity in terms of visual perception.For this and all other types of variable, the surface area represented by each class in each square was calculated in absolute terms.

Multivariate Classification Analysis: TWINSPAN
Grid cells were classified (following 0-1 standardization) by the TWINSPAN multivariate classification method (Two Way Indicator Species Analysis (Kent & Coker, 1992)), using the CAP 3.0 software package.TWINSPAN classifies a sample set by repeated dichotomous divisions, establishing groups on the basis of the values obtained for variables.For this purpose, each quantitative variable is divided into qualitative variables known as pseudovariables.The differential presence of pseudovariables distinguishes between the groups formed in each division.This difference is quantified by the indicator value (I).
It is assumed, as a basic rule, that the pseudovariable with the greatest indicator value counts as the global indicator value for that variable.For continuous variables (Mean height, Diference between maximum and minimum and Mean slope), each pseudovariable refers to a range of values of that variable.While in categorical variables (all others) each pseudovariable refers to a range of presence of that variable.
Groups established by TWINSPAN are described by their indicator variables and their preferential variables.Indicator variables are those whose indicator value lies between +0.5 < I < +1 and -0.5 > I > -1.A variable is considered preferential for one or other group in a dichotomy when the probability of its being present in one group is over twice the probability of its being present in the other, its indicator value being: -0.5<I <+0.5.
For the analysis of the groups generated at the three levels of division considered, all indicator variables and the greatest preferential variables were used.

Multivariate ordination analysis: DCA
The results of the TWINSPAN classification were evaluated by Detrended Correspondence Analysis (DCA (Kent and Coker, 1992)), using the CAP 3.0 software package.The first two axes were interpreted.
DCA enables analysis of the position occupied by grid cells in the ordination space defined by the variables, allowing groupings in that space to be identified (McGarigal et al., 2000).As a result, the clusters formed in classification analysis can be evaluated and the degree of difference (i.e. the distance) between clusters can be established (Kent & Coker, 1992).

Validation using Discriminant Analysis
Discriminant Analysis was used to check the validity of the TWINSPAN classification.In this type of canonical analysis, the aim is to account for the structure of a qualitative descriptor (or a classification) in terms of quantitative descriptors (Legendre & Legendre, 1998).Objects can be assigned to groups by calculating classification functions and establishing the classification value of each object for each of the classification groups; each object is thus assigned to the group for which its receives the highest classification value.These data can be used to construct a contingency table to compare the original assignation of objects to groups with the assignation carried out using classification functions (Legendre & Legendre, 1998).This table can then be used to determine the number and percentage of cases correctly classified by discriminant functions.Here, since data distribution was non-normal, a non-parametric method was used for discriminant analysis: k Nearest Neighbour (K = 10).The SAS System software package was used for this purpose.
For validation to be meaningful, the set of elements classified by TWINSPAN had to be different from that subjected to discriminant analysis (Legendre & Legendre, 1998).Accordingly, a subset (1) comprising half the grid cells for the whole of the province of Huelva was used for TWINSPAN classification.The degree of match between the classification obtained and the original classification of all grid cells was checked.Next, taking as reference the classification of subset 1 and using the interpolation method, a second subset (2) of grid cells was obtained, classified into the eight groups generated at TWINSPAN level 3.This classification was used as the observed classification.Information on the 95 variables used in TWINSPAN was associated to each square, these variables thus serving as qualitative descriptors.
The goodness of the results obtained by discriminant analysis was in turn evaluated by cross-validation using the jacknife estimator (Manly, 1997;Muñoz Serrano, 2003).

Multivariate Classification Analysis: TWINSPAN
TWINSPAN results from the first to the third division were analysed; the final total of eight groups generated in the first three divisions was considered representative of the landscape typology of Huelva province.The three successive divisions generated two, four and eight groups, respectively (figures 2 and 3; tables 4-6).The main features of the eight groups considered as landscape types for the study area are as follows: -High mountain range (Group a): Mostly found in the Sierra de Aracena y Picos de Aroche Natural Park.The mean altitude of 80% of the territory ranges between 444 and 622 m.The highest peaks in the province are to be found here; slopes not greater than 15% predominate, and ruggedness is intermediate to high.Quercus (cork-oak in this case) dehesas are the main feature of this typical sierra landscape, and there are also abundant oak and conifers stands, sometimes with undergrowth.-Low mountain range (Group b): Lower-altitude sierra landscape, with heights ranges between 300-400 m.Abundant steep slopes, often exceeding 45%.This landscape covers the Sierra Pelada y Ribera del Aserrador Natural Park, the Peña de Aroche Natural Park and areas of considerable visual impact such as the Río Tinto mines (Río Tinto Protected Landscape).Scattered conifers stands, with and without undergrowth, are found throughout the land corresponding to this landscape type.-Peneplains and piedmonts (Group c): Height range 100-300 m, with slopes of less than 7%, mild to moderate relief, and mostly granite-type lithology.Characteristic features of this landscape include extensive pastureland and scattered areas of undergrowth interspersed with oak stands.-Slopes and Hills (Group d): Characterised by steep rocky slopes (30-45%), due to the presence of narrow river valleys, and by altitudes not exceeding 300 m.Extensive scrub, with abundant eucalyptus stands and dry river beds.-Croplands (Group e): Characteristic features include loamy soils and dryland crops: cereals, woody crops (olive) and a mixture of the two.-Coastal and pre-coastal dunes (Group f): Landscape type characterised by highly undulating relief, sandy lithology and land cover dominated by conifers (mainly pines) stands and accompanying scrub.Land uses include urban areas aimed at seaside tourism (most numerous on the western coast).-Sands (Group g): A flat sandy landscape, with mean altitudes of between 18 and 44 m, containing numerous eucalyptus and conifers stands together with characteristic undergrowth.-Marshes (Group h): A clearly-defined landscape type including wellknown areas such as the Doñana National Park and the Marismas del Tinto y del Odiel Nature Park, as well as less known beauty spots such as the Marismas de Isla Cristina Nature Park and the Río Piedras y Flecha Natural Park at El Rompido.This is a flat landscape covered by marshland both with and without vegetation; soils are predominantly loamy.

Multivariate Ordination Analysis
The results of DCA (figure 4) highlighted the specific values of the first two axes: 0.502 for Axis 1 and 0.196 for Axis 2.

Validation using Multivariate Methods
A 96.98% match was recorded between the TWINSPAN classification for the whole sample set (10,464 grid cells) and the TWINSPAN classification for subset 1 (5237 grid cells).
A global match of 80.01% was recorded between the TWINSPAN classification and the classification performed using discriminant functions (table 7).The highest percentage of correctly-classified grid cells was found for Group a (89.28%), and the lowest percentage for Group f (70.80%).The reliability of the results was confirmed by the total estimation error as determined by cross validation: 24.11%, compared to an error rate of 19.99% for discriminant analysis.

DISCUSSION
Overall analysis of TWINSPAN results highlighted the differing role played by different types of variable at successive levels of classification: relief-related variables were particularly relevant in the first two groups generated, followed Geology is considered a major factor that needs to be taken into account when developing land classifications in southern Europe because landform and soil parent material play a more major role in defining soil features, and consequently life conditions, under a Mediterranean climate than under oceanic and continental temperate climates (Bunce et al., 2002).Land-use and land-cover variables made their first appearance at level 3.The resulting landscape units displayed their own distinctive features, and differed sufficiently from each other to be considered basic landscape types.Natural and seminatural vegetation played a key role in defining all landscape types except for Croplands (Group e) (where crops made a major contribution) and Low mountain range (Group b) (not characterised by any land-use or landcover variable).A number of authors have drawn attention to the role of vegetation in the development of specific landscape identities, due to its visual impact (Misgav & Amir, 2001).
The graphic output of DCA (figure 4) confirmed the results of the TWINSPAN classification.The position of the grid cells with respect to DCA Axis 1 matched the groups distinguished by TWINSPAN in the first two levels of classification, while third-level groups were distinguished with respect to Axis 2. Arrangement of landscape types along the Axis 1 gradient (figure 4) underlined their location in the geographical space from north to south.Distances between landscape units within the same cluster were generally shorter than distances between landscape units belonging to different clusters.
The results obtained supported that classification: of the 5227 grid cells analysed, 4108 were assigned by discriminant analysis to the same groups to which they had been assigned by TWINSPAN, giving an error rate of 19.99%.
The present study builds upon a research effort that has prompted similar classifications elsewhere in the world.The pioneering work in the use of this multivariate analysis for land classification was undoubtedly the Land Classification of Great Britain carried out by the Institute of Terrestrial Ecology (ITE) (Bunce et al., 1996b).
Any discussion of the method used and the results obtained here must take as its framework this seminal achievement by the ITE.Comparison of the two studies highlights a number of differences, both of purpose and of method.
Purpose: the aim here was to classify landscapes rather than land; accordingly, visually-perceivable variables were used, instead of the climatic variables used in the ITE classification, which play no key role in landscape classification.
Method: the technological constraints operative when the first land classification of Great Britain was produced (1977) prevented the simultaneous LANDSCAPE CLASSIFICATION OF HUELVA (SPAIN): AN OBJECTIVE METHOD... classification of the 240000 1 km squares.A workable representative population of 1212 squares, produced by sampling, was therefore subjected to multivariate analysis using Indicator Species Analysis (ISA).A total classification was then obtained using logistic discrimination (LGD).The 32 land classes used in the analysis of the 1212 grid cells were modified so that they would be representative of the whole of Great Britain.Here, by contrast, TWINSPAN yielded satisfactory results, and DCA was used to confirm group classification and to establish the degree of difference (i.e. the distance) between groups in the ordination space; finally, discriminant analysis was used to validate rather than generate the classification.
The ITE classification provided the foundation for a number of similar studies, including the Bioclimatic Land Classification of Spain (Elena-Rosello et al., 1997).This was a four-stage method using, successively, 25, 4, 1 and 0.5 Km 2 squares.Classes were established using climatic, physiographic and geological data, and the ecological significance of the classes established was checked using soil data, as well as land-cover and land-use data.This is the major difference between the method applied by Elena-Rosello et al. (1997) and that used in the present study; here, rather than using one set of variables for identification purposes and another set for characterisation purposes, all variables were used to identify groups, which were characterised by the major resulting indicator and preferential variables.In both studies, however, the classification method used was TWINSPAN.
Another leading study (Bunce et al., 1996a) classified Europe into climatic regions using only climate, location and altitude data.Principal Component Analysis (PCA) was used to represent dominant trends in climate variation.A total of 68 variables and 5209 grid cells (size: 0.5º longitude x 0.5º latitude) were classified by TWINSPAN, generating 64 classes.The final geographical distribution of classes was smoothed by discriminant function in order to remove outliers.
The multivariate methods used in all these studies (Bunce et al., 1996a, Bunce et al., 1996b, Elena-Rosello et al., 1997), and also here, were very similar (TWINSPAN, DCA, PCA, Discriminant Analysis), though they were used differently; this merely serves to confirm the flexibility and the potential of these methods for land classification.
A later paper entitled "Identification and Characterisation of Environments and Landscapes in Europe" (Mücher et al., 2003) reported on two major projects: an Environmental Classification of Europe and a European Landscape Classification.Environmental classification was carried out at pixel level, the pixel being the minimum GIS mapping unit.Pixels with asso-ciated information on the environmental variables used -altitude, slope, latitude, temperature, rainfall and sunshine duration -were subjected to PCA.The first three principal components explained 88% of the variation in the input variables; pixels containing information from the first three components were used for classification analysis.The Iterative Self-Organising Data Analysis Technique (ISODATA) was used to cluster the principal components into environmental classes.The present study -as well as using different classification units (1 km x 1 km square vs. pixel), a more restricted area (regional vs. continental) and different variables -opted to use TWINSPAN, which affords two major advantages over ISODATA for typological classification: (1) It is a hierarchical divisive method -whereas ISODATA directly classifies objects into an pre-established number of user-defined groups -and it enables bottom-up visualisation of the classification process, thus helping to understand the way groups are formed and enabling identification of the major defining variables at each level; all this leads to a more thorough knowledge of the landscape, and ensures a more reliable typology; (2) the landscape types established are objectively characterised by indicator variables.
The European Landscape Classification (Mücher et al., 2003), addressed at greater depth in Mücher et al., (2010), used topographic, parent-material and land-use variables.Classification was carried out using eCognition object-oriented image-classification software, which is widely used for multiscale analysis of geographical data of all kinds.The image classification is based on the attributes of image objects rather than the attributes of individual pixels.Although the method used differed markedly from that employed in the present study, the underlying philosophy was very similar, in that both studies were concerned with classifying landscapes rather than with environmental or bioclimatic classifications.

CONCLUSIONS
Several conclusions regarding methods and landscapes can be drawn from the present study.Use of TWINSPAN and visually-perceivable variables proved to be a valid technique for landscape classification, establishing 8 reliable landscape units for the province of Huelva.The value of ordination analysis (DCA) for establishing the degree of difference between landscape types was confirmed.Discriminant Analysis proved to be a useful method for validating landscape classifications.The present study enabled the objective and rigorous establishment of the landscape types present in the province of Huelva, and provided information on their distinctive features, with associated quantitative data (indicator value = I) indicating their relative importance.The results highlighted the great diversity of landscapes in the province.
The main advantages of the method used are that it can be applied to any land classification, and that classification is operator-independent, thus enabling objective comparison of results obtained in various areas.In short, the method is an objective tool whose results may serve as the basis for landscape planning and management.Fecha de recepción: 11 de julio de 2014.Fecha de aceptación: 16 de enero de 2015.

FIGURE 1 MAP
FIGURE 1 MAP OF THE STUDY AREA, HUELVA, IN SPAIN

FIGURE 2 DISTRIBUTION
FIGURE 2 DISTRIBUTION OF SQUARES ASSIGNED TO GROUPS (I AND II) BY TWINSPAN LEVEL 1 DIVISION, AND OF SQUARES ASSIGNED TO GROUPS (1, 2, 3, 4) BY TWINSPAN LEVEL 2 DIVISION FIGURE 3 DISTRIBUTION OF SQUARES ASSIGNED TO GROUPS 1 TO 8 (LANDSCAPE TYPES) BY TWINSPAN LEVEL 3 DIVISION HUELVA (SPAIN): AN OBJECTIVE METHOD...

FIGURE 4 CORRESPONDENCE
FIGURE 4 CORRESPONDENCE DIAGRAM OF THE FIRST TWO AXES OF THE DCA FOR THE GRID CELLS OF HUELVA.MATCH WITH TWINSPAN GROUPS IS INDICATED BY INDICES (I AND II; 1, 2, 3 AND 4) FOR LEVELS 1 AND 2, AND BY GREYSCALE TONES AND SYMBOLS FOR LANDSCAPE TYPES (LEVEL 3).THE CENTROID FOR EACH LANDSCAPE TYPE HAS BEEN DRAWN

TABLE 1 LAND
Relief.Relief data were taken from the Digital Map of Elevations (MDE), Andalusia, scale 1:50,000, one of a set of thematic maps drawn up by REDIAM.The variables (table 3) were calculated using the Spatial Analysis extension of the Arc/View 3.2 software package, and applicable scripts such as Texture and DEMAT.
-USE AND LAND-COVER CLASSES USED Source: Authors.-Lithology.Data were obtained from the Thematic Map of Andalusia, Physical Environment, scale 1:100,000 (table2).-