# Geographically Weighted Regression to Model Housing Prices

**Disclaimer:** This work has been submitted by a student. This is not an example of the work written by our professional academic writers. You can view samples of our professional work here.

Any opinions, findings, conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of UK Essays.

Published: *Wed, 18 Oct 2017*

## Introduction

In chapter 2, HPM has been used to model the relationships between characteristics of property and neighbourhood. However, HPM treats the whole housing market as a single homogenous market and assumes a stationary process, i.e the parameter estimates are assumed to apply equally over space. This presumes the influences of various factors on house prices in one location are the same as those in another location so that space, place and location do no matter (Foster refer).

However, as shown in Chapter 2, the residuals derived using HPM are correlated. Additionally, Chapter 3 shows that when MLM approach is employed to account for spatial heterogeneity, the effects of those various factors in fact vary across neighbourhoods at different scales and there are great price differentials between neighbourhoods. The global approach, such as HPM, masks those local deviations from this average relationship.

### Disadvantages of MLM

Although MLM approach takes into account spatial heterogeneity by specifying the spatial units as levels in the model, there are some weakness of this approach. Firstly, there is no agreement on the definition of neighbourhoods (Kearns and Parkinson 2001: 2103), so the specification of the macro level units (i.e. neighbourhoods) is fairly arbitrary. In the past, census boundaries (…),administrative boundaries (….), or school catchment areas (goodman) have all been used to delimitate the whole house market into smaller submarkets, or local neighbourhood areas. Some researchers combined a series of dataset, such as travel-to-work, immigration and house price information and constructed a so-called housing market areas (HMAs)(………..) . HMAs neither match the census boundaries, or the administrative boundaries, but instead, they represents………….. . The existence of spatial dependency in geographical data means that the observations that are most spatially dependent in the locations that are close to each other should constitute a neighbourhood. A predefined hierarchy of spatial units based on administrate or census boundaries may not necessarily appropriate.

Secondly, MLM[1] treats space and assumes that same spatial process applies within the neighbourhoods and discontinues at the boundaries of the neighbourhoods. (……). Additionally, the highest level of spatial units (for example, MSOAs in our analysis) are assumed to be spatially dependent. This assumption is unrealistic because the “effect” of a neighbourhood is more likely change gradually from one neighbourhood to its adjacent ones rather than completely stops, the so-called “spill-over” effects. Therefore, there might be presence of spatial dependency between MSOAs that MLM is unable to capture.

In contract, GWR (Brunsdon et al, 1996…..) relaxes the assumptions of the effects of various variables being constant over space (Dark, 2004,Mitchell, 2005andShi etal., 2006) and treats space as continuous. It calibrates locally a spatially varying coefficient regression model for each location of the study area by weighting the attributes of it neighbouring locations based on distance-decay functions (…….). The attributes of neighbours of a fitted location are all considered so the spatial dependency and heterogeneity can be taken into account in this approach (Paez 2005). This chapter therefore introduced this type of modelling technique to explore the spatial variations that may exist in the relationships between house price and its predictors.

### Purpose and Structure of the Chapter

The aim of this chapter is to identify whether the relationships of house prices and a range of characteristics of houses and neighbourhood attributes) are relatively stable, or they vary substantially over space? If there are spatial variations, how does the relationships vary within and between neighbourhoods and how does this variation differ from the results derived from MLM approach? In addition, how good is the GWR approach in terms of its predictive capability, compared with MLM.?

In the next section, a brief description of this technique is introduced. Section 3 follows with a review of previous applications of GWR is presented. The proposed study in relation to the empirical implementation of the technique then follows in section 4. The final section summarise the comparison between GWR and MLM the results and discusses the appropriateness of both techniques.

## 4.2 Brief Description on GWR Models

### What is GWR?

GWR technique is fully descried by Fotheringham etal., 2002[2] and just a brief description of the approach is presented here. GWR is a spatial analysis technique that takes into account spatial autocorrelations among the observations in surrounding locations by allowing for spatial nonstationarity in the linear regression coefficients for each location. In GWR literature, the “location” can be a point or an aggregated area.

describe local geographical variations in the relationships between a response variable and its explanatory variables by a set of local estimates for all the predictors for each geographical location (Fotheringham et al. 2002). A set of estimates and standard errors for each local coefficients are produced by focusing each location in the study region and weighted matrix of its nearby observation.

The basic GWR equation can be written as:

(4.1)

Where denotes the coordinates of the th point in a two-dimensional study area; is the dependent variable at point , is the estimated intercept at point , ( represents the estimated coefficient for variable at point , is the independent variable of the th parameter at location , and is the error term for the local model at point .

The estimation of ( is derived using weighted least squares (WLS) regressions (Moore and Myers, 2010; Fotheringham et al., 2002) by weighting the observations near location in accordance with their distance to that fit point. It is given by:

where is a diagonal matrix denoting the geographical weighting of the observations around the fit point .

#### Weighting

The weighting is based on the distance between the regression location and its nearest neighbours, defined as bandwidth. The points in closer proximity to location is given more weight and therefore has more influence on the estimation of than the observations that are further away to location . A number of weighting schemes are available, but they tend to be Gaussian or “Gaussian-like” function, which is the types of dependency generally found in spatial processes (Forthemham). Two Commonly used distance-decay functions in GWR are Gaussian and Bi-square function (Fotheringham et al. 2002), which are expressed as below:

Gaussian

Bi-square

Where is the th element of the diagonal of the matrix of the geographical weights , is the bandwidth, a threshold distance that any observations beyond this distance will not be used for calibrating the local model, and represents the distance between observation and focus point . When and coincide, the weighting equals to 1.

**Source:** Gollini et al (2014) GW model: an R Package for Exploring Spatial Heterogeneity using Geographically Weighted Models

Both functions are continuous up until the bandwidth, but the weights of Bi-square function decrease faster than that of Gaussian function and eventually become zero at the boundary of the bandwidth, while the weights of Gaussian function do not become zero. Both of the weighting functions will be tried in the planned research.

#### Bandwidth

Bandwidths can be specified either as fixed or adaptive (in terms of physical distance). The physical distance for adaptive bandwidth is changeable according to the spatial density so as to capture a fixed nearest neighbours for each local model: a shorter distance for areas where observations are dense and longer distance when data are sparse. The benefit of using adaptive bandwidth is that it can ensure sufficient local information be utilised for areas where observations are spatially scares and reduce the estimate variance for local coefficient and still reveal subtle local variations where observations are dense (Fotheringham et al. 2002). Therefore, adaptive bandwidth will be used in the planned research as the density of house price data vary geographically.

The size of bandwidth affects gradient of the kernel and thus the rate of decay function. A small bandwidth have fewer observations included in the local model and rapid decay whereas a large bandwidth will have more observations in the local model and a smoother weighting scheme. The size of the bandwidth is important as if the bandwidth is too small, although the model would fits better for the local observations, but at the same time local noise may also be fitted thus the local estimates will have large variances. Conversely, if the bandwidth is too large, although the variances will become smaller, but the estimates of local coefficients are based on a much larger area and result in biased estimates which masks the true local relationships, especially if the relationships vary dramatically over small areas. This is the so-called bias-variance trade-off (Fotheringham et al., 2002)[3]. The effective number can be used to reflect bias-variance trade-off in GWR, which is a measure of the number of observations that have been used effectively for calibrating the local model.

#### Bias-Variance Trade-Off

To find the best bias-variance trade-off, an appropriate weighting function and optimal bandwidth need to be selected. It has been argued that the selection of bandwidth selection is far more important than the weighting scheme as the weighting all decreases as distances increase by all weighting functions but the size of bandwidth decides the degree of decay (Fortherham…). The optimization process is generally exploratory and can be very compute-intensive process as it requires all the local regressions fitted at each step[4]. It can be achieved by either cross-validation method or use corrected Akaike information criterion (AICc) (Fotheringham et al. (2002).

Leave-one-out cross-validation (LOOCV) is a commonly used cross-validation method in GWR, where for each local model, it is validated by using all the cases except for one observation and the model is tested on that single observation. The bandwidth which produce the smallest root mean square prediction errors for all the dependent variables of all the local models is deemed as the optimal bandwidth. AICc is an indicator of goodness-of-fit and can be used to compare competing models while taking into account the complexity of a model. A lower AIC score indicate a better fit of a model. As a rule of thumb, a decrease of 3 in AIC of two competing model score indicates an improvement in the model fit for the model with lower AIC (Fotheringham et al 2002; Zhang etal., 2011).

It is common though to get different optimal bandwidth from the two methods as the criteria for “optimal” is different for AICc and for CV[5] and the AIC value is not based on prediction of the dependant variable (…[6]..). In addition, AIC score can be corrected for small sample size, while classical CV method tend to produce under-smoothed result for small sample size[7]. One thing is note is that AIC should be avoided when the sample size is large as it requires the creation of an n by n matrix [8]so the optimization can be very slow[9]. Both method will be tried out in the planned research.

### Why Use GWR and when?

As mentioned earlier, when there is spatial dependency between variables and spatial non-stationarity, GWR can be used to disaggregate global relations to local levels to obtain a better understanding of spatial data in more details. As every local model is fitted to local observations, it fits better to data than a global model and residuals are generally lower and less spatially dependent. The outputs, the estimates of local coefficient are specific to each location.

In Chapter 2, Moran’s I has been used and indicate that there is statistical significant spatial autocorrelation within both house prices and the residuals of HPM results. This means that the global fitted coefficient value of HPM does not represent detailed location variations adequately and GWR should be used in this instance to taken into account the spatial dependency and examine the heterogeneity in housing market.

## A review of GWR approach in house price estimation

This section reviews the application of GWR technique with a focus on residential real estate, as well as the comparisons of GWR with a range of other methodologies. The section will conclude with the identification of the research gap and thus the contribution of the current chapter.

### Application in Real Estate Valuation

GWR has been applied to a number of field, including land use (Geniaux et al. 2011….), environment (Harris et al. 2010a), health (Comber et al. 2011, Helbich et al. 2012b, Yang and Matthews 2012; [10]) and crime studies (Leitner and Helbich 2011), economics ([11]), regional studies ([12]) and residential real estate studies (Kestens et al. 2006; Bitter et al. 2007…………). In terms of the application to real estate, GWR has been used to investigate the effects of the locations and surrounding neighbourhood characteristics, such as …………………,the effects of accessibility, such as the new bus transitway in…..((Mulley, 2013), infrastructure availability in ……….(Cellmer, 2012), and the effects of open space amenities (Nilsson, 2014).

GWR has also been used to identify housing sub-markets (Borst & Mccluskey, 2007; Crespo & Grêt-Regamey, 2013; Helbich, Brunauer, Hagenauer, & Leitner, 2013).

### GWR compared with other modelling techniques

GWR has also been compared with a few valuation tools in real estate, such as multiple regression analysis (MRA), simultaneous autoregressive model (SAR), Artificial neural networks (ANN), spatial expansion method (SEM) and Spatial lag model (e.g., Brunsdon et al., 1999[13]; LeSage 1999[14]; (Bitter, Mulligan, & Dall’erba, 2006; Helbich, Brunauer, Vaz, & Nijkamp, 2013; McCluskey, McCord, Davis, Haran, & McIlhatton, 2013; Yu, Wei, & Wu, 2007).

More specifically Bitter, Mulligan, & Dall’erba (2006) demonstrated in their study that GWR was superior to spatial expansion method ( define briefly ….)in terms of predictive accuracy and explanatory power when applied to examine the marginal price of key housing attributes in the Tucson, Arizona housing market. McCluskey, McCord, Davis, Haran, & McIlhatton (2013) also showed that GWR outperform MRA, ANN and SAR in term of predictive accuracy, transparency, and cost-effectiveness and offered when applied to 2,694 residual properties in for real estate price estimation. In a case study of spatial heterogeneity in Austria, Helbich, Brunauer, Vaz, et al. (2013) extended GWR to a mixed-GWR(MGWR), which allows some coefficient to be stationary while others to be non-stationary. This approach is more flexible and parsimonious than standard GWR (Wei and Qi, 2012). Both MGWR and GWR has smaller prediction errors in comparison with a global approach, such as OLS, SAR and spatial two stage least square procedure (S2SLS)[15].

There are other extensions of GWR. To deal with cross-sectional time series data, GTWR (Huang, Wu, & Barry, 2010) was developed to integrate both temporal and spatial information in the weighting matrices to capture spatial and temporal dependency and heterogeneity[16] . GTWR is able to model spatial and temporal nonstationarity simultaneously and therefore offers a better goodness-of-fit. LeSage (2003) incorporate a Bayesian treatment into GWR in order to improve the estimates of GWR parameters. Contextualized Geographically Weighted Regression (CGWR) was developed by adding contextual variables into standard GWR. The research applied this approach to model spatial heterogeneity in the land parcel prices of Beijing in China and demonstrated that the incorporation of contextual information improved the model fit.

However, multicollinearity between explanatory variables may result in unstable results in GWR models and cause more problem for GWR than in a global regression model (Lloyd 2007). Therefore, extreme caution should be exercised when analysing the spatial patterns of local coefficients derived from GWR (Wheeler & Tiefelsdorf, 2005). A range of diagnostic tools was proposed and usage of PCA to identify the most influential predictors or integrating ridge regression into the GWR framework (D. C. Wheeler, 2007) can help stabilize GWR regression coefficients.

There is only limited comparison of GWR with MLM, or random coefficient model (RCM). These two approaches are very different in terms of its underlying assumptions of the spatial process and yielded completely different results in the study of long-term illness in the UK (Brunsdon, Aitkin, Fotheringham, & Charlton, 1999).

There has no published research that compares GWR with MLM in terms of their capability to model spatial heterogeneity of house price data and their predictive accuracy. In addition, although GWR can be applied at any geographic scale of measurement, in practice however, may applications and previous research applied it to an coarsely aggregated scale due to the availability of data or keep anonymized information. Unlike previous studies, we have geo-code the “location” of each house based on its unit postcode location, which only contains typically around 15 residential addresses[17]. We hope to offer further insight into the geographical variation of the relationships at this detailed level, which previously might be disguised in previous research when the level of analysis was carried out at a much coarser scale.

## Planned Research

Standard GWR is applied to the same dataset in chapter two and three, the house price data of the Greater Bristol area. Two extended version of GWR, GTWR and CGWR, will be explored with the former to capture the temporal dependency and heterogeneity and the later to incorporate contextual information into the model. In GWR and CGWR, the whole dataset will be split into yearly data to avoid the potential temporal autocorrelation within the data. There is no need of doing so in GTWR, as the time of sale has been taken into account in the model.

Individual house characteristics are all categorical variables as described in Chapter 2 and will be modelled first and then neighbourhood variables will be added in the subsequent models.

The planned procedures and a few methodological issues are addressed as follows. Firstly, before carrying out actual modelling of GWR, whether there is significant spatial autocorrelation within the data, which can be between the response variables and its lagged values or between the explanatory variables and their lagged value. Two most commonly used weighting function, Gaussian and Bi-squares functions will be used, although it has been shown that the selection of the weighting function does not have as much an effect on the results as the selection of bandwidth (Fotheringham, Brunsdon, and Charlton 1998). If it is the case, just one weighting function will be used in the subsequent yearly models and the focus will be one the optimization of bandwidth. An adaptive bandwidth is proposed, as there is a good mixture of rural/urban of housing stock in Greater Bristol and the density of the house sales varies dramatically over space. Both CV and AIC will be used to obtain optimal bandwidth and measure model fit as it was shown in the past that the two methods resulted in different optimal bandwidth and regression coefficients ([18]).

Once a weighting function and bandwidth has been selected, the weighting matrix can are defined and used to estimate the coefficient for every location based on equation (4.1) and calibrating local GWR. The standardised residuals and the parameters, and their estimated standard errors will be mapped to investigate whether they vary spatially[19]. This will also be compared with the map of the shrinkage estimates of the neighbourhoods (OAs, LSOAs and MSOAs) derived by MLM in previous chapters. It is expected that the mapped patterns of MLM coefficient exhibit more “noise” than that of GWR, since GWR is essentially a spatially smoothing calibration. All of the model caliberation will be conducted in R, using GWmodel package as this software is free and the process can be easily replicated.

Lastly, the predictive accuracy of GWR will be measured and compare with MLM. R squared is used for goodness of fit of the model and it measures the proportion of variation in the data that is explained by the model. Adjusted Rsquared takes into account the complexity of the model in terms of the number of variable that are specified in the model. It is expected that extended version of GWR, GTWR and CGWR, may provide better model fit and more accurate predictions based on their previous applications.

In the past, there has been criticism that GWR cannot produce confidence intervals (………..) and the significance of the estimates for parameters cannot be tested. However, Monte Carlo significance tests have been used to test whether there is significant variability (………..) so this test is also planned to test if the spatial variation of the coefficients are statistically significant. “Wild bootstrap” approach as suggested by by H¨ardle (1990) and McMillen (2004) can also be used to produce a weighted average of the variance of the separate parameter estimates.

## Conclusion

GWR generally give much better fits to the data and the residuals are less autocorrelated. Its advantages over MLM is that it no longer treats space as discrete, which more likely resemble the spatial process in reality, and it models both spatial dependency and heterogeneity. In addition, it is essentially a non-parametric approach that does not requiring any assumptions with respect to the predictors, which can be categorical or the underlying distributions of the predictors can be highly skewed. There is no need to specify a functional form to produce the estimates of spatially varying parameters (Brunsdon et al 1998). The underlining concept of “letting the data speak for themselves” make it a good exploratory tool [20] for spatial analysis. This concept is very much similar to another modelling technique, ANN, except that in ANN, there is no implication of nearer locations have more influences on the estimates of local coefficients than locations that are further away as in GWR. This although unlikely in reality, but it might happen. How does GWR compared with ANN will be discussed in the next chapter.

Link GWR and ANN: a set of estimates of spatially varying parameters WITHOUT specifying a functional form – “let the data speak for themselves” (Chris et al 1998)

[1] the parameter estimates are assumed to be randomly distributed with either a finite (Wedel and Kamakura 2000) or a continuous mixture distribution (Aitkin 1996).

[2] And Legendre, 1993

[3] Check: Bias-variance trade-off: MLM (Goldstein 1987) and Ridge Regeression (Hoerl and Kennard 1970a, 1970b)

[4] check reference Schabenberger and Gotway (2005 316-317) statistical methods for spatial data analysis

Waller and Gotway (2004, p434) – applied spatial statistics

and Lloyd (2007 pp 79-86): local models for spatial analysis

[5] http://webhelp.esri.com/arcgisdesktop/9.3/body.cfm?tocVisable=1&ID=-1&TopicName=Interpreting GWR results

[6] Housing Sub-markets and Hedonic Price Analysis: A Bayesian Approach by

David C. Wheeler1*, Antonio Páez2*

,

Lance A. Waller1 and Jamie Spinney3

## Chapter 4 [7] Encyclopedia of Geographic Information Science

edited by Karen Kemp (p183)

[8] (gwr.sel {spgwr})

[9] NOTE AIC be applied in non-Gaussian GWR( Local Models for Spatial Analysis, Second Edition By Christopher D. Lloyd) ???

[10] Modelling spatially varying impacts of socioeconomic predictors on mortality outcomes, J Geograph Syst (2003) 5:161–184, DOI: 10.1007/s10109-003-0099-7, proposed for modelling spatially varying, predictor effects on a disease or mortality count outcome – The methodology is illustrated by suicide mortality in 32 London Boroughs over the period 1979–1993, in terms of area deprivation and a measure of social fragmentation disease mapping methods

[11] SPATIAL HETEROGENEITY AND THE WAGE CURVE REVISITED*Simonetta Longhi, *ISER,* Peter Nijkamp

[12] The Geographic Diversity of U.S. Nonmetropolitan Growth Dynamics: A Geographically Weighted Regression Approach *Mark D. Partridgey Dan* 5. Rickman, Kamar AU, and M, Rose Olfertte.st for geographic heterogeneity in ihe growth parameters ami compare iliem to global regression estimates. The results indicate significant heterogeneity in the regression coejjkients across the country, most notably for amenities and college graduate shares. V.sing GWR also exposes .signiftimt local variations that are masked by global estimates

[13] A Comparison of Random-Coefficient modelling and Modeling and Geographically Weighted Regression for Spatial Non-Stationary Regression Problems, Geographical and Environmental Modeling, 3 (1), 47–62

### Cite This Work

To export a reference to this article please select a referencing stye below: