Statistics and Its Interface
Volume 5 (2012)
Bayesian areal wombling using false discovery rates
Pages: 149 – 158
Spatial data arising in public health services are often reported as case counts or rates aggregated over areal regions (e.g. counties, census-tracts or ZIP codes), rather than being referenced with respect to the geographical coordinates of individual residences. For such areal data, subsequent inferential interest often resides in the formal identification of “barriers”, or “difference boundaries”, on the map, where “boundary” refers to a border with sharp changes in outcome on either side. This boundary detection problem is often referred to as “wombling” or, more specifically, “areal wombling” for aggregated areal data, after a foundational article by Womble (1951). Existing statistical frameworks for areal wombling usually follow a two stage procedure: (i) estimate the spatial effects from an appropriate spatial model, and (ii) detect boundaries from those estimates using appropriate discrepancy metrics on those estimates. Lu and Carlin (2005), and several subsequent articles, explored areal wombling within this framework.
This article treats wombling as a hypothesis-testing problem, where we are testing a substantial number of hypotheses – one for each geographical boundary – and seek to provide policy-makers and analysts with a final set of difference boundaries. Here we must reckon with a lurking multiplicity problem arising from the large number of individual hypothesis we are testing. We proffer a computationally feasible framework to estimate hierarchical spatial models that account for dependence between adjacent regions and test for equality of spatial effects, while adjusting for multiplicities using false discovery rates (FDR); see, e.g., Benjamini and Hochberg (1995). A simulation study is conducted to first illustrate and assess the new approach, which is then applied to detect boundaries on a county map of Minnesota that records pneumonia and influenza hospitalization rates from the SEER-Medicare program.
areal data, Bayesian inference, hierarchical models, false discovery rates, spatial moving averages
2010 Mathematics Subject Classification
Primary 62F15, 62H11. Secondary 62F03.