Hierarchical clustering with spatial constraints in tuberculosis data
International Journal of Development Research
Hierarchical clustering with spatial constraints in tuberculosis data
Received 08th January, 2020; Received in revised form 14th February, 2020; Accepted 20th March, 2020; Published online 30th April, 2020
Copyright © 2020, Dalila Camêlo Aguiar et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
Study on socio-epidemiological variables of TB, considering a clustering with spatial/ geographical restrictions for the State of Paraíba, Brazil. For the application of Ward's hierarchical clustering method, two dissimilarity matrices were calculated, the first provides the dissimilarities in the feature space calculated from the socio-epidemiological variables (D_0) and the second provides the dissimilarities in the calculated restriction space from the geographical distances (D_1) together with an alpha mixing parameter and the weight w attributed to calculation of the dissimilarity matrix as being collective inequality index. Statistical analyses were undertaken in R. In D_0 the clusters are dispersed and are not strictly contiguous, the five clusters are marked mainly by the high proportion of new cases. Geographically more compact clusters are obtained after the introduction of D_1 and α=0.1, slightly favoring socioeconomic homogeneity (24%) versus geographical homogeneity (64%) mainly influenced by clusters 1 and 3. With α=0.2 the socio-epidemiological and geographic homogeneity are favored although they are more compact, this partition is slightly worse than the previous one because it gives more importance to the neighborhoods. The method is shown to be feasible in epidemiological studies in the joint understanding of factors of different dimensions, aggregated from a spatial perspective.