Tuesday, November 11, 2014

On Culture 2

I want to follow up on my previous post on culture and elaborate on the topic through some further analysis.
So, there are two questions that come naturally after plotting the 6 cultural dimensions of Hofstede's framework:

  1. Do some dimensions correlate with each other. In other words, do countries that score high one one dimension also score high on another.
  2. Given these dimensions, which countries are more similar to one another.



Correlation Analysis


The first question can easily be answered with a simple Pearson Correlation analysis. I plot the results in the correlation matrix below.



Obviously, the strongest correlation is between Power Distance and Individualism. In this case the relation is negative, meaning that the higher a society scores on the Power Distance index, the lower it will score on the Individualism index (i.e. it will be a more collective society) and vice versa. Although this isn't exactly news to most people, it does leave several questions open: "Why do more collective societies do not question power legitimacy as much as individualistic societies?", "Is it the case that power left unchallenged would result in collectivism, or is it that collective societal structures would require a strong unquestioned power". Let me know what do you think in the comments below.

The second strongest correlation is between Long-term Orientation and Indulgence. Again, the relation is negative, meaning that the more long-term oriented a society is, the less accepting it will be to gratification from satisfying natural human drives (i.e. it will be more restraint). This makes a lot of sense - by definition long-term orientation will manifest itself into behaviours such as saving and education oneself for the future. These are behaviours that naturally require restraining yourself from immediate gratification to reach a goal in the future - for example, when you choose to stay home and study for an exam instead of going out to party. The fact that the link is weaker suggests that there is more to it than that and indulgence does not necessarily result in short-termism.


Cluster Analysis


The second question can be answered through conducting Cluster Analysis. I have always been a bit sceptical of the K-means method, so I go for the Hierarchical clustering using Ward's agglomeration method. I choose to cut the dendogram right where the red line is, so I end up with 5 clusters.






I've also ran a summary statistics, where I calculate the average of each dimension of Hofstede's model per cluster. This allows one to easily see that:
  1. Cluster one is comprised of countries having the lowest average Power Distance, Masculinity and Uncertainty Avoidance and highest average Individualism. 
  2. Cluster two is comprised of the most collective societies with the shortest-term orientation, but with the highest level of indulgence
  3. Cluster three is a more moderate segment of societies with relatively high Uncertainty Avoidance
  4. Cluster four is comprised of the most long-term oriented and the most masculine societies. At the same time, it feels the strongest anxiety from the uncertain and is the second most restrained segment (i.e. second lowest Indulgence)
  5. Cluster five has the highest Power Distance the highest Restraint. While it is a collective society, it is the second most long-term oriented.












As before, here's my code:




# For the Correlation Matrix I use the Rattle User Interface
library(rattle)
rattle()


# I subset the datta, omitting observations where data is not available
hofs.db_noNA<-na.omit(hofs.db[2:8])


library(stats)

#Create the distance matrix
db.to_clust<-dist(hofs.db_noNA[2:7], method = "euclidean")

# Create the clusters
hclust <- hclust(db.to_clust, method="ward.D")

# Print the dendogram with the country labels
plot(hclust, labels=hofs.db_noNA$country)

# Plot The dendogram with 5 clusters.
# As Ithis graph is not particularly pretty, 
# I ended up using the one above, adding the additional elements manually
rect.hclust(hclust, k=5, border="red")


# Cut the hclust class into 5 clusters
hofs.groups2<-cutree(hclust, k=5)

# Add assign the clusters to the original dataframe
hofs.db_noNA$clust<-hofs.groups2

# Calculate average for each feature by cluster
aggregate(.  ~ clust, data=hofs.db_noNA[2:8], FUN = function(x) mean=round(mean(x),0))

1 comment :

  1. It's interesting that the cluster analysis shows such a strong relationship between cultural traits and geography. I wonder if this is changing as ideas can now move more freely over geographical distances or if factors such as climate and topography act as constraints.

    ReplyDelete

Blog Archive