Word Cloud: Home States of 2017 MLS Players

Undeterred--OK, briefly deterred--by my setback yesterday, I tried again to make a word cloud of the home states of MLS players. Unable to make the two-letter state abbreviations into a word cloud, I instead used R to assign the full names of the states to the now 312 names in my player dataset. So far, so good. Except this time I had to deal with the problem of states with two words in their names, such as New York, New Jersey, and New Hampshire. The code I used for the first word cloud project broke these names into their constituent words, which is not what I wanted at all and gave the word "New" undue prominence in the word cloud.

I was able to solve this problem by adapting the code presented in the first answer to the question posed here. My dataset was already in the form of a data frame, so all I had to do was turn the column of state names into a table. Then calling names and as.numeric on the table of states and their frequencies, I was able to create this word cloud with the two-word state names intact (click to enlarge):

Well, almost. In contrast to the example that I adopted, I wanted my word cloud to have some color to it. However, unlike the first word cloud, this version was composed of more frequency levels (17) than colors in any palette in RColorBrewer (largest palette=12 colors). However, I found this blog entry from 2011, which explained how to increase the number of colors in a palette by using the function colorRampPalette. The idea is that colorRampPalette takes the colors from a selected RColorBrewer palette and blends them to create additional colors. The example in the link describes making a palette of as many as 20 different colors this way. I tried it, but I'm not convinced that I can tell the differences between some of the colors. Nevertheless, a new R coding discovery that I'm sure will come in handy in the future.

Comments

Popular Posts