As I work to enter back into a doctoral program with UNISA, I have realized that I haven’t yet had a quick cohesive explanation of why it is so important to me to do the research I am doing. So here is that attempt to explain why I believe what I’m working on will make a significant contribution to the field of education, and beyond that, why it could be something that truly changes the world, and also how others can get involved. Who knows, maybe some day this will become a real TED Talk 🙂
What if hidden in plain sight there is an answer to a world problem, like how to reduce world hunger, increase the economy of developing nations, or lower disease and war? Wouldn’t this be worthwhile searching for? That is what I’m working to do, through the use of data science.
As a humanity we have collected tons of information about each country in this world, from how much a Big Mac costs in each country (with the Economist’s Big Mac Index) to how many people die of heart disease (collected by the World Health Organization), we know a lot. And most of this information is available on the Internet. But do we know if these things affect each other, or cause each other?
Social science research, such as comparative education (which is looking at educational differences between countries), have often looked at individual aspects between countries, and tested to see if there is a correlation between variables. But this has almost always been done based upon human hunches, which we like to call hypotheses. Yet only using hunches to try and find knowledge that is significant is limiting, because first humans can only process a certain amount of knowledge in their brain, and second there could be variables that correlate that a person might never guess that this would be the case.
But what if we could trawl through nearly all the world’s knowledge to find hidden correlations? Well, I believe we can, and in fact, that is what my doctoral research is all about: which is to build a huge database that is a compendium of countries, and then to have software do data mining, in which it looks at all the information collected, and spits out which one’s have a strong correlation.
Although, it should be clear that this process would only be the first step in finding answers to the world’s problems. Many correlations do not mean there is a direct causation between one and the other. For instance, my early research of mining data from the CIA World Factbook (yes, the CIA does some good things), showed a strong connection between the expenditure on education per capita and the expenditure on healthcare per capita, which had a correlation coefficient of nearly 80%. Or in other words, in nearly every country of the world, as spending on health care goes up for each person, spending on education also rises in a ratio that is very consistent. But does this mean one causes the other? Probably not. But maybe it means that humans value both of these in a consistent manner. And isn’t that alone, knowledge that is worthwhile knowing and discovering? We will not know the full extent of what might be valuable until we do the searching, and when the correlations are found, further research can start to ask why the correlations exist, and whether they can be used to help the world.
But, I cannot do this alone. To try and gather all of this data on my own would limit the results of the work, because the more types of data that can be mined, the better chance we have of finding something of value. So I’m crowd sourcing my research, in which I’m developing curriculum to help high school and college students from around the world to become citizen scientists, and be able to contribute to this discovery process. And this will give more effect for the effort, because not only will students be able to contribute to something worthwhile, they will also be learning the fundamentals of data science, which is becoming of growing importance in all fields of science.
So I invite you to join me in searching for the solution to the world problems, by collecting what the world already knows, and then seeing what can be found in this knowledge. You can go to www.CompendiumOfCountries.org to see the current state of the project.