Recently I started reading Data Science for Business, and it struck me that the example it gives of a company wanting to predict customer churn is quite a bit like what a school might want in predicting whether a student is likely to dropout.
For some time, I’ve been fascinated with the idea of using data to solve human problems. We know that the current economy runs on data science; Facebook and Google are quite upfront that they want to predict what you want, so they can show you ads. Brick and mortar stores are doing it too, as Target can predict when its customers are pregnant, and send appropriate coupons. I believe it must be possible to use data science in a manner that is less profit driven and more humanity driven. And so this brings me to this question, can data science save dropouts?
Looking at an example, in the Twin Rivers school district, I did some rough calculations, and it looks like the district loses approximately 1,000 students from kindergarten to high school graduation. This is 1,000 kids that will generally find the rest of lives much more difficult than if they were able to graduate. Further these 1,000 dropouts cost Twin Rivers over $10,000,000 in lost revenue. If only half of these kids could be recovered, the district would be able to fund at least 50 faculty and staff positions.
So how could the district solve this problem? There are two answers: dropout prevention and dropout recovery. I have personally worked a lot with dropout recovery for adults, and I am currently working on creating an adult-serving charter school that can help solve this problem, which I will talk about more soon in other posts.
Dropout prevention is something that is usually done in very broad strokes, and if these programs are more targeted, with support groups or extra help, they usually aren’t consistently applied across the student population (they often are based upon self-selection, parent selection, or teacher selection).
But what if there were patterns in the existing data about students that might show they were highly likely to dropout? These particular students could more cost-effectively and consistently be targeted with intervention strategies to help them stay in school.
The first question would be whether there were predictors in the current data that could be used? Most traditional demographics should not be used. Gender, race and ethnicity may at times have a correlation with students dropping out, but these are proxies to other societal issues, and any such use of the data would surely have people up in arms about “racial profiling”, etc. Some demographics might play a role in doing this type of analysis, specifically looking at socioeconomic status and English language learners. But since these two variables are so broad and change slowly, they would be useful as a tangential or supporting variable at best. Looking at specific teachers should also not be done as it would be perceived as a form of teacher evaluation, and would probably not be compatible with the contract. As such it would also likely get the ire of the teacher’s union, and getting their support is important to getting a successful outcome.
So I think the answer is to see if there are patterns in student behavior that might be predictors of a student dropping out. A school’s various student information systems has attendance data, grades, the courses students have taken, standardized test scores, etc. By using automated statistical techniques, it could be determined if these variables often have some types of patterns to them when students drop out, and if so, then in almost real-time (although more likely on a weekly basis), the school could look for these patterns, and have counselors or teachers work with the students who were determined to be most at-risk to see if these student can be saved from dropping out.
I am going to share this post with some of the folks with Twin Rivers, and I hope to post more soon about how such a project like this might be able to be done, and which specific steps would be involved. Maybe, I’ll even be able to be involved with working with Twin Rivers to do it… We will see!