Response to Blumenstock Reading

DATA 150

Suditi Shyamsunder

Blumenstock discusses in his article that while Big Data has a lot of potential to change the world and better the lives of many people, there are also potential problems that can develop from this new technology. Big data essentially gives us the ability to analyze the world around us by summarizing and interpreting trends within the seemingly infinite quantity of data that exists in our modern world. Through machine learning, which is the process of using data from the past to make predicitions about the future, we are able to reach many people and use our analysis to ease the lives of many. However, human development is not merely about technological advancement, and sometimes the two do not go hand in hand. Human development is more about improving quality of life for people around the world. Past student Anna Raymond expertly noted that “Good intent is not enough in data science when dealing with the problems which determine people’s experiences.” This is because in order to truly better people’s lives we must understand the reality of how they live and ensure we do not make any mistakes when creating our algorithms. The most interesting part of the article for me was the section on biased algorithms. Since machine learning models are built on data, it is imperative that the data collected is representative of the population it is meant to model. However, as the article mentioned this is not always easy to accomplish because those who are more likely to be responsive to certain surveys may be people who are from a certain socioeconomic status, raical background, or gender. It is crucial but difficult to ensure that data is representative. If the data is biased, the algorithm is bound to be biased, and so will be the predictions that are made from it. Transparency is also definitely important when it comes to using machine learning and big data in society. For example, if the algorithm is being used to determine who should get a loan, the process/model being used to train that model should not be secret. It should be known and understandable so that its equity can be assessed. This whole process of automating the world and using data science to change society is truly a balancing act. We must keep advancing and innovating. We can’t stop doing as much as we can to help those around the world and improve quality of life for as many people as we can, but we must also be cautious. We must ensure that our actions taken with good intent don’t end up hurting more than they help.