Response to Joshua Blumenstock’s Article: “Don’t Forget People In The Use Of Big Data For Human Development”
In the opinion piece “Don’t Forget People In The Use Of Big Data For Human Development”, Joshua Blumenstock argues that while development can be aided by the use of data science, the current methodology for actively aiding countries and individuals in poverty are lacking in efficiency and execution due to the current data science methodologies focusing on the big picture promises, rather than the small lensed execution.
As an introduction to his main idea that the potential for development is large but the execution is lacking, Blumenstock states that the intent of data scientists and other researchers intending in aiding human development in developing countries is pure and full of promises. For example, machine-learning algorithms can provide data such as credit scores to people who have mobile phones but might lack other ways of getting a credit score, such as having collateral or bank access. Another hopeful promise of the application of data science is the usage of digital signatures and digital footprints, gathered and put together using sources such as phone call data or satellite imagery, to more efficiently identify areas needing humanitarian aid, public health intervention services, or natural-disaster crisis responses and more effectively provide those services to the people in need. Though these are the promises of data science’s usefulness in aiding human development, Blumenstock asserts that these promises have plenty of pitfalls in execution.
To support his thesis that the efforts to aid development have fallen short of their intent, Blumenstock breaks down the pitfalls of execution of human development promises into four main categories: unanticipated effects, lack of validation, biased algorithms, and lack of regulation. In regards to unanticipated effects of data collection and implementation, the problem might not even be solved, or could be made worse. For example, digital loans could have adverse effects, such as leading to poverty cycles or debt traps, when they were created with the pure intention of helping those without access to banks obtain loans. A second pitfall is the lack of validation of data collection systems, or put more simply, new systems are imperfect and often conclusions from data can’t be generalized between different times of the year or different locations in the same general area. In addition, people might attempt to take advantage of the system in order to try to obtain benefits, such as aid. A third pitfall is uncovered when considering the bias of algorithms. Because data collection is based on digital sources, it favors those who have access to things like phones or Facebook, while leaving out poorly represented, and often marginalized, groups. The main dilemma in regard to biased algorithms is that it is difficult to help people when they don’t even appear in the data sets collected with the intention of helping them. This is a prime example of how we can stray from good intention in execution. Lastly, Blumenstock presents the obstacle of a lack of regulation on data collection. Though here in the United States, and in many other developed countries, legislation has been created to reduce the abuse of power that comes with collecting personal data, developing countries oftentimes lack these regulations, and private companies operating there are motivated by the incentive to make a profit off of individuals’ data, rather than contributing to causes to aid development.
Though there are significant pitfalls that Blumenstock presents regarding the effectiveness of big data being used to help development, he does also present a way forward as an elementary solution to the root problem of the lack of representation of those who need help in the data sets meant to be used to help them. Blumenstock asserts that we must: validate, that is, create data sources to complement old ones and promote collaboration between organizations and researchers who can both benefit from the same data; customize, by contextualizing programs and data sets so that they benefit the people who we are trying to help; and deepen collaboration, by linking people in areas in need with those creating the programs. One flaw that I find with Blumenstock’s solutions is that they all derive from the same central solution: to foster collaboration and inclusivity of the people whom data scientists, researchers, governments, and companies are trying to help. Thus, I feel like the solution could have come across as a more powerful argument and segway to the ending if Blumenstock had not separated it into sections.
The three concepts of good intent, transparency, and the balancing act are connected in a way such that they build on each other, with good intent fostering transparency and transparency leading to a more clear view of how to approach the balance of helping some group without accidentally harming another. I agree with Anna that good intent is just not enough, after all, that is the basis of Blumenstock’s piece. The beginning of the process of transforming how we view data science in relation to human development is establishing good intentions: that data will be used for its true purpose and not purely for profit, that data scientists will try their best to cover marginalized groups, and that research will be put into practice using software and ways that are customized to the groups needing help. After good intent is established, keeping in mind that pure promises are not sufficient to enact change, transparency must next be established. I agree with Nira’s assertion that a major issue is a transparency on the part of the researcher and those needing assistance. However, this is fallible because human nature is not altruistic, so it is difficult for transparency in situations where corporations might be seeking profits or people might be seeking aid because they need to help themselves rather than think about the good of the whole, such as the example provided in the article where people pretended to live in thatched rather than iron-roofed houses. Transparency, not good intentions, is then the main obstacle in the intersection between data science and human development, and without transparency, it’s not possible to balance trying to aid development without misrepresenting or forgetting marginalized groups.