Skip to the content.

October 28th, Class Essay

10/28 Essay: To give you a better idea of what data science is and how it compares to other areas in the world of work read this 10-15 min article. Taxonomically, in academia, data science is a subfield of computer science. Personally, I see data science as a tool or a set of tools and methods that a researcher or practitioner of data science applies to data in a particular domain. This combination is their “craft”, in essence. I, for example, apply data science to geography in the context of international development. This means I use tools such as machine learning on data such as satellite imagery to help solve problems in the international development community. Data science is a domain that can be surprisingly hard to define given its broad applications and newness. I don’t necessarily think there’s one correct definition and, whatever it is, the field will continue to evolve with new technologies, new industries, and/or new issues. There’s also a question of skill ranges. Should the title of “data scientist” be reserved for those with doctoral degrees in the same way it is in physics, for example, and anyone below that is basically something else (like an analyst)? Think about an area of study or field you care about (outside of your formal assignment topic), maybe the major you intend to choose. Based on what you read and know about data science and its related areas (data analysis, stats, and ML), briefly (1-2 paragraphs) write about how you think these disciplines could possibly be used in your field (it’s okay to think ambitiously). If methods are already being used, what are they and to what extent (if you know)? If you are a prospective data science major, what ideas do you have for how data science could be used in ways you believe they are currently not? You have until 10:10.

I am intending to major in Computer Science, however, since data science is a part of computer science and thus already used in computer science, I will instead be discussing my ambitions for using data science in Psychology. Though human minds and consciousnesses are highly variable and unique, in a lot of ways we are like machines. There is a possibility that data science could be used to predict on a larger scale people’s likelihood of developing a particular mental illness or disorder. This would require extensive data collection, and it is probably already being done on a smaller scale, but with enough data, I think it would be possible to categorize factors that are common determinants of people developing mental illnesses or disorders. This could be used to get people treatment and proper resources at the beginning of the issues occurring, rather than the person having to suffer and sometimes being unable to seek out treatment.

This possibility can easily be broken up into the three categories of data science as described in the article. Data Analysis can be used to identify common factors that people have that could have possibly led to them developing a mental illness or disorder, such as a family history of mental health problems or specific types of traumatic events. Then, statistics can be used to discover how prevalent certain factors are to specific mental illnesses and “match” common factors to certain mental illnesses. Finally, machine learning can be used by taking someone’s medical or experience history and seeing if it fits with any of the criteria for possibly developing a certain mental illness or disorder. As Ng mentioned in the video, it is important that the model gather more data and keep improving upon itself even after it is initially developed, so there would also need to be a continuous stream of data to help improve the prediction ability of the model. If this could be accomplished on a large scale, it could make it easy for people to seek treatment once symptoms arise or even beforehand as a preventative measure. This software could be rolled out as a website or app where people can go into detail regarding their lives and it could “match” them with possible mental illnesses or disorders that could arise.

Some downsides of this are that it would likely lead to ethical issues regarding how much this project would delve into people’s lives and the database containing all of the information would have to be super secure. There is also the possibility that maybe we can’t predict mental illnesses or disorders in individuals, and the project is useless. However, I believe that this could be possible, even it is hundreds of years away from being successful on a large scale, and this is a clear way that data science can be used in junction with psychology to greatly improve the lives of people.