CFP last date
20 June 2024
Reseach Article

A Machine Learning Method for Detecting Depression Among College Students

by Peter J. Yu
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Volume 185 - Number 24
Year of Publication: 2023
Authors: Peter J. Yu
10.5120/ijca2023923003

Peter J. Yu . A Machine Learning Method for Detecting Depression Among College Students. International Journal of Computer Applications. 185, 24 ( Jul 2023), 44-51. DOI=10.5120/ijca2023923003

@article{ 10.5120/ijca2023923003,
author = { Peter J. Yu },
title = { A Machine Learning Method for Detecting Depression Among College Students },
journal = { International Journal of Computer Applications },
issue_date = { Jul 2023 },
volume = { 185 },
number = { 24 },
month = { Jul },
year = { 2023 },
issn = { 0975-8887 },
pages = { 44-51 },
numpages = {9},
url = { https://ijcaonline.org/archives/volume185/number24/32844-2023923003/ },
doi = { 10.5120/ijca2023923003 },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Journal Article
%1 2024-02-07T01:26:59.969095+05:30
%A Peter J. Yu
%T A Machine Learning Method for Detecting Depression Among College Students
%J International Journal of Computer Applications
%@ 0975-8887
%V 185
%N 24
%P 44-51
%D 2023
%I Foundation of Computer Science (FCS), NY, USA
Abstract

As depression is becoming more prevalent on college campuses, it is increasingly a critical topic to investigate. Recently, studies using machine learning techniques have begun to predict depression and other mental illnesses. However, there is little understanding of why these mental problems occur. In this study, the causation of depression among college students posting on the popular social media platform Reddit is studied, and several machine learning classifiers for depression detection are compared. Of the 7,680 semi-anonymous Reddit posts examined, 552 contained depression-related keywords. After applying a series of natural language processing (NLP) techniques, three primary areas of depression were found among college students: institutions and programs; academic projects and assignments; and the college environment. Moreover, the results of this study show the effectiveness and performance of different machine learning classifiers. The classifier with the highest accuracy was Adaptive Boosting (AdaBoost), detecting depression with 99% accuracy, while the Random Forest classifier had the highest F1 score of 1.0.

References
  1. American Psychiatric Association. (2020, October). What is depression? Psychiatry.org – What is Depression? Retrieved March 20, 2023, from https://www.psychiatry.org/patients-families/depression/what-is-depression
  2. Mayo Clinic Health System. (2023, May 31). College students and Depression. Mayo Clinic Health System. https://www.mayoclinichealthsystem.org/hometown-health/speaking-of-health/college-students-and-depression
  3. National Institute of Mental Health. (2020). Major depression. National Institute of Mental Health. https://www.nimh.nih.gov/health/statistics/major-depression#:~:text=In%202020%2C%20an%20estimated%2066.0,treatment%20in%20the%20past%20year
  4. Beiter, R., Nash, R., McCrady, M., Rhoades, D., Linscomb, M., Clarahan, M., & Sammut, S. (2015). The prevalence and correlates of depression, anxiety, and stress in a sample of college students. Journal of Affective Disorders, 173, 90–96. https://doi.org/10.1016/j.jad.2014.10.054
  5. Thurber, C. A., & Walton, E. A. (2012). Homesickness and adjustment in university students. Journal of American College Health, 60(5), 415–419. https://doi.org/10.1080/07448481.2012.673520
  6. Sun, J., Hagedorn, L. S., & Zhang, Y. (Leaf). (2016). Homesickness at college: Its impact on Academic Performance and Retention. Journal of College Student Development, 57(8), 943–957. https://doi.org/10.1353/csd.2016.0092
  7. Barbayannis, G., Bandari, M., Zheng, X., Baquerizo, H., Pecor, K. W., & Ming, X. (2022). Academic stress and mental well-being in college students: Correlations, affected groups, and covid-19. Frontiers in Psychology, 13. https://doi.org/10.3389/fpsyg.2022.886344
  8. Liu, X. Q., Guo, Y. X., Zhang, W. J., & Gao, W. J. (2022). Influencing factors, prediction and prevention of depression in college students: A literature review. World journal of psychiatry, 12(7), 860–873. https://doi.org/10.5498/wjp.v12.i7.860
  9. Goswami, S., Sachdeva, S., & Sachdeva, R. (2012). Body image satisfaction among female college students. Industrial psychiatry journal, 21(2), 168–172. https://doi.org/10.4103/0972-6748.119653
  10. Orzech, K. M., Salafsky, D. B., & Hamilton, L. A. (2011). The state of sleep among college students at a large public university. Journal of American college health: J of ACH, 59(7), 612–619. https://doi.org/10.1080/07448481.2010.520051
  11. Doom, J. R., & Haeffel, G. J. (2013). Teasing apart the effects of cognition, stress, and depression on health. American Journal of Health Behavior, 37(5), 610–619. https://doi.org/10.5993/ajhb.37.5.4
  12. Ebert, D. D., Buntrock, C., Mortier, P., Auerbach, R., Weisel, K. K., Kessler, R. C., Cuijpers, P., Green, J. G., Kiekens, G., Nock, M. K., Demyttenaere, K., & Bruffaerts, R. (2018). Prediction of major depressive disorder onset in college students. Depression and Anxiety, 36(4), 294–304. https://doi.org/10.1002/da.22867
  13. Shen, J. H., & Rudzicz, F. (2017). Detecting anxiety through Reddit. Proceedings of the Fourth Workshop on Computational Linguistics and Clinical Psychology -From Linguistic Signal to Clinical Reality. https://doi.org/10.18653/v1/w17-3107
  14. Yu, P. (in press). Entrepreneurial Struggle: A Natural Language Processing Approach. International Journal of High School Research.
  15. Gil, M., Kim, S.-S., & Min, E. J. (2022). Machine learning models for predicting risk of depression in Korean college students: Identifying family and individual factors. Frontiers in Public Health, 10. https://doi.org/10.3389/fpubh.2022.1023010
  16. Proferes, N., Jones, N., Gilbert, S., Fiesler, C., & Zimmer, M. (2021). Studying Reddit: A systematic overview of disciplines, approaches, methods, and Ethics. Social Media + Society, 7(2), 205630512110190. https://doi.org/10.1177/20563051211019004
  17. Loper, E., & Bird, S. (2002). NLTK. Proceedings of the ACL-02 Workshop on Effective Tools and Methodologies for Teaching Natural Language Processing and Computational Linguistics -. https://doi.org/10.3115/1118108.1118117
  18. Balakrishnan, V., & Ethel, L.-Y. (2014). Stemming and lemmatization: A comparison of retrieval performances. Lecture Notes on Software Engineering, 2(3), 262–267. https://doi.org/10.7763/lnse.2014.v2.134
  19. Blei, D. M., Ng, A. Y., & Jordan, M. I. (2003). Latent dirichlet allocation. Journal of machine Learning research, 3(Jan), 993-1022.
  20. Darling, W. M. (2011, December). A theoretical and practical implementation tutorial on topic modeling and gibbs sampling. In Proceedings of the 49th annual meeting of the association for computational linguistics: Human language technologies (pp. 642-647).
  21. GeeksforGeeks. (2021, June 6). Latent dirichlet allocation. GeeksforGeeks. https://www.geeksforgeeks.org/latent-dirichlet-allocation/
  22. Clark, S. (2013). Topic modelling and latent dirichlet allocation. Online, Lent.
  23. Rosner, F., Hinneburg, A., Röder, M., Nettling, M., & Both, A. (2014, March 25). Evaluating topic coherence measures. arXiv.org. https://arxiv.org/abs/1403.6397
  24. Zvornicanin, W. by: E. (2023, May 31). When coherence score is good or bad in topic modeling?. Baeldung on Computer Science. https://www.baeldung.com/cs/topic-modeling-coherence-score
  25. Pleplé, Q. (2013). Topic Coherence To Evaluate Topic Models. Topic coherence to evaluate topic models. http://qpleple.com/topic-coherence-to-evaluate-topic-models/
  26. Edgar, T. W., & Manz, D. O. (2017). Science and cyber security. Research Methods for Cyber Security, 33–62. https://doi.org/10.1016/b978-0-12-805349-2.00002-9
  27. Noble, W. S. (2006). What is a support vector machine?. Nature News. https://www.nature.com/articles/nbt1206-1565
  28. Speiser, J. L., Miller, M. E., Tooze, J., & Ip, E. (2019). A comparison of random forest variable selection methods for classification prediction modeling. Expert Systems with Applications, 134, 93–101. https://doi.org/10.1016/j.eswa.2019.05.028
  29. Tadesse, M. M., Lin, H., Xu, B., & Yang, L. (2019). Detection of depression-related posts in Reddit Social Media Forum. IEEE Access, 7, 44883–44893. https://doi.org/10.1109/access.2019.2909180
  30. Korstanje, J. (2021, August 31). The F1 score. Medium. https://towardsdatascience.com/the-f1-score-bec2bbc38aa6#:~:text=The%20F1%20score%20is%20defined,when%20computing%20an%20average%20rate.
  31. van der Maaten , L., & Hinton, G. (2008). Visualizing data using T-SNE. Journal of Machine Learning Research. https://jmlr.csail.mit.edu/papers/volume9/vandermaaten08a/vandermaaten08a.pdf
  32. Huilgol, P. (2019, August 24). Accuracy vs. F1-score. Medium. https://medium.com/analytics-vidhya/accuracy-vs-f1-score-6258237beca2
  33. Vandana, Marriwala, N., & Chaudhary, D. (2023). A hybrid model for depression detection using Deep Learning. Measurement: Sensors, 25, 100587. https://doi.org/10.1016/j.measen.2022.100587
  34. Patel, M. J., Khalaf, A., & Aizenstein, H. J. (2016). Studying depression using imaging and Machine Learning Methods. NeuroImage: Clinical, 10, 115–123. https://doi.org/10.1016/j.nicl.2015.11.003
  35. Gitnux, A. (2023, July 12). Reddit user statistics and Trends in 2023 • gitnux. GITNUX. https://blog.gitnux.com/reddit-user-statistics/#:~:text=engage%20in%20conversations.-,With%20over%20430%20million%20monthly%20active%20users%2C%2074%25%20of%20which,ranging%20from%20politics%20to%20entertainment.
  36. Barthel, M., Stocking, G., Holcomb, J., & Mitchell, A. (2016). Seven-in-ten Reddit users get news on the site.
Index Terms

Computer Science
Information Sciences

Keywords

College College Students Depression Mental Health Machine Learning Natural Language Processing (NLP) Latent Dirichlet Allocation (LDA) Social Media Reddit