Text Mining for Employee Candidates Automatic Profiling Based on Application Documents
Opening job vacancies using the Internet will receive many applications quickly. Manually filtering resumes takes a lot of time and incurs huge costs. In addition, this manual screening process tends to be inaccurate due to fatigue conditions and fails in obtaining the right candidate for the job. This paper proposed a solution to automatically generate the most suitable candidate from the application document. In this study, 126 application documents from a private company were used for the experiment. The documents consist of 41 documents for Human Resource and Development (HRD) staff, 42 documents for IT (Data Developer), and 43 documents for the Marketing position. Text Processing is implemented to extract relevant information such as skills, education, experiences from the unstructured resumes and summarize each application. A specific dictionary for each vacancy is generated based on terms used in each profession. Two methods are implemented and compared to match and score the application document, namely Document Vector and N-gram analysis. The highest the score obtained by one document, the highest the possibility of application to be accepted. The two methods’ results are then validated by the real selection process by the company. The highest accuracy was achieved by the N-Gram method in IT vacancy with 87,5%, while the Document Vector showed 75% accuracy. For Marketing staff vacancy, both methods achieved the same accuracy as 78%. In HRD staff vacancy, the N-Gram method showed 68%, while Document Vector showed 74%. In conclusion, overall the N-gram method showed slightly better accuracy compared to the Document Vector method.
P. Hendrarso, Meningkatkan Kualitas Sumber Daya Manusia di Perguruan Tinggi menuju Era VUCA : Studi Fenomenologi Pada Perguruan Tinggi Swasta, Prosiding Seminar Stiami, vol. 7, no. 2. 2020.
S. R. Astari, “Penerapan Profile Matching Untuk Seleksi Asisten Laboratorium,” Telematika, vol. 16, no. 1, p. 1, 2019, doi: 10.31315/telematika.v16i1.2987. DOI: https://doi.org/10.31315/telematika.v16i1.2987
J. Kuswanto, “Penerimaan Karyawan Baru Menggunakan Metode Profile Matching,” J. Ilm. Sist. Informasi, Teknol. Inf. dan Sist. Komput., vol. 15, no. 2, pp. 85–97, 2020. DOI: https://doi.org/10.33998/processor.2020.15.2.831
E. Sutinah, “Sistem Pendukung Keputusan Menggunakan Metode Profile Matching dalam Pemilihan Salesman Terbaik,” Informatics Educ. Prof., vol. 2, no. 1, p. 234409, 2017.
Hassani, H., Beneki, C., Unger, S., Mazinani, M. T., & Yeganegi, M. R. (2020). Text mining in big data analytics. Big Data and Cognitive Computing, 4(1), 1–34. https://doi.org/10.3390/bdcc4010001 DOI: https://doi.org/10.3390/bdcc4010001
Wosiak, A. (2021). Automated extraction of information from Polish resume documents in the IT recruitment process. Procedia Computer Science, 192, 2432–2439. https://doi.org/10.1016/j.procs.2021.09.012 DOI: https://doi.org/10.1016/j.procs.2021.09.012
Alanoca, H. A., Vidal, A. A. R. de C., & Saire, J. E. C. (2020). Curriculum Vitae Recommendation Based on Text Mining. http://arxiv.org/abs/2007.11053
A. Aditya, B. N. Sari, and T.N Padilah, "Perbandingan pengukuran jarak Euclidean dan Gower pada klaster k-medoids," Jurnal Teknologi dan Sistem Komputer, vol. 9, no. 1, pp. 1-7, 2021. DOI: https://doi.org/10.14710/jtsiskom.2020.13747
A. Ali, J. Qadir, R. ur Rasool, A. Sathiaseelan, A. Zwitter, and J. Crowcroft, “Big data for development: applications and techniques,” Big Data Anal., vol. 1, no. 1, 2016. DOI: https://doi.org/10.1186/s41044-016-0002-4
D. Rapitasari, “Digital marketing Berbasis Aplikasi Sebagai Strategi Meningkatkan Kepuasaan Pelanggan,” J. Cakrawala, vol. 10, no. 2, pp. 107–112, 2016.
Kotler, P., Rackham, N., & Krishnaswamy, S. (2006). Ending the War Between Sales and Marketing. www.hbrreprints.org
Kasmawati, “Pengembangan Sumber Daya Manusia Dalam Organisasi Pendidikan Islam,” J. UIN Alaudin, vol. VIII, no. 2, pp. 392–402, 2019. DOI: https://doi.org/10.24252/idaarah.v2i2.6864
I. A. Zarqan, “Human Resource Development in the Era of Technology; Technology’s Implementation for Innovative Human Resource Development,” J. Manaj. Teor. dan Terap. | J. Theory Appl. Manag., vol. 10, no. 3, p. 217, 2017. DOI: https://doi.org/10.20473/jmtt.v10i3.5967
M. Habibi, “Implementation of Cosine Similarity in an automatic classifier for comments,” JISKA (Jurnal Inform. Sunan Kalijaga), vol. 3, no. 2, p. 110, 2019. DOI: https://doi.org/10.14421/jiska.2018.32-05
D. Soyusiawaty and Y. Zakaria, “Book data content similarity detector with cosine similarity (case study on digilib.uad.ac.id),” Proceeding 2018 12th Int. Conf. Telecommun. Syst. Serv. Appl. TSSA 2018, 2018. DOI: https://doi.org/10.1109/TSSA.2018.8708758
R. Saptono, H. Prasetyo, and A. Irawan, “Combination of cosine similarity method and conditional probability for plagiarism detection in the thesis documents vector space model” J. Telecommun. Electron. Comput. Eng., vol. 10, no. 2–4, pp. 139–143, 2018.
A. W. Pradana and M. Hayaty, “The Effect of Stemming and Removal of Stopwords on the Accuracy of Sentiment Analysis on Indonesian-language Texts,” Kinet. Game Technol. Inf. Syst. Comput. Network, Comput. Electron. Control, vol. 4, no. 3, pp. 375–380, 2019. DOI: https://doi.org/10.22219/kinetik.v4i4.912
S. Sohangir and D. Wang, “Improved sqrt-cosine similarity measurement,” J. Big Data, vol. 4, no. 1, 2017. DOI: https://doi.org/10.1186/s40537-017-0083-6
A. K. Singh and M. Shashi, “Vectorization of text documents for identifying unifiable news articles,” Int. J. Adv. Comput. Sci. Appl., vol. 10, no. 7, pp. 305–310, 2019. DOI: https://doi.org/10.14569/IJACSA.2019.0100742
Singh Lehal M, Kumar, A, Goyal, V, "Comparative Analysis of Similarity Measures for Extraction of Parallel Data", International Journal of Control and Automation, Vol. 12, No. 6, pp. 408-417, 2019.
A. Koochari, A. A. Gharahbagh, and V. Hajihashemi, “A Persian part of speech tagging system using the long short-term memory neural network,” 6th Iran. Conf. Signal Process. Intell. Syst. ICSPIS 2020, 2020, doi: 10.1109/ICSPIS51611.2020.9349556. DOI: https://doi.org/10.1109/ICSPIS51611.2020.9349556
Kinoa, Y., Kurokia, H., Machidab, T., Furuyab, N., Takanob, K., “Text Analysis for Job Matching Quality Improvement,” Int’l Conf. on Knowledge Based and Intelligent Information and Engineering Systems, 2017. DOI: https://doi.org/10.1016/j.procs.2017.08.054
Almada, R. V., Elias, O. M., G ´omez, C. E., Mendoza, M. D., L ´opez, S. G., Natural Language Processing and Text Mining to Identify Knowledge Profiles for Software Engineering Positions, 5th 81st Int’l Conf. in Software Engineering Research and Innovation (CONISOFT), 2017.
S A Md Nasir, W F Wan Yaacob, and W A H Wan Aziz. Analysing Online Vacancy and Skills Demand using Text Mining., Journal of Physics: Conference Series., 1496 (2020), IOP Publishing, doi:10.1088/1742-6596/1496/1/012011 DOI: https://doi.org/10.1088/1742-6596/1496/1/012011
Debortoli S, Müller O and vom Brocke J., (2014). Comparing business intelligence and big data skills: a text mining study using job advertisements. Business & Information Systems Engineering 6(5) DOI: https://doi.org/10.1007/s12599-014-0344-2
Karakatsanis I, AlKhader W, MacCrory F, Alibasic A, Omar M A, Aung Z and Woon W L. (2017)., Data mining approach to monitoring the requirements of the job market: A case study. Information Systems Vol 65 p1-6. DOI: https://doi.org/10.1016/j.is.2016.10.009
Copyright (c) 2022 EMITTER International Journal of Engineering Technology
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.
The copyright to this article is transferred to Politeknik Elektronika Negeri Surabaya(PENS) if and when the article is accepted for publication. The undersigned hereby transfers any and all rights in and to the paper including without limitation all copyrights to PENS. The undersigned hereby represents and warrants that the paper is original and that he/she is the author of the paper, except for material that is clearly identified as to its original source, with permission notices from the copyright owners where required. The undersigned represents that he/she has the power and authority to make and execute this assignment. The copyright transfer form can be downloaded here .
The corresponding author signs for and accepts responsibility for releasing this material on behalf of any and all co-authors. This agreement is to be signed by at least one of the authors who have obtained the assent of the co-author(s) where applicable. After submission of this agreement signed by the corresponding author, changes of authorship or in the order of the authors listed will not be accepted.
Retained Rights/Terms and Conditions
- Authors retain all proprietary rights in any process, procedure, or article of manufacture described in the Work.
- Authors may reproduce or authorize others to reproduce the work or derivative works for the author’s personal use or company use, provided that the source and the copyright notice of Politeknik Elektronika Negeri Surabaya (PENS) publisher are indicated.
- Authors are allowed to use and reuse their articles under the same CC-BY-NC-SA license as third parties.
- Third-parties are allowed to share and adapt the publication work for all non-commercial purposes and if they remix, transform, or build upon the material, they must distribute under the same license as the original.
To avoid plagiarism activities, the manuscript will be checked twice by the Editorial Board of the EMITTER International Journal of Engineering Technology (EMITTER Journal) using iThenticate Plagiarism Checker and the CrossCheck plagiarism screening service. The similarity score of a manuscript has should be less than 25%. The manuscript that plagiarizes another author’s work or author's own will be rejected by EMITTER Journal.
Authors are expected to comply with EMITTER Journal's plagiarism rules by downloading and signing the plagiarism declaration form here and resubmitting the form, along with the copyright transfer form via online submission.