doi: 10.4304/jsw.9.5.1202-1209
Lemmatization Technique in Bahasa: Indonesian Language
Abstract—Many researches and inventions have been made in the field of linguistics and technology. Even so, the integration between linguistics and technology is not always reliable to all language. Every language is unique in its linguistic nature and rules. In this paper, a lemmatization technique in Bahasa (Indonesian language) is presented. It has achieved good precision by using The Indonesian Dictionary and a set of rules to remove affixes. The lemmatization technique is developed based on the previous algorithm, Indonesian stemmer. Both Indonesian stemming and lemmatization method have the same characteristics but a little bit different in its implementation. The way to reach its own goal/purpose is defined as a core difference and therefore possible to modify. The result shows that the algorithm achieved roughly 98% precision on a collection consisting 57,261 valid words with 7,839 unique valid words gathered from Kompas.com, an Indonesian online news article.
Index Terms—stemmer, algorithm, lemmatization, language, Bahasa, Indonesian
Cite: Derwin Suhartono, David Christiandy, Rolando, "Lemmatization Technique in Bahasa: Indonesian Language," Journal of Software vol. 9, no. 5, pp. 1202-1209, 2014.
General Information
ISSN: 1796-217X (Online)
Abbreviated Title: J. Softw.
Frequency: Quarterly
APC: 500USD
DOI: 10.17706/JSW
Editor-in-Chief: Prof. Antanas Verikas
Executive Editor: Ms. Cecilia Xie
Abstracting/ Indexing: DBLP, EBSCO,
CNKI, Google Scholar, ProQuest,
INSPEC(IET), ULRICH's Periodicals
Directory, WorldCat, etcE-mail: jsweditorialoffice@gmail.com
-
Oct 22, 2024 News!
Vol 19, No 3 has been published with online version [Click]
-
Jan 04, 2024 News!
JSW will adopt Article-by-Article Work Flow
-
Apr 01, 2024 News!
Vol 14, No 4- Vol 14, No 12 has been indexed by IET-(Inspec) [Click]
-
Apr 01, 2024 News!
Papers published in JSW Vol 18, No 1- Vol 18, No 6 have been indexed by DBLP [Click]
-
Jun 12, 2024 News!
Vol 19, No 2 has been published with online version [Click]