111 Jurnal Teknik Industri, Vol. 23, No. 2, December 2021 DOI: 10.9744/jti.23.2.111-120 ISSN 1411-2485 print / ISSN 2087-7439 online Utilizing Elbow Method for Text Clustering Optimization in Analyzing Social Media Marketing Content of Indonesian e-Commerce Aisyah Larasati 1* , Raretha Maren 1 , Retno Wulandari 2 Abstract: The massive increases in textual data from Twitter and text analytics simultaneously have driven organizations to obtain hidden insights to implement the proper marketing strategies for businesses. The vast information generated by Twitter enables most e-commerce businesses to utilize Twitter to implement social media marketing. One of those e-commerce businesses is Blibli Indonesia. Intense business competition has led them to perform marketing strategies to understand consumer tendencies. Focusing the marketing strategies on consumer preferences enables the increase of consumer interest in Blibli, which is in line with enhancing the opportunity to reach new consumers. This research aims to discover Twitter content based on k-means results to cluster the tweets of @bliblidotcom. The best cluster is determined with the elbow method by selecting the deepest curvature, three clusters. The result suggests that Twitter users like Park Seo Jun's content. Hence, Blibli can focus on that content as its business marketing strategy on the Twitter platform. Keywords: E-commerce, Twitter, marketing, text mining, K-means, elbow. Introduction Simultaneously, the evolution of textual data and text analytics has encouraged most organizations to utilize them as their business concern [1]). The significant increase in textual data is driven by the emergence of social media platforms such as Twitter and Facebook [2]. The available textual data can be used as a data source to obtain new insight into precise social media marketing for businesses [3]). Twitter provides most of its information in textual form [4]). Associated with the phenomenon of e-commerce business in Indo- nesia, one e-commerce that utilizes Twitter platforms for social media marketing is Blibli Indonesia. An intense business competition requires Blibli Indonesia to implement the proper marketing strate- gies. Communication through tweet uploads need to be carried out properly, according to consumer prefe- rences, in order to acquire positive responses [5]. In practice, Blibli tends to upload random tweets that have not met consumer trends and interests. More- over, the tweets of Blibli infrequently have major retweets, which indicates that the existing Twitter contents are yet to be adequately effective. In conse- quence, the existence of Blibli has been less popular in _________________________________________________ 1 Faculty of Industial Technology, Industrial Engineering Department, Universitas Negeri Malang, Jl. Semarang 5 Malang 65145, Indonesia. Email: aisyah.larasati.ft@um.ac.id; raretha.maren.1705166@stu- dents.um.ac.id 2 Faculty of Industial Technology, Mechanical Engineering Department, Universitas Negeri Malang, Jl. Semarang 5 Malang 65145, Indonesia. Email: retno.wulandari.ft@um.ac.id *Corresponding author public, which is in accordance to have fewer possibi- lities for consumers who are interested in making transactions through the Blibli application. The retweet and like features on Twitter are metrics for evaluating content effectiveness, which also can be accepted as metrics for measuring content popularity and consumer preference [6]. Thus, the tweet content that has not had major retweets and likes indicates less desirable content. The integration of text mining with the clustering method is the appropriate alternative to discover the preferable Twitter content that fits public preference. This is based on the ability of text mining to process textual data into beneficial information to meet analytical needs aligned with the goals of company business through several approaches [7]. The approach can be a clustering method to group the tweet contents using the k-means algorithm due to its convenience and relatively low time complexity of the implementation in handling large amounts of data [8]. In addition, the k-means algorithm is adequately optimal and highly utilized in text mining. The k- means algorithm proceeds to classify a set of data corresponding to the number of clusters determined by calculating the closest distance to the randomized centroids [9]. Each separate cluster has disparity characteristics. Thus, data partitioning is performed based on the characteristics of each cluster. Data with identical characteristics will be grouped into one cluster and vice versa [10]. However, the k-means algorithm cannot work correctly if the cluster parameter is selected subjectively since the algorithm is included in partitional clustering. Hence, it is