AbstractConsidering user opinion in game software development is important from a marketing viewpoint, because there are no effective ways to analyze the market of game software. In this research, we attempted to develop an analysis process for consumers’ review comments by using topic model and structural equation modeling. By using this approach, we aimed to extract the relationships of elements to which users seem to direct their attention visually and quantitatively, and we expected to extract meaningful knowledge for game software development. Experimental results suggest that our proposed process can analyze the market as effectively as the text-based model generation method for confirmatory factor analysis. Index TermsCausal analysis, factor expression, game software, structural equation modeling, topic model, hierarchical latent dirichlet allocation. I. INTRODUCTION With the rapid expansion of the platform diffusion rate spurred on by the spread of smartphone and tablet terminals due to recent global technological advances, the game software market, including consumer, mobile, and amusement facilities, has become a large-scale market worth $61.400 million as of 2012. A report by CAPCOM co. LTD. investigation group indicates that the size of the game software market is expected to reach $86.6 million by 2017 [1], [2]. However, the difficulty of market investigation is one of the most important problems for any game software developers, whereas rapid growth of the market size is accepted. The difficulty of identifying consumers’ purchasing factor is a notable issue, given that many developers unanimously say that they are unable to know whether their products will be popular until they send it off in the market [3], [4]. As in the related work, Kunimoto et al. attempted to extract the important factors by using KJ method [5] and model integration methods with structural equation modeling (SEM) [6]. They propose the path model generation process, which uses the idea of collective intelligence. Saga et al. attempted to improve the analysis process with SEM by using text information [7]. They showed the effectiveness of using a combination of models and text information for factor analysis with SEM. However, issues still exist because of the Manuscript received October 17, 2014; revised December 19, 2014. This work was supported by a Grant-in-Aid for Foundation of the Fusion of Science and Technology; FOST, and MEXT/JSPS KAKENHI 2524049, 25420448. R. Kunimoto and R. Saga are with Osaka Prefecture University, 1-1 Gakuen-cho, Naka-ku, Sakai-City, Osaka, Japan (e-mail: kunimoto@mis.cs.osakafu-u.ac.jp, saga@cs.osakafu-u.ac.jp). explanation ability of the factor model, which inadequately expresses the text-based confirmation analysis process. In this research, we propose an analysis process that uses mainly two information techniques, namely, SEM and topic model, as one of the approaches for the problem based on the idea of visualization to analyze invisible phenomenon as a latent factor in text data. Higher collective intelligence exists in users’ comments and reviews. Thus, factor analysis that uses such information source will have higher explanation ability. The topic model will enable us to structurally understand specific topicality when users evaluate game software by electronically analyzing existing text data (corpus). We suggest combining the factor analysis method, namely, SEM, with the obtained structured topic model to analyze users’ interests visually and quantitatively. We aim to show the possibility of an effective investigation method for the game software market and to help developers. II. TOPIC MODEL Topic model is a machine learning technique that clarifies the structure of a document group by estimating words that constitute a topic based on the premise that each document group that constitutes the corpus belongs to the specific topic. Several topic model methods are available, such as latent semantic indexing (LSI) [8], latent Dirichlet allocation (LDA) [9], and hierarchical LDA (hLDA) [10], which is a progressive LDA technique. We use LDA as the foundation of the topic model because a reviewer’s comment can be safely assumed to have several background topics. Furthermore, we adopted hLDA, which is highly compatible with SEM, as a concrete step in our analysis process. A. hLDA hLDA is the representative hierarchical topic model. In hLDA, the potentiality topic constitutes the part tree of infinite height and the hierarchy structure branches off endlessly, unlike LDA, which assumes a flat potentiality topic. Adopting hLDA has two advantages. First, relationships between topics do not need to be considered, and second, the number of topics will be estimated automatically by the algorithm of the hLDA process. Hierarchy structure is generated by using the nested Chinese restau-rant process [11] in which the visitor and the table (or a restaurant) expresses the document and the topic, respectively. The generation process of hLDA is as follows: First, the parameter of multinomial distribution (Dirichlet allocation) on words for each topic is chosen, as shown in Fig. 1. Then, the root node Purchase Factor Expression for Game Software Using Structural Equation Modeling with Topic Model in User’s Review Texts Rikuto Kunimoto and Ryosuke Saga International Journal of Innovation, Management and Technology, Vol. 5, No. 6, December 2014 417 DOI: 10.7763/IJIMT.2014.V5.551