876—JOURNAL OF FOOD SCIENCE—Volume 61, No. 5, 1996 An Hypothesis Paper Linear Regression, Neural Network and Induction Analysis to Determine Harvesting and Processing Effects on Surimi Quality G. PETERS, M.T. MORRISSEY, G. SYLVIA, and J. BOLTE ABSTRACT Harvesting and processing input combinations and product quality at- tributes for the Pacific whiting surimi industry were collected and ana- lyzed. Multiple linear regression, neural networks, and M5-induction were used to determine significant variables in the industry. Significant factors included variables intrinsic to the fish (moisture content, salinity, pH, length, weight) and processing variables (processing time, storage temperature, harvest date, wash time, wash ratios). Most variables were highly interactive and nonlinear. Information derived from these models have implications for production and management decisions. Key Words: surimi, neural networks, induction, quality attributes INTRODUCTION THE PACIFIC WHITING FISHERY is the largest volume fishery in the Pacific Northwest (excluding Alaska). The domestic harvest and processing has increased from 12,000 metric tons (mt) in 1990 to over 200,000 mt in 1994 (Radtke, 1995). Since 1992, four surimi plants have been constructed in Oregon for produc- tion of surimi from Pacific whiting. Excessive softening of the muscle tissue due to protease enzymes gave Pacific whiting a poor reputation among processors and consumers (Nelson et al., 1985; Peters and Sylvia, 1991). This was partially circumvented in surimi production through use of food-grade protease inhib- itors which improved surimi gel strength (Morrissey et al., 1993). Because of rapid development of the whiting industry, there is little information concerning the importance of different harvesting and processing factors that affect the final quality of Pacific whiting products. Changes in quality are the result of complex interactions in harvesting, storage and processing sectors of the seafood indus- try. A model that relates harvesting, processing, and natural characteristics of a fishery to expected product quality would present advantages for the industry. Such a model would enable more efficient resource utilization and better understanding of critical factors which impact product attributes. A relational model could integrate, coordinate, and evaluate the effects of individual input and output variables into a comprehensive over- view of the fishery. It could be used to maximize economic benefits of the fishery by improving decision-making capabili- ties. For developing cohesive models the number of applicable fac- tors is very large and they are often interactive. Intelligent com- puter programs such as fuzzy logic and neural networks have Authors Peters and Bolte are with the Dept. of Bioresource En- gineering at Oregon State Univ., Corvallis, OR 97331. Author Mor- rissey is Director of the OSU Seafood Laboratory, 250 36th St, Astoria, OR 97103. Author Sylvia is with the Dept. of Resource Economics and a scientist at the OSU Hatfield Marine Science Center, Newport, OR 97365. Direct inquiries to Dr. Michael T. Morrissey, Director, OSU Seafood Laboratory, 250 36th St, As- toria, OR 97103. shown promise in analyzing complex interactive biological sys- tems. A review by Eerikainen et al. (1993) has shown their potential use in analyzing food processing systems. Horimoto et al. (1995) showed that a neural network (NN) could be used to determine loaf volumes of breads made from different wheat cultivars. Vallejo-Cordoba et al. (1995) showed the usefulness of NN in predicting the shelf-life of milk using headspace gas chromatographic data. In both cases, a NN was more accurate and faster than other analysis systems such as principle com- ponents regression. Our objective was to determine the relationships that exist among different harvesting, handling and processing factors that affect Pacific whiting surimi quality. Three different modeling strategies were used to determine influences on product quality. These were (1) multiple linear regression (MLR), (2) a model that would describe the quality issues in a nonlinear NN frame- work, and (3) an M5 induction type machine learning system (M5-I). MLR analysis is a traditional analytical tool that defines a mathematical function which best fits the relationship among input (independent) and output (dependent) variables. NN is a relatively innovative technology for making predictions and rec- ognizing patterns of large, complex, and highly interactive sys- tems (Stanley, 1988). The M5-I model is a method of artificial intelligence that uses examples, i.e., data sets, to generate rules and equations that categorize the influencing variables and their effects (Schank and Kass, 1990). The three models were com- pared to help evaluate their relative advantages and validity. MATERIALS & METHODS Collection of data Information was collected over a 3-yr period (1992-1994 whiting sea- sons). Data were obtained from several sources. Three shore-side pro- cessors in the Pacific Northwest participated in the collection of proc- essing data and information. Harvesting data were collected from 16 trawlers in the fisheries and the Oregon Department of Fish and Wildlife (ODFW). ODFW had observers on 20% of all whiting fishing expedi- tions in Oregon and shared data on harvesting variables (e.g., tow size, tow length, geographic location, and weather conditions). Also, log book data from fishing vessels were collected. This information was combined with on-shore observations and a quality evaluation to get a complete representation of variables affecting fish quality. ODFW observer infor- mation and log books were supplemented by data from fishermen in- cluding output from computerized time/temperature recorders mounted in fish holds. Although several products are made from Pacific whiting, only processing and final product quality parameters for surimi were collected. Surimi has objective values for gel strength, color, numbers of impurities, and microbial count. These values are computed to deter- mine a final grade for surimi (Park and Morrissey, 1994). Several types of variables were collected (Table 1) to help define this relational model. Different input variables (86) which may influence the quality of Pacific whiting were collected and used to develop each of the models. These variables consisted of large data sets and 8000 data points were collected for each variable. Processing vessels kept careful records of the harvesting parameters while processing plants tracked