VOL. 10, NO. 4, MARCH 2015 ISSN 1819-6608 ARPN Journal of Engineering and Applied Sciences © 2006-2015 Asian Research Publishing Network (ARPN). All rights reserved. www.arpnjournals.com 1732 A NOVEL DECISION TREE APPROACH FOR OPTION PRICING USING A CLUSTERING BASED LEARNING ALGORITHM J. K. R. Sastry, K. V. N. M. Ramesh and J. V. R. Murthy KL University, JNTU Kakinada, India E-Mail: drsastry@kluniversity.in ABSTRACT Decision tree analysis involves forecasting future outcomes and assigning probabilities to those events. One of the most basic fundamental applications of decision tree analysis is for the purpose of option pricing. The binomial tree would factor in multiple paths that the underlying asset's price can take as time progresses. The price of the option is calculated using the discrete probabilities and their associated pay-offs at maturity date of the option. In this work we came up with an approach to build a binomial decision tree that can be used to price European, American and Bermudian options and a methodology to train the decision tree using a clustering based learning algorithm that minimizes the mean square error (MSE) between the observed and predicted option prices. The training methodology involves clustering the options based on moneyness and fit a linear equation for each cluster to calculate the confidence that needs to be used in building the binomial decision tree for a particular strike price within the cluster. It is observed that the MSE for option price using the proposed model is less when compared to the Black-Scholes model for the proposed learning algorithm. Keywords: option pricing, clustering, decision tree, binomial option pricing. 1. INTRODUCTION A derivative [13] is an agreement between two parties that has a value derived on the underlying asset. There are many kinds of derivatives with most notable being swaps, futures and options. An option [13] is a financial derivative that represents contract sold by one party (option writer) to another party (option holder). The contract offers the buyer the right, but not the obligation, to buy (call) or sell (put) a security or other financial asset at an agreed-upon price (the strike price) during a certain period of time or on a specific date (exercise date). The Black-Sholes formula [2] presented the first pioneering tool for rational valuation of options. There are several assumptions, used to derive the original model Black, Sholes, relaxation of which had been reported in the literature: No dividends Relaxed in [15], No taxes nor transaction costs, Constant interest rates relaxed in [15], No penalties for short sales, Continuous market operation relaxed in [16], Continuous share price relaxed in [7], Lognormal terminal stock price return relaxed in [14]. In addition, Black-Sholes model assumes; continuous diffusion of the underlying relaxing which resulted in jump diffusion model [15], constant standard deviation/volatility, and no effect on option prices from supply/demand. These models improve pricing performance and generalise Black-Sholes formula to a class of models referred to as the modern parametric option pricing models. Modern parametric option pricing models which are a generalization to the Black-Sholes model are more complex and have poor out-of-sample performance and use implausible and/or inconsistent implied parameters. They often produce parameters inconsistent with underlying time-series and inferior hedging and retain systematic price bias they were intended to eliminate [3], [4]. Prompted by shortcomings of modern parametric option-pricing, new class of methods was created that do not rely on pre-assumed models but instead try to uncover/induce the model, or a process of computing prices, from vast quantities of historic data. Many of them utilize learning methods of Artificial Intelligence. Non- parametric approaches are particularly useful when parametric solution either; lead to bias, or are too complex to use, or do not exist at all. The purest version of non- parametric option-pricing methods, are model-free methods. They involve no finance theory but estimates option prices inductively using historical or implied variables and transaction data. Although some form of parametric formula usually is involved, at least indirectly, it is not the starting point but a result of an inductive process. There are several methods in this group:  Model-free option pricing with Genetic Programming (GP)  Model-free option-pricing with kernel regression  Model-free option-pricing with Artificial Neural Networks (ANN) The independence of model-free approaches from any finance theory means prices produced by them may not conform to rational pricing and/or may not capture restrictions implied by arbitrage [10]. To improve model- free approaches in this respect, constraints have to be introduced [5]. There are several ways used to enforce rational pricing into model-free pricing; The Equivalent Martingale Measure (EMM) adjusts prices to reflect a preference-free, risk-neutral market. In risk-neutral economy all assets must earn the same return [6]. Under the risk-adjusted probability distribution, the stock price follows a Martingale (a stochastic process where the best forecast of tomorrow’s price is today’s) and is arbitrage- free. Non-parametric adjustments to Black-Sholes estimate a portion of the option-pricing non-parametrically while retaining the conventional option-pricing framework to