Show me the Money! Deriving the Pricing Power of Product Features by Mining Consumer Reviews

The increasing pervasiveness of the Internet has dramatically changed the way that consumers shop for goods.  Consumer-generated product reviews have become a valuable source of information for customers, who read the reviews and decide whether to buy the product based on the information provided.  In this paper, we use techniques that decompose the reviews into segments that evaluate the individual characteristics of a product (e.g., image quality and battery life for a digital camera).  Then, as a major contribution of this paper, we adapt methods from the econometrics literature, specifically the hedonic regression concept, to estimate: (a) the weight that customers place on each individual product feature, (b) the implicit evaluation score that customers assign to each feature, and (c) how these evaluations affect the revenue for a given product.  Towards this goal, we develop a novel hybrid technique combining text mining and econometrics that models consumer product reviews as elements in a tensor product of feature and evaluation spaces.  We then impute the quantitative impact of consumer reviews on product demand as a linear functional from this tensor product space.  We demonstrate how to use a low-dimension approximation of this functional to significantly reduce the number of model parameters, while still providing good experimental results.  We evaluate our technique using a data set from Amazon.com consisting of sales data and the related consumer reviews posted over a 15-month period for 242 products.  Our experimental evaluation shows that we can extract actionable business intelligence from the data and better understand the customer preferences and actions.  We also show that the textual portion of the reviews can improve product sales prediction compared to a baseline technique that simply relies on numeric data.