By Crystal Tatis


The analysis tools, traditionally, used for marketing research are tools like SPSS and SAS. However, there are different types of data analysis technologies used for marketing research areas within the R software, like ANOVA, Regression and Matrix Factorization. We all know that the benefit of R is that it’s free of cost and you can get immediate updated results as many times as you please. Although I am familiar with the R language, I did not know I could carry out statistical analyses in a marketing context to such a detailed extent, so I wanted to share with you what I have learned. Let’s get started!

Most companies live in the search for incremental progress in their current products and/or services. But of course, living in this postmodern market, life is not as easy as just asking consumers what they want. We will always end up with many more variables in our datasets than we expected, as soon as we start asking about the details of feature preferences or product usages. In essence, we look to explain this phenomenon of large numbers in the simplest way possible.

For example, instead of ranking the importance of the color in your future purchase of a car, suppose that you are shown a multitude of colors alternatives to which you are more likely to answer “no”, than actually choosing that unique shade of dark navy blue #05226E, which makes sense because some colors are associated with options you are not buying. However, to respond to the ranking question, the purchaser of the car will think back to possibly find “color problems” he might have experienced in the past. On the other hand, the manufacturer is concerned about those “color problems” in the future, when only a small number of specific colors will be available.



It will be difficult to analyze this resulting data with traditional techniques, since the resulting data will be high dimensional (data anywhere from a few dozen to many thousands of dimensions) and sparse. This is where R comes as a big benefit to us by offering tools from statistical learning designed for this kind of data that is composed whenever we provide the context. We can use R packages for nonnegative matrix factorization (NMF) to manage and explain such fragmentation and high dimensional data in a smaller set. Matrix factorization basically exposes a hidden structure within the observed color choices by identifying the lurking benefits of them.

What we hope to get out of this is to develop the consumption process of consumers through a new perspective. Hopefully obtaining some insight into the consumers measures and thought process while shopping that will lead us to know whether the consumer will or will not buy the product or service.


If you want to find out what does your car color say about you, click on the link below.

What does your car color say about you?


For detailed examples on Matrix Factorization, see the following: