Key Driver Analysis

Note: this feature is only available for the Professional Analytics licence

What is Key Driver Analysis?

Key Driver Analysis is a powerful method to understand what motivates the decision-making of your respondents. It is typically used to gain insights into which factors are the most important in relation to the selected outcome metric, such as: NPS, Customer satisfaction score, Purchase Intent, etc.

Example: Company X is continually measuring it’s NPS score throughout various touchpoints with their clients. They can use Key Driver Analysis to analyze and understand which of the underlying aspects of their business have the most significant impact on the changes in their NPS.

The Drivers are the factors that drive the specified outcome. In the NPS example above, you can additionally ask your respondents to rate different aspects of your business, such as:

  1. Pricing competitiveness.
  2. Product quality.
  3. Product range.
  4. Customer support.
  5. Personalized approach.

By using Key Driver Analysis you can identify which of those factors have the biggest impact when it comes to the company’s NPS.

The identification of key drivers allows organizations to focus their efforts and resources on the factors that have the most substantial impact on achieving their goals.

How to read the results?

For Key Driver Analysis results interpretation, first you need to identify:

  1. Outcome metric (dependent variable): what is the outcome that we want to optimize (e.g. NPS)
  2. Predictors (independent variables). Which factors do we analyze against that outcome metric.

All selected predictors are assessed on the following 2 dimensions:

  1. Importance (Y-axis): how big is the relative impact on the outcome variable (e.g. NPS).
  2. Performance (X-axis): how high is the respondents’ score for this variable (normalized mean).

The further to the right the variable is on the chart, the more satisfied the respondents are with that aspect, according to survey results.

The higher up a variable appears on the chart, the stronger the impact of that variable is on the outcome metric (NPS).

Always consider the Model Accuracy Score before using the analysis results. As a general rule, the higher the model accuracy the better. It is not advised to make business decisions based on the results if the model accuracy is below 50%. Read more in-depth information in “Model evaluation metrics” section below.

You can notice 4 quadrants on the chart, created with reference lines based on the Importance and Performance score mean values:

  1. Key weaknesses: usually the most important quadrant to focus on. The top left quadrant contains drivers that are important but poorly rated. In the example above, you can find individual approach in that quadrant, meaning that this aspect has a big impact on the NPS score, however, it is currently poorly rated by your respondents. Based on this analysis, it might be a good idea to put business priority on improving the individual approach of your services.
  2. Key strengths: the top right quadrant contains aspects that are important and highly rated. This indicates that those key drivers are already performing well and have a positive important effect on the outcome metric (NPS). In the example below pricing is positively contributing to the company´s NPS.
  3. Unimportant weaknesses: the bottom left quadrant contains aspects that are not important and poorly rated. Although respondents rank these drivers poorly, they are not an important factor in determining the company´s NPS. In the example below it would not be recommended to invest more in the product range or customer support.
  4. Unimportant strengths: the bottom right quadrant contains aspects that are not important but highly rated. Although these drivers received a good rating, they do not influence the outcome metric (NPS). In the example below it is not advised for the company to invest more in product quality.

How to use the Key Driver Analysis method in Survalyzer?

The minimum required number of interviews for the calculation to work is 50. KDA calculation will not be started if the number of interviews is lower.

1. Start Key Driver Analysis

2. Outcome Variable

3. Predictor Variables

4. Method (calculation algorithm)

5. Filters

6. Results

7. Add chart & table to the report

Key Driver Analysis: details explained

Performance score calculation

Performance score (X-axis) is a normalized mean value of that variable. In simple terms, it is a mean value converted to percentage value, ranging from 0 to 100%. Normalizing allows you to compare variables with different scales.

Performance score calculation = (Mean value – Min value) / (Max value – Min value).

Examples:

The respondents rank your company’s pricing on a scale 0 – 10 and the average is 7.8. Performance score is 78 (%).

The product quality is ranked on a 1 – 5 scale and the average rating is a 3.8. Performance score is 70 (%).

Importance score calculation

Importance score calculation comes from ml.net machine learning regression model. It is a result of a two-step approach:

  1. Machine learning model is built with regression algorithm to predict the value of the outcome metric from a set of related features (predictor variables). Regression algorithms model the dependency of the outcome metric on its related features to determine how the outcome will change as the values of the features are varied.
  2. Permutation Feature Importance (PFI) technique is used to to assess the importance of individual features (variables) in a predictive model. At a high level, the way it works is by randomly shuffling data one feature at a time for the entire dataset and calculating how much the performance metric of interest (e.g. R-squared) decreases. The larger the change, the more important that feature is.

This two-step approach is used to avoid multicollinearity bias that might occur in pure regression analysis.

Please note that Importance score is relative. It is related to the specific model created and may differ with any change to the model (e.g. adding / removing predictors, changing selected algorithm, etc.)

Further reading:

Methods (calculation algorithms) available

FastTree method should provide good results for common Key Driver Analysis scenarios. However, depending on the character of your data, you may achieve better results by testing and selecting the best calculation algorithm for your case.

Method     Description
FastTree
(default)
 A decision tree-based algorithm that is used for both regression and classification problems. It is known for its speed and accuracy, and is particularly useful when working with large datasets. It is a good choice when you need to train a model quickly and don’t have a lot of time to spend on feature engineering
FastForestA random forest-based algorithm that is used for regression problems. It is similar to FastTree, but instead of building a single decision tree, it builds an ensemble of trees and averages their predictions. This helps to reduce overfitting and improve the accuracy of the model. It is a good choice when you have a large dataset and want to avoid overfitting
OLS
(Ordinary Least Squares)
A linear regression algorithm that is used to model the relationship between a dependent variable and one or more independent variables. It is a simple and widely used algorithm that is easy to interpret. It is a good choice when you have a small dataset and want to model a linear relationship between variables
SDCA
(Stochastic Dual Coordinate Ascent)
A linear regression algorithm that is used to solve large-scale optimization problems. It is particularly useful when working with sparse datasets, as it can handle large numbers of features. It is a good choice when you have a large dataset with many features and want to optimize for sparsity
L-BFGS PoissonA linear regression algorithm that is used to model count data. It is based on the Poisson distribution, which is commonly used to model count data. It is a good choice when you have count data and want to model the relationship between the count data and one or more independent variables
Online Gradient Descent A linear regression algorithm that is used to model the relationship between a dependent variable and one or more independent variables. It is particularly useful when working with large datasets that cannot fit into memory, as it can update the model parameters incrementally. It is a good choice when you have a large dataset and want to train a model in an online setting

Further reading: How to choose an ML.NET algorithm – ML.NET

Model evaluation metrics

Metric          Description
Model accuracyIn our case, it is R-squared converted to percentage value.
R2
(R-squared)
Represents the predictive power of the model as a value between -inf and 1. As a general rule, the closer to 1, the better quality. You should consider models of R2 above 0.5. Negative R2 means that the model is worse than random predictions, should be disregarded entirely, and may also indicate a problem with the input data.

However, sometimes low R-squared values (such as 0.5) can also be entirely normal for certain research cases and very high R-squared values (such as 0.99) are not always good. See further reading link below.
MAE
(Mean Absolute Error)
Measures how close the predictions are to the actual outcomes. It is the average of all the model errors, where model error is the absolute distance between the predicted outcome value and the correct outcome value. The closer to 0.00, the better quality.
MSE
(Mean Squared Error)
Tells you how close a regression line is to a set of test data values by taking the distances from the points to the regression line and squaring them. The squaring gives more weight to larger differences. It is always non-negative, and values closer to 0.00 are better.
RMSE
(Root Mean Squared Error)
RMSE is the square root of MSE. Has the same units as the predicted outcome, similar to the MAE though giving more weight to larger differences. It is always non-negative, and values closer to 0.00 are better.

Further reading: Evaluation metrics for Regression and Recommendation

Updated on January 25, 2024
Was this article helpful?

Related Articles

Need Support?
Please login to your Survalyzer account and use the "Create Support Request" form.
Login to Survalyzer