432 Views

# Which Aggregation Method Should I Choose?

Quick Thought Experiment
Suppose you knew the population and average income for five ZIP Codes in a town and wanted to compute the population and average income for the town as a whole. To get the town population, you’d need to add together the population of the five ZIP Codes. To get the average household income for the town, however, you wouldn’t add together the average income of the ZIP Codes. Instead, you com­pute an average of the income for the five ZIP Codes. This average has to take into account not only the average income of each ZIP, but also the number of households in each one. This is some­times called a weighted average.
When deciding how to choose your Aggregation Method, you need to ask yourself two questions:
If I knew data about two areas and wanted to get the same data for both the areas combined:
1. Would I want to add together my data or find an average in the middle?
2. If I want to find an average, which area should have a bigger pull on that average?
The answer to the first questions will tell you whether you need to choose Add or Average and the answer to the second question will tell you which field should be the weighting field.
Examples
Population:
1. The Population of two ZIP Codes would be the population of each added together. I would choose Add
Average Household Income:
1. The Average Household Income of two counties is not the sum of their individual averages, but rather somewhere in the middle of the two, so I would choose Average.
2. The county with more households should have a bigger pull on the Average Household Income, so the weighting field should be Households.
% Population Change:
1. The % Population Change of two postcodes is not the sum of their individual changes, but rather somewhere in the middle of the two, so I would choose Average.
2. The postcode with higher population should have a bigger pull on the overall change, so the weighting field is Population.
Sales:
1. The Total Sales of two territories would be the sum of each territory's Sales, so I would choose Add.
Sales per Rep:
1. To find sales per rep for the combination of two states, you wouldn’t add together the sales per rep for each state, but rather it would be somewhere in the middle of the two, so I would choose Average.
2. A state with many more reps should have a bigger impact on the average so the weighting field is Number of Reps.
General Tips
• For the most part, if a field has the words Average, Percent or Per in it, these would use the Average method.
• For the most part, if a field has the words Total, Number of, or Sum in it, these would be the Add method.
• Any field that involves division in the calculation should be Average and the field you divided by should be the weighting field.
• E.g. Average Household Income is Total Income divided by Number of Households, so Number of Households is the Weighing Field
• E.g. % Population Change is Total Population Change divided by Starting Population, so Starting Population is the Weighting Field
• If you can reword the field to use the word Per then the thing after Per is generally the weighting field.
• Average Income -> Income per Household -> Number of Households is the weighting field
• Sales per Rep -> Number of Reps is the weighting field
• Population Density -> Population per Area -> Area is the weighting field
• Average Sale value -> Sales Value per Sale -> Number of Sales is the weighting field
None of these tips will work perfectly for every field, so if you are unsure, the best thing to do would be to think about the original question of two areas being combined together and whether you would add your data together or whether you would find an average in the middle.