Skip to content

Instantly share code, notes, and snippets.

@mszep
Last active May 4, 2018 08:55
Show Gist options
  • Select an option

  • Save mszep/91c80a735e6f3c2828cf35e8e84ade80 to your computer and use it in GitHub Desktop.

Select an option

Save mszep/91c80a735e6f3c2828cf35e8e84ade80 to your computer and use it in GitHub Desktop.
thermo.md

So the scoring module is in review (nstack-apps#68).

What remains to be done is figuring out how to plug Thermo Fisher's data into it, and what further information we need from them in order to do so.

To wit:

The data has five columns, given on a monthly cadence:

  1. % license used
  2. length of time a customer
  3. delta license since initiation
  4. % entity trending
  5. tickets created

Following a chat with Chris, we agreed to use monthly deltas for features 1, 4, and 5.

For each feature, we need to know if a higher value corresponds to a better or worse score. To me, this is clear for the first three (all higher is better), but not for 4 and 5. We should probably ask them, but for now I'll assume that higher is better for 4, and lower is better for 5.

In addition, to avoid a combinatorial explosion of required reference scores, I've split the features into three independent groups:

  • {1, 5}
  • {2, 3}
  • {4}

This grouping is based purely on my intuition of which features could be correlated with one another. The model will compute a score for each group, and then the final score will be based on these group scores.

Next we need to determine suitable reference values for each feature. I propose to just use the min, max and median from the reference set. If the client wants more granularity, we can do that, but they'll need to provide more reference scores.

Finally, we need a set of reference scores for combining the group scores into a final value. Here, I'm just going to assume that the groups have equal weightings and are independent. (This can also be revisited later if necessary.)

Therefore, all we need from Thermo Fisher are reference scores for the following combinations (on a scale of 0-100):

delta % license used delta tickets created reference score
-44% -12
-44% 0
-44% 7 0
0% -12
0% 0
0% 7
+60% -12 100
+60% 0
+60% 7
length of time a customer delta license since initiation reference score
0.75 -13 0
0.75 0
0.75 30
2.0 -13
2.0 0
2.0 30
9.0 -13
9.0 0
9.0 30 100
delta entity trending reference score
-100.11% 0
-1.14%
+26.35% 100
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment