So the scoring module is in review (nstack-apps#68).
What remains to be done is figuring out how to plug Thermo Fisher's data into it, and what further information we need from them in order to do so.
To wit:
The data has five columns, given on a monthly cadence:
- % license used
- length of time a customer
- delta license since initiation
- % entity trending
- tickets created
Following a chat with Chris, we agreed to use monthly deltas for features 1, 4, and 5.
For each feature, we need to know if a higher value corresponds to a better or worse score. To me, this is clear for the first three (all higher is better), but not for 4 and 5. We should probably ask them, but for now I'll assume that higher is better for 4, and lower is better for 5.
In addition, to avoid a combinatorial explosion of required reference scores, I've split the features into three independent groups:
- {1, 5}
- {2, 3}
- {4}
This grouping is based purely on my intuition of which features could be correlated with one another. The model will compute a score for each group, and then the final score will be based on these group scores.
Next we need to determine suitable reference values for each feature. I propose to just use the min, max and median from the reference set. If the client wants more granularity, we can do that, but they'll need to provide more reference scores.
Finally, we need a set of reference scores for combining the group scores into a final value. Here, I'm just going to assume that the groups have equal weightings and are independent. (This can also be revisited later if necessary.)
Therefore, all we need from Thermo Fisher are reference scores for the following combinations (on a scale of 0-100):
| delta % license used | delta tickets created | reference score |
|---|---|---|
| -44% | -12 | |
| -44% | 0 | |
| -44% | 7 | 0 |
| 0% | -12 | |
| 0% | 0 | |
| 0% | 7 | |
| +60% | -12 | 100 |
| +60% | 0 | |
| +60% | 7 |
| length of time a customer | delta license since initiation | reference score |
|---|---|---|
| 0.75 | -13 | 0 |
| 0.75 | 0 | |
| 0.75 | 30 | |
| 2.0 | -13 | |
| 2.0 | 0 | |
| 2.0 | 30 | |
| 9.0 | -13 | |
| 9.0 | 0 | |
| 9.0 | 30 | 100 |
| delta entity trending | reference score |
|---|---|
| -100.11% | 0 |
| -1.14% | |
| +26.35% | 100 |