[abacus] Data structures for temporal usage windows


Jean-Sebastien Delfino
 

Hi Ben,

Following up on our discussion of some of the data structures we can use to
represent our various time windows:
(some background in Github #33 [1] and I've copied the latest Github
comment below as well)

What you're proposing looks pretty good to me. I like your idea of renaming
this array 'windows', and grouping the usage quantity and the related cost,
charge, summary etc together.

This makes clear that the array is about windowing (chopping our stream of
usage into finite temporal windows / buckets), and that 'windows' array is
already contained under a 'usage' object (or 'accumulated_usage',
'aggregated_usage' depending on which usage processing step we're at) so
IMO there's no need to repeat 'usage' again here.

BTW looking at this again triggered another idea about that array, but I
need to think a bit more about it before proposing another minor change on
top of what you have here. Will post again later on that topic.

[1] https://github.com/cloudfoundry-incubator/cf-abacus/issues/33

-- Jean-Sebastien

Benjamin Scheng wrote:

In terms of following this design with accumulator and aggregator, it was
changing the quantity to a 7-length array.

When we get to rate, instead of making an equivalent 7-length array for
costs. It'd be better to keep all the values associated with that quantity
in one object(similar to the current and previous quantities in
accumulator). Since it wouldn't make sense to just call it quantity despite
having cost. Here's what the structure would most likely look like in terms
of aggregated_usage:

{ aggregated_usage: [ { metric: 'memory', windows: [{ cost: 0, quantity:
0 }, { cost: 0, quantity: 0 }, { cost: 1, quantity: 1 }, { cost: 24,
quantity: 24 }, { cost: 720, quantity: 720 }, { cost: 8640, quantity: 8640
}, { cost: 86400, quantity: 86400 }] } ] }

'windows' would be holding the new quantity and cost in a 7-length array.
Since charge & summary will also be added to this object during the report,
the name ought to accommodate. At the risk of sounding redundant, would
usage or usage_amount work here better, or is there other terms that would
make more sense?

Thoughts? Suggestions?