Re: [abacus] Accepting delayed usage within a slack window


Jean-Sebastien Delfino
 

Hi Ben,

That makes sense to me. What you've described will enable refinements of
accumulated usage for a month as we continue to receive delayed usage
during the first few days of the next month.

To illustrate this with an example: with a 48h time window, on Sept 30 you
can retrieve the Sept 30 usage doc and find 'provisional' usage for Sept in
the 'month time window', not including unknown usage not yet been submitted
to Abacus. Later on Oct 2nd you can retrieve the Oct 2nd usage doc and find
the 'final usage' for Sept in the 'month - 1 time window'. I think this is
better than waiting for Oct 2nd to 'close the Sept window', as our users
typically want to see both their *real time* usage for Sept before Oct 2nd
and their final usage later once it has settled for sure.

I also like that with that approach you don't need to go back to your Sept
30 usage doc to patch it up with delayed usage, as that way you're also
keeping a record of the Sept usage that was really known to us on Sept 30.

Another interesting aspect of this is that the history you're going to
maintain will allow us to write 'marker' usage docs when we transition from
one time window to another. Since a usage doc contains both the usage for
the day and the previous day, you can write the first document you process
each day, as a marker, in a reporting db and that'll give you an easy and
efficient way to retrieve the accumulated usage for the previous day. For
example, to retrieve the usage accumulated at the end of Oct 11, just
retrieve the 'marker' usage doc for Oct 12 and get the usage in its 'day -
1 time window'. That could help us implement the kind of query that Georgi
mentioned on the chat last week when he was looking for an efficient way to
retrieve daily usage for all the days of the month.

Finally, looking at the array of numbers/objects currently used to maintain
our time windows, I'm wondering if keeping the 'yearly' and 'forever' usage
time windows is not a bit overkill (and could actually become a problem).

That data is going to be duplicated in all individual usage docs for little
value IMO as the yearly usage at least is easy to reconstruct at reporting
time with a query over 12 monthly usage docs. Also, maintaining that
'forever' usage will require us to keep usage docs around for resource
instances that may have been deleted long time ago, and will complicate our
database partitioning scheme as these old resource instances will cause the
databases to grow forever. So, I'd prefer to let old usage data sit in old
monthly database partitions instead of having to carry that old data over
each month forever just to maintain these 'forever' time windows.

In other words, I'm suggesting to change our current array of 7 time
windows [Forever, Y, M, D, h, m, s] to 5 windows [M, D, h, m, s]. Combined
with your slack window proposal, with a 2D slack time we'll be looking at
an array like follows: [[M, M-1], [D, D-1, D-2], [h], [m], [s]]. With a 48h
slack time the array will have 49 hourly entries [h, h-1, h-2, h-3, etc]
instead of one.

Thoughts?


- Jean-Sebastien

On Sun, Oct 11, 2015 at 6:04 AM, Benjamin Cheng <bscheng(a)us.ibm.com> wrote:

One of the things that need to be supported in abacus is the handling of
delayed usage submissions within a particular slack window after the usage
has happened. For example, given a slack window of 48 hours, a service
provider will be able to submit usage back to September 30th on October 2nd.

An idea that we were discussing about for this was augmenting the quantity
from an array of numbers/objects to an array of arrays of numbers/objects
and using an environmental variable that is currently going to be called
SLACK to hold the configuration of the slack window. SLACK would follow a
format of [0-9]+[YMDhms] with the width of the slack window and to what
precision the slack window should be maintained. 2D and 48h both are the
same time, but 48h will keep track of the history to the hour level while
2D will only keep it to the day level. If this environment variable isn't
configured, the current idea is to have no slack window as the default.

The general formula for the length of each array in a time window would be
as follows: 1(This is for usage covered in the current window) + (number of
windows to cover the configured slack window for the particular time
window).
IE: Given a slack of 48h. The year time window would be 1 + 1. Month would
be 1 + 1. Day would be 1 + 2. Hours would be 1 + 48. Minutes/Seconds would
stay at 1.

Thoughts on this idea?

Join {cf-dev@lists.cloudfoundry.org to automatically receive all group messages.