Re: [abacus] Accepting delayed usage within a slack window
Jean-Sebastien Delfino
Hi Ben,
toggle quoted message
Show quoted text
That makes sense to me. What you've described will enable refinements of accumulated usage for a month as we continue to receive delayed usage during the first few days of the next month. To illustrate this with an example: with a 48h time window, on Sept 30 you can retrieve the Sept 30 usage doc and find 'provisional' usage for Sept in the 'month time window', not including unknown usage not yet been submitted to Abacus. Later on Oct 2nd you can retrieve the Oct 2nd usage doc and find the 'final usage' for Sept in the 'month - 1 time window'. I think this is better than waiting for Oct 2nd to 'close the Sept window', as our users typically want to see both their *real time* usage for Sept before Oct 2nd and their final usage later once it has settled for sure. I also like that with that approach you don't need to go back to your Sept 30 usage doc to patch it up with delayed usage, as that way you're also keeping a record of the Sept usage that was really known to us on Sept 30. Another interesting aspect of this is that the history you're going to maintain will allow us to write 'marker' usage docs when we transition from one time window to another. Since a usage doc contains both the usage for the day and the previous day, you can write the first document you process each day, as a marker, in a reporting db and that'll give you an easy and efficient way to retrieve the accumulated usage for the previous day. For example, to retrieve the usage accumulated at the end of Oct 11, just retrieve the 'marker' usage doc for Oct 12 and get the usage in its 'day - 1 time window'. That could help us implement the kind of query that Georgi mentioned on the chat last week when he was looking for an efficient way to retrieve daily usage for all the days of the month. Finally, looking at the array of numbers/objects currently used to maintain our time windows, I'm wondering if keeping the 'yearly' and 'forever' usage time windows is not a bit overkill (and could actually become a problem). That data is going to be duplicated in all individual usage docs for little value IMO as the yearly usage at least is easy to reconstruct at reporting time with a query over 12 monthly usage docs. Also, maintaining that 'forever' usage will require us to keep usage docs around for resource instances that may have been deleted long time ago, and will complicate our database partitioning scheme as these old resource instances will cause the databases to grow forever. So, I'd prefer to let old usage data sit in old monthly database partitions instead of having to carry that old data over each month forever just to maintain these 'forever' time windows. In other words, I'm suggesting to change our current array of 7 time windows [Forever, Y, M, D, h, m, s] to 5 windows [M, D, h, m, s]. Combined with your slack window proposal, with a 2D slack time we'll be looking at an array like follows: [[M, M-1], [D, D-1, D-2], [h], [m], [s]]. With a 48h slack time the array will have 49 hourly entries [h, h-1, h-2, h-3, etc] instead of one. Thoughts? - Jean-Sebastien On Sun, Oct 11, 2015 at 6:04 AM, Benjamin Cheng <bscheng(a)us.ibm.com> wrote:
One of the things that need to be supported in abacus is the handling of |
|