Date
1 - 4 of 4
[abacus]Reporting Query
Benjamin Cheng
In abacus, a user can get a report for an organization while passing a timestamp. This is done via a descending query with the given timestamp as the beginning and the beginning of the previous year is the end. This ensures that the report will have usage data for up to the previous year for the given organization. This query used to only look for the current day until time windows support was recently added.
With time windows, each record has information of usage accumulated/aggregated/rated to the last second/minute/hour/day/month/year/forever in sync with the document's end time. Choosing to query up to the beginning of the previous year was a choice based upon allowing the user to know usage up to the previous year for any given timestamp. Here are a couple of questions I have regarding this: - In addition to partitioning by bucketing, abacus partitions by period in terms of month for its databases. Therefore, each month with usage would have its own set of databases with a minimum of 1. In the worst condition(assuming only 1 database per month), that is possibly 24 databases (ie: 2014-01 to 2015-12) that the report would have to search for the last usage of an organization. Does it make sense to have to look through all of those databases if the organization hasn't had usage in the past 3 years for instance? - In terms of report, does it make sense to return the latest record within the past year? For example, say a user queried the monthly or yearly usage for an organization in 2015-10, but the last time the organization had any usage was 2014-04. Does it make sense to return 2014-04 to the user or would it better to inform the user that there is no usage within the specific time range? I guess in a way, this is asking what a report should entail that would make the information useful to the user. Thanks. |
|
Jean-Sebastien Delfino
Hi Ben,
Choosing to query up to the beginning of the previous year was a choicebased upon allowing the user to know usage up to the previous year for any given timestamp. If I'd want my usage for last year, I'd just get a report with a 12/31/2014 time (or 1/2015 to see any delayed usage leftover from 2014, but that part probably deserves a different discussion thread...) So I don't think the query needs to automatically go that far back. abacus partitions by period in terms of month for its databasesCorrect, and that's really useful to manage their growth and archival and accommodate for schema changes over time. Does it make sense to have to look through all of those databases if theorganization hasn't had usage in the past 3 years for instance? I don't think so, if we clarify what you get out of each report, more on that below. In terms of report, does it make sense to return the latest record withinthe past year? For example, say a user queried the monthly or yearly usage for an organization in 2015-10, but the last time the organization had any usage was 2014-04. Does it make sense to return 2014-04 to the user or would it better to inform the user that there is no usage within the specific time range? The typical use case is to get your usage for the month. So I'd suggest to keep this really simple for now: - you have usage in 10/2015, you get a report; - you don't have usage in 10/2015, you get a 'sorry no usage, nothing to report'; - you don't have usage in 10/2015 and you still wanted to see your yearly usage, ask for your 09/2015 report, still nothing for 09? ask for 08... With that approach, we avoid confusing the caller with some magic... (as magically returning the 09/2015 report when the request was for 10/2015 just in case you'd want to see your yearly usage could be pretty confusing IMO). I guess in a way, this is asking what a report should entail that wouldmake the information useful to the user. With the addition of more reporting time windows (you can get your usage for the month, day, hour etc) we can probably imagine many different types of reporting queries (I want my usage for this hour, accumulated today until this hour, yearly unit now...). I'd suggest to get more concrete input from our users on the types of queries they're most interested in, before attempting to automate all these combinations in the reporting service. Until then, would the proposal I've described above work? - Jean-Sebastien On Wed, Oct 7, 2015 at 4:20 PM, Benjamin Cheng <bscheng(a)us.ibm.com> wrote: In abacus, a user can get a report for an organization while passing a |
|
Benjamin Cheng
Yes, I think your proposal makes sense. I would prefer that approach rather than what I've detailed above with retrieving everything within a potential 2-year timeframe to fit purposes that the user most likely did not query in the first place for.
|
|
Jean-Sebastien Delfino
OK great. I've been looking into our database partitioning to fix issue #69
toggle quoted message
Show quoted text
[1] (related to this discussion as well) earlier this week so I'll go ahead and make that simple change then. [1] https://github.com/cloudfoundry-incubator/cf-abacus/issues/69 HTH - Jean-Sebastien On Fri, Oct 9, 2015 at 4:59 PM, Benjamin Cheng <bscheng(a)us.ibm.com> wrote:
Yes, I think your proposal makes sense. I would prefer that approach |
|