Billing & Metering of app usage with Abacus


Hristo Iliev
 

We want to integrate the metrics provided by Cloud Foundry with Abacus
<https://github.com/cloudfoundry-incubator/cf-abacus>.

We plan to create a billing/metering integration layer that:

- fetches the app usage events
<http://apidocs.cloudfoundry.org/214/app_usage_events/list_all_app_usage_events.html>
from CC
- inserts runtime usage
<https://github.com/cloudfoundry-incubator/cf-abacus/blob/master/doc/api.md#runtime-usage>
data to Abacus

The events should be processed to build usage data based on them. This is
much like the idea outlined in Dr.Nic's blog
<https://blog.starkandwayne.com/2015/01/22/billing-your-cloud-foundry-users/>.
The integration layer can do this periodically.

AFAIK Abacus provides usage reports
<https://github.com/cloudfoundry-incubator/cf-abacus/blob/master/doc/api.md#usage-report>
for the current month only and not for arbitrary period of time. This
implies some restrictions to what we can report when the application is:

1. started and stopped several times in the month
2. started in the current month but not stopped at all
3. started in a previous month and not stopped

The first problem can be solved by iterating through the events and finding
the respective start and end timestamps that have to be reported to Abacus.

The second issue might be solved by reporting small amounts of usage,
stating from the last start event and continuing to report on every poll of
the integration layer. For example we can report several usages:

- start: 1438945112; end: 1438946000 (current time for the billing
integration)
- start: 1438946000 (previous reporting cycle); end: 1438947000

The third issue might be solved by finding the last start event and
reporting in the same manner as with #2.
Reporting usage in small steps might require persistence so we can store
the end time of the previous reporting. We might use in-memory cache and
reach to Abacus as primary storage. If Abacus can accumulate usage
reporting we can even get rid of the persistence and cache.

Is such integration in the scope of Abacus project?

Regards,
Hristo Iliev


Piotr Przybylski <piotrp@...>
 

Hi, I am also looking at runtime submissions for Abacus, worked on it for
Bluemix, couple of points for discussion.

In addition to usage events (start followed by stop), the scaling and
duplicate events need to be handled. The former is a START followed by
START with memory or instance count changed, the latter can be STOP
followed by STOP.
We also encountered a situation where the ordering is correct - START
followed by STOP but the timestamp for the START is later than STOP.

For the running applications - your points #2 and #3 are a working though
it may generate fair amount of traffic, depending on frequency and number
of running applications. Eventually we may want to look at alternatives,
for example enhance metering to allow for time based submissions. Instead
of continually submitting usage, submit the state of application -
(STARTED, memory, instances), the metering could then calculate usage for
that application based on the time passed until application is stopped.

I think handling some of above scenarios requires persistence, even if to
log CF events that were used for submission (or skipped). You may want to
persist state to recovery from application failure or restart, as well as
keep track of running/active applications.

Thanks,

Piotr


Hristo Iliev <hsiliev(a)gmail.com> wrote on 08/07/2015 08:07:53 AM:

We want to integrate the metrics provided by Cloud Foundry with Abacus.
We plan to create a billing/metering integration layer that:
fetches the app usage events from CC
inserts runtime usage data to Abacus
The events should be processed to build usage data based on them.
This is much like the idea outlined in Dr.Nic's blog. The
integration layer can do this periodically.
AFAIK Abacus provides usage reports for the current month only and
not for arbitrary period of time. This implies some restrictions to
what we can report when the application is:
1. started and stopped several times in the month
2. started in the current month but not stopped at all
3. started in a previous month and not stopped
The first problem can be solved by iterating through the events and
finding the respective start and end timestamps that have to be
reported to Abacus.


The second issue might be solved by reporting small amounts of
usage, stating from the last start event and continuing to report on
every poll of the integration layer. For example we can report several
usages:
start: 1438945112; end: 1438946000 (current time for the billing
integration)
start: 1438946000 (previous reporting cycle); end: 1438947000
The third issue might be solved by finding the last start event and
reporting in the same manner as with #2.
Reporting usage in small steps might require persistence so we can
store the end time of the previous reporting. We might use in-memory
cache and reach to Abacus as primary storage. If Abacus can
accumulate usage reporting we can even get rid of the persistence and
cache.


Is such integration in the scope of Abacus project?

Regards,
Hristo Iliev


CF Runtime
 

Piotr,

The timestamps not being correct is a known limitation of how events are
being generated, but as you said, order should be guaranteed (and
timestamps should hopefully be close).

Duplicate events are something I'm not aware of though. In theory only one
API instance should be able to get a database lock on an app, and should
not release it until it is done updating and has recorded the app usage
event. Do you have any details on what caused duplicate events?

Joseph
OSS Release Integration Team

On Wed, Aug 12, 2015 at 8:07 AM, Piotr Przybylski <piotrp(a)us.ibm.com> wrote:

Hi, I am also looking at runtime submissions for Abacus, worked on it for
Bluemix, couple of points for discussion.

In addition to usage events (start followed by stop), the scaling and
duplicate events need to be handled. The former is a START followed by
START with memory or instance count changed, the latter can be STOP
followed by STOP.
We also encountered a situation where the ordering is correct - START
followed by STOP but the timestamp for the START is later than STOP.

For the running applications - your points #2 and #3 are a working though
it may generate fair amount of traffic, depending on frequency and number
of running applications. Eventually we may want to look at alternatives,
for example enhance metering to allow for time based submissions. Instead
of continually submitting usage, submit the state of application -
(STARTED, memory, instances), the metering could then calculate usage for
that application based on the time passed until application is stopped.

I think handling some of above scenarios requires persistence, even if to
log CF events that were used for submission (or skipped). You may want to
persist state to recovery from application failure or restart, as well as
keep track of running/active applications.

Thanks,

Piotr


Hristo Iliev <hsiliev(a)gmail.com> wrote on 08/07/2015 08:07:53 AM:

We want to integrate the metrics provided by Cloud Foundry with Abacus.
We plan to create a billing/metering integration layer that:
fetches the app usage events from CC
inserts runtime usage data to Abacus
The events should be processed to build usage data based on them.
This is much like the idea outlined in Dr.Nic's blog. The
integration layer can do this periodically.
AFAIK Abacus provides usage reports for the current month only and
not for arbitrary period of time. This implies some restrictions to
what we can report when the application is:
1. started and stopped several times in the month
2. started in the current month but not stopped at all
3. started in a previous month and not stopped
The first problem can be solved by iterating through the events and
finding the respective start and end timestamps that have to be
reported to Abacus.


The second issue might be solved by reporting small amounts of
usage, stating from the last start event and continuing to report on
every poll of the integration layer. For example we can report several
usages:
start: 1438945112; end: 1438946000 (current time for the billing
integration)
start: 1438946000 (previous reporting cycle); end: 1438947000
The third issue might be solved by finding the last start event and
reporting in the same manner as with #2.
Reporting usage in small steps might require persistence so we can
store the end time of the previous reporting. We might use in-memory
cache and reach to Abacus as primary storage. If Abacus can
accumulate usage reporting we can even get rid of the persistence and
cache.


Is such integration in the scope of Abacus project?

Regards,
Hristo Iliev


Piotr Przybylski <piotrp@...>
 

Joseph,
thank you, for the timestamps I just pointed it out as something that
implementation must deal with, quite easily as the order is correct.

For the duplicate events, I cannot say what causes them. It used to be
simple to reproduce by (almost) simultaneously sending 'cf stop <app>' from
two terminal windows. This does not 'work' any more (CF 210), however I see
logged duplicate STARTED and STOPPED events - not too frequent, but still
there. Somewhat surprisingly, for the two I looked at, the time difference
is quite large - 2 minutes for STARTED and 15 minutes for STOPPED. Is there
a way to determine how that happened without access to the application ?

Piotr




|------------>
| From: |
|------------>
>-----------------------------------------------------------------------------------------------------------------------------------------|
|CF Runtime <cfruntime(a)gmail.com> |
>-----------------------------------------------------------------------------------------------------------------------------------------|
|------------>
| To: |
|------------>
>-----------------------------------------------------------------------------------------------------------------------------------------|
|"Discussions about Cloud Foundry projects and the system overall." <cf-dev(a)lists.cloudfoundry.org> |
>-----------------------------------------------------------------------------------------------------------------------------------------|
|------------>
| Date: |
|------------>
>-----------------------------------------------------------------------------------------------------------------------------------------|
|08/12/2015 03:45 PM |
>-----------------------------------------------------------------------------------------------------------------------------------------|
|------------>
| Subject: |
|------------>
>-----------------------------------------------------------------------------------------------------------------------------------------|
|[cf-dev] Re: Re: Billing & Metering of app usage with Abacus |
>-----------------------------------------------------------------------------------------------------------------------------------------|





Piotr,

The timestamps not being correct is a known limitation of how events are
being generated, but as you said, order should be guaranteed (and
timestamps should hopefully be close).

Duplicate events are something I'm not aware of though. In theory only one
API instance should be able to get a database lock on an app, and should
not release it until it is done updating and has recorded the app usage
event. Do you have any details on what caused duplicate events?

Joseph
OSS Release Integration Team

On Wed, Aug 12, 2015 at 8:07 AM, Piotr Przybylski <piotrp(a)us.ibm.com>
wrote:
Hi, I am also looking at runtime submissions for Abacus, worked on it for
Bluemix, couple of points for discussion.

In addition to usage events (start followed by stop), the scaling and
duplicate events need to be handled. The former is a START followed by
START with memory or instance count changed, the latter can be STOP
followed by STOP.
We also encountered a situation where the ordering is correct - START
followed by STOP but the timestamp for the START is later than STOP.

For the running applications - your points #2 and #3 are a working though
it may generate fair amount of traffic, depending on frequency and number
of running applications. Eventually we may want to look at alternatives,
for example  enhance metering to allow for time based submissions.
Instead of continually submitting usage, submit the state of application
- (STARTED, memory, instances), the metering could then calculate usage
for that application based on the time passed until application is
stopped.

I think handling some of above scenarios requires persistence, even if to
log CF events that were used for submission (or skipped). You may want to
persist state to recovery from application failure or restart, as well as
keep track of running/active applications.

Thanks,

Piotr


Hristo Iliev <hsiliev(a)gmail.com> wrote on 08/07/2015 08:07:53 AM:

> We want to integrate the metrics provided by Cloud Foundry with Abacus.

> We plan to create a billing/metering integration layer that:
> fetches the app usage events from CC
> inserts runtime usage data to Abacus
> The events should be processed to build usage data based on them.
> This is much like the idea outlined in Dr.Nic's blog. The
> integration layer can do this periodically.

> AFAIK Abacus provides usage reports for the current month only and
> not for arbitrary period of time. This implies some restrictions to
> what we can report when the application is:
> 1. started and stopped several times in the month
> 2. started in the current month but not stopped at all
> 3. started in a previous month and not stopped
> The first problem can be solved by iterating through the events and
> finding the respective start and end timestamps that have to be
> reported to Abacus.



> The second issue might be solved by reporting small amounts of
> usage, stating from the last start event and continuing to report on
> every poll of the integration layer. For example we can report several
usages:
> start: 1438945112; end: 1438946000 (current time for the billing
integration)
> start: 1438946000 (previous reporting cycle); end: 1438947000
> The third issue might be solved by finding the last start event and
> reporting in the same manner as with #2.

> Reporting usage in small steps might require persistence so we can
> store the end time of the previous reporting. We might use in-memory
> cache and reach to Abacus as primary storage. If Abacus can
> accumulate usage reporting we can even get rid of the persistence and
cache.


> Is such integration in the scope of Abacus project?
>
> Regards,
> Hristo Iliev


CF Runtime
 

You might get some answers by querying the events api.
http://apidocs.cloudfoundry.org/214/events/list_all_events.html

You should be able to query it where the actee equals the guid of the app.

Joseph
OSS Release Integration Team

On Wed, Aug 12, 2015 at 5:26 PM, Piotr Przybylski <piotrp(a)us.ibm.com> wrote:

Joseph,
thank you, for the timestamps I just pointed it out as something that
implementation must deal with, quite easily as the order is correct.

For the duplicate events, I cannot say what causes them. It used to be
simple to reproduce by (almost) simultaneously sending 'cf stop <app>' from
two terminal windows. This does not 'work' any more (CF 210), however I see
logged duplicate STARTED and STOPPED events - not too frequent, but still
there. Somewhat surprisingly, for the two I looked at, the time difference
is quite large - 2 minutes for STARTED and 15 minutes for STOPPED. Is there
a way to determine how that happened without access to the application ?

Piotr



[image: Inactive hide details for CF Runtime ---08/12/2015 03:45:55
PM---Piotr, The timestamps not being correct is a known limitation]CF
Runtime ---08/12/2015 03:45:55 PM---Piotr, The timestamps not being correct
is a known limitation of how events are



From:


CF Runtime <cfruntime(a)gmail.com>

To:


"Discussions about Cloud Foundry projects and the system overall." <
cf-dev(a)lists.cloudfoundry.org>

Date:


08/12/2015 03:45 PM

Subject:


[cf-dev] Re: Re: Billing & Metering of app usage with Abacus
------------------------------



Piotr,

The timestamps not being correct is a known limitation of how events are
being generated, but as you said, order should be guaranteed (and
timestamps should hopefully be close).

Duplicate events are something I'm not aware of though. In theory only one
API instance should be able to get a database lock on an app, and should
not release it until it is done updating and has recorded the app usage
event. Do you have any details on what caused duplicate events?

Joseph
OSS Release Integration Team

On Wed, Aug 12, 2015 at 8:07 AM, Piotr Przybylski <*piotrp(a)us.ibm.com*
<piotrp(a)us.ibm.com>> wrote:

Hi, I am also looking at runtime submissions for Abacus, worked on it
for Bluemix, couple of points for discussion.

In addition to usage events (start followed by stop), the scaling and
duplicate events need to be handled. The former is a START followed by
START with memory or instance count changed, the latter can be STOP
followed by STOP.
We also encountered a situation where the ordering is correct - START
followed by STOP but the timestamp for the START is later than STOP.

For the running applications - your points #2 and #3 are a working
though it may generate fair amount of traffic, depending on frequency and
number of running applications. Eventually we may want to look at
alternatives, for example enhance metering to allow for time based
submissions. Instead of continually submitting usage, submit the state of
application - (STARTED, memory, instances), the metering could then
calculate usage for that application based on the time passed until
application is stopped.

I think handling some of above scenarios requires persistence, even if
to log CF events that were used for submission (or skipped). You may want
to persist state to recovery from application failure or restart, as well
as keep track of running/active applications.

Thanks,

Piotr


Hristo Iliev <*hsiliev(a)gmail.com* <hsiliev(a)gmail.com>> wrote on
08/07/2015 08:07:53 AM:

> We want to integrate the metrics provided by Cloud Foundry with
Abacus.

> We plan to create a billing/metering integration layer that:
> fetches the app usage events from CC
> inserts runtime usage data to Abacus
> The events should be processed to build usage data based on them.
> This is much like the idea outlined in Dr.Nic's blog. The
> integration layer can do this periodically.

> AFAIK Abacus provides usage reports for the current month only and
> not for arbitrary period of time. This implies some restrictions to
> what we can report when the application is:
> 1. started and stopped several times in the month
> 2. started in the current month but not stopped at all
> 3. started in a previous month and not stopped
> The first problem can be solved by iterating through the events and
> finding the respective start and end timestamps that have to be
> reported to Abacus.



> The second issue might be solved by reporting small amounts of
> usage, stating from the last start event and continuing to report on
> every poll of the integration layer. For example we can report
several usages:
> start: 1438945112; end: 1438946000 (current time for the billing
integration)
> start: 1438946000 (previous reporting cycle); end: 1438947000
> The third issue might be solved by finding the last start event and
> reporting in the same manner as with #2.

> Reporting usage in small steps might require persistence so we can
> store the end time of the previous reporting. We might use in-memory
> cache and reach to Abacus as primary storage. If Abacus can
> accumulate usage reporting we can even get rid of the persistence
and cache.


> Is such integration in the scope of Abacus project?
>
> Regards,
> Hristo Iliev