Re: [abacus] Configuring Abacus applications


Jean-Sebastien Delfino
 

Hi Piotr,

The answers to your questions really depend on the performance of the
environment and database you're integrating Abacus with, but let me try to
give you pointers to some of the factors you'll want to watch in your
performance tuning. Sorry for such a long email, but you had several
questions bundled in there and the answers are not just yes/no type of
answers.

- is there a recommended minimal number of instances of Abacus
applications

I recommend 2 of each as a minimum for availability if instances crash or
if you need to restart them individually.

- how would above depend on expected number of submissions or documents
to be processed

This really depends on the performance of your deployment environment and
database cluster. More instances will allow you to process more docs
faster, scaling linearly up to the load what your database can take.

- is there a dependency between number of instances of applications i.e.
do they have to match

You should be able to tune each application with a different number of
instances (see note *** below for additional info).

Here are some of the key factors to consider for tuning:

Collector service
- stateless, receives batches of submitted usage over HTTP, does 1 db write
per batch, 1 db write per usage doc;
- increase to provide better response time to resource providers as they
submit usage.

Metering service
- stateless, receives individual submitted usage docs from collector, does
2 db writes per usage doc;
- you can probably size it the same or a bit more than the collector app as
it's processing more (individual) docs than the submitted batches.

Accumulator service
- stateful as it accumulates usage per resource instance, does 2 db writes
per usage doc, 1 read per approx 100 usage docs;
- serializes updates to the accumulated usage per resource instance, so
increase if your individual resource instances are getting a lot of usage;
- resource instances are distributed to db partitions, one partition per
instance, and that instance is the only reader/writer from/to that
partition;
- I've seen the performance of the accumulator scale linearly from 1 to 16
instances, recommend to test its performance in your environment.

Aggregator service
- stateful as it aggregates usage per organization, does 2 db writes per
usage doc, 1 read per approx 100 usage docs;
- same performance characteristics and observations as for the accumulator,
except that the write serialization is on an organization basis.

Rating service
- stateless, just adds rated usage to input aggregated usage, no
serialization here, 2 db writes per usage doc;
- since there's no serialization you may be OK with less instances than the
accumulator and aggregator;
- on the other hand you don't want 16 aggregators to overload 2 instances
of the rating service, so look for a middle ground.

Reporting
- stateless, one db read per report per org;
- scales like a regular Web app, gated by the query performance on your db;
- recommend 2 instances minimum for availability then increase as your
reporting load increases;
- delegates org lookups to your account info service so include the
performance of that service in your analysis as well.

- what is the default and recommended number of DB partitions and how can
they be configured (time based as well as key based)

Time-based
- one per month, as most db writes and reads target the current month, and
sometimes the previous month;
- with that, monthly dbs can be archived once they're not needed anymore.

Key based
- depends how many resource instances and organizations you have and the
performance of your database as its volume increases;
- for the accumulator and aggregator services, you need one db partition
per app instance, reserved to that instance.

- how would above depend on expected number of documents
Same as your 2nd question, if I understood it correctly.

[***] While researching this I found that although you can configure each
app with a different number of instances, it's not very convenient to do
right now as we're currently using a single environment variable to
configure the number of db partitions a service uses and the number of
instances configured for the next service in the Abacus processing
pipeline. I'll open a Github issue to change that and use different env
variables to configure these two different aspects, as that'll make it
easier for you to use different numbers of db partitions and instances in
the accumulator and the aggregator services for example.

HTH


- Jean-Sebastien

On Wed, Oct 21, 2015 at 9:04 AM, Piotr Przybylski <piotrp(a)us.ibm.com> wrote:

Hi,
couple of questions about configuring Abacus, specifically the recommended
settings and how to configure them

- is there a recommended minimal number of instances of Abacus applications
- how would above depend on expected number of submissions or documents to
be processed
- is there a dependency between number of instances of applications i.e.
do they have to match
- what is the default and recommended number of DB partitions and how can
they be configured (time based as well as key based)
- how would above depend on expected number of documents

Thank you,

Piotr


Join cf-dev@lists.cloudfoundry.org to automatically receive all group messages.