The answers to your questions really depend on the performance of the
environment and database you're integrating Abacus with, but let me try to
give you pointers to some of the factors you'll want to watch in your
performance tuning. Sorry for such a long email, but you had several
questions bundled in there and the answers are not just yes/no type of
- is there a recommended minimal number of instances of Abacusapplications
I recommend 2 of each as a minimum for availability if instances crash or
if you need to restart them individually.
- how would above depend on expected number of submissions or documentsto be processed
This really depends on the performance of your deployment environment and
database cluster. More instances will allow you to process more docs
faster, scaling linearly up to the load what your database can take.
- is there a dependency between number of instances of applications i.e.do they have to match
You should be able to tune each application with a different number of
instances (see note *** below for additional info).
Here are some of the key factors to consider for tuning:
- stateless, receives batches of submitted usage over HTTP, does 1 db write
per batch, 1 db write per usage doc;
- increase to provide better response time to resource providers as they
- stateless, receives individual submitted usage docs from collector, does
2 db writes per usage doc;
- you can probably size it the same or a bit more than the collector app as
it's processing more (individual) docs than the submitted batches.
- stateful as it accumulates usage per resource instance, does 2 db writes
per usage doc, 1 read per approx 100 usage docs;
- serializes updates to the accumulated usage per resource instance, so
increase if your individual resource instances are getting a lot of usage;
- resource instances are distributed to db partitions, one partition per
instance, and that instance is the only reader/writer from/to that
- I've seen the performance of the accumulator scale linearly from 1 to 16
instances, recommend to test its performance in your environment.
- stateful as it aggregates usage per organization, does 2 db writes per
usage doc, 1 read per approx 100 usage docs;
- same performance characteristics and observations as for the accumulator,
except that the write serialization is on an organization basis.
- stateless, just adds rated usage to input aggregated usage, no
serialization here, 2 db writes per usage doc;
- since there's no serialization you may be OK with less instances than the
accumulator and aggregator;
- on the other hand you don't want 16 aggregators to overload 2 instances
of the rating service, so look for a middle ground.
- stateless, one db read per report per org;
- scales like a regular Web app, gated by the query performance on your db;
- recommend 2 instances minimum for availability then increase as your
reporting load increases;
- delegates org lookups to your account info service so include the
performance of that service in your analysis as well.
- what is the default and recommended number of DB partitions and how canthey be configured (time based as well as key based)
- one per month, as most db writes and reads target the current month, and
sometimes the previous month;
- with that, monthly dbs can be archived once they're not needed anymore.
- depends how many resource instances and organizations you have and the
performance of your database as its volume increases;
- for the accumulator and aggregator services, you need one db partition
per app instance, reserved to that instance.
- how would above depend on expected number of documentsSame as your 2nd question, if I understood it correctly.
[***] While researching this I found that although you can configure each
app with a different number of instances, it's not very convenient to do
right now as we're currently using a single environment variable to
configure the number of db partitions a service uses and the number of
instances configured for the next service in the Abacus processing
pipeline. I'll open a Github issue to change that and use different env
variables to configure these two different aspects, as that'll make it
easier for you to use different numbers of db partitions and instances in
the accumulator and the aggregator services for example.
On Wed, Oct 21, 2015 at 9:04 AM, Piotr Przybylski <piotrp(a)us.ibm.com> wrote: