性能和可伸缩性

| | 评论( )

A Word On Scalabilityposting I tried to write down a more precise definition of scalability than is commeonly usedThere were good comments about the definition at the posting as well as in a discussion atThe ServerSide

以不那么精确的方式回顾我说

  • A service is said to be scalable if when we increase the resources in a system, it results in increased performance in a manner proportional to resources added
  • An always-on service is said to be scalable if adding resources to facilitate redundancy does not result in a loss of performance.
  • A scalable service needs to be able to handle heterogeneity of resources.

There were quite a few comments about the use of performance in the definitionThis is how I reason about performance in this context: I am assuming that each service has an SLA contract that defines what the expectations of your clients/customers are (SLA = Service Level Agreement)What exactly is in that SLA depends on the kind of service business you are in; quite a few of the services that contribute to an Amazon.com website have an SLA that is latency drivenThis latency will have a certain distribution and you pick a number of points on the distribution as representatives for measuring your SLAFor example at Amazon we also track the latency at the 99.9% mark to make sure all of all customers are getting an experience at SLA or better.

This SLA needs to be maintained if you grow your business. Growing can mean increasing the number of requests, increasing the number of items you serve, increasing the amount of work you do for each request, etcBut no matter along which axis you grow, you will need to make sure you can always meet your SLAGrowth along some axis can be served by scaling up to faster CPUs and larger memories, but if you keep growing there is an end to what you can buy and you will need to scale outGiven that scaling up is often not cost effective, you might as well start by working on scaling out, as you will have to go that path eventually.

I have not seen many SLAs that are purely throughput driven. It is often a combination of the amount of work that needs to be done, the distribution in which it will arrive and when that work needs to be finished, that will lead to a throughput driven SLALatency does play a role here as it is often a driver for what throughput is necessary to achieve the output distributionIf you have a request arrival distribution that is non-uniform you can play various games with buffering and capping the throughput at lower than you peak load as long as you are willing to accept longer latencies.  Often it is the latency distribution that you try to achieve that drives you throughput requirements.

There were some other points made with respect to what should be part of a scalability definition, among others byGideon Low @ the serverside thread(I tried to link to his individual response but seem to fail) who make some good points.

  • Operationally efficient – It takes less human resources to manage the system as the number of hardware resources scales up.
  • Resilient – Increasing the number of resources will also increase the probability of failure of one of those resources, but the impact of such a failure should be reduced as the number of resource grows.

These two points combined with a discussion about cost/capacity/efficiency should be part of a definition of a scalable service. I’ll be thinking a bit about what the right wording should be and will post a proposal later.

评论

博客评论的Disqus