Skip to main content

Service availability - Temporal Cloud

The operating envelope of Temporal Cloud includes availability, regions, throughput, latency, and limits. If you need more details, contact us.

Available regions

Where is Temporal Cloud available?

Developers and applications can access Temporal Cloud from any location with internet connectivity, irrespective of where the Temporal Cloud resources (Namespaces) are located.

Temporal Cloud is compatible with applications deployed in various cloud environments or data centers.

To minimize latency, we advise creating your Namespace in a region geographically close to your Workers' hosting location.

AWS

Temporal Cloud operates in several regions on Amazon Web Services (AWS):

AreaCodeRegion
Asia Pacificap-northeast-1Tokyo
Asia Pacificap-northeast-2Seoul
Asia Pacificap-south-1Mumbai
Asia Pacificap-south-2Hyderabad
Asia Pacificap-southeast-1Singapore
Asia Pacificap-southeast-2Sydney
Europeeu-central-1Frankfurt
Europeeu-west-1Ireland
Europeeu-west-2London
North Americaca-central-1Central Canada
North Americaus-east-1Northern Virginia
North Americaus-east-2Ohio
North Americaus-west-2Oregon
South Americasa-east-1São Paulo

GCP

Temporal Cloud operates in two regions on Google Cloud (GCP):

AreaCodeRegion
North Americaus-west1Oregon
Asia Pacificaustralia-southeast1Sydney

Throughput expectations

What kind of throughput can I get with Temporal Cloud?

Each Namespace has a rate limit, which is measured in Actions per second (APS). A Namespace's default limit is set at 400 APS and automatically adjusts based on recent usage (over the prior 7 days). Your throughput limit will never fall below this default value.

When your Action rate exceeds your quota, Temporal Cloud throttles Actions until the rate matches your quota. Throttling means limiting the rate at which Actions are performed to prevent the Namespace from exceeding its APS limit.

Critical calls to external events, such starting or Signaling a Workflow, are always prioritized and never throttled. There are four priority levels for Temporal Cloud API calls:

  1. External events
  2. Workflow progress updates
  3. Visibility API calls
  4. Cloud operations such as Namespace creation

When you exceed your APS limits, you might receive warnings about throttling. However, requests are never dropped, and high-priority calls are never delayed. Workers might take longer to complete Workflows.

If your usage grows slowly, your throughput limit grows with your usage. At times, you may hit a maximum throughput threshold and need to switch to a higher consumption tier. Learn more about our tiers by visiting our information page or reach out to our team to help size your number of Actions. Temporal Cloud can provide more than 150,000 Actions per second at its highest tier.

MEASURING THROUGHPUT WITH APS AND RPS

APS and RPS are both measures of throughput, but apply to different aspects of Temporal.

APS, or Actions Per Second, is specific to Temporal Cloud. It measures the rate at which Actions, like starting or signaling a Workflow, can be performed in a specific Namespace. Temporal Cloud uses APS to manage and throttle Actions, preventing a Namespace from exceeding its limit. APS measures how many high-level operations (Actions) a user can perform in Temporal Cloud each second.

RPS, or Requests Per Second, is used in the Temporal Service, both in self-hosted Temporal and Temporal Cloud. It measures and controls the rate of gRPC requests to the Service. This is a lower-level measure that manages rates at the service level, such as the Frontend, History, or Matching Services.

In summary, APS is a higher-level measure to limit and mitigate Action spikes in Temporal Cloud. RPS is a lower-level measure to control and balance request rates at the service level.

Latency Service Level Objective (SLO)

What kind of latency can I expect from Temporal Cloud?

Temporal Cloud has a p99 latency SLO of 200ms per region.

In March 2024, latency over a week-long period for starting and signaling Workflow Executions was as follows:

Operationp90p99
StartWorkflowExecution24ms54ms
SignalWorkflowExecution14ms40ms
SignalWithStartWorkflowExecution24ms61ms

As Temporal continues working on improving latencies, these numbers will progressively decrease.

The same SLO for normal Worker requests (commands and polling) apply to Nexus in both the caller and handler Namespaces.

Latency observed from the Temporal Client is influenced by other system components like the Codec Server, egress proxy, and the network itself. Also, concurrent operations on the same Workflow Execution may result in higher latency.