Making Scalable API Calls to a Salesforce server using a Static IP from a serverless environment in GCP.


  1. Basic Understanding of Cloud deployments.
  2. A Google Cloud Console project.
  3. Knowledge of Cloud Load balancers.
  4. Knowledge of deploying Infrastructure on GCP using Terraform.
  5. Knowledge of types of Instance Groups in GCP.

This article illustrates how we set-up a GCP infrastructure that enabled our application deployed on App Engine to make API calls to a server that requested a Static IP for whitelisting at scale.

In my previous article, I described how we set-up a standard infrastructure to make API calls to a Salesforce Server from a Static IP. In this article, I will explain how we overcame the limitations of traditional non-scalable deployments.


On August 8 2020, President Trump signed an executive order creating a Lost Wages Assistance (LWA) program. [1] To receive this LWA benefit, users needed to complete a small certification process through our application, due to which we anticipated an increase in the number of users logging into our application. We estimated the traffic to increase from 1500 users/hour to about 6000 users/hour during peak business hours.

The drawback with setting up a single Compute Instance as a reverse proxy to make the API call is that once the traffic on the application increases, it does not scale automatically, and one would need to add an instance to handle this increase in traffic manually.

We had the application deployed on App Engine which would scale automatically, but with this anticipated increase in the users, we also needed to make the Compute Instance that was set-up as an Nginx Reverse Proxy scale automatically and had to come up with a solution that did that.


Since we needed the Compute VM to scale automatically, we decided to use a Managed Instance Group(MIG), which dynamically adds or removes instances in response to increases or decreases in traffic. Also, we would need a Load Balancer that would properly distribute the traffic amongst the instances that would be created by the MIG.

The only part that needed to be tweaked from the previous standard architecture was replacing the single Compute VM with an Auto Scalable solution. Everything else, from creating the VPC Network to Cloud NAT and Serverless VPC access would remain the same. Hence, I will focus on how we set-up the auto-scalable solution.


We needed three significant components to make scalable API requests successfully.

  1. A MIG where each server that was added would be set up as a reverse proxy that routed requests to the correct destination.
  2. An Autoscale rule that would instruct the MIG to add or remove instances.
  3. A Cloud Load Balancer that would distribute user traffic across multiple instances created by the MIG.

The final architecture would look like this:

The MIG, Load balancer, Cloud NAT and VPC connector resided in the VPC network we created in the us-central1 region.

We again made use of Terraform and Google Cloud Foundation Toolkit scripts to deploy our infrastructure.

Instantiating the Managed Instance Group:

An instance group is a collection of virtual machine (VM) instances that you can manage as a single entity.[2]

Google Cloud Platform offers two types of Instance groups:

  • Unmanaged Instance Group
  • Managed Instance Group

We decided to use a Managed Instance Group(MIG) over an Unmanaged Instance Group because of the following benefits:

  1. We didn’t need to manage our VM’s because of the High availability feature offered by the MIG.
  2. It automatically scaled our Compute VM’s based on the increase or decrease in traffic.
  3. MIG’s work well with Load Balancers to distribute traffic across all the instances in the group.

The other benefits of using a MIG can be found here.

Below is the HCL in the file that creates the Managed Instance Group utilising the Instance Template to configure the VM’s.

We kept the machine types of our VM’s the same as before: e2-small (2 vCPUs, 2 GB memory) for our Dev environment and an e2-standard-2 (2 vCPUs, 8 GB memory) for our Production environment. We decided to use the same machine type purely from an economic standpoint since the certification process was available only for two weeks, and the traffic would go back to normal after that. Setting up a larger instance would’ve been costlier over the year, and the auto-scaling feature of the MIG would’ve handled the increase in traffic.

The parameter startup_script, in the module “instance_template”, contains the bash script that installs Nginx as a Reverse Proxy Server on the Compute VM’s. It is essential to use a startup script because any new VM being added by the MIG needs to be configured as a Reverse Proxy Server before it is ready to receive traffic.

Creating an Auto-Scaling Rule:

We needed to create an auto-scaling rule that would apply to our MIG created above; this auto-scaling rule tells the MIG when there is a need to add a new Compute VM to the group or delete one from the group based on the traffic.

Below is the HCL in the file that creates this Auto-Scaling Rule:

The parameter target identifies the MIG that the autoscaler will scale.

The parameter autoscaling_policy defines the set of rules that will apply to the MIG when it scales. The autoscaler will add a new VM instance to the MIG when the CPU utilisation of the VM instance goes above 60%. We choose 60% because it takes about 2–3 minutes for the new VM instance to be provisioned and be ready to receive traffic, and this value would account for that time.

Configuring the Internal Load Balancer:

We decided to use an Internal TCP/UDP Load Balancer because it is a regional load balancer that enables one to run and scale one’s services behind an internal load balancing IP address that is accessible only to the internal virtual machine (VM) instances.

An Internal TCP/UDP Load Balancing distributes traffic among VM instances in the same region in a Virtual Private Cloud (VPC) network by using an internal IP address.[3]

The four components required in setting up a TCP Internal Load Balancer are:[3]

  1. Internal IP Address: This is the address for the Internal Load Balancer that will be used by our application to make the API calls. Below is the HCL that creates an Internal IP Address to be attached to the Load Balancer.

This internal IP address should belong to the same subnet created in the VPC network where we instantiated the MIG.

2. Internal Forwarding Rule: An internal forwarding rule, in combination with the internal IP address, forms the frontend of the load balancer. It defines the protocol and port(s) that the load balancer accepts, and it directs traffic to a regional internal backend service. The HCL to create a Forwarding rule is as follows:

The parameter ports are used to specify the ports where the load balancer accepts traffic.

The parameter ip_address is where we attach the Internal IP address created above which will act as the frontend of the Load Balancer.

3. Regional Internal Backend Service: The regional internal backend service defines the protocol used to communicate with the backends. It defines three backend parameters:

  • Protocol — Either TCP or UDP
  • Traffic distribution — A backend service that allows traffic distribution.
  • Health Check — Associated health check with the backend

The above is how we define the Regional Backend Service for the Internal Load Balancer using HCL.

4. Health Check: The health check defines the parameters under which GCP considers the backends that it manages to be eligible to receive traffic. Only healthy VMs in the backend instance groups receive traffic sent from the client to the IP address of the load balancer. The HCL to set up a health check is:

After setting up all these components, our Internal Load balancer will be configured and ready to receive traffic from our Application deployed on App Engine.

These three components in conjunction enable us to make API calls to the Salesforce server at scale.


We received overwhelming traffic within the initial hours of this new certification process going live; we were able to certify 19000 users within the first 2 hours at an average of 9500 users/hour, which was 50% more than our estimate. The MIG, combined with the Load Balancer, did a great job and autoscaled to fulfil this surge.

At peak traffic, we had four instances of Compute VM’s attached to the MIG running as Reverse Proxy Servers enabling users to complete the certification process.

About ~180,000 users completed the certification process to receive the LWA benefits during the two weeks this process was active.

In this way, we could successfully auto-scale our infrastructure to make the SOAP API requests to the Salesforce server from our serverless application and overcome the drawback of the standard architecture.

The use of Terraform scripts allowed us to swiftly get this infrastructure up and running and also avoided a lot of rework in deploying the same infrastructure to different environments individually.

The link to all the HCL scripts and a step-by-step guide to setting up the infrastructure can be accessed here.






I am a Data and Cloud Engineer with a keen interest in deploying scalable solutions on the Cloud.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store