This guide is about the challenges and obstacles you might run into if you try to scale an Amazon EC2 Instance running Power BI’s OnPremise’s gateway. It all comes down to the current limitations in the installer and the PowerShell cmdlets. Right Sizing Your Power BI Gateway on AWS is possible. Here’s how DesignMind uses enhanced AWS configurations to anticipate our customer’s data reporting needs.
Why would you even consider right sizing your Power BI Gateway on AWS? While Microsoft provides a Redshift Datasource, that requires pulling things across the wire from AWS to Azure or to your Power BI Desktop. Treating AWS’ data resources like an “on-premise” data lake, allows us to configure Power BI Gateways onto EC2 Instances in the same VPC as the database. We can offload and localize Refreshes and Direct Queries. Now, we just need to right-size our Gateway clusters so we have the capacity we need without breaking the bank.
Installing the Power BI Gateway using PowerShell cmdlets
Let’s talk about Power BI Gateway clusters first. Power BI Gateway already provides some basic load balancing, but they require big expensive machines that must be pre-configured and connected.
When you install the gateway, you get a choice after you authenticate. You can either register the gateway as a new cluster or you can connect it to the existing cluster. Here’s the rub. That part cannot be automated as of yet.
The Power BI Gateway installer doesn’t accept any options. We can’t select the gateway. We can’t provide the recovery key. We can’t connect to an existing cluster.
If we could automate these settings, we could connect a new instance to an existing gateway cluster.
The Power BI team provides us with some PowerShell cmdlets that lets us do almost everything. With the modules outlined here, we can do almost everything. We can install the gateway, login and create a new gateway. Microsoft’s Power BI team points out this limitation in the public preview of their new cmdlets to automate it.
Right sizing your Power BI Gateway on AWS
That’s it in a nutshell. We can automate installation. We can create NEW gateways. We cannot register a new gateway to a new installer in an automated manner with either the installer or PowerShell.
There goes a pre-cooked AMI with everything installed and a user-data script to configure a gateway and add it to the cluster. Since each node needs to be manually configured, we need to resize pre-existing machines in AWS.
Enter Ops Automator
There’s no way to resize and existing EC2 with the tools that AWS provides out of the box. ELB and Autoscaler groups are designed to work with ephemeral instances that auto-configure themselves. Fortunately, AWS provides an enhanced framework
that makes such things possible. They provide a CloudFormation template that installs this impressive stack.
The core of it is Lambda functions that do powerful things not available to us on the Console. Once you install that framework, our AWS configuration is relatively straightforward. We can use the VerticalResize task provided by the framework, attach it to our EC2 instances to RESIZE the installation.
Bringing it all together
In order to make this work and ensure maximum uptime while reduced cost, we must configure the Power BI gateway itself to use their own internal load balancing and then use the Vertical Resize task to setup the pre-configured EC2 instances under load and step them down when the load cools off.
Microsoft provides some out of the box load balancing with some tweaks to the configuration files on the primary gateway here. By default, load balancing is random. By adjusting the metrics in the config file, we get greater control to start routing traffic to child nodes. If we didn’t care about cost, we’d just have a bunch of machines running and they’d balance the queries and we would be done.
This is AWS we’re talking about here. The recommended configurations for an On-premises gateway get expensive real fast. So, why not have one main gateway running at the specifications and then have child nodes running on Free Micro instances that scale up under load?
That’s where the ops automator comes in. By using the Ops Automator framework, we can tag our child nodes and have them scale up as they come under load. The Power BI Gateway routes the traffic and the child nodes go offline, scale up and then come back online. Getting the responsive, always on cluster is a balancing act of scaling up in time, but this is just the basic configuration.
With CloudWatch and better monitoring, we can get those child nodes scaling up once the Power BI Gateway goes underload. Once the Power BI team delivers the PowerCmd modules planned on their road map, then we should be able to use auto scaling and throw away ephemeral gateways when we are done. This solution for right sizing your Power BI gateway on AWS is a handy trick for those who need to resize EC2 containers that must be configured by hand.
Brian Hatchet is a Principal Software Engineer at DesignMind, and specializes in .NET, open source, and team software development processes.