Alinta Energy replaces Azure services in its data architecture

Alinta Energy has gradually replaced most of the Azure-based services that once powered its data architecture with Databricks, an effort it says has substantially cut its running costs.

Alinta Energy replaces Azure services in its data architecture


Alinta’s Jake Roussis.

Lead data engineer Jake Roussis told a Databricks data intelligence day in Melbourne that the changes had driven a “40-to-50 percent reduction in production platform expenditure over the last 12 months, [equivalent to] over $1 million in cost savings”.

The “gentailer” – both an electricity generator and retailer – had been using Azure Synapse, Microsoft’s all-in-one data platform and analytics services package, to run data transformation, querying and processing, as well as to serve data to users.

It had Databricks but in a limited capacity, used only for some data processing and analytics work; it also had Tableau and Power BI for analysis and reporting as well.

Most of these elements have now been replaced by Databricks under a concerted re-platforming effort that Roussis said aligned to internal engineering’s strategy to chase “cost, observability, reliability and performance” improvements.

“There’s only a small portion [of our data architecture] that’s not [Databricks], and we’re doing everything that we can to move it into [Databricks],” he said.

The gradual replacement of Azure services has led to cost reductions, according to Roussis.

“I don’t want to talk about Synapse too much – we’ve moved away from it – but I can’t help but mention a 40 percent cost reduction that we’ve had by switching that off and migrating to Databricks,” he said.

“Another big win for us recently [was] serverless SQL warehouses. Databricks SQL is incredibly powerful on its own, but switching from a traditional warehouse to serverless SQL warehouses has netted us another 38 percent annually. That’s about $300,000 a year, so it’s no small sum.”

Roussis noted the serverless SQL move was accompanied by some “rightsizing” of data workloads in order to realise the cost reduction.

“We had to make sure that when users were querying … that we had the best cost optimisation on the serverless SQL setup, but it wasn’t a hard activity to undertake,” he said.

Roussis said that alerting was better set up in Databricks compared to natively in Azure, and could be routed to Alinta’s PagerDuty IT operations platform.

He also cited “real-time query monitoring” in Databricks as being advantageous.

Prior to re-platforming, Roussis said that resource exhaustion could be a “bi-weekly occurrence”. 

“Imagine it’s 3am on a Tuesday morning. Your phone starts vibrating, you reach it for it, a critical [data] pipeline has failed,” he said.

“When you finished work yesterday everything was fine, your pipelines were running, the platform was perfect. 

“After 20 minutes of searching you finally locate the problem: a user kicked off a query at 10pm and it was just poorly written, it’s used up all your resources and nothing’s run since then.

“That was a reality for me when I started at Alinta.”

Roussis said that he could previously observe a query as it ran. “I couldn’t see it after the fact. I couldn’t see the damage it had done.”

“Now I can see exactly how long that query takes, I can see the cost and I can also understand the query plan,” he said.

“Databricks is also great with that too because it tries to optimise the queries for you. It does what it can, but sometimes people just write bad queries. 

“So with that, we can analyse the queries that are being executed and then we can break them down and help the users improve what they’re doing. We can enable our business to do better.”

Case-in-point: Roussis said that a calculator used for electricity “pricing variation events” now required “less than 15 minutes” runtime, compared to over an hour previously.

“All we did was take it from being some poorly written Python code and make it purely Databricks SQL with dbt,” he said. Data build tool or dbt is used to prepare raw data for analysis.

Natural language querying

Alinta Energy has also kicked off its first use case for AI/BI Genie, Databricks’ generative AI feature intended to enable business teams to interact with data using natural language.

“We did everything we could to get on top of that and understand it as quickly as possible – to understand its limitations and where it can excel,” Roussis said.

“In late 2024, we were able to put together a proof-of-concept that was presented to the Alinta Board that was able to present information about retail customers, with the idea being that the call centre agents can use it.

“Should someone ring up, [the call centre agent] can ask a question to the AI about you [to understand] what you might be calling about.

“That was a really good learning experience for us. From that, we are now at a stage where we are trying to put in processes so that we can make repeatable AI, so we can develop new AI and continue to put more models into production. 

“But we need to make sure that we have the appropriate guardrails in place. We need to make sure that things work exactly as we expect them to because we don’t want any AI to be giving the wrong information to people.”