close
close

Different ways to replace spaces with zeros in DAX

Different ways to replace spaces with zeros in DAX

My post a few months ago about the dangers of DAX measures never returning a null value got quite a bit of attention – it’s a hot topic on the forums, and adding zeros to measures is a common cause of memory errors in Power BI. However, in this post I didn’t talk about what is the best way to replace spaces with zeros if you have absolutely no choice but to do so. One of the comments on this post mentioned that visual computing is an option, and I hadn’t thought about that before; now, after talking to the DAX gods (no, not the Italians, I mean Akshay, Marius and Geoffrey!) and doing some tests, I may find that visual computing may be a good choice. Sometimeswhereas in other cases more traditional DAX approaches are appropriate.

Let’s look at some examples. I created the following model using AdventureWorksDW 2017 sample data:

There is a Product dimension, a Customer dimension, and a Date dimension, as well as a fact table containing sales data. The most important thing to note is that individual customers only purchase a few items for a few dates. I also created two measures with the following definitions:

Sales Amount = SUM('FactInternetSales'(SalesAmount))

Mountain-100 Black 38 Sales =
CALCULATE (
    (Sales Amount),
    'DimProduct'(EnglishProductName) = "Mountain-100 Black, 38"
)

The measure (Sales Amount) simply sums the values ​​in the SalesAmount column of the fact table; (Mountain-100 Black 38 Sales) returns (sales) for only one product.

Now consider a table visual showing LastName from the DimCustomer table, FullDateAlternateKey from DimDate, and the measures (Sales Amount) and (Mountain-100 Black 38 Sales):

There are a lot of rows here because each combination of LastName and FullDateAlternateKey is displayed where there is a value (Sales Volume). Connecting Profiler to Power BI Desktop and recording the progress metrics trace event (DAX Studio also shows this now) shows that the peak memory consumption for this query is 2063 KB.

{
	"timeStart": "2024-11-03T19:07:26.831Z",
	"timeEnd": "2024-11-03T19:07:26.844Z",

	"durationMs": 13,
	"vertipaqJobCpuTimeMs": 0,
	"queryProcessingCpuTimeMs": 0,
	"totalCpuTimeMs": 0,
	"executionDelayMs": 0,

	"approximatePeakMemConsumptionKB": 2063,

	"commandType": "Statement",
	"queryDialect": 3,
	"queryResultRows": 502
}

As you can see (Mountain-100 Black 38 Sales) is mostly empty, and let’s say you need to replace the spaces in this column with zeros.

Modifying the measure definition to add zero to the result of Calculate() as follows:

Mountain-100 Black 38 Sales =
CALCULATE (
    (Sales Amount),
    'DimProduct'(EnglishProductName) = "Mountain-100 Black, 38"
) + 0

Doesn’t do what you want because you now get a row in the table for every LastName and FullDateAlternateKey combination, meaning rows with non-null values ​​are hard to find:

Instead, add zero only when there is a value (Sales Volume), something like this:

Mountain-100 Black 38 Sales =
IF (
    NOT ( ISBLANK ( (Sales Amount) ) ),
    CALCULATE (
        (Sales Amount),
        'DimProduct'(EnglishProductName) = "Mountain-100 Black, 38"
    ) + 0
)

… does its job. What does memory usage look like? Here are the performance metrics:

{
	"timeStart": "2024-11-03T19:17:22.470Z",
	"timeEnd": "2024-11-03T19:17:22.500Z",

	"durationMs": 30,
	"vertipaqJobCpuTimeMs": 0,
	"queryProcessingCpuTimeMs": 31,
	"totalCpuTimeMs": 31,
	"executionDelayMs": 0,

	"approximatePeakMemConsumptionKB": 3585,

	"commandType": "Statement",
	"queryDialect": 3,
	"queryResultRows": 502
}

Memory usage increased only slightly to 3585 KB.

What about using visual computing instead? Let’s return to the original definition of the indicator (Sales of Mountain-100 Black 38) and then create this visual calculation:

No Blanks = (Mountain-100 Black 38 Sales)+0

…shows that this doesn’t solve the problem because you’re getting unwanted non-sales lines again. Using:

No Blanks = 
IF (
    NOT ( ISBLANK ( (Sales Amount) ) ),
    (Mountain-100 Black 38 Sales) + 0
)

…does solve the problem, and you can of course hide the original metric column (Mountain-100 Black 38 Sales) so it doesn’t appear in your table:

But the execution metrics show that the memory usage is actually much higher – 11295 KB – because the result set now has one extra column, and visual computing makes a copy of the original result set in memory when it calculates them:

{
	"timeStart": "2024-11-03T19:31:22.858Z",
	"timeEnd": "2024-11-03T19:31:22.980Z",

	"durationMs": 122,
	"vertipaqJobCpuTimeMs": 0,
	"queryProcessingCpuTimeMs": 109,
	"totalCpuTimeMs": 109,
	"executionDelayMs": 0,

	"approximatePeakMemConsumptionKB": 11295,

	"commandType": "Statement",
	"queryDialect": 3,
	"queryResultRows": 502

Does this mean visual computing should never be used? No, not at all. Consider the following matrix visual, which contains only the measure (Mountain-100 Black 38 Sales) and has LastName in the rows and FullDateAlternateKey in the columns:

Memory usage for this visual is 1091 KB:

{
	"timeStart": "2024-11-03T19:51:02.966Z",
	"timeEnd": "2024-11-03T19:51:02.974Z",

	"durationMs": 8,
	"vertipaqJobCpuTimeMs": 0,
	"queryProcessingCpuTimeMs": 16,
	"totalCpuTimeMs": 16,
	"executionDelayMs": 0,

	"approximatePeakMemConsumptionKB": 1091,

	"commandType": "Statement",
	"queryDialect": 3,
	"queryResultRows": 139
}

The result set returned for the DAX query used to populate this visual contains only one row for each last name and date combination that has a value for (Mountain-100 Black 38 Sales), a total of 139 rows, but since the matrix visual is used to display The results include all the spaces that you see in the screenshot. You could try replacing those spaces with some very complex DAX, but I won’t even try. Instead, visual computing solves this problem very easily:

No Blanks Matrix = (Mountain-100 Black 38 Sales)+0

Here is the matrix with hidden (Mountain-100 Black 38 Sales) and visual calculation applied:

Execution metrics show that the peak memory consumption is only 2054 KB, and the number of rows returned is higher, but still only 2070 rows:

{
	"timeStart": "2024-11-03T19:45:49.298Z",
	"timeEnd": "2024-11-03T19:45:49.337Z",

	"durationMs": 39,
	"vertipaqJobCpuTimeMs": 0,
	"queryProcessingCpuTimeMs": 31,
	"totalCpuTimeMs": 31,
	"executionDelayMs": 0,

	"approximatePeakMemConsumptionKB": 2054,

	"commandType": "Statement",
	"queryDialect": 3,
	"queryResultRows": 2070
}

In general, both traditional DAX and visual computing solutions perform well in different scenarios, so I suggest you test the query performance and memory usage of different solutions yourself.