Tuesday, March 6, 2012

"Current day" MDX

Hi,

is there any "update" on how to do the "current day" as a set in MDX with SQL 2005 based on the actual system date?

If you don't want to base that on the system date is this still the "best practice" to introduce some "flags" in the time dimension to identifiy the actual day, last week, ... Just the way it was introduced with the BI Accelerator tool...

Thanks,

It's an interesting question...

I can't actually remember how SSABI did its current time periods, but from what you say I guess it created sets which filtered on a member property value such as 'Is Current Day'. That should still work, but my feeling is that in AS2005 it might be better to create relative time attributes. So, for example, if you had Year, Quarter and Month attributes you would add Relative Year, Relative Quarter and Relative Month attributes too. These would have members on them such as 'Current Month', 'Current Month - 1', 'Current Month - 2' etc. This would allow users to create much more sophisticated relative time period reports, and at the same time still see the actual dates involved if they crossjoined the actual time period attributes with the relative time period attributes because autoexists would automatically do filter out everything but the correct combinations.

Incidentally, using the NOW() function to find the current date is a bad idea in AS2005 because as far as I can see, server-defined sets are now evaluated at processing time. So for example, add the following set to Adventure Works:

create set currentcube.test as strtoset("{[Date].[Calendar Year].&[" + cstr(cint(mid(cstr(now()),7,4))-2) + "]}");

If you run a query which shows the contents of this set, you should see the CY2004 member returned. However if you change the system date of your machine to be 2005 instead of 2006, then rerun the same query, you'll see that CY2004 is still returned. You need to reprocess the cube ('Process Script Cache' seems to be sufficient) to get CY2003 to appear in your query.

Chris

|||

Chris, I think your suggestion is good. However, it does have some rather nasty processing implications, since you have to reprocess your time dimension each day (if the granularity of your time dimension is date). A "Process Update" is sufficient of course, but this method invalidates (and therefore removes) all aggregations that include attributes from the time dimension. You are therefore forced into doing a "Process Index" on all your measure groups (except those that might not contain the time dimension) to get the aggregations back online.

It puzzles me that named sets should be evaluated at processing time?! As far as I know they are evaluated the first time they are requested by a query in a given session. The set is then cached until the session expires or the set is removed by the DROP SET statement. I have not verified that this is true for server-defined sets, though...

Anyway, using named sets for dynamic time, should in my opinion be one of the best approaches. Using Chris' suggestion allows you to define a single named set "Current Day", which you can base a number of other sets on. For instance, having created the set "Current Day", you can easily create the set "Current Month" by using the Exists function:

CREATE SET CURRENTCUBE.[Current Month] AS Exists([Date].[Calendar].[Month],[Current Day])

... and so on...

If you are using a front-end tool, make sure that it supports the use of named sets. If not, you have to create a calculated member that aggregates across the named set, but this approach has quite a few disadvantages (one of which is that a calculated member does not establish current cube context).

|||

True, the processing overhead could be pretty nasty depending on your cube.

Re sets, you can test out the new behaviour on Adventure Works as follows:

Create the following server-side set

create set currentcube.test as topcount([Date].[Date].[Date].members,10, measures.[internet sales amount]);

|||

Yes, I can see that the two queries return the same set. I don't think the example proves your point, though, since the named set under no circumstances is evaluated in the context of the query, which is why the WHERE clause has no effect on the set of dates returned.

I have also tested the fact that the named set is evaluated at processing time. This is actually not the case. It is evaluated the first time it is requested in a query. This can be verified (following the example in a previous post) by doing a process, changing the system date and then running the query, in which case the returned set will accurately reflect the new system date. Anyway, you are absolutely right that the content of the named set does not change until the cube is reprocessed. I wonder if this behavior can be changed? For the sake of dynamic time, however, it doesn't really matter too much. If only you process a part of your cube database once a day (after midnight), you should be good to go.

|||

You're right about the sets - sorry, my mistake. I think I was getting confused with a slightly different issue which is that the same set can return different results depending on where you put it in the MDX Script. Here's an example in Adventure Works: if you add the following onto the end of your MDX Script -

create set currentcube.test1 as order([Customer].[Education].[Education].members, measures.[internet sales amount], bdesc);

(measures.[internet sales amount], [Customer].[Education].&[Bachelors])=0;

create set currentcube.test2 as order([Customer].[Education].[Education].members, measures.[internet sales amount], bdesc);

Then run queries showing the contents of test1 and test2, you can see that they return different results - test2 accurately reflects the change made by the assignment. This of course isn't inconsistent with the set being evaluated the first time it's queried (presumably the results are then stored in the 'script cache' I was processing), just that it's correctly evaluated in the context of the script.

Anyway, I've had an idea on how to have a relative time dimension of the type I've described without incurring any of the processing penalties. What might work (and I need to test this) would be to create a separate Relative Time dimension then add it to your cube with *no* relationship to any measure group; you could then use MDX Script assignments to map the members on it to the equivalent members on the real time dimension. Definitely worth investigating...

Chris

|||Good thinking! I would be very interested in knowing how you accomplish this, as I have already tried to implement your suggestion (without success). |||

Here's a 'proof of concept' version:

http://forums.microsoft.com/MSDN/ShowPost.aspx?PostID=348468&SiteID=1

I'd be interested to hear if anyone actually tries this, and if they have any suggestions for improvements.

Chris

|||That link appears to point back to this thread, but I would be extremely interested in seeing your proof of concept code. I didn't know it was possible to dynamically link dimensions at run time, but that would solve for me a number of very difficult issues I am struggling with.|||

I have seen two other ways of implementing the current day.

Never have members in the time dimension after the current date. Today it is Aug 22 so never let at time member enter the dimension after this date. With this solution you can look for the last time member in MDX by lastchild.lastchild and so on. In this case it is the ETL process and SSIS that manage the time dimension.

Another solution is to look for a measure in the cube that you know will reflect the current day or the day before the current day. Actual Sales is a good candidate if budget sales is entered for the full year in advance. In this case you can use Tail with Filter, in MDX, to look for the last non empty member. If this set works you can add Lag(MDX) och Lead(MDX) to your first named set, together with Lead(MDX).

If you buy "Fast track to MDX", Second edition, you can read a discussion about these methods and a third one, recursion.

Regards

Thomas Ivarsson

|||

I use this technique as it supports dates beyond today.

In ETL, I select the min and max trx date keys (a bit more involved if multiple fact tables which I have) from the fact table and store them in a one row table called DIM_DAY_RANGE. Then use the following where clause for your DAY_DIM view

create view v_active_day_dim as
select * from
FROM dbo.DAY_DIM
WHERE DAY_KEY BETWEEN (SELECT minDayKey FROM dbo.DAY_DIM_RANGE) AND
(SELECT maxDayKey FROM dbo.DAY_DIM_RANGE)

The same technique applies to active products and customers and other dimensions. Makes the cube smaller and only used dimension keys are diplayed.

|||

Perhaps it's good to know some background...

I need this information ("today") because I need to build up reports which show "todays" Orders, Revenue or whatever. The users should do nothing but open the report, so I need something like a set to use instead of a fixed day or something what a user has to select.

Using the approach to "just offer what you have" is OK in many cases, however what do you do if you not only have actual sales but also your planned sales? Then you have to be able to show all days (or months) of the current year or also the next year... So this will not help you much...

|||

Quote:"what do you do if you not only have actual sales but also your planned sales". Do you have a version dimension like Actual, Budget, Planned, Forecast ? Or is this in the measures like, ActualSales, BudgetSales, Planned Sales, ForecastSales?

/Thomas

|||

Have a look here: http://support.dspanel.com/help43/Web_Part/Examples/MDX_Examples.htm

Regards

Thomas Ivarsson

|||

Thomas,

that's basically what Chris posted before... I used that for my solution as well because I'll have daily reprocesses of the cube... But it's a nice page with quite some useful stuff...

No comments:

Post a Comment