As you may know, both dimension’s attributes and distinct count measures can be of string data type. During processing, Analysis Services reads data and put them into individual stores organized by attributes in case of dimensions or by measures groups/partitions for cubes.
The file format of those stores differ to adapt to the type of the data located inside. By default, until now, string stores couldn’t exceed the 4GB file size limit.
Starting with the CTP3, a new feature names “Scalable String Storage” allow to go beyond this limit. A new property named “StringStoresCompatibilityLevel” is available at different places to activate it.
I’ll review in this post how to set up this new feature in your SSAS solution and what we can expect from it…
How To set up the scalable string storage ?
First, you will find this property in the dimension editor. Just select the top node of the dimension in the metadata tree view, and take a look at the properties.
Then, you will also find this property in the cube editor, but not on measure groups as stated in the msdn (see refs below), but on each partition instead.
This property can take two distinct values : 1050 or 1100. Other values will be ignored. Set this property to 1100 to use the new storage mode.
As a best practice, you should not use this feature unless you think your store files are about to exceed the 4GB limit.
- This feature requires the compatibility of the Olap database to be set properly. All newly created solution are by default correctly set.
Upgraded databases will needs additional parameterization.
- Note that this change is irreversible (no downgrade!!) so be sure to back up your db first.
- Note also that partition merging and dimension linking across dbs as well as db synchronization is only supported when all dbs are of the
same level of compatibility.
- To set the compatibility level of an olap db : in management studio, create an Alter script of the olap db, search for a xml tag named “CompatibilityLevel”, replace the value and execute the script.
- This is a feature that should be planed whenever it’s possible at design time since the change of storage mode implies a full reprocessing of the object and all its dependencies. Considering the size of the objects involved (we’re talking about removing a 4GB file size limit!), it is quite sure that the reprocessing time needed will be high.
- To know if there is a risk of reaching the limit, one must monitor at least the .asstore, .bstore, .ksstore and .string files size growth speed. The sooner one knows the answer, the better….
- The new file structure is divided in multiple 64K records (a string can span over multiple records depending on its size) and can hold up to 4 billions records or 4 billions strings. All in all, the new maximum file size is of 64Ko * 4 billions = 256 billions Ko = 100 To for a single string attribute, which is quite huge !! As a side effect, beware that each string has at least one record allocated so even if the string size is less than 64Ko, it will take 64Ko. That’s the same principle than for file system’s clusters. So the average size of strings should definitively be taken into account or else our disk space need may as well grow far more and faster than expected !
Is there a performance impact ?
Take a look at the following table :
the file size on disk will be superior to the size of the strings it contains and so we should try to evaluate and provision enough disk space before proceeding to the modification
my guess is that we can also expect an increased IO rate to write/process or read/query the data/database, therefore impacting both processing time and SSAS metadata loading after each service’s restart or query time
further investigations and measurements will have to be performed to evaluate the exact impact on queries
More information can be found here :