As of TM1 version 10.2, a new feature known as “multi-threaded queries”, or MTQ, is available. MTQ allows TM1 to take advantage of multiple processor cores to significantly increase the speed at which TM1 can perform calculations and queries by allowing queries to split into multiple processing threads, effectively using a parallel processing regime to resolve queries significantly faster.
The performance improvement of Multi-Threaded-Queries is in an approximately linear relationship to the # of CPU cores that the TM1 query engine is allowed to utilize: Approximate TM1 10.2 Query Time = TM1 <10.2 Query Time / # of CPU Cores utilized for Multi-Threaded Queries. Here is an IBM note on configuring MTQ: https://www-01.ibm.com/support/docview.wss?uid=swg21960495&myns=swgimgmt&mynp=OCSS9RXT&mync=E&cm_sp=swgimgmt-_-OCSS9RXT-_-E
MTQ Behaviour & Configuration Options
MTQ is configured and enabled via corresponding entries in the tm1s.cfg file (MTQ=N, with N=#of CPU Cores)
The MTQ parameters in the tm1s.cfg are dynamic, i.e. the TM1 Server process does not have to be restarted after a change in the MTQ settings.
The MTQ value does not specify the total # of CPUs that TM1 can leverage for multiple queries. MTQ defines the # of CPU cores that TM1 may leverage for individual queries (i.e. end users). It follows that if MTQ is for example set to 4 on a system with 8 CPUs, two users could run a Multi-Threaded-Query simultaneously and with each leveraging 4 CPU cores).
If there is no MTQ entry in the tm1s.cfg file or if MTQ=1 or MTQ=0, MTQ is disabled.
To set the value to the maximum number of cores available on a server, the setting MTQ=All (case insensitive) can be used.
Setting MTQ to a negative number will (MTQ=-N) will result in the # of MTQ CPU Cores being determined as follows: T=M-N+1 (where T= # CPU Cores to be used by MTQ, M= # of CPU cores available to TM1). For example, if your computer has 64 cores and you set MTQ=-10, the # of MTQ Cores will be T = 64-10+1 = 55
MTQ does not effectively leverage Hyper-Threaded Cores. If the CPU and OS supports Hyper-Threading, we strongly recommend to disable Hyper-Threading.
MTQ Configuration Scenarios, Consideration & Practices
Generally, the best practice is to set the MTQ value such that the maximum available processor cores are used, i.e. MTQ=All or MTQ=-1 or MTQ=M (with M= # CPU cores incl. hyper-threading cores).
In a scenario where some end user queries consume many CPUs yet still take considerable time and if concurrency is sufficiently high (very high), configure MTQ to use a # of processor cores that will leave CPU capacity available to allow multiple users to leverage MTQ concurrently.
In a scenario where ViewConstruct is being used heavily to leverage MTQ in the context of TI processing (where views can get very large and could potentially take a longer to be constructed time than just a few seconds, instead possibly taking many minutes), it is recommended to set MTQ to a value < the # of available processors because otherwise a TI process may be able to temporarily consume all processing cores on the TM1 Server and thereby cause a significant performance degradation of end-user queries during that time. Advance MTQ configuration Customers who are applying MTQ to models with reasonably complex rules (i.e. to rules that have cross-cube references with reclusiveness’ depth more than 2) who are on an early TM1 10.2 release (prior to 10.2 FP1) could intermittently suffer from incorrect calculations unless they have disabled MTQ for single cell consolidation via parameter MTQ.SingleCellConsolidation=false. This parameter should only be used in such Pre 10.2 FP1 environments, once an upgrade to the fixpack or higher version has occurred, the parameter should be removed. The dynamic TM1s.cfg parameter MTQ.MultithreadStargateCreationUsesMerge=TRUE can speed up the creation of large stargate views of >100MB. The default value of this parameter in TM1 10.2.2 is FALSE.
The dynamic TM1s.cfg parameter MTQ.CTreeWorkUnitMerge=TRUE speeds up concurrent population of calculation cache kept in the CTree, thereby improving performance of queries that cause a very large cache (re-)population. The default value of this parameter in TM1 10.2.2 is FALSE. Note that MTQ.CTreeWorkUnitMerge=TRUE could lead to redundant work in MTQ threads when the same calculation for exactly the same cell is re-computed per MTQ thread that needs it, whereas with MTQ.CTreeWorkUnitMerge=FALSE a computed cell would be published faster into a global CTree cache and then re-used by all MTQ threads. In cases where MTQ.CTreeWorkUnitMerge is enabled (= set to TRUE), the additional parameter MTQ.CTreeRedundancyReducer=TRUE (also a dynamic parameter) may be used to reduce query redundancies if applicable.
IN TM1 10.2 Version prior to TM1 10.2.2. FP2 HF9, the (default) MTQ.CTreeWorkUnitMerge=FALSE setting could lead to rules not being calculated in certain scenarios where TI-processes were used to (re-)populate/refresh data. We therefore recommend to upgrade to TM1 10.2.2 FP2 HF9 or higher (10.2.2 FP3 and 10.3) when using MTQ. If an upgrade is not feasible at the given time and issues with rule calculations are encountered after TI-processing, enabling MTQ.CTreeWorkUnitMerge (MTQ.CTreeWorkUnitMerge=TRUE) will also solve the issue
For additional documentation the refer to this link – http://www-01.ibm.com/support/knowledgecenter/SSMR4U_10.2.1/com.ibm.swg.ba.cognos.tm1_op.10.2.0.doc/c_tm1_op_multithreadedqueries_description.html
MTQ will be leveraged in TI if a sub-process is called in the Prolog of the TI process which then runs the ViewConstruct command against the source cube and view. Using MTQ for faster generation of Views via use of ViewConstruct() may require a significant increase in the VMM value (see section on VMM) and – consequently – RAM.
MTQ logging and monitoring
MTQ activity is best monitored via use of the TM1 Operations Console. The parent thread and each MTQ worker thread will all be visible as separate threads in TM1 Operations Console.
To generate logging information on multi-threaded queries, the following entries can be made in the tm1s-log.properties file (located in the same location as your tm1s.cfg file):
To capture Stargate creation times: log4j.logger.TM1.Cube.Stargate=DEBUG
To capture work unit splitting: log4j.logger.TM1.Parallel=DEBUG
To capture the event of operation threads picking work units: log4j.logger.TM1.OperationThread=DEBUG
Caching of MTQ Results
Query caching behaviour is configured per cube via the VMM value in the }CubeProperties cube, where the VMM value defines the maximum amount of memory to be used for caching per cube. In many cases it is therefore a good practice to optimize/increase memory reserved for caching Stargate views by increasing the VMM value in the }CubeProperties Cube to a significantly higher value than the default of 65kb.
The use of MTQ typically will require an increase in VMM size. If VMM cache is set too low, even queries that were cached without MTQ use may not be cached anymore once MTQ is enabled. In such cases – to avoid unnecessary re-execution of MTQs – increase the VMM value until repeated query execution will not trigger MTQ activity anymore (indicating the cache is used).
Overview of Caching / Importance of Caching
TM1 allows setting thresholds and maximum cache memory for Stargate views. A Stargate view is a calculated and stored subsection of a TM1 cube that TM1 creates when you browse a cube with the Cube Viewer, Web-Sheet or In-Spreadsheet Browser. The purpose of a Stargate view is to allow quicker access to the cube data. A Stargate view is different from a TM1 view object. The Stargate view contains only the data for a defined section of a cube, and does not contain the formatting information and browser settings that are in a view object. A Stargate view that TM1 creates when you access a cube contains only the data defined by the current title elements and row and column subsets. TM1 stores a Stargate view when you access a view that takes longer to retrieve than the threshold defined by the VMT property in the control cube }CubeProperties. A Stargate view persists in memory only as long as the browser-view from which it originates remains unchanged. When you recalculate the browser view, TM1 creates a new Stargate view based on the recalculated view and replaces the existing Stargate view in memory. When you close the browser view, TM1 removes the Stargate view from memory. Stargate View caching behavior/thresholds (by cube) can be configured via changing the VMT and VMM values in the }CubeProperties cube: For each cube, the VMM property determines the amount of RAM reserved on the server for the storage of Stargate views. The more memory made available for Stargate views, the better performance will be. You must, however, make sure sufficient memory is available for the TM1 server to load all cubes.
The value of VMM is expressed in kilobytes. If no VMM value is specified the default value is 128 kilobytes.
The valid range for VMM is 0 – 2,147,483,647 KB. The actual upper limit of VMM is determined by the amount of RAM available on your system.