SSAS INTERVIEW QUESTIONS AND ANSWERS : ALL BASIC QUESTIONS ON SQL SERVER ANAYSIS SERVICES
- What is datawarehouse in short DWH?
The datawarehouse is an informational
environment that
- Provides an integrated and total view of the enterprise
- Makes the enterprise’s current and historical information easily
available for decision making
- Makes decision-support transactions possible without hindering
operational systems
- Renders the organization’s information consistent
- Presents a flexible and interactive source of strategic information
OR a warehouse is a
- Subject oriented
- Integrated
- Time variant
- Non volatile for doing decision support
OR
Collection of data in support of
management’s decision making process”. He defined the terms in the sentence as
follows.
OR
Subject oriented:
It define the specific business domain
ex: banking, retail, insurance, etc…..
Integrated:
It should be in a position to
integrated data from various source systems
Ex: sql,oracle,db2 etc……
Time variant:
It should be in a position to maintain
the data the various time periods.
Non volatile:
Once data is inserted it can’t be
changed
- What is the difference between SSAS 2005 and SSAS2008?
- In 2005 its not possible to create an empty cube but in 2008 we can create an empty cube.
- A new feature in Analysis Services 2008 is the Attribute Relationships tab in the Dimension Designer . to implement attribute relationship is complex in ssas 2005
- we can create ONLY 2000 partitions per Measure Group in ssas 2005 and the same limit of partitions is removed in ssas 2008.
You can answer more but if you end this with these then the interviewer feel that you are REAL EXPERIENCED.
- What
is data mart?
A data mart is a subset of an
organizational data store, usually oriented to a specific purpose or major data
subject that may be distributed to support business needs. Data marts are
analytical data stores designed to focus on specific business functions for a
specific community within an organization.
Data marts are often
derived from subsets of data in a data warehouse, though in the
bottom-up data warehouse design methodology the data warehouse is created from
the union of organizational data marts.
They are 3 types of data mart they are
- Dependent
- Independent
- Logical data mart
- What are the difference between data mart
and data warehouse?
Datawarehouse is complete data where as
Data mart is Subset of the same.
Ex:
All the organisation data may related
to finance department, HR, banking dept are stored in data warehouse where as
in data mart only finance data or HR department data will be stored. So data
warehouse is a collection of different data marts.
- Have you ever worked on performance
tuning, if yes what are the steps involved in it?
We need to identify
the bottlenecks to tune the performance, to overcome the bottleneck
we need to following the following.
- Avoid named queries
- Unnecessary relationships between tables
- Proper attribute relationships to be given
- Proper aggregation design
- Proper partitioning of data
- Proper dimension usage design
- Avoid unnecessary many to many relationships
- Avoid unnecessary measures
- Set AttributeHierarchyEnabled = FALSE to Attributes that is not
required
- Won’t take even single measure which is not necessary.
- What
are the difficulties faced in cube development?
This question is either to test whether
you are really experienced or when he doesnot have any questions to ask
..
You can tell any area where you feel
difficult to work. But always the best answers will be the following.
- Giving attribute relationships
- Calculations
- Giving dimension usage (many to many relationship)
- Analyzing the requirements
- Explain the flow of creating a cube?
Steps to create a cube in ssas
- Create a data source.
- Create a datasource view.
- Create Dimensions
- Create a cube.
- Deploy and Process the cube.
- What
is a datasource or DS?
The data source is the Physical
Connection information that analysis service uses to connect to the database
that host the data. The data source contains the connection string which
specifies the server and the database hosting the data as well as any necessary
authentication credentials.
- What
is datasourceview or DSV?
A data source view is a persistent set
of tables from a data source that supply the data for a particular cube. BIDS
also includes a wizard for creating data source views, which you can invoke by
right-clicking on the Data Source Views folder in Solution Explorer.
- Datasource view is the logical view of the data in the data source.
- Data source view is the only thing a cube can see.
- What is named calculation?
A named calculation is a SQL expression
represented as a calculated column. This expression appears and behaves as a
column in the table. A named calculation lets you extend the relational schema
of existing tables or views in a data source view without modifying the tables
or views in the underlying data source.
Named calculation is used to create a
new column in the DSV using hard coded values or by using existing columns or
even with both.
- What
is named query?
Named query in DSV is similar to View
in Database. This is used to create Virtual table in DSV which will not impact
the underlying database. Named query is mainly used to merge the two or more
table in the datasource view or to filter columns of a table.
- Why
we need named queries?
A named query is used to join multiple
tables, to remove unnecessary columns from a table of a database. You
can achieve the same in database using Views but this Named Queries
will be the best bet whe you don’t have access to create Views in database.
- How will you add a new column to an
existing table in data source view?
By using named calculations we can add
a new column to an existing table in the data source view. Named Calculation is
explained above.
- What is dimension table?
A dimension table contains hierarchical
data by which you’d like to summarize. A dimension table contains specific
business information, a dimension table that contains the specific name of each
member of the dimension. The name of the dimension member is called an
“attribute”
The key attribute in the dimension must
contain a unique value for each member of the dimension. This key attribute is
called “primary key column”
The primary key column of each
dimension table corresponding to the one of the key column in any related
fact table.
- What
is fact table?
A fact table contains the basic
information that you wish to summarize. The table that stores the detailed
value for measure is called fact table. In simple and best we can define as
“The table which contains METRICS” that are used to analyse the business.
It consists of 2 sections
1) Foregine key to the dimesion
2) measures/facts(a numerical value
that used to monitor business activity)
- What
is Factless fact table?
This is very important interview
question. The “Factless Fact Table” is a table which is similar to Fact Table
except for having any measure; I mean that this table just has the links to the
dimensions. These tables enable you to track events; indeed they are for
recording events.
Factless fact tables are used for
tracking a process or collecting stats. They are called so because, the fact
table does not have aggregatable numeric values or information. They are mere
key values with reference to the dimensions from which the stats can be
collected
- What is attribute relationships, why we
need it?
Attribute relationships are the way of
telling the analysis service engine that how the attributes are related with
each other. It will help to relate two or more attributes to each
other.Processing time will be decreased if proper relationships are given. This
increases the Cube Processing performance and MDX query performance too.
In Microsoft SQL
Server Analysis Services, attributes within a dimension are always related
either directly or indirectly to the key attribute. When you define a dimension
based on a star schema, which is where all dimension attributes are derived
from the same relational table, an attribute relationship is automatically
defined between the key attribute and each non-key attribute of the dimension.
When you define a dimension based on a snowflake schema, which is where
dimension attributes are derived from multiple related tables, an attribute
relationship is automatically defined as follows:
- Between the key attribute and each non-key attribute bound to
columns in the main dimension table.
- Between the key attribute and the attribute bound to the foreign
key in the secondary table that links the underlying dimension tables.
- Between the attribute bound to foreign key in the secondary table
and each non-key attribute bound to columns from the secondary table.
- How
many types of attribute relationships are there?
They are 2 types of attribute
relationships they are
- Rigid
- Flexible
Rigid: In Rigid relationships
where the relationship between the attributes is fixed, attributes will not
change levels or their respective attribute relationships.
Example: The time dimension. We
know that month “January 2009″ will ONLY belong to Year “2009″ and it wont be
moved to any other year.
Flexible : In Flexible
relationship between the attributes is changed.
Example: An employee and
department. An employee can be in accounts department today but it is possible
that the employee will be in Marketing department tomorrow.
- How many types of dimensions are there
and what are they?
They are 3 types of dimensions:
- confirm dimension
- junk dimension
- degenerate attribute
- What
are confirmed dimensions, junk dimension and degenerated dimensions?
Confirm dimension: It is the dimension
which is sharable across the multiple facts or data model. This is
also called as Role Playing Dimensions.
junk dimension: A number of very small
dimensions might be lumped (a small irregularly shaped) together to form a
single dimension, a junk dimension – the attributes are not closely related.
Grouping of Random flags and text Attributes in a dimension and moving them to
a separate sub dimension is known as junk dimension.
Degenerated dimension: In this
degenerate dimension contains their values in fact table and
the dimension id not available
in dimension table. Degenerated Dimension is a dimension key
without corresponding dimension.
Example: In the PointOfSale Transaction
Fact table, we have:
Date Key (FK), Product Key (FK), Store
Key (FK), Promotion Key (FP), and POS Transaction Number
Date Dimension corresponds to Date Key,
Production Dimension corresponds to Production Key. In a traditional
parent-child database, POS Transactional Number would be the key to the
transaction header record that contains all the info valid for the transaction
as a whole, such as the transaction date and store identifier. But in
this dimensional model, we have already extracted this info into other
dimension. Therefore, POS Transaction Number looks like a dimension key in
the fact table but does not have the corresponding dimension table.
- What
are the types of database schema?
They are 3 types of database schema
they are
- Star
- Snowflake
- Starflake
- What is star, snowflake and star flake
schema?
Star schema: In star schema
fact table will be directly linked with all dimension tables. The star schema’s
dimensions are denormalized with each dimension being represented by a single
table. In a star schema a central fact table connects a number of
individual dimension tables.
Snowflake: The snowflake schema
is an extension of the star schema,
where each point of the star explodes into more points. In a star schema, each
dimension is represented by a single dimensional table, whereas in a snowflake
schema, that dimensional table is normalized into multiple lookup tables, each
representing a level in the dimensional hierarchy. In snow flake schema
fact table will be linked directly as well as there will be some intermediate
dimension tables between fact and dimension tables.
Star flake: A hybrid
structure that contains a mixture of star(denormalized) and
snowflake(normalized) schema’s.
- How
will you hide an attribute?
We can hide the attribute by selecting
“AttributeHierarchyVisible = False” in properties of the attribute.
- How will you make an attribute not
process?
By selecting “
AttributeHierarchyEnabled = False”, we can make an attribute not in
process.
- What is use of IsAggregatable property?
In Analysis Service we generally see
all dimension has All member. This is because of IsAggregatable property of the
attribute. You can set its value to false, so that it will not show All
member. Its default member for that attribute. If you hide this member
than you will have to set other attribute value to default member else it will
pick some value as default and this will create confusion in browsing data if
someone is not known to change in default member.
- What are key, name and value columns of
an attribute?
Key column of any attribute: Contains the column or columns that represent the key for the attribute,
which is the column in the underlying relational table in the data source view
to which the attribute is bound. The value of this column for each member is
displayed to users unless a value is specified for the NameColumn property.
Name column of an attribute: Identifies the column that provides the name of the attribute that
is displayed to users, instead of the value in the key column for the
attribute. This column is used when the key column value for an attribute
member is cryptic or not otherwise useful to the user, or when the key column
is based on a composite key. The NameColumn property is not used in
parent-child hierarchies; instead, the NameColumn property for child members is
used as the member names in a parent-child hierarchy.
Value columns of an attribute: Identifies the column that provides the value of the attribute. If the
NameColumn element of the attribute is specified, the same DataItem values are
used as default values for the ValueColumn element. If the NameColumn element
of the attribute is not specified and the KeyColumns collection of the
attribute contains a single KeyColumn element representing a key column with a
string data type, the same DataItem values are used as default values for the
ValueColumn element.
- What
is hierarchy, what are its types and difference between them?
A hierarchy is a very important part of
any OLAP engine and allows users to drill down from summary levels
hierarchies represent the way user expect to explore data at more detailed level
hierarchies is made up of
multipule levels creating the structure based on end user requirements.
->years->quarter->month->week
,are all the levels of calender hierarchy
They are 2 types of hierarchies they
are
- Natural hierarchy
- Unnatural hierarchy
Natural hierarchy: This means that the attributes are intuitively related to one another.
There is a clear relationship from the top of the hierarchy to the bottom.
Example: An example of this would
be date: year, quarter and month follow from each other, and in part, define
each other.
Unnatural hierarchy: This means that the attributes are not clearly related.
Example: An example of this might be
geography; we may have country -> state -> city, but it is not clear
where Province might sit.
- What is Attribute hierarchy?
An attribute hierarchy is created for
every attribute in a dimension, and each hierarchy is available for
dimensioning fact data. This hierarchy consists of an “All” level and a detail
level containing all members of the hierarchy.
you can organize attributes into
user-defined hierarchies to provide navigation paths in a cube. Under certain
circumstances, you may want to disable or hide some attributes and their
hierarchies.
- What
is use of AttributeHierarchyDisplayFolder property ?
AttributeHierarchyDisplayFolder: Identifies the folder in which to display the associated attribute
hierarchy to end users. For example if I set the property value as “Test” to
all the Attributes of a dimension then a folder with the name “Test” will be created
and all the Attributes will be placed into the same.
- What is use
of AttributeHierarchyEnabled?
AttributeHierarchyEnabled: Determines whether an attribute hierarchy is generated by Analysis
Services for the attribute. If the attribute hierarchy is not enabled, the
attribute cannot be used in a user-defined hierarchy and the attribute
hierarchy cannot be referenced in Multidimensional Expressions (MDX)
statements.
- What is use
of AttributeHierarchyOptimizedState?
AttributeHierarchyOptimizedState: Determines the level of optimization applied to the attribute hierarchy.
By default, an attribute hierarchy is FullyOptimized, which means that Analysis
Services builds indexes for the attribute hierarchy to improve query
performance. The other option, NotOptimized, means that no indexes are built
for the attribute hierarchy. Using NotOptimized is useful if the attribute
hierarchy is used for purposes other than querying, because no additional
indexes are built for the attribute. Other uses for an attribute hierarchy
can be helping to order another attribute.
- What is use
of AttributeHierarchyOrdered ?
AttributeHierarchyOrdered: Determines whether the associated attribute hierarchy is ordered. The
default value is True. However, if an attribute hierarchy will not be used for
querying, you can save processing time by changing the value of this property
to False.
- What
is the use of AttributeHierarchyVisible ?
AttributeHierarchyVisible : Determines whether the attribute hierarchy is visible to client
applications. The default value is True. However, if an attribute hierarchy
will not be used for querying, you can save processing time by changing the
value of this property to False.
- What are types of storage modes?
There are three standard storage modes
in OLAP applications
- MOLAP
- ROLAP
- HOLAP
- Compare
the Three Storage Modes ?
Summary and comparison
Basic Storage Mode
|
Storage Location for Detail Data
|
Storage Location for Summary/
Aggregations
|
Storage space requirement
|
Query Response Time
|
Processing Time
|
Latency
|
MOLAP
|
Multidimensional Format
|
Multidimensional Format
|
MediumBecause detail data is stored
in compressed format.
|
Fast
|
Fast
|
High
|
HOLAP
|
Relational Database
|
Multidimensional Format
|
Small
|
Medium
|
Fast
|
Medium
|
ROLAP
|
Relational Database
|
Relational Database
|
Large
|
Slow
|
Slow
|
Low
|
- What
is MOLAP and its advantage?
MOLAP (Multi dimensional Online
Analytical Processing) : MOLAP is the most used storage type. Its designed to
offer maximum query performance to the users. the data and aggregations are
stored in a multidimensional format, compressed and optimized for performance.
This is both good and bad. When a cube with MOLAP storage is processed,
the data is pulled from the relational database, the aggregations are performed,
and the data is stored in the AS database. The data inside the cube will
refresh only when the cube is processed, so latency is high.
Advantages:
- Since the data is stored on the OLAP server in optimized
format, queries (even complex calculations) are faster than ROLAP.
- The data is compressed so it takes up less space.
- And because the data is stored on the OLAP server, you don’t need
to keep the connection to the relational database.
- Cube browsing is fastest using MOLAP.
- What is ROLAP and its advantage?
ROLAP (Relational Online Analytical
Processing) : ROLAP does not have the high latency disadvantage of MOLAP. With
ROLAP, the data and aggregations are stored in relational format. This means
that there will be zero latency between the relational source database and the
cube.
Disadvantage of this mode is the
performance, this type gives the poorest query performance because no objects
benefit from multi dimensional storage.
Advantages:
- Since the data is kept in the relational database instead of on the
OLAP server, you can view the data in almost real time.
- Also, since the data is kept in the relational database, it allows
for much larger amounts of data, which can mean better scalability.
- Low latency.
- What is HOLAP and its
advantage?
Hybrid Online Analytical Processing
(HOLAP): HOLAP is a combination of MOLAP and ROLAP. HOLAP stores the detail
data in the relational database but stores the aggregations in multidimensional
format. Because of this, the aggregations will need to be processed when
changes are occur. With HOLAP you kind of have medium query performance:
not as slow as ROLAP, but not as fast as MOLAP. If, however, you were only
querying aggregated data or using a cached query, query performance would be
similar to MOLAP. But when you need to get that detail data, performance is
closer to ROLAP.
Advantages:
- HOLAP is best used when large amounts of aggregations are queried
often with little detail data, offering high performance and lower storage
requirements.
- Cubes are smaller than MOLAP since the detail data is kept in the
relational database.
- Processing time is less than MOLAP since only aggregations are
stored in multidimensional format.
- Low latency since processing takes place when changes occur and
detail data is kept in the relational database.
- What
are Translations and its use?
Translation: The translation
feature in analysis service allows you to display caption and attributes names
that correspond to a specific language. It helps in providing GLOBALIZATION to
the Cube.
- What
is Database dimension?
All the dimensions that are created
using NEW DIMENSION Wizard are database dimensions. In other words, the
dimensions which are at Database level are called Database Dimensions.
- What is Cube dimension?
A cube dimension is an instance of a
database dimension within a cube is called as cube dimension. A database
dimension can be used in multiple cubes, and multiple cube dimensions can be
based on a single database dimension
- Difference
between Database dimension and Cube dimension?
- The Database dimension has only Name and ID properties, whereas a
Cube dimension has several more properties.
- Database dimension is created one where as Cube dimension is
referenced from database dimension.
- Database dimension exists only once.where as Cube dimensions can be
created more than one using ROLE PLAYING Dimensions concept.
- How will you add a dimension to cube?
To add a dimension to a cube follow
these steps.
- In Solution Explorer, right-click the cube, and then
click View Designer.
- In the Design tab for the cube, click the Dimension Usage
tab.
- Either click the Add Cube Dimension button, or right-click
anywhere on the work surface and then click Add Cube Dimension.
- In the Add Cube Dimension dialog box, use one of the
following steps:
- To add an existing dimension, select the dimension, and then click
OK.
- To create a new dimension to add to the cube, click New dimension,
and then follow the steps in the Dimension Wizard.
- What
is SCD (slowly changing dimension)?
Slowly changing dimensions (SCD)
determine how the historical changes in the dimension tables are handled.
Implementing the SCD mechanism enables users to know to which category an item
belonged to in any given date.
- What
are types of SCD?
It is a concept of STORING Historical
Changes and when ever an IT guy finds a new way to store then a new Type will
come into picture. Basically there are 3 types of SCD they are given below
- SCD type1
- SCD type2
- SCD type3
- What
is Type1, Type2, Type3 of SCD?
Type 1: In Type 1 Slowly
Changing Dimension, the new information simply overwrites the original
information. In other words, no history is kept.
In our example, recall we originally
have the following table:
Customer Key
|
Name
|
State
|
1001
|
Christina
|
Illinois
|
After Christina moved from Illinois to
California, the new information replaces the new record, and we have the
following table:
Customer Key
|
Name
|
State
|
1001
|
Christina
|
California
|
Advantages: This is the easiest
way to handle the Slowly Changing Dimension problem, since there is no need to
keep track of the old information.
Disadvantages: All history is lost. By
applying this methodology, it is not possible to trace back in history.
Usage: About 50% of the time.
When to use Type 1: Type 1 slowly
changing dimension should be used when it is not necessary \
for the data warehouse to keep track of
historical changes.
Type 2: In Type 2 Slowly
Changing Dimension, a new record is added to the table to represent the new
information. Therefore, both the original and the new record will be present.
The new record gets its own primary key.
In our example, recall we originally
have the following table:
Customer Key
|
Name
|
State
|
1001
|
Christina
|
Illinois
|
After Christina moved from Illinois to
California, we add the new information as a new row into the table:
Customer Key
|
Name
|
State
|
1001
|
Christina
|
Illinois
|
1005
|
Christina
|
California
|
Advantages: This allows us to
accurately keep all historical information.
Disadvantages:
- This will cause the size of the table to grow fast. In cases where
the number of rows for the table is very high to start with, storage and
performance can become a concern.
- This necessarily complicates the ETL process.
Usage: About 50% of the time.
Type3 : In Type 3 Slowly
Changing Dimension, there will be two columns to indicate the particular attribute
of interest, one indicating the original value, and one indicating the current
value. There will also be a column that indicates when the current value
becomes active.
In our example, recall we originally
have the following table:
Customer Key
|
Name
|
State
|
1001
|
Christina
|
Illinois
|
To accommodate Type 3 Slowly Changing
Dimension, we will now have the following columns:
Customer
Key,Name,OriginalState,CurrentState,Effective Date
After Christina moved from Illinois to
California, the original information gets updated, and we have the following
table (assuming the effective date of change is January 15, 2003):
Customer Key
|
Name
|
OriginalState
|
CurrentState
|
Effective Date
|
1001
|
Christina
|
Illinois
|
California
|
15-JAN-2003
|
Advantages:
- This does not increase the size of the table, since new information
is updated.
- This allows us to keep some part of history.
Disadvantages: Type 3 will not be
able to keep all history where an attribute is changed more than once. For
example, if Christina later moves to Texas on December 15, 2003, the California
information will be lost.
Usage: Type 3 is rarely used in actual
practice.
- What is role playing dimension with two
examples?
Role play dimensions: We already discussed about this. This is nothing but CONFIRMED Dimensions.
A dimension can play different role in a fact table you
can recognize a roleplay dimension when there are multiple columns in
a fact table that each have foreign keys to the same dimension table.
Ex1: There are three dimension keys in
the factinternalsales,factresellersales tables which all refer to the dimtime
table,the same time dimension is used to track sales by that contain
either of these fact table,the corresponding role-playing dimension are
automatically added to the cube.
Ex2 : In retail banking, for
checking account cube we could have transaction date dimension and effective
date dimension. Both dimensions have date, month, quarter and year attributes.
The formats of attributes are the same on both dimensions, for example the date
attribute is in ‘dd-mm-yyyy’ format. Both dimensions have members from 1993 to
2010.
- What is measure group, measure?
Measure groups : These measure groups
can contain different dimensions and be at different granularity but
so long as you model your cube correctly, your users will be able to use
measures from each of these measure groups in their queries easily and without
worrying about the underlying complexity.
Creating multiple measure groups : To
create a new measure group in the Cube Editor, go to the Cube Structure tab and
right-click on the cube name in the Measures pane and select ‘New Measure
Group’. You’ll then need to select the fact table to create the measure group
from and then the new measure group will be created; any columns that aren’t
used as foreign key columns in the DSV will automatically be created as
measures, and you’ll also get an extra measure of aggregation type Count. It’s
a good idea to delete any measures you are not going to use at this stage.
Measures : Measures are the
numeric values that our users want to aggregate, slice, dice and otherwise
analyze, and as a result, it’s important to make sure they behave the way we
want them to. One of the fundamental reasons for using Analysis Services is
that, unlike a relational database it allows us to build into our cube design
business rules about measures: how they should be formatted, how they should
aggregate up, how they interact with specific dimensions and so on.
- What is attribute?
An attribute is a specification that defines
a property of an object, element, or file. It may also refer to or set the
specific value for a given instance of such.
- What is surrogate key?
A surrogate key is the SQL generated
key which acts like an alternate primary key for the table in database, Data
warehouses commonly use a surrogate key to uniquely identify an entity. A
surrogate is not generated by the user but by the system. A primary difference
between a primary key and surrogate key in few databases is that primarykey
uniquely identifies a record while a Surrogatekey uniquely identifies an
entity.
Ex: An employee may be recruited
before the year 2000 while another employee with the same name may be recruited
after the year 2000. Here, the primary key will uniquely identify the record
while the surrogate key will be generated by the system (say a serial number)
since the SK is NOT derived from the data.
- How
many types of relations are there between dimension and measure group?
They are six relation between the
dimension and measure group, they are
- No Relationship
- Regular
- Refernce
- Many to Many
- Data Mining
- Fact
- What is regular type, no relation type,
fact type, referenced type, many-to-many type with example?
No relationship: The dimension
and measure group are not related.
Regular: The dimension
table is joined directly to the fact table.
Referenced: The dimension
table is joined to an intermediate table, which in turn,is joined to the fact
table.
Many to many:The dimension table
is to an intermediate fact table,the intermediate fact table is joined , in
turn, to an intermediate dimension table to which the fact table is joined.
Data mining:The target dimension
is based on a mining model built from the source dimension. The source
dimension must also be included in the cube.
Fact table: The dimension table
is the fact table.
- What are calculated members and what is
its use?
Calculations are item in the cube that
are eveluated at runtime
Calculated members: You can create
customized measures or dimension members, called calculated members, by
combining cube data, arithmetic operators, numbers, and/or functions.
Example: You can create a calculated
member called Marks that converts dollars to marks by multiplying an existing
dollar measure by a conversion rate. Marks can then be displayed to end users
in a separate row or column. Calculated member definitions are stored, but
their values exist only in memory. In the preceding example, values in marks
are displayed to end users but are not stored as cube data.
- What are KPIs and what is its use?
In Analysis Services, a KPI is a
collection of calculations that are associated with a measure group in a cube
that are used to evaluate business success. We use KPI to see the business
at the particular point, this is represents with some graphical items such as
traffic signals,ganze etc
- What
are actions, how many types of actions are there, explain with example?
Actions are powerful way of extending
the value of SSAS cubes for the end user. They can click on a cube or
portion of a cube to start an application with the selected item as a
parameter, or to retrieve information about the selected item.
One of the objects supported by a SQL
Server Analysis Services cube is the action. An action is an event that a user
can initiate when accessing cube data. The event can take a number of forms.
For example, a user might be able to view a Reporting Services report, open a
Web page, or drill through to detailed information related to the cube data
Analysis Services supports three types
of actions..
Report action: Report action Returns
a Reporting Services report that is associated with the cube data on which the
action is based.
Drill through: Drillthrough Returns
a result set that provides detailed information related to the cube data on
which the action is based.
Standard: Standard has five
action subtypes that are based on the specified cube data.
Dataset: Returns a
mutlidimensional dataset.
Proprietary: Returns a
string that can be interpreted by a client application.
Rowset: Returns a
tabular rowset.
Statement: Returns a
command string that can be run by a client application.
URL: Returns a URL that can be
opened by a client application, usually a browser.
- What
is partition, how will you implement it?
You can use the Partition Wizard to
define partitions for a measure group in a cube. By default, a single partition
is defined for each measure group in a cube. Access and processing performance,
however, can degrade for large partitions. By creating multiple partitions,
each containing a portion of the data for a measure group, you can improve the
access and processing performance for that measure group.
- What
is the minimum and maximum number of partitions required for a measure
group?
In 2005 a MAX of 2000 partitions can be
created per measure group and that limit is lifted in later versions.
In any version the MINIMUM is ONE
Partition per measure group.
- What
are Aggregations and its use?
Aggregations provide performance
improvements by allowing Microsoft SQL Server Analysis Services (SSAS) to retrieve
pre-calculated totals directly from cube storage instead of having to
recalculate data from an underlying data source for each query. To design
these aggregations, you can use the Aggregation Design Wizard. This wizard
guides you through the following steps:
- Selecting standard or custom settings for the storage and caching
options of a partition, measure group, or cube.
- Providing estimated or actual counts for objects referenced by the
partition, measure group, or cube.
- Specifying aggregation options and limits to optimize the storage
and query performance delivered by designed aggregations.
- Saving and optionally processing the partition, measure group, or
cube to generate the defined aggregations.
- After you use the Aggregation Design Wizard, you can use the
Usage-Based Optimization Wizard to design aggregations based on the usage
patterns of the business users and client applications that query the
cube.
- What is perspective, have you ever
created perspective?
Perspectives are a way to reduce the
complexity of cubes by hidden elements like measure groups, measures,
dimensions, hierarchies etc. It’s nothing but slicing of a cube, for ex we
are having retail and hospital data and end user is subscribed to see only
hospital data, then we can create perspective according to it.
- What is deploy, process and build?
Bulid: Verifies the project files and
create several local files.
Deploy: Deploy the structure
of the cube(Skeleton) to the server.
Process: Read the data
from the source and build the dimesions and cube structures
Elaborating the same is given
below.
Build: Its is a used to
process the data of the cube database. Build is a version of a program. As a rule, a build is a pre-release version and as such is identified
by a build number, rather than by a release number. Reiterative (repeated)
builds are an important part of the development process. Throughout
development, application components are collected and repeatedly compiled for testing purposes, to ensure a reliable final product. Build
tools, such as make or Ant, enable developers to automate some programming tasks. As a verb, to
build can mean either to write code or to put individual coded components of a program together.
Deployment: During development of
an Analysis Services project in Business Intelligence Development Studio, you
frequently deploy the project to a development server in order to create the
Analysis Services database defined by the project. This is required to test the
project.
for example, to browse cells in the
cube, browse dimension members, or verify key performance indicators (KPIs) formulas.
- What is the maximum size of a dimension?
The maximum size of
the dimension is 4 gb.
- What are the types of processing and
explain each?
They are 6 types of processing in ssas
,they are
- Process Full
- Process Data
- Process Index
- Process Incremental
- Process Structure
- UnProcess
Process Full: Processes an Analysis Services object and all the objects that it
contains. When Process Full is executed against an object that has already been
processed, Analysis Services drops all data in the object, and then processes
the object. This kind of processing is required when a structural change has
been made to an object, for example, when an attribute hierarchy is added,
deleted, or renamed. This processing option is supported for cubes, databases,
dimensions, measure groups, mining models, mining structures, and partitions.
Process Data: Processes data only
without building aggregations or indexes. If there is data is in the
partitions, it will be dropped before re-populating the partition with source
data. This processing option is supported for dimensions, cubes, measure
groups, and partitions.
Process Index: Creates or rebuilds
indexes and aggregations for all processed partitions. This option causes an
error on unprocessed objects. This processing option is supported for cubes,
dimensions, measure groups, and partitions.
Process Increment: Adds newly available
fact data and process only to the relevant partitions. This processing option
is supported for measure groups, and partitions.
Process Structure: If the cube is unprocessed, Analysis Services will process, if it is
necessary, all the cube’s dimensions. After that, Analysis Services will create
only cube definitions. If this option is applied to a mining structure, it
populates the mining structure with source data. The difference between this
option and the Process Full option is that this option does not iterate the
processing down to the mining models themselves. This processing option is
supported for cubes and mining structures.
Unprocess : Drops the data in the
object specified and any lower-level constituent objects. After the data is
dropped, it is not reloaded. This processing option is supported for cubes,
databases, dimensions, measure groups, mining models, mining structures, and
partitions.
Process Default: Detects the process
state of an object, and performs processing necessary to deliver unprocessed or
partially processed objects to a fully processed state. This processing option
is supported for cubes, databases, dimensions, measure groups, mining models,
mining structures, and partitions.
- What
is a cube?
The basic unit of
storage and analysis in Analysis Services is the cube. A cube is a collection of data that’s
been aggregated to allow queries to return data quickly.
For example, a cube of order data might
be aggregated by time period and by title, making the cube fast when you ask
questions concerning orders by week or orders by title.
- What
is AMO?
The full form of AMO is Analysis
Managament Objects. This is used to create or alter cubes from .NET code.
- After
creating the cube, if we added a new column to the OLTP table then
how you add this new attribute to the cube?
Just open the datasourceview and on
right click we find the option REFRESH. Click the REFRESH then it will add new
attributes to the table which can be added to Cube.
EXPERIENCED INTERVIEW QUESTIONS -
- What
is the size of the Cube in your last Project?
Answer to this question varies from
project to project and mainly depends on how BIG is your database and how
COMPLEX the database design is. Generally for the database with a TRANSACTION
TABLE of 50 crore records, the cube size will be around 100GB. So, better go
with 100GB as answer to this question.
- What
is size of the database in your last Project?
You can expect this
question immediately after you answer 100GB to the last question. The database
size will be 600 to 800GB for which the cube will come to 100 GB. So go with
800GB for this question.
- What
is size of the fact(Transaction) table in your last Project?
This will be the next question if you answer 800GB
as your dataabase size. Here he is not expecting SIZE in GBs but the
interviewer will be expecting NUMBER OF ROWS in the Transaction table. Go with
57Crore records for this question.
- How
frequently you process the cube?
You have to be very careful here. Frequency of
processing cube depends on HOW FREQUENTLY YOU ARE GETTING NEW DATA. Once the
new data comes then SSIS team loads it and send a mail to SSAS team after load
is completed successfully. Once SSAS team receives the mail then these guys
will look for best time to PROCESS.
Typically we get data either Weekly or Monthly. So
you can say that the processing of the cube will be done either Weekly or
monthly.
- How
frequently you get DATA from clients?
This answer should be based on your last answer. IF
you answered WEEKLY to last question then the Answer to this question also
should be WEEKLY. IF MONTHLY for last question then this answer also should be
MONTHLY.
- What
type of Processing Options you used to process the cube in your Project?
This is the toughest question to answer. This
depends on DATA you have and CLIENTS requirements. Let me explain here.
- If the database is SMALL, let’s say it has only 1 crore records then
people do FULL PROCESS as it wont take much time.
- If the database is MEDIUM, let’s say it has only 15 crore records
then people prefer to do INCREMENTAL PROCESS unless CLIENTS ask us to do
FULL PROCESS as it takes little bit of time.
- If the database is HUGE, let’s say it has more than 35 to 40 crore
records then people prefer to do INCREMENTAL PROCESS unless CLIENTS ask us
to do FULL PROCESS as it takes lot of time. In this case we TRY to
convince clients for INCREMENTAL and if they don’t agree then we don’t
have any other option.
- Incremental process will come into picture ONLY when there is no
updates to the OLD data i.e no changes to already existing data else NO
OTHER OPTION than FULL PROCESS.
- How
you provide security to cube?
By defining roles we provide security to cubes.
Using roles we can restrict users from accessing restricted data. Procedure as
follows -
- Define Role
- Set Permission
- Add appropriate Users to the role
- How
you move the cube from one server to another?
There are many ways to do the same. Let me explain
four here and cleverly you can say “I worked on 4 SSAS projects till date and
implemented different types in all the four.”
- Backup and restore – This is the simplest way. Take the Backup from
development server and copy the backup to FTP folder of clients. After
doing this drop a mail to Client’s Admin and he will take care of RESTORE
part.
- Directly PROCESS the cube in PRODUCTION environment. For this you
need access to Production which will not be given by clients unless the
clients are *********. One of the client I worked for given FULL access to
me ..
- Under Srart –> All Programs –> Sql Server –> Analysis
Services you can see deployment wizard. This is one way of moving the
cube. This method has some steps to follow. First deploy your cube and
FOUR files will be created in BIN folder of PROJECT folder. Copy those
FOUR files and paste in Production server in any directory. Then OPEN this
DEPLOYMENT Wizard in production and when it ask for Database file then
point to the location where you copied the files. After that
NEXT,NEXT,NEXT … OK .. Cube will be deployed and processed.
- This way is most beautiful one. Synchronization, In this we will
first deploy and process the cube in STAGING ENVIRONMENT and then we will
go to production server. Connect to Analysis services in SSMS and select
Synchronize by right clicking on Databases folder in SSMS of analysis
services\
- Then select source as STAGING SERVER and then click on OK. The
changes in the cube present in the Staging server will be copied to the
production server.
- What
is the toughest challenge you face in your Project?
There are couple of this where we face difficulty.
- While working on RELATIONSHIPS between Measure Groups and
Dimensions.
- Working on Complex calculations
- Performance tuning
- How
you created Partitions of the cube in your Last Project?
Partitions can be created on different data. Few people do it on PRODUCT NAME wise and many prefer to do it on DATE data wise. you go with DATE wise.
In dates, we can
create MONTH wise,WEEK wise,QUARTER wise and some times YEAR wise. This all
depends on how much data you are coming per WEEK or MONTH or QUARTER … If you
are getting 50 lakhs records per month then tell you do MONTH wise.
- How
many dimensions in your last cube?
47 to 50.
- How
many measure groups in your last cube?
Total 10 and in that 4 are Fact tables and
remaining 6 are Fact less fact tables.
- What
is the Schema of your last cube?
Snowflake
- Why
not STAR Schema ?
My
data base design doesn’t support STAR Schema.
- What
are the different relationships that you are used in your cube?
- Regular
- Referenced
- Many to Many
- Fact
- No Relationship
- Have
you created the KPI’s , If then Explain?
Don’t add much to this as the questions in this
will be tricky. Just tell that you worked on couple of KPI and you have basic
knowledge on this. (Don’t worry, this is not MANDATORY)
- How
you define Aggregations in your Project?
We defined the aggregations for MOST FREQUENTLY
USED data in SSRS reports.
- Size
of SSAS team in your last Project?
Just
2 guys as we guys are really in demand and lot of scarcity:)
- How
many Resources worked on same Cube in your Project?
Only 2 and one in morning shift and another in
Evening shift.
- How
much time it take to Process the Cube?
This is Very very important question. This again
depends on the SIZE of database,Complexity of the database and your server
settings. For database with 50 cr transaction records, it generally takes 3.5
hrs.
- How
many Calculation you done in Your Project?
I answer more than 5000 and if you tell the same
then you are caught unless you are super good in MDX. Best answer for you is
“Worked on 50 calculations”.
What is Analysis
Services? List out the features?
Microsoft SQL Server 2005 Analysis Services (SSAS) delivers online analytical processing (OLAP) and data mining functionality for business intelligence applications. Analysis Services supports OLAP by letting we design, create, and manage multidimensional structures that contain data aggregated from other data sources, such as relational databases. For data mining applications, Analysis Services lets we design, create, and visualize data mining models that are constructed from other data sources by using a wide variety of industry-standard data mining algorithms.
Analysis Services is a middle tier server for analytical processing, OLAP, and Data mining. It manages multidimensional cubes of data and provides access to heaps of information including aggregation of data. One can create data mining models from data sources and use it for Business Intelligence also including reporting features.
Analysis service provides a combined view of the data used in OLAP or Data mining. Services here refer to OLAP, Data mining. Analysis services assists in creating, designing and managing multidimensional structures containing data from varied sources. It provides a wide array of data mining algorithms for specific trends and needs.
Some of the key features are:
Ease of use with a lot of wizards and designers.
Flexible data model creation and management
Scalable architecture to handle OLAP
Provides integration of administration tools, data sources, security, caching, and reporting etc.
Provides extensive support for custom applications
What is UDM? Its significance in SSAS?
The role of a Unified Dimensional Model (UDM) is to provide a bridge between the user and the data sources. A UDM is constructed over one or more physical data sources, and then the end user issues queries against the UDM using one of a variety of client tools, such as Microsoft Excel. At a minimum, when the UDM is constructed merely as a thin layer over the data source, the advantages to the end user are a simpler, more readily understood model of the data, isolation from heterogeneous backend data sources, and improved performance for summary type queries. In some scenarios a simple UDM like this is constructed totally automatically. With greater investment in the construction of the UDM, additional benefits accrue from the richness of metadata that the model can provide.
The UDM provides the following benefits:
Allows the user model to be greatly enriched.
Provides high performance queries supporting interactive analysis, even over huge data volumes.
Allows business rules to be captured in the model to support richer analysis.
What is the need for SSAS component?
Analysis Services is the only component in SQL Server using which we can perform Analysis and Forecast operations.
SSAS is very easy to use and interactive.
Faster Analysis and Troubleshooting.
Ability to create and manage Data warehouses.
Apply efficient Security Principles.
Explain the TWO-Tier Architecture of SSAS?
SSAS uses both server and client components to supply OLAP and data mining functionality BI Applications.
The server component is implemented as a Microsoft Windows service. Each instance of Analysis Services implemented as a separate instance of the Windows service.
Clients communicate with Analysis Services using the standard the XMLA (XML For Analysis) , protocol for issuing commands and receiving responses, exposed as a web service.
What are the components of SSAS?
An OLAP Engine is used for enabling fast ad hoc queries by end users. A user can interactively explore data by drilling, slicing or pivoting.
Drilling refers to the process of exploring details of the data.
Slicing refers to the process of placing data in rows and columns.
Pivoting refers to switching categories of data between rows and columns.
In OLAP, we will be using what are called as Dimensional Databases.
What is FASMI ?
A database is called a OLAP Database if the database satisfies the FASMI rules :
Fast Analysis– is defined in the OLAP scenario in five seconds or less.
Shared – Must support access to data by many users in the factors of Sensitivity and Write Backs.
Multidimensional – The data inside the OLAP Database must be multidimensional in structure.
Information – The OLAP database Must support large volumes of data..
What languages are used in SSAS ?
Structured Query Language (SQL)
Multidimensional Expressions (MDX) - an industry standard query language orientated towards analysis
Data Mining Extensions (DMX) - an industry standard query language oriented toward data mining.
Analysis Services Scripting Language (ASSL) - used to manage Analysis Services database objects.
How Cubes are implemented in SSAS ?
Cubes are multidimensional models that store data from one or more sources.
Cubes can also store aggregations
SSAS Cubes are created using the Cube Wizard.
We also build Dimensions when creating Cubes.
Cubes can see only the DSV( logical View).
What is the difference between a derived measure and a calculated measure?
The difference between a derived measure and a calculated measure is when the calculation is performed. A derived measure is calculated before aggregations are created, and the values of the derived measure are stored in the cube. A calculated measure is calculated after aggregations are created, and the values of a calculated measure aren’t stored in the cube. The primary criterion for choosing between a derived measure and a calculated measure is not efficiency, but accuracy.
What is a partition?
A partition in Analysis Services is the physical location of stored cube data. Every cube has at least one partition by default. Each time we create a measure group, another partition is created. Queries run faster against a partitioned cube because Analysis Services only needs to read data from the partitions that contain the answers to the queries. Queries run even faster when partition also stores aggregations, the pre calculated totals for additive measures. Partitions are a powerful and flexible means of managing cubes, especially large cubes.
While creating a new calculated member in a cube what is the use of property called non-empty behavior?
Nonempty behavior is important property for ratio calculations. If the denominator Is empty, an MDX expression will return an error just as it would if the denominator Were equal to zero. By selecting one or more measures for the Non-Empty Behavior property, we are establishing a requirement that each selected measure first be evaluated before the calculation expression is evaluated. If each selected measure is empty, then The expression is also treated as empty and no error is returned.
What is a RAGGED hierarchy?
Under normal circumstances, each level in a hierarchy in Microsoft SQL Server 2005 Analysis Services (SSAS) has the same number of members above it as any other member at the same level. In a ragged hierarchy, the logical parent member of at least one member is not in the level immediately above the member. When this occurs, the hierarchy descends to different levels for different drilldown paths. Expanding through every level for every drilldown path is then unnecessarily complicated.
What are the roles of an Analysis Services Information Worker?
The role of an Analysis Services information worker is the traditional "domain expert" role in business intelligence (BI) someone who understands the data employed by a solution and is able to translate the data into business information. The role of an Analysis Services information worker often has one of the following job titles: Business Analyst (Report Consumer), Manager (Report Consumer), Technical Trainer, Help Desk/Operation, or Network Administrator.
What are the different ways of creating Aggregations?
We can create aggregations for faster MDX statements using Aggregation Wizard or thru UBO – Usage Based Optimizations. Always, prefer UBO method in realtime performance troubleshooting.
What is WriteBack? What are the pre-conditions?
The Enable/Disable Writeback dialog box enables or disables writeback for a measure group in a cube. Enabling writeback on a measure group defines a writeback partition and creates a writeback table for that measure group. Disabling writeback on a measure group removes the writeback partition but does not delete the writeback table, to avoid unanticipated data loss.
What is processing?
Processing is a critical and resource intensive operation in the data warehouse lifecycle and needs to be carefully optimized and executed. Analysis Services 2005 offers a high performance and scalable processing architecture with a comprehensive set of controls for database administrators.
We can process an OLAP database, individual cube, Dimension or a specific Partition in a cube.
Name few Business Analysis Enhancements for SSAS?
The following table lists the business intelligence enhancements that are available in Microsoft SQL Server 2005 Analysis Services (SSAS). The table also shows the cube or dimension to which each business intelligence enhancement applies, and indicates whether an enhancement can be applied to an object that was created without using a data source and for which no schema has been generated.
Enhancement
Type
Applied to
No data source
Time Intelligence
Cube
Cube
No
Account Intelligence
Dimension
Dimension or cube
No
Dimension Intelligence
Dimension
Dimension or cube
Yes
Custom Aggregation
Dimension
Dimension (unary operator) or cube
No
Semiadditive Behavior
Cube
Cube
Yes>
Custom Member Formula
Dimension
Dimension or cube
No
Custom Sorting and Uniqueness Settings
Dimension
Dimension or cube
Yes
Dimension Writeback
Dimension
Dimension or cube
Yes
What MDX functions do you most commonly use?
This is a great question because you only know this answer by experience. If you ask me this question, the answer practically rushes out of me. “CrossJoin, Descendants, and NonEmpty, in addition to Sum, Count, and Aggregate. My personal favorite is CrossJoin because it allows me identify non-contiguous slices of the cube and aggregate even though those cube cells don’t roll up to a natural ancestor.” Indeed, CrossJoin has easily been my bread and butter.
Where do you put calculated members?
The reflexive answer is “in the Measures dimension” but this is the obvious answer. So I always follow up with another question. “If you want to create a calculated member that intersects all measures, where do you put it?” A high percentage of candidates can’t answer this question, and the answer is “In a dimension other than Measures.” If they can answer it, I immediately ask them why. The answer is “Because a member in a dimension cannot intersect its own relatives in that dimension.”
How do I find the bottom 10 customers with the lowest sales in 2003 that were not null?
Simply using bottomcount will return customers with null sales. You will have to combine it with NONEMPTY or FILTER.
SELECT { [Measures].[Internet Sales Amount] } ON COLUMNS ,
BOTTOMCOUNT(
NONEMPTY(DESCENDANTS( [Customer].[Customer Geography].[All Customers]
, [Customer].[Customer Geography].[Customer] )
, ( [Measures].[Internet Sales Amount] ) )
, 10
, ( [Measures].[Internet Sales Amount] )
) ON ROWS
FROM [Adventure Works]
WHERE ( [Date].[Calendar].[Calendar Year].&[2003] ) ;
How in MDX query can I get top 3 sales years based on order quantity?
By default Analysis Services returns members in an order specified during attribute design. Attribute properties that define ordering are "OrderBy" and "OrderByAttribute". Lets say we want to see order counts for each year. In Adventure Works MDX query would be:
SELECT {[Measures].[Reseller Order Quantity]} ON 0
, [Date].[Calendar].[Calendar Year].Members ON 1
FROM [Adventure Works];
Same query using TopCount:
SELECT
{[Measures].[Reseller Order Quantity]} ON 0,
TopCount([Date].[Calendar].[Calendar Year].Members,3, [Measures].[Reseller Order Quantity]) ON 1
FROM [Adventure Works];
How do you extract first tuple from the set?
Use could usefunction Set.Item(0)
Example:
SELECT {{[Date].[Calendar].[Calendar Year].Members
}.Item(0)}
ON 0
FROM [Adventure Works]
How can I setup default dimension member in Calculation script?
You can use ALTER CUBE statement. Syntax:
ALTER CUBE CurrentCube | YourCubeName UPDATE DIMENSION , DEFAULT_MEMBER='';
SSAS - SQL Server
Analysis Services
Q:
What is Analysis Services? List out the features?
Microsoft SQL Server 2005 Analysis Services (SSAS) delivers online analytical processing (OLAP) and data mining functionality for business intelligence applications. Analysis Services supports OLAP by letting we design, create, and manage multidimensional structures that contain data aggregated from other data sources, such as relational databases. For data mining applications, Analysis Services lets we design, create, and visualize data mining models that are constructed from other data sources by using a wide variety of industry-standard
data mining algorithms.
Analysis Services is a middle tier server for analytical processing, OLAP, and Data mining. It manages multidimensional cubes of data and provides access to heaps of information including aggregation of data. One can create data mining models from data sources and use it for Business Intelligence also including reporting features.
Analysis service provides a combined view of the data used in OLAP or Data mining. Services here refer to OLAP, Data mining. Analysis services assists in creating, designing and managing multidimensional structures containing data from varied sources. It provides a wide array of data mining algorithms for specific trends and needs.
Microsoft SQL Server 2005 Analysis Services (SSAS) delivers online analytical processing (OLAP) and data mining functionality for business intelligence applications. Analysis Services supports OLAP by letting we design, create, and manage multidimensional structures that contain data aggregated from other data sources, such as relational databases. For data mining applications, Analysis Services lets we design, create, and visualize data mining models that are constructed from other data sources by using a wide variety of industry-standard
data mining algorithms.
Analysis Services is a middle tier server for analytical processing, OLAP, and Data mining. It manages multidimensional cubes of data and provides access to heaps of information including aggregation of data. One can create data mining models from data sources and use it for Business Intelligence also including reporting features.
Analysis service provides a combined view of the data used in OLAP or Data mining. Services here refer to OLAP, Data mining. Analysis services assists in creating, designing and managing multidimensional structures containing data from varied sources. It provides a wide array of data mining algorithms for specific trends and needs.
Some
of the key features are:
§ Ease of use with a lot of wizards and designers.
§ Flexible data model creation and management
§ Scalable architecture to handle OLAP
§ Provides integration of administration tools, data
sources, security, caching, and reporting etc.
§ Provides extensive support for custom applications
Q:
What is UDM? Its significance in SSAS?
The role of a Unified Dimensional Model (UDM) is to provide a bridge between the user and the data sources. A UDM is constructed over one or more physical data sources, and then the end user issues queries against the UDM using one of a variety of client tools, such as Microsoft Excel. At a minimum, when the UDM is constructed merely as a thin layer over the data source, the advantages to the end user are a simpler, more readily understood model of the data, isolation from heterogeneous backend data sources, and improved performance for summary type queries. In some scenarios a simple UDM like this is constructed totally automatically. With greater investment in the construction of the UDM, additional benefits accrue from the richness of metadata that the model can provide.
The role of a Unified Dimensional Model (UDM) is to provide a bridge between the user and the data sources. A UDM is constructed over one or more physical data sources, and then the end user issues queries against the UDM using one of a variety of client tools, such as Microsoft Excel. At a minimum, when the UDM is constructed merely as a thin layer over the data source, the advantages to the end user are a simpler, more readily understood model of the data, isolation from heterogeneous backend data sources, and improved performance for summary type queries. In some scenarios a simple UDM like this is constructed totally automatically. With greater investment in the construction of the UDM, additional benefits accrue from the richness of metadata that the model can provide.
The
UDM provides the following benefits:
• Allows the user model to be greatly enriched.
• Provides high performance queries supporting interactive analysis, even over huge data volumes.
• Allows business rules to be captured in the model to support richer analysis.
• Allows the user model to be greatly enriched.
• Provides high performance queries supporting interactive analysis, even over huge data volumes.
• Allows business rules to be captured in the model to support richer analysis.
Q: What is the need for SSAS component?
§ Analysis Services is the only component in SQL
Server using which we can perform Analysis and Forecast operations.
§ SSAS is very easy to use and interactive.
§ Faster Analysis and Troubleshooting.
§ Ability to create and manage Data warehouses.
§ Apply efficient Security Principles.
Q:
Explain the TWO-Tier Architecture of SSAS?
§ SSAS uses both server and client components to
supply OLAP and data mining functionality BI Applications.
§ The server component is implemented as a Microsoft
Windows service. Each instance of Analysis Services implemented as a separate
instance of the Windows service.
§ Clients communicate with Analysis Services using
the standard the XMLA (XML For Analysis) , protocol for issuing commands and
receiving responses, exposed as a web service.
Q:
What are the components of SSAS?
§ An OLAP Engine is used for enabling fast ad
hoc queries by end users. A user can interactively explore data by
drilling, slicing or pivoting.
§ Drilling refers to the process of exploring details
of the data.
§ Slicing refers to the process of placing data in
rows and columns.
§ Pivoting refers to switching categories of data
between rows and columns.
§ In OLAP, we will be using what are called as
Dimensional Databases.
Q:What
is FASMI ?
A database is called a OLAP Database if the database satisfies the FASMI rules :
A database is called a OLAP Database if the database satisfies the FASMI rules :
§ Fast Analysis– is defined in the OLAP scenario in five seconds or
less.
§ Shared – Must support access to data by many users
in the factors of Sensitivity and Write Backs.
§ Multidimensional – The data inside the OLAP Database must be
multidimensional in structure.
§ Information – The OLAP database Must support large volumes of
data..
Q:
What languages are used in SSAS ?
§ Structured Query Language (SQL)
§ Multidimensional Expressions (MDX) - an industry
standard query language orientated towards analysis
§ Data Mining Extensions (DMX) - an industry standard
query language oriented toward data mining.
§ Analysis Services Scripting Language (ASSL) - used
to manage Analysis Services database objects.
Q:
How Cubes are implemented in SSAS ?
§ Cubes are multidimensional models that store data
from one or more sources.
§ Cubes can also store aggregations
§ SSAS Cubes are created using the Cube Wizard.
§ We also build Dimensions when creating Cubes.
§ Cubes can see only the DSV( logical View).
Q:What
is the difference between a derived measure and a calculated measure?
The difference between a derived measure and a calculated measure is when the calculation is performed. A derived measure is calculated before aggregations are created, and the values of the derived measure are stored in the cube. A calculated measure is calculated after aggregations are created, and the values of a calculated measure aren’t stored in the cube. The primary criterion for choosing between a derived measure and a calculated measure is not efficiency, but accuracy.
The difference between a derived measure and a calculated measure is when the calculation is performed. A derived measure is calculated before aggregations are created, and the values of the derived measure are stored in the cube. A calculated measure is calculated after aggregations are created, and the values of a calculated measure aren’t stored in the cube. The primary criterion for choosing between a derived measure and a calculated measure is not efficiency, but accuracy.
Q:
What is a partition?
A partition in Analysis Services is the physical location of stored cube data. Every cube has at least one partition by default. Each time we create a measure group, another partition is created. Queries run faster against a partitioned cube because Analysis Services only needs to read data from the partitions that contain the answers to the queries. Queries run even faster when partition also stores aggregations, the pre calculated totals for additive measures. Partitions are a powerful and flexible means of managing cubes, especially large cubes.
A partition in Analysis Services is the physical location of stored cube data. Every cube has at least one partition by default. Each time we create a measure group, another partition is created. Queries run faster against a partitioned cube because Analysis Services only needs to read data from the partitions that contain the answers to the queries. Queries run even faster when partition also stores aggregations, the pre calculated totals for additive measures. Partitions are a powerful and flexible means of managing cubes, especially large cubes.
Q:
While creating a new calculated member in a cube what is the use of
property
called non-empty behavior?
Nonempty behavior is important property for ratio calculations. If the denominator Is empty, an MDX expression will return an error just as it would if the denominator Were equal to zero. By selecting one or more measures for the Non-Empty Behavior property, we are establishing a requirement that each selected measure first be evaluated before the calculation expression is evaluated. If each selected measure is empty, then The expression is also treated as empty and no error is returned.
called non-empty behavior?
Nonempty behavior is important property for ratio calculations. If the denominator Is empty, an MDX expression will return an error just as it would if the denominator Were equal to zero. By selecting one or more measures for the Non-Empty Behavior property, we are establishing a requirement that each selected measure first be evaluated before the calculation expression is evaluated. If each selected measure is empty, then The expression is also treated as empty and no error is returned.
Q:
What is a RAGGED hierarchy?
Under normal circumstances, each level in a hierarchy in Microsoft SQL Server 2005 Analysis Services (SSAS) has the same number of members above it as any other member at the same level. In a ragged hierarchy, the logical parent member of at least one member is not in the level immediately above the member. When this occurs, the hierarchy descends to different levels for different drilldown paths. Expanding through every level for every drilldown path is then unnecessarily complicated.
Under normal circumstances, each level in a hierarchy in Microsoft SQL Server 2005 Analysis Services (SSAS) has the same number of members above it as any other member at the same level. In a ragged hierarchy, the logical parent member of at least one member is not in the level immediately above the member. When this occurs, the hierarchy descends to different levels for different drilldown paths. Expanding through every level for every drilldown path is then unnecessarily complicated.
Q: What are the roles of an Analysis Services Information Worker?
The role of an
Analysis Services information worker is the traditional "domain
expert" role in business intelligence (BI) someone who understands the
data employed by a solution and is able to translate the data into business
information. The role of an Analysis Services information worker often has one
of the following job titles: Business Analyst (Report Consumer), Manager
(Report Consumer), Technical Trainer, Help Desk/Operation, or Network
Administrator.
Q:
What are the different ways of creating Aggregations?
We can create aggregations for faster MDX statements using Aggregation Wizard or thru UBO – Usage Based Optimizations. Always, prefer UBO method in realtime performance troubleshooting.
We can create aggregations for faster MDX statements using Aggregation Wizard or thru UBO – Usage Based Optimizations. Always, prefer UBO method in realtime performance troubleshooting.
Q:
What is WriteBack? What are the pre-conditions?
The Enable/Disable Writeback dialog box enables or disables writeback for a measure group in a cube. Enabling writeback on a measure group defines a writeback partition and creates a writeback table for that measure group. Disabling writeback on a measure group removes the writeback partition but does not delete the writeback table, to avoid unanticipated data loss.
The Enable/Disable Writeback dialog box enables or disables writeback for a measure group in a cube. Enabling writeback on a measure group defines a writeback partition and creates a writeback table for that measure group. Disabling writeback on a measure group removes the writeback partition but does not delete the writeback table, to avoid unanticipated data loss.
Q:
What is processing?
Processing is a critical and resource intensive operation in the data warehouse lifecycle and needs to be carefully optimized and executed. Analysis Services 2005 offers a high performance and scalable processing architecture with a comprehensive set of controls for database administrators.
We can process an OLAP database, individual cube, Dimension or a specific Partition in a cube.
Processing is a critical and resource intensive operation in the data warehouse lifecycle and needs to be carefully optimized and executed. Analysis Services 2005 offers a high performance and scalable processing architecture with a comprehensive set of controls for database administrators.
We can process an OLAP database, individual cube, Dimension or a specific Partition in a cube.
Q:
Name few Business Analysis Enhancements for SSAS?
The following table lists the business intelligence enhancements that are available in Microsoft SQL Server 2005 Analysis Services (SSAS). The table also shows the cube or dimension to which each business intelligence enhancement applies, and indicates whether an enhancement can be applied to an object that was created without using a data source and for which no schema has been generated.
The following table lists the business intelligence enhancements that are available in Microsoft SQL Server 2005 Analysis Services (SSAS). The table also shows the cube or dimension to which each business intelligence enhancement applies, and indicates whether an enhancement can be applied to an object that was created without using a data source and for which no schema has been generated.
Enhancement
|
Type
|
Applied to
|
No
data source
|
Time
Intelligence
|
Cube
|
Cube
|
No
|
Account
Intelligence
|
Dimension
|
Dimension
or cube
|
No
|
Dimension
Intelligence
|
Dimension
|
Dimension
or cube
|
Yes
|
Custom
Aggregation
|
Dimension
|
Dimension
(unary operator) or cube
|
No
|
Semiadditive
Behavior
|
Cube
|
Cube
|
Yes>
|
Custom
Member Formula
|
Dimension
|
Dimension
or cube
|
No
|
Custom
Sorting and Uniqueness Settings
|
Dimension
|
Dimension
or cube
|
Yes
|
Dimension
Writeback
|
Dimension
|
Dimension
or cube
|
Yes
|
Q:
What MDX functions do you most commonly use?
This is a great question because you only know this answer by experience. If you ask me this question, the answer practically rushes out of me. “CrossJoin, Descendants, and NonEmpty, in addition to Sum, Count, and Aggregate. My personal favorite is CrossJoin because it allows me identify non-contiguous slices of the cube and aggregate even though those cube cells don’t roll up to a natural ancestor.” Indeed, CrossJoin has easily been my bread and butter.
This is a great question because you only know this answer by experience. If you ask me this question, the answer practically rushes out of me. “CrossJoin, Descendants, and NonEmpty, in addition to Sum, Count, and Aggregate. My personal favorite is CrossJoin because it allows me identify non-contiguous slices of the cube and aggregate even though those cube cells don’t roll up to a natural ancestor.” Indeed, CrossJoin has easily been my bread and butter.
Q:
Where do you put calculated members?
The reflexive answer is “in the Measures dimension” but this is the obvious answer. So I always follow up with another question. “If you want to create a calculated member that intersects all measures, where do you put it?” A high percentage of candidates can’t answer this question, and the answer is “In a dimension other than Measures.” If they can answer it, I immediately ask them why. The answer is “Because a member in a dimension cannot intersect its own relatives in that dimension.”
The reflexive answer is “in the Measures dimension” but this is the obvious answer. So I always follow up with another question. “If you want to create a calculated member that intersects all measures, where do you put it?” A high percentage of candidates can’t answer this question, and the answer is “In a dimension other than Measures.” If they can answer it, I immediately ask them why. The answer is “Because a member in a dimension cannot intersect its own relatives in that dimension.”
Q: How do I find the bottom 10 customers with the lowest sales in 2003 that were not null?
A: Simply using bottomcount will return customers with null sales. You will have to combine it with NONEMPTY or FILTER.
SELECT { [Measures].[Internet Sales Amount] } ON COLUMNS ,
BOTTOMCOUNT(
NONEMPTY(DESCENDANTS( [Customer].[Customer Geography].[All Customers]
, [Customer].[Customer Geography].[Customer] )
, ( [Measures].[Internet Sales Amount] ) )
, 10
, ( [Measures].[Internet Sales Amount] )
) ON ROWS
FROM [Adventure Works]
WHERE ( [Date].[Calendar].[Calendar Year].&[2003] ) ;
Q:
How in MDX query can I get top 3 sales years based on order quantity?
By default Analysis Services returns members in an order specified during attribute design. Attribute properties that define ordering are "OrderBy" and "OrderByAttribute". Lets say we want to see order counts for each year. In Adventure Works MDX query would be:
SELECT {[Measures].[Reseller Order Quantity]} ON 0
, [Date].[Calendar].[Calendar Year].Members ON 1
FROM [Adventure Works];
Same query using TopCount:
SELECT
{[Measures].[Reseller Order Quantity]} ON 0,
TopCount([Date].[Calendar].[Calendar Year].Members,3, [Measures].[Reseller Order Quantity]) ON 1
FROM [Adventure Works];
By default Analysis Services returns members in an order specified during attribute design. Attribute properties that define ordering are "OrderBy" and "OrderByAttribute". Lets say we want to see order counts for each year. In Adventure Works MDX query would be:
SELECT {[Measures].[Reseller Order Quantity]} ON 0
, [Date].[Calendar].[Calendar Year].Members ON 1
FROM [Adventure Works];
Same query using TopCount:
SELECT
{[Measures].[Reseller Order Quantity]} ON 0,
TopCount([Date].[Calendar].[Calendar Year].Members,3, [Measures].[Reseller Order Quantity]) ON 1
FROM [Adventure Works];
Q:
How do you extract first tuple from the set?
Use could usefunction Set.Item(0)
Example:
SELECT {{[Date].[Calendar].[Calendar Year].Members
}.Item(0)}
ON 0
FROM [Adventure Works]
Use could usefunction Set.Item(0)
Example:
SELECT {{[Date].[Calendar].[Calendar Year].Members
}.Item(0)}
ON 0
FROM [Adventure Works]
Q:
How can I setup default dimension member in Calculation script?
You can use ALTER CUBE statement. Syntax:
ALTER CUBE CurrentCube | YourCubeName UPDATE DIMENSION , DEFAULT_MEMBER='';
You can use ALTER CUBE statement. Syntax:
ALTER CUBE CurrentCube | YourCubeName UPDATE DIMENSION , DEFAULT_MEMBER='';
No comments:
Post a Comment