Business Analytics - Day 1
Ritesh Mehta - 14158 - Group D (Operations)
Today we had the first two lectures of Business
Analytics and as expected, the subject is a total hands-on experience. The tool
used for the class was statistical analysis software called SPSS (Statistical
Package for the Social Sciences).
The two classes today were an overview of how
the tool works and I got a hang of some of the most fundamental aspects. The
basic interface of the tool – Data View & Variable View and how the records
(cases plus variables) are entered into the tool were all easy to understand.
The variable view in particular gave a picture of how data and its properties
can be modified just on the click of a button. The types of measure was an
important concept to note here -
- Nominal – Random numbers are assigned to variables without any order
- Ordinal – The numbers assigned to variable follow a defined sequence
- Scale – It represents how much an ordinal value variable differs form the others
Though I had earlier read about how a variable
can be classified, the basics got cleared today only.
First classification was in category and
continuous variables -
- Category variables can be counted and can be recognized from the variable view tab if some numeral values are assigned to it (apart from NA (not available), NPA (not applicable) and DK (Don’t Know))
- Continuous are the ones, which are not categorical, i.e. no numeral values are assigned to them. These are further classified as continuous (fractions included) and discrete (whole numbers only).
Second classification was in terms of techniques
-
- 1st Level – Frequency, X Tabs, OLAP Cubes
- 2nd Level – Multivariate techniques -
- Univariate – only one variable
- Bivariate – two variables
- Multivariate – more than two variables
In the second class, frequency and cross tab,
under “Analyze” tab, were used to test the hypothesis. Frequency defines the
number of times a variable has occurred and cross tab is tabulation of two or
more variables that displays the relationship between them
The point to note here was the use of category
variables for cross tab. Continuous variables can’t be used with cross tab and
if someone intends to use them, they need to be converted to category
variables. The variable, in which the result is to be analyzed, is usually put
in the row and the other variable is put in the column. The results can be seen
as numbers or as percentages whichever way is relevant.
The null hypothesis is formed before starting
the test. Null hypothesis in case of cross tab is always – “There is no
relation between the two variables in question”.
The lecture also covered a statistical test
known as chi-square, which is used to examine the significance of relationship
between two or more variables. In order to perform a chi-square test, all data
should be in the form of numbers and the variables should be coded as “Nominal”
in the variable view tab.
A defined null hypothesis can be accepted or
rejected based upon the significance value of the chi-square test. If the
significance value < 0.05, it shows that there is positive relationship
between variables and the null hypothesis can be rejected. If the value is
greater than 0.05, it shows a negative relationship between variables and as
such null hypothesis is accepted. The values are compared with 0.05 as it
represents a confidence level of 95% i.e. with 95% confidence it can be said
that there is or isn’t a relationship between the given variables. Similarly,
for 99% confidence level a value of 0.01 is used of comparison. The selection
of confidence level depends upon the criticality of the situation and the
decision to be made on analysis of data.
For the whole two lectures, focus was on
analysis of data rather than processing. I learnt how necessary it is to
analyze the output displayed by SPSS rather just looking at numbers and getting
nothing out of it.
In all, it appears to be very informative series
of lectures and I look forward to get some expertise in this wonderful tool,
not only in terms of getting the output but also to analyze the output and make
decisions out of it.
-------------------------------------------------XXXX-------------------------------------------------
No comments:
Post a Comment