Tuesday, 4 September 2012

Business Analytics with SPSS_Group C

Business Analytics with SPSS

Business Analytics can be said as exploration and anlysis of business data with the help of statistical model to gain insight and help in decision making. 

SPSS is a Java based business analytics tool provided by IBM. SPSS (originally, Statistical Package for the Social Sciences) was launched in 1968 and was developed by developed by Norman H. Nie and C. Hadlai Hull. It is one of the most widely used analytics software. One of it major advantage is unlike SAS you don't have to know programming to use it effectively.

Other business analytics tools:

MS Excel: It is part of Microsoft office suite and provides easy to use user interface.
SAS: It is one of the most powerful business analytics tool available, but it more complex to use and learn.
MATLAB: It is a tool built around MATLAB language. It is a great tool for mathematical modeling.

Others like R, S are also available.

SPSS

Name
Age
Gender
Katie
28
F
Jones
23
M
John
26
M

In SPSS data is organized in 2 dimensional structure, where each row represents a case (in the above data a person) and each column represent a varable ( here Name, Age, Gender).

This structure can be represented by a simple equation:
Record= Cases + Variables


SPSS provides two views to you:
Data view: In this view you can see and edit the data.
Variable View: In this view you can define the the variables and their attributes. 
Various attributes of variable are:
Name: It specifies the name of the variable.
Type: It specifies what type of data does that variable stores. Different types are:
          Numeric: To store numeric data.
          Comma: It stores data in comma seperate format. ex. 1,000,000.00
          Dot: It stores data in dot seperated format. ex. 1.000.000,00. This format is mainly used by  
                 european  countries.
          Scientifice Format:
          Date: To store date. We can specify different date formats like dd-mm-yyy
          Dollar: To store dollar value.
          Custom currency: To store value in other currencies.
          String: To store alphanumeric values.
Tip:
1. Make all variables numeric variable as it makes it easy for further analysis.
2. While entering string data if the default width becomes the width of the first value.


Width: To specify width of the data stored in variable.
Decimals: Used for numeric data to specify no of digits after decimal in the data.
Label: Sometimes due to naming constraints name of the variable does not clarify what data is stored in the variable, lables become helpful in this case.
Values: To label non-intuitive numeric data. ex for gender variable if we store numeric data we can specify here that "1" denotes "Male", "2" denotes "Female"
Missing: Here we specify which data values signify missing value. Various Missing values can be NA(Not Answered), NAP(Not Applicable), DK(Don't Know).
Column: Specifies width of column in data view.
Alignment: Specifies data alignment.
Measure: It defines the level of measure. It can be of following type:
                 Nominal: Values represent categories with no intrinsic ranking. Example:Gender
                 Ordinal: Values represent categories with some intrinsic ranking.
                              Example: Education level
                 Scale: Values represent ordered categories with a meaningful metric, so that distance
                           comparisons between values are appropriate. Example: Income.


Classifiaction of Analysis technique:
Analysis techniques can be classified in two ways:
First:
1. Univariate Analysis: It is the simplest form of analysis. Analysis is carried out with the description of a
    single variable and its attributes. Example: frequency distribution analysis.
2. Bivariate Analysis: It involves analysis of two variables to find relationship between them.
3 Multivariate Analysis: Involves observation and analysis of more than one statistical outcome variable at a time. Example: Factor Analysis

Second:
1st Level Analysis: It includes Frequency distribution, crosstabs, OLAP.
2nd Level Analysis: Multivariate analysis.

Analysis Vs Processing:
Processing: To transform input data to some output format.
Analysis: To analyse the output to get some relevant information.

Crosstab in SPSS
Crosstab can be accessed by clicking Analyse menu and selsecting Descriptive statistics and under that there is Crosstab.
Here we need to define one variable for column and one variable for row.
Using this we can also perform chi square analysis.

Chi Square
It is used to test independence of two variable.
Null Hypothesis: That occurence of two outcomes is statistically independent.

Example: There is no relationship between gender and "age when first married"

If Pearson Chi-Square is greater than 0.05 we accept null hypothesis.

Posted By:
S M Murshid Azam
Group C
Roll No: 14104
Operations Batch.

No comments:

Post a Comment