Tuesday, 11 September 2012

Session 9 & 10 - Group A

Submitted by:
Arpit Setya
Roll No. 14011

MDS & Permap
The fundamental purpose of Permap is to uncover hidden structure that might be residing in a complex data set. Compared to other data mining and data analysis techniques MDS is growing increasingly popular because its mathematical basis is easier to understand and its results are easier to interpret.
Permap is an interactive computer program. It offers both metric and nonmetric MDS techniques. It solves problems in up to eight dimensional space and allows boundary conditions to be imposed on the solution. Permap can treat up to 1000 objects at a time and each object can have up to 100 attributes. It is easy to use, visually oriented, and allows real-time interaction with the analysis.

MDS Maps
Perceptual maps are sometimes called product maps, sociograms, sociometric maps, psychometric maps, stimulus-response diagrams, relationship maps, concept maps, etc. A perceptual map is a piece of paper, or any plane, with symbols on it that convey information about perceived relationships between the objects represented by the symbols. A perceptual map is taken to be a map that involves object-to-object relationships that are not amenable to simple, physical measurement. Objects can be anything. Anything that you want to study can be an object. If you are interested in how certain objects relate to each other, and if you would like to present these relationships in the form of a map, then MDS is the technique you need.

The MDS algorithm uses object-to-object proximity information to construct the map. A proximity is some measure of likeness or nearness, or difference or distance, between objects. It can be either a similarity (called a resemblance in some disciplines) or a dissimilarity. If the proximity value gets larger when objects become more alike or closer in some sense, then the proximity is a similarity. If the opposite is the case, the proximity is a dissimilarity.
Proximity values can be calculated, measured, or just assigned based on someone's best judgment. If calculated, they typically are based on some mathematical measure of association (correlation, distance, interaction, relatedness, dependence, joint or conditional probability) operating on a set of attributes.

An attribute is some aspect of an object. It may be called a factor, characteristic, trait, property, component, quantity, variable, dimension, parameter, and so forth. The attributes should be presented in a form where each is normalized (standardized) to some kind of range or standard deviation, but Permap can do the normalizing internally if so desired. An attribute in one study may be an object in another study. It is all a matter of perspective and interest.

Data Input
Data are entered from a text file (i.e., a file stored in ASCII or ANSI format). It is often faster and easier to use Notepad, the simple text editor that comes with all Windows operating systems. Notepad is designed for quick entry of short segments of unformatted text.

Here is a very simple data set:
Title=Distance
nObjects=4
DissimilarityList
Delhi    0           
Mumbai    1163    0       
Kolkata    1307    1666    0   
Chennai    1754    1030    1355    0

Dis/similarity data can be in either a lower-left half-matrix, as shown above, or in a whole matrix format. If a whole square matrix is entered, the upper-right triangle is ignored. Entering a square matrix is allowed simply to facilitate data interchange with other programs such as Excel.

If your proximity information is in the form of similarities instead of dissimilarities, then replace the keyword DISSIMILARITYLIST with SIMILARITYLIST and be sure that the diagonal values are all equal and are not exceeded by any other similarity value. There is no space before the "LIST" part of the keyword and capitalization is not important except for readability considerations.

Permap gives a graphical interpretation related to the proximity of variables in a very easy to understand way in which one can easily find out the related distances between each pair of variables. It also allows variables to be parked and see the effect on the data.

No comments:

Post a Comment