Friday, December 26, 2008

Correlation & Small n Methods

Introduction

This post considers correlative methods, for answering 'Are they related?" questions and also look at one type of quasi-experimental method for answering "Are they different?/Does this cause that?"

Correlation Methods

Correlation research is interested in describing how one variable relates to another. Often there are just two variables, but there may be more.

Like descriptive research, correlation research is describing how things are, but in a special way. Unlike experimental research, variables are rarely manipulated.

Correalation coefficient is the mathematical measure of the magnitude of association or relationship of two variables and what the direction of the relationship is. A correaltion coeffieicnt near zero indicates no relationship, a correlqation coefficeint near one indicates a very close realtionship. A posotive correaltion coeffcient indicates that both vairables increase and decrease together. A negative correaltion coeffcient indicates one variable decreases as the other increases.

Example graphs.

Common correlation coefficients are:

r = Pearson Product Moment correlation, r, for metric data
p = Spearman Rank-Order correlation, rho, for ordinal data
o = phi for nominal data - categorisdation (ie. no relationship)

These correaltion coefficients assume a linear relationship

Correlation is used to establish the strength and direction of a reltionship between vartiables

Correlation of less than 0.5 are usually considered poor, correlations of 0.5 to 0.7 are considered moderate in social sciences, with correlation above 0.7 considered high. In physical sciences the criteria are more stringent. This recognises the different accuracy, control and number of potentially inportant variables in the domains.

Correlation cannot be used to prove a cause-effect relationship, but can be very useful in the search for such a relationship. The course of scientific investigation often starts with a description of the situation which is follwed by a search for related variables. Those variables which show reasonable realtionships whcih fit in with current understanding (theorires) can then be used in experimetns. The experiments can control potential explanations for effects to provide the best evidence that "this causes that". Sometimes it is not possible to test hypothesis in an experiment and the "proof of a cause and effect can only be a rationale theory and close correalations (ie. in astrology where the variables are many). But remember a high correaltion between x and y could mean x causes y, y causes x, z causes x and y or the relationship betweeen x and y is spurious.

The square of the correlation co-efficient, the coefficient of determination, r2 to the power of 2, is useful as it represents the fraction of one variable which can be predicted by the other variable. Regression equations are an extentsion of the correlation concept and are used to predict one variable from another (or a number of variables). For example an r of 0.7 gives and r to the power of 2 of only 0.49, that is only 49% of the vairation in one variable is shared by the other variable.

Correlation is also commonly used to detemine test-retest reliability. A high correlation indicates that similar scored are given on different application of a test. However, it does not say how different the scores are.

The strength of the correlation can be tested to determine how likely it would be to get a correlation as high as you have if there was no relationship in the real world. Also remember that you should use random selection of subjects if you are going to generalise the sample relationships back to the population.

Small n (Single Subject) Quasi-Experimental Methods

The remainder of this session and next session will be devoted to quasi-experimetnal and experimental methods. Remember the difference between theset wo is teh amount of control the investigator has, particularly over random selection of subjects. Quasi/experimental will be described together, but keep in mind the greater limitations on generalisabilitiy that loss of random selection creates.

Quasi/experimetnal methods can be divided into group designs and "single-subject" designs. Single-subject designs are variously referred to as time-series methods, within-subject comparisons and small n designs. They can involve one subject, but are more powerful when three to five subjects are used. They are different to case reports and case studies in that there is a specific design used to increase the ability of the results to support answers to "Does this cause that?" questions. That is, the design is contrived to help eliminate or control possible explanations for changes in the DV other than the alternate hypothesis/the effect of the IV. The design includes repeated measures of a consistent DV to establish a "baseline", changing only one IV at a time, arranging data collection in phases and conducting replications.

The basic compnenets of the single-subject design are repeated measurements of the DV during manipulation of the IV by having control periods (called A) where the IV is absent and where the IV is present. In fact, the control period does not need to be no intervention.

Also, the measurement during the initial control period does not need to be stable but do need to be consistent. thus, there could be a steady decline or increase in teh DV during the A period. The simplest design is called an AB design where measures are continued. The AB design should be modified wherever possible to provide better control over the potential causes.

There are two ways of providing better evidence; replication and multiple base line design. replication can be within subject replication with other subjects. Within subject replication is achieved by modifying the design to add additional control (A) periods (eg. ABA)

ABA Design

By having a number of phases of A and B, the chance that some variable other than the IV caused the change in the DV is diminished. One pottential problem with the design is that it may be unethical to withdraw treatment.

Replication with other subjects

By repeating the study on a number of subjects, the chance taht some variable other than the IV caused the damage in the DV is reduced. Replication with other subjects also provides evidence on how generalisable the effect is.

Multiple Baseline Design

Multiple base line design can be used on one subject or a numbner of subjects. the critical characteristic is taht the stsrt of teh intervention are staggered.

On one subject, different interventions (IVs) are targetted to different expected outcomes (DVs). When more than one subject are being used, all can start their A (non intervention) period together but the start of their B (intervention) period would be staggered.

The multiple baseline design provides reduced risk that a variable other than the IV caused the change in the DV.

Single-subject designs are not very commonly used yet, as they tend to be seen as not as "scientific" as group designs. However, this is a misconception. This is partly due to the fact that until recently the analysis of the results has been visual, which has lacked credibility. Now there are a number of statistical tests which can be performed, however as it is a relatively new area, there is little concensus about which statistical procedure is best for which situation. One example is a computer program that tests for an overall difference between the two periods (A & B), then if there is a statistically significant difference it tests for differences between the levels of the two periods and the slopes of the two periods. Thus, single-subject designs can provide excellent evidence for a cause-effect relationship. They take into account the variation between subjects more, allowing for different people to respond differently to the same intervention.

No comments: