General Issues in Scaling

S.S. Stevens came up with what I think is the simplest and most straightforward definition of scaling. He said:

Scaling is the assignment of objects to numbers according to a rule.

But what does that mean? In most scaling, the objects are text statements, usually statements of attitude or belief. The figure shows an example.

There are three statements describing attitudes towards immigration. To scale these statements, we have to assign numbers to them. Usually, we would like the result to be on at least an interval scale (see Levels of Measurement) as indicated by the ruler in the figure. And what does “according to a rule” mean? If you look at the statements, you can see that as you read down, the attitude towards immigration becomes more restrictive – if a person agrees with a statement on the list, it’s likely that they will also agree with all of the statements higher on the list. In this case, the “rule” is a cumulative one. So what is scaling? It’s how we get numbers that can be meaningfully assigned to objects – it’s a set of procedures. We’ll present several different approaches below.

But first, I have to clear up one of my pet peeves. People often confuse the idea of a scale and a response scale. A response scale is the way you collect responses from people on an instrument. You might use a dichotomous response scale like Agree/Disagree, True/False, or Yes/No. Or, you might use an interval response scale like a 1-to-5 or 1-to-7 rating. But, if all you are doing is attaching a response scale to an object or statement, you can’t call that scaling. As you will see, scaling involves procedures that you do independent of the respondent so that you can come up with a numerical value for the object. In true scaling research, you use a scaling procedure to develop your instrument (scale) and you also use a response scale to collect the responses from participants. But just assigning a 1-to-5 response scale for an item is not scaling! The differences are illustrated in the table below.

ScaleResponse Scale
results from a processis used to collect the response for an item
each item on scale has a scale valueitem not associated with a scale value
refers to a set of itemsused for a single item

Purposes of Scaling

Why do we do scaling? Why not just create text statements or questions and use response formats to collect the answers? First, sometimes we do scaling to test a hypothesis. We might want to know whether the construct or concept is a single dimensional or multidimensional one (more about dimensionality later). Sometimes, we do scaling as part of exploratory research. We want to know what dimensions underlie a set of ratings. For instance, if you create a set of questions, you can use scaling to determine how well they “hang together” and whether they measure one concept or multiple concepts. But probably the most common reason for doing scaling is for scoring purposes. When a participant gives their responses to a set of items, we often would like to assign a single number that represents that’s person’s overall attitude or belief. For the figure above, we would like to be able to give a single number that describes a person’s attitudes towards immigration, for example.

Dimensionality

A scale can have any number of dimensions in it. Most scales that we develop have only a few dimensions. What’s a dimension? Think of a dimension as a number line. If we want to measure a construct, we have to decide whether the construct can be measured well with one number line or whether it may need more.

For instance, height is a concept that is unidimensional or one-dimensional. We can measure the concept of height very well with only a single number line (e.g., a ruler). Weight is also unidimensional – we can measure it with a scale. Thirst might also bee considered a unidimensional concept – you are either more or less thirsty at any given time. It’s easy to see that height and weight are unidimensional. But what about a concept like self esteem? If you think you can measure a person’s self esteem well with a single ruler that goes from low to high, then you probably have a unidimensional construct.

What would a two-dimensional concept be? Many models of intelligence or achievement postulate two major dimensions – mathematical and verbal ability. In this type of two-dimensional model, a person can be said to possess two types of achievement. Some people will be high in verbal skills and lower in math. For others, it will be the reverse. But, if a concept is truly two-dimensional, it is not possible to depict a person’s level on it using only a single number line. In other words, in order to describe achievement you would need to locate a person as a point in two dimensional (x,y) space.

OK, let’s push this one step further: how about a three-dimensional concept? Psychologists who study the idea of meaning theorized that the meaning of a term could be well described in three dimensions. Put in other terms, any objects can be distinguished or differentiated from each other along three dimensions. They labeled these three dimensions activity, evaluation, and potency. They called this general theory of meaning the semantic differential. Their theory essentially states that you can rate any object along those three dimensions. For instance, think of the idea of “ballet.” If you like the ballet, you would probably rate it high on activity, favorable on evaluation, and powerful on potency. On the other hand, think about the concept of a “book” like a novel. You might rate it low on activity (it’s passive), favorable on evaluation (assuming you like it), and about average on potency. Now, think of the idea of “going to the dentist.” Most people would rate it low on activity (it’s a passive activity), unfavorable on evaluation, and powerless on potency (there are few routine activities that make you feel as powerless!). The theorists who came up with the idea of the semantic differential thought that the meaning of any concepts could be described well by rating the concept on these three dimensions. In other words, in order to describe the meaning of an object you have to locate it as a dot somewhere within the cube (three-dimensional space).

Unidimensional or Multidimensional?

What are the advantages of using a unidimensional model? Unidimensional concepts are generally easier to understand. You have either more or less of it, and that’s all. You’re either taller or shorter, heavier or lighter. It’s also important to understand what a unidimensional scale is as a foundation for comprehending the more complex multidimensional concepts. But the best reason to use unidimensional scaling is because you believe the concept you are measuring really is unidimensional in reality. As you’ve seen, many familiar concepts (height, weight, temperature) are actually unidimensional. But, if the concept you are studying is in fact multidimensional in nature, a unidimensional scale or number line won’t describe it well. If you try to measure academic achievement on a single dimension, you would place every person on a single line ranging from low to high achievers. But how do you score someone who is a high math achiever and terrible verbally, or vice versa? A unidimensional scale can’t capture that type of achievement.

The Major Unidimensional Scale Types

There are three major types of unidimensional scaling methods. They are similar in that they each measure the concept of interest on a number line. But they differ considerably in how they arrive at scale values for different items. The three methods are Thurstone or Equal-Appearing Interval Scaling, Likert or “Summative” Scaling, and Guttman or “Cumulative” Scaling.