next up previous contents
Next: 8.5 Self-Protection Statements Up: 8 GEIS Scripts Previous: 8.3 Answer Quoting   Contents


8.4 Item Type Overview

Questions in an interview may be of several different kinds. One example is an open-ended question in which any response text can be entered, such as

``What is your profession?''
Another example is a question that prompts for numeric data, such as
``How old are you?''

A numeric item may require an answer to be a single number, multiple un-related numbers, or multiple numbers whose values are related. In the last case the values may, for example, have to add up to 100.

Other questions may be single or multiple choice closed-form questions. The term `single-choice' refers to the selection of a single response from a series of possible responses, while `multiple-choice' refers to the selection of one or more responses from a series of possible responses. An example of a single choice question is

``Are you employed or unemployed?''
Multiple-choice questions can result in between zero and as many answers as there are options. An example of a multiple choice question is
``Do you own any of the following items? ...''
There are also non-question items, such as instructions to interviewers.

GEIS supports various types of questions by the use of different script item types. A complete list of all GEIS item types is in the file \GEIS\BIN\GEIS.TXT. The most commonly used item types are described below along with details on how the data they collect are stored.

For most item types, numeric or character variables are created within the ANSWERS data set. If only one variable is created it has the same name as the script item. Some item types store data in more than one variable, and these have the same name as the item with a number appended: QA1, QA2, QA3, ....

The most common types are discussed below.

TITL: The TITL item is always the first item in the script. It is used to set the project title and to specify various global options. One of the options is the interviewing mode. Valid modes that may be used are given in Table 7.

Since the TITL item does not form part of an interview it is not referenced within the logic structure of the script.

Table 7: Interviewing modes set with the SURVTYPE option in the TITL item.
Mode Meaning
CATI CATI - Computer Assisted Telephone Interview
IVR IVR - Interactive Voice Response: Outbound calls
IVR_IN IVR - Interactive Voice Response: Inbound calls
SIVR_P IVR - Inbound Hybrid method calls
CASI CASI - Computer-Assisted Self-Interview




CALC: Often a questionnaire requires that a computation be performed or some data be manipulated during the interview. For example, after asking a series of questions, it may be necessary to calculate a score from the respondent's answers.

GEIS can perform any required calculation during the interview by inserting a CALC item within the script. The CALC calculation statements are equivalent to a small program inserted with the script. Standard SAS language statements can be used as long as only a single observation is output. By default, the calculation can consist of up to 50 lines, each of 200 characters. It is important that OUTPUT statements not be used within CALC items.

Figure 22 shows an example of calculation statements that might be included in a CALC item. When GEIS encounters a CALC item it automatically executes the following lines:

data DATACATI.COMBRESP; 
   set DATACATI.COMBRESP;

It then executes the CALC item calculation statements. Finally, it terminates the data step:

RUN;

Figure 22: Example of calculation statements inserted by a CALC item. These lines come from the script.
\begin{figure}\centering
\begin{small}
\begin{verbatim}* One of the statements...
... CALC item;
QA=1;
QB=QA+10;
QC=QB+100;\end{verbatim}
\end{small}
\end{figure}

Normally the calculated value is stored as a numeric value in a single variable in the ANSWERS data set, called the item's primary variable, but additional variables can be created if needed within the calculation statements.

The primary variable must be assigned a value by the CALC item. The user is responsible for ensuring that this happens. This is required because GEIS will not allow the interviewer to move on the next item unless the current item has a value. Suppose a CALC item called, say QA, was defined in a script but it was not assigned a value by the CALC statements, then GEIS will stop when reaches the item during an interview and display the usual message obtained when it cannot jump to another item:

``QA[CALC] evaluates to missing! Report this error!''
The interviewer will be able to move to the previous item but will not be able to move forward past this point. To avoid this the calculation statements should include a line similar to: QA=1.


CHCE and LIST: These item types implement single-choice questions. For example, the following question only allows two possible answers.
``Are you employed or unemployed?''

The selected option is stored as a numeric code. Any numeric value may be used for the codes, but typical values are: 1, 2, 3, ....

Options may be set to handle Refusals and Don't knows. Refusals are coded using the code .R and Don't knows may be coded by local convention, but .K is suggested7.

By default, the CHCE allows up to 50 options. The CHCE option codes and their textual labels are defined in the script.

The LIST type allows an unlimited number of options, but the option codes and texts must be loaded from an external SAS data set at compile time. The LIST is intended for use for displaying long lists on the screen through which the interviewer can browse. For example, it may be used to display a long list of countries.

For both types formats8 are used to provide a simple means of mapping the numeric codes to the textual responses. However, if no format is defined the operation of GEIS will be unaffected. The format may be defined by the user or generated automatically by specifying the _MAKE_ option.


DO and ENDD: A sequence of items embedded between a DO and ENDD item in a script are repeated during the interview for as many times as desired. These sections of repeated items are called DO loops9. This allows for a sequence of similar questions to be compressed into a short series in the script.


INFO: The INFO item is used to give instructions to interviewers, or text to be read out loud.


LINK: It is often necessary to be able to access data in external data sets. A typical use is to use data recorded in a previous survey to select or construct questions in a new interview. For example:
``Last time, you said you were unemployed. Is this still the case?''

``In the previous interview, you reported your income as ^Prv_Incm^. Has your income increased, decreased, or stayed about the same?''

The LINK item type causes GEIS to import data from external data sets either when the script is compiled, called a static link, or just before the interview is to start, called a dynamic link.

The LINK item creates a variable in the ANSWERS with the same properties as the external data set variable that is imported. A single LINK item may import multiple variables from several data sets. For the first listed data set, if an external variable is not specified then it is assumed to have the same name as the LINK item.

The respondent's ID code is usually used to select which record in the external data sets to import. The external data set is searched for an ID variable. A search is then made for the respondent's ID code in the ID variable. The record holding the respondent's ID code is then locked and the specified variables for this record are then imported. Alternatively, a logical expression may be used to select the records to import.

There may also be a need for the data transfer to work in the opposite direction; i.e. the current interview may send data to external data set. This is called external-updating10.


LVLC: The LVLC type is used to summarise the answers to a series of CHCE or MULT items. It returns the number of CHCE or MULT items that have a particular response set.

For example, it may count how many in a series of twenty CHCE items had a Don't Know response set. In the same way, the LCLV type can also be used to count how many of several MULT items have a particular level set.

Note that the level code specified in the LVLC item refers to the code value specified in the MULT or CHCE item definition. Although the code value in CHCE can be an integer, date, or time, the level code in the LVLC item must be always expressed as an integer.


MAIL: This item generates an e-mail. Typically the contents of the e-mail are based on the responses within an interview. It can also include quoted values so as to produce customised e-mails. The item requires that a set of variables be defined to specify the addressee, CC address, subject, and attachments.


MULT: The MULT item type is used to implement multiple-choice questions. Multiple-choice questions can have between zero and as many answers as there are options. By default, the maximum is fifty options. The response options are stored as a single character string consisting of zeroes (`0') and ones (`1'). Each character (`1' or `0') in the string represents the selection state of an option while its ordinal position in the string represents the index of the option. As an example, `1010' indicates that option 1 and 3 were selected, and options 2 and 4 were not selected. A question with fifty options requires a string 50 characters long to store the answers.

When the FINAL data set is created (Section 9.3) the MULT variable is decomposed to a set of secondary variables, with one variable for each option. These variables contain `1' if the response option was selected, and `0' if not. Labels may be defined in the script for each of the secondary variables.

The length of the name of a MULT item cannot exceed 6 characters, but 5 is better. This becomes the name of the primary variable in the ANSWERS data set. The secondary variables in the FINAL data set are then named by appending 1, 2, 3, ...to the name.


NULL: The NULL item does not ask a question or store data. Instead, it is used to provide a connection between two or more other items. Sometimes the SPSs in a complex branching script may be very long. These long statements may be split up between one or more NULL item types, thereby simplifying the script and making it easier to read.


NUM and NUMM: The NUM and NUMM item types are used to enter numeric data. Numeric data includes numbers like 322, 4OCT1957, and 12:45pm.

Formats are used to control how values appear on the screen. Informats are used to control entry of numeric data.

There are two types of range check limits that may be applied: absolute limits, used to catch impossible values; reasonableness limits, used to catch unlikely but possible values.

To assist in entering complex numbers, such as dates, a slider control appears on the interviewing screen underneath the data entry field. This only happens when both absolute limits are set.

Refusals are entered using the SAS special missing value: .R. This is a decimal point followed by a capital R. Don't knows may be recorded using any code, but .K is suggested11.

The NUM item only allows a single number to be entered. The NUMM item allows up to five numbers to be entered. For example, a NUMM item may be used to handle a question like:

``How long have you been employed?''
where the answer may be given in any combination of days, weeks, months or years. In addition, logical constraints may be specified between the five numbers. For example, a possible constraint might be that the sum of the five fields' values must be 100.

If more than five numbers are required, use the TABL item (page [*]).


OPEN: This item is used to store a respondent's verbatim answer. By default, storage is limited to 200 characters of text and stored in a single column. If more are required, additional variables each of 200 characters can be added.

Refusals and Don't knows may be recorded using a strings according to local convention, but ``REFUSED'' and ``DK'' are suggested.

The item's name cannot exceed 6 characters, but 5 is better.


RST: The RST item restarts the interview without exiting. It may be used to re-initialise if the wrong respondent begins the interview.


SCAL: This item is used within scripts to create values that may be quoted or referenced. It is typically used at the start of the script to define values used throughout.


SMSG: The SMSG item is used to display a screen at the beginning of interviews when the status and auxiliary status codes match that specified in the item. It is used to display a particular message to the interviewer when special conditions arise.


STAT: A STAT item must be set for each possible exit condition from the interview. STAT items are used to set interview exit status codes12. All scripts must at least have a STAT item that sets a status code of CQ, for completed interviews. Usually, there would also be a STAT item to store refusals.


SCRP: The SCRP item is used to insert another script file into the current script. This allows a long script to be broken up into smaller manageable chunks. For example, individual modules dealing with a particular issue can be put into separate files and then assembled into a single script during the import process.

The included script is inserted at the location of the SCRP item, which it replaces. SCRP items may be nested. Since the SCRP item is replaced by the inserted script, the item has no SPS (Self-Protection Statement) of its own.


TABL: The TABL type allows entry of numeric or character data in tabular format. A typical TABL question asks the respondent how they would split up a numeric quantity, such as an amount of money. For example,
``If you were given $100, how would you spend it on the following topics? ... [Multiple topics follow]''

A summary value is stored in a single variable and the values entered in the rows of the table are stored in separate variables. The summary value is determined by the VARSTAT option. For example, it could be the average or the sum of the rows. If only character data are to be entered then the primary column is set equal to the number of cells with character data. In the above example, the VARSTAT would be set equal to 100.

If a TABL item is called QA and has 10 rows, the primary column will be named QA and the secondary columns will be named QA1, QA2, ....

Overall absolute limits may be set for all rows or individual rows.

Some or all rows can be redefined as buttons. If a button is pressed the table will be blanked and the primary column will hold the value associated with the button.

Refusals are entered using the SAS special missing value: .R. This is a decimal point followed by a capital R. Don't knows may be recorded using any code, but .K is suggested13


TIME: GEIS automatically records the duration of the whole interview but not parts of the interview. To do this TIME items may be inserted before and after the part to be timed. The TIME item records the interview duration to the point when the item first becomes active. By subtracting the values of consecutive TIME items the duration of parts of the script may be calculated.


next up previous contents
Next: 8.5 Self-Protection Statements Up: 8 GEIS Scripts Previous: 8.3 Answer Quoting   Contents
Ross Corkrey 2006-02-14