next up previous contents
Next: 6.1 The Local Datasets Up: GEIS Generalized Electronic Interviewing Previous: 5 GEIS libnames   Contents

6 GEIS Data Sets

There are a number of data sets and catalogues within the DATACATI library as shown in Table 2. The CONFID data set is the only one created by the user. All others are created by GEIS.


Table 2: GEIS data sets and catalogues.
Name Type Content  
CONFID Data set Respondent contact details  
SCRIPT Data set Script questions and logic  
ANSWERS Data set Answers given by respondents  
CONTROL Data Set Interview control information  
INTRVS Data Set Interviewer details  
ILOG Data Set Interviewer log  
PROTECT Data Set Interview protection statuses  
COMBRESP Data Set Data Set used during interviews  
CALLS Data Set Data Set to store chronological call data  
DEFINIT Catalogue General project information  

The CONFID Data Set

Before running GEIS, the CONFID data set must exist. It is usually created during the survey sample allocation and contains the respondent contact information, including the respondents' names, telephone numbers, and addresses. Deleting all occurrences of the CONFID data set after a survey has finished removes identifying information of respondents from a project.

Each row in the data set corresponds to a respondent. Some variables are mandatory, but it may also contain any number of other variables. The mandatory variables are shown in Figure 4. They must have appropriate values set before compilation. All the mandatory variables are copied from CONFID to the CONTROL data set when the script is compiled. This allows the values in the variable to be preset when the survey starts. Some further explanatory notes on the CONFID variables are given below.

Figure 4: CONFID structure.
\begin{figure}\centering
\begin{verbatim}-Alphabetic List of Variables an...
...3 STDPHONE Char 20 80
\newcommand{\SILENT} {\SILENT}}\end{verbatim}
\end{figure}

The variables in the CONFID data set are described below.

An example of a CONFID data set is shown in Table 3.


Table 3: An example of a CONFID data set.
ID FULLNAME STDPHONE ELIGIBLE SELECTED START
123456 George Eliot1 02 43216789 0 1 1NOV2003
123457 Emily Brontë2 03 1234 5678 0 0 .
123458 Jane Austen3 04 9876 5432 33 1 .
123459 Mary Shelley4 06 1234 4321 0 1 .
1 This case will not be interviewed until after November 1, 2003.
2 This case will not be interviewed because SELECTED is set to zero.
3 This case will not be interviewed because ELIGIBLE is non-zero.
4 An interview will be attempted with this case.




The CONTROL Data Set

The CONTROL data set is created by GEIS. Important variables are shown in Figure 5. The CONTROL data set is used to control the conduct of interviews. Some variables are copied from the CONFID data set during compilation. Some further explanatory notes on the CONFID variables are given below.

Figure 5: Major variables in CONTROL.
\begin{figure}\centering
\begin{verbatim}Variable Type Len Format LabelDISTA...
...spondent telephone number
SURVTYPE Char 6 Survey type\end{verbatim}
\end{figure}

The ID, STDPHONE, ELIGIBLE, SELECTED, START variables are copied from the CONFID data set, as described in Section 6.

The STATUS variable indicates the interviewing status of a case. It uses a two-character as defined by the $statfmt. format in the file \GEIS\BIN\FORMATS.SAS. Important status codes are shown in Table 4. A full list can be found in the file \GEIS\BIN\FORMATS.SAS. See Section 9.1.


Table 4: Selected status codes.
Code Meaning
AM Answering machine
CB Callback arranged
CQ Completed questionnaire
DO Dropped part-way
DR Dropped before starting
DT Disconnected tone
ER Error condition
ET Engaged tone
FM Fax machine
NA No attempt made to contact
OS Out of scope -- Ineligible
PQ Partly-completed




The ANSWERS Data Set

The ANSWERS data set is created by GEIS. The exact structure depends on the script, but the first variable is always ID. The ANSWERS data set is used to store the responses to interview questions.


The PROTECT Data Set

The PROTECT data set is created by GEIS. The exact structure depends on the script, but the first variable is always ID. The PROTECT data set is used to store the protection statuses of items in the script.


The SCRIPT Data Set

The SCRIPT data set is created by GEIS when a script file is imported. There is no need to pay attention to the structure of this data set unless there is a need to write programs that directly use the script contents. The SCRIPT data set is used to store the script contents.


The CALLS Data Set

The CALLS data set is created by GEIS. The CALLS data set is used to store a chronological listing of all calls made during the project.


The ILOG Data Set

The ILOG data set is created by GEIS. The ILOG data set is used to store a chronological listing of most events that occur while GEIS is running. These include the results of importing and compiling scripts, as well as the outcomes of all interviews conducted by all interviewers.


The INTRVS Data Set

The INTRVS data set is created by GEIS. The INTRVS data set is used to store a monitoring information relating to interviewers, such as the total time logged in, the total time spent interviewing, and the interviewer's password. It is created by the compiler by copying the master copy in the library COMMON.


The COMBRESP Data Set

The COMBRESP data set is created by GEIS. It is created by merging the CONTROL and ANSWERS data sets. It is used as a template for a local data set when an interview starts.


The DEFINIT Catalogue

The DEFINIT catalogue is created by GEIS. It is used to store project settings and the compiled version of the script.



Subsections
next up previous contents
Next: 6.1 The Local Datasets Up: GEIS Generalized Electronic Interviewing Previous: 5 GEIS libnames   Contents
Ross Corkrey 2006-02-14