РУССКИЙENGLISH
МОСКВА КИЕВ АСТАНА
CESSI — Институт Сравнительных Социальных Исследований
Точность и профессионализм, проверенные временем

* SAMPLE DESIGN

*

SAMPLE DESIGN FOR THE PANEL

The universe covered in the Russian sample consists of the entire population aged 18 and over living permanently in Russian territory. Military bases, penal communities, and other institutionalised population were excluded from the sample.

Although the sample was intended to represent the adult population of Russia of 18 years old and over, the size of total population was used for designing the sample because of lack of demographic information of the age restricted population on the level of the primary sampling units (rayons). The population figures used as the basis for construction the sample were estimated by State Statistical Agency (GosKomStat) for January 1 1992 based on Census of 1989.

Primary sampling units and stratification

The primary sampling units (PSU's) in RUSSET Panel were rayons (U.S. equialent: counties). A rayon is a territorial and administrative unit in Russia. Although most rayons have at least one city or town and some suburban or rural areas, some are entirely rural and some are completely urban. The latter are called self-administered cities with the status of rayon. The population of the rayons varies from 1.4 thousand to 1.47 million inhabitants.

The rayon was selected as the primary sampling unit in the first stage of Russet area probability sampling for the following reasons. A rayon is a unit of definite, well-known boundaries that rarely change. The number of rayons are manageable (about 2500). The internal composition of a rayon is rather heterogeneous (which is attractive for the purposes of survey research). And finnally, information about the population is available on rayon level from the state's statistical agency.

The division of the population into strata was done separately within 4 geographical zones (figure 1). To the extent possible, the sampling units assigned to a stratum were intended to be similar (homogeneous) in two major characteristics: population size and the size of the rayon center (usually largest city/village in the rayon).

The major stratification objective was to assign primary units to groups (strata) of equal sizes (population). The number of primary sampling units per stratum varies from 2 to 180. Strata were independent and explicitly defined. In a few cases the strata populations deviated by more than 10 percent from the average of 2.8 million.

Each rayon falls into one of the 12 economic-geographical zones into which Russia is divided by its statistical agencies. In the first sampling stage these 12 zones were combined into 4 larger zones (later on simply called zones) in order of facilitate the construction of primary areas of equal size within each stratum of the PSU’s. These 4 zones are presented in figure 1 and described in table 1 are:

 

 

Zone 1

Northern and Central parts of European Russia (North, North-West, Central, Volgo-Vyatski regions and Kaliningrad oblast).

Zone 2

Southern part of European Russia (Centralno-Chemozemniy, Povolzhie and Severo Kavkazski regions).

Zone 3

Ural and Western Siberia (Ural region and West Siberia region).

Zone 4

Eastern (East Siberia and Far East regions).

Table 1. Distributions of PSU's by Zone, with Population Size

 

 

 

 

Population in Millions*

Nr. of strata **

Average size of PSU ***

Assigned number of interviews

 

Self representing****

 

 

 

 

 

Moscow

8.9

1

8.9

231

 

St. Petersburg

5.0

1

5.0

154

 

Non Self Representing

 

 

 

 

1

Zone 1 North and Central Europe

40.2

14

2.9

1078

2

Zone 2 South Europe

41.9

14

3.0

1078

2

Zone 3 Urals, Western Siberia

35.6

13

2.8

1001

4

Zone 4 East Siberia and Far East

17.1

6

2.9

462

 

 Total

148.7

49

2.9

4004

* Estimation for 01-01-1992 (GosKomStat).
** Equal to number of selected PSU’s.
*** Average size of PSU within each  zone in milions of people
**** The self-represented zones (Moscow and St.Petersburg) are selected with a probability equal to 1, hence the first stage in the sampling  process can be skipped Sso the primary sampling unit is the electoral district within each city instead of a rayon. In the table we show the total size of the zone for this cities instead of size of PSU.

Table 1 shows that the sample is spread across the four geographical zones proportionate to the population size in that zone. Thus the number of PSU’s in each region is directly determined by its population size.

First stage of the selection - PSU within strata

The goal of selection in the first stage was to choose one sampling unit from each stratum with probability proportional to size (PPS), using the 1992 population as the size measure.

Moscow with a population of 8,9 million people and St.Petersburg with a population of about 5,0 million people were automatically included in the sample and constituted 5 PSUs in total.

Each of the other 47 strata contained at least two primary sampling units only one of which could become the stratum representative in the sample. Because each primary unit's selection probability was less than one, its choice was not assure, so it was a non-certainty selection (non self-representing).

The other 47 PSUs were chosen from the total list of PSUs (rayons) within each of 4 geographical zones where the PSUs were ordered from large to small by the population size of the rayon centers. From each set of rayons containing 2.8 million people (strata) one rayon was selected to represent the set.

As an example, in the Far East (Zone 4), four urban rayons - Krasnoyarsk (925,000), Vladivostok (674,000), lrkutsk (641,300) and Khabarovsk (625,900) were combined into one stratum of total size of 2,8 million. Using a selection proportional to population size (PPS), Khabarovsk was chosen to represent these four urban areas.

The primary sampling units (rayons) are thus selected independently in each geographical zone. The probability of selection for each rayon can be calculated by dividing the size of each stratum by the size of the PSU. The probability of selection for primary sampling units varies from 0.00044 (the minimal size of a primary sampling unit divided by the maximal size of stratum p=1.5/3377) to 1.00 (for the self-representing areas of Moscow and St. Petersburg).

Second stage of the selection - cities/towns/villages within PSU

The second stage of the sample involved the selection of municipalities (cities/towns) and rural communities (villages) within the 47 PSUs (rayons).

From the list of all areas (towns and villages) in each rayon the necessary number of areas was selected with probability proportional to the population size using a systematic random selection approach. The number of the selected areas (towns/villages) depended on the population size of the areas in the rayon.

Because the plan was to conduct 77 interviews in each chosen PSU while in each election district (sampling unit on the third stage of selection) about 15 interviews had to be done, 5 election districts had to be chosen as sampling units for the third stage of the sampling. The step size for the systematic selections was therefore determined as the total population size of the rayon divided by 5. So, the number of towns/villages selected in each rayon varied from 1 to 5. It was 1 when the rayon consisted only of 1 area (town) and 5 when the rayon consisted of many areas of rather small size. This selected method is resulting in the distribution of interviews proportional to the urban and rural population in each rayon.

In total 50 urban areas (cities/towns) and 69 rural areas (villages) were selected across Russia.

Third stage of the selection - electoral districts

In each selected area (town or city) the necessary number of electoral districts was randomly selected from the total list of electoral districts in the area. The probability of selection was proportional to the size of the electoral district. In total 239 electoral districts were selected.

Self-representing areas. In self-representing cities Moscow and St.Petersburg electoral districts served as a primary sampling units. All electoral districts were listed within each of 9 administrative areas of Moscow and 13 administrative areas of St.Petersburg separately. Then the necessary number of electoral districts was selected from each area with a probability proportional to the size of electoral district. In total 26 election districts were selected in self-representing cities.

Fourth stage - selection of households

The population represented by household samples included persons living in housing units within Russian Federation. Family and students dormitories were included in the sample. All excluded from household surveys were military reservations, monasteries, hospitals, rest or convalescent homes, homes for the aged, rooms in hotels, motels.

The selection of households was different for urban and rural areas.

In the urban areas after selecting electoral districts, interviewers were sent to make a list of all apartments in each housing unit of the districts, a map of the investigated area with location of all housing and description of the type of dwellings and building usage -living/non-living purposes. All this information about the selected electoral districts was sent to central office to create database of addresses. Each line in the database content the ID of the city, number of electoral district, household address -street, house number, and apartment number. Random selection was used to select necessary number of households. These addresses were sent to interviewers back with survey materials and instructions.

In the rural areas a random route procedure was used for selecting the households. This means that the interviewer was given a starting point in a village (electoral district) for the first household to visit. Then the next address was determined by skipping a fixed number of house doors going to the right from the visited house. The number of house doors to be skipped was determined by the average response rate in the given area and the assigned number of interviews.

An equal number of interviews should be conducted in each PSU. The assigned number of effective interviews in one PSU was 77 people, which in total gives us 4000 effective interviews. The number of households selected was defined as the number of effective interviews multiplied by the average response rate in the particular strata (based on the previous experience). This procedure allows to eliminate the necessity of weighting the sample to adjust for different response rate in different regions.

In total 5225 households were selected assuming the average response rate of 76 percent.

Fifth stage - selection of respondent within a household

In the fifth stage of sampling, a particular respondent is selected within each household using Kish’ standard procedure. The interviewer first lists all members of the household over age 18 first in terms of all men from oldest to youngest, then all women by age as well. All people over 18 living in that household are listed, including any relatives or people who may rent rooms in the household. After that this table of data has been collected a respondent is selected on the basis of the standard rules of Kish (1965).

THE SAMPLE DESIGN FOR NEW PANEL (WAVE 5)

Wave 5 was dedicated to collect 2000 thousand people to reconstruct the initial sample size of the panel of 4000 people (in the last wave the sample size was 2073 people).

According to the results of previous waves we found that the attrition did not generate a systematic bias with respect to the demographics. That means that we got more or less similar distributions for the 4 samples with respect to the basic demographic variables (except age where we have natural ageing of the sample). Therefore we decided to draw an independent random sample for the Wave 5 to be able to analyze the new sample separately as an independents sample and in the same time be able to combine the two samples into one for analysis of the total sample of 4000.

The logic of the sample design was identical to the design of the original sample.