In September 2021, Statistics South Africa (Stats SA) implemented new samples for the following monthly surveys:
New samples for these surveys are implemented annually, although no new samples were drawn in 2020 because of the disruption caused by COVID-19.
The impact of each new sample is explained in its corresponding statistical release (reference month July 2021 and publication month September 2021). This note provides a summary of the new samples drawn in 2021 as well as the history of new samples.
Each sample is drawn from Stats SA’s business register, which Stats SA maintains using information provided by the South African Revenue Service. Maintenance of the business register includes changes related to new businesses, ceased businesses, merged businesses, and classification. Stats SA undertakes quality improvement investigations to keep its information up to date regarding the structures and activities of large and complex businesses. It is important, therefore, to draw new samples to keep surveys up to date with changes in the business register.
For each of the surveys listed above, the new samples were run in parallel with the old samples for three months, namely April, May and June 2021. This is the same methodology that has been in use for many years. Comparisons between new and old samples are provided in Table 1. For example, in 2021 the level of manufacturing sales was 1,6% higher based on the new sample, and the level of food and beverages income was 6% lower based on the new sample. The differences ranged from 5,9% (wholesale) to ‑6% (food and beverages). Table 1 shows the history of level changes between old and new samples going back to 2014.
COVID-19 had a severe impact on the economy, and this was clearly shown in Stats SA’s monthly business cycle indicators, which dipped sharply in April 2020. The COVID-19 effect is picked up in Stats SA’s surveys by comparing different periods, e.g. April 2020 compared with March 2020 (month-on-month), or April 2020 compared with April 2019 (year-on-year). Note that this is very different from the comparisons shown in Table 1, which are comparisons between samples, not comparisons between periods. The percentage differences in Table 1 all refer to different samples run in parallel over the same period, namely April to June of each year.
For most users, the most useful and interesting statistics from monthly business cycle indicators are the growth rates rather than the actual values of sales or income. Putting the old series together with the new series without linking the two would distort growth rates, whether the growth rates are calculated month-on-month or year-on-year, at current or constant prices, or using seasonally adjusted or unadjusted data. The greater the difference between the two samples, the greater the distortion in growth rates if there is no linking.
Consequently, Stats SA revises the historical data by linking the old series to the new series using a simple formula:
Linking results in a continuous series between revised and new, such that the growth rates are not distorted by the change in sample.
Figure 1 – Wholesale trade levels
Figure 1 illustrates the effect of new samples in terms of levels, measured at current prices. Wholesale trade sales are shown for three different samples, namely 2018, 2019, and 2021. The 2018 sample runs until June 2019, because that is the last month in which it was used. The 2019 sample lies below the 2018 sample, which is consistent with the level difference of ‑3,1% shown for 2019 in Table 1. Normally the 2019 sample would have run until June 2020, but COVID-19 prevented the implementation of new samples in 2020, so the sample was extended to June 2021. The 2021 sample lies above the 2019 sample, again consistent with Table 1 (a level difference of 5,9% in 2021).
Note that in Figure 1 the estimates based on the 2021 sample begin in April 2021, and yet the 2021 time series is shown all the way back to 2015. That is because the estimates before April 2021 are the revised series, i.e. the old series linked to the new series using formula .
Figure 2 shows wholesale trade annual growth rates corresponding to the three different samples, all based on estimates at constant prices. The linking described above is performed at current prices, which is then deflated to arrive at constant prices. The effect of linking carries through to constant prices automatically, with two important results for growth rates. First, the growth rates are not distorted by the introduction of new samples. Second, the newly calculated growth rates are very close to those published previously. Small differences in growth rates are partly the result of data cleaning processes necessitated by technical issues such as the late receipt of data from respondents, the replacement of imputed values with actual values, and changes in respondents’ reporting structures.
Figure 2 – Wholesale trade growth rates
Figures 3 and 4 show the results for retail trade sales, and they work in the same way as Figures 1 and 2. In Figure 3, note that the sharp upward spikes in retail trade are the high festive season sales that we see every December (i.e. the data are not seasonally adjusted). Figure 3 also shows a sharp downward spike in April 2020, which is the impact of COVID‑19 and lockdown.
Figure 3 – Retail trade levels
In Figure 4 we see once again that new samples make very little difference to growth rates, which is because of linking using formula .
Figure 4 – Retail trade growth rates
Figure 5 shows manufacturing sales based on the three different samples for 2018, 2019, and 2021. The 2021 sample lies above the 2019 sample; the 2019 sample lies below the 2018 sample; and the 2021 and 2018 samples are very close, because in Table 1 the level differences of ‑1,4% (2019) and 1,6% (2021) are largely offsetting.
Note that the sharp upward spikes around October / November of each year are the seasonally high manufacturing sales ahead of the festive season. Figure 5 also shows the large economic impact of COVID‑19 in April 2020. A seasonally adjusted version of Figure 5 would no longer show the upward spikes around October / November (because these are seasonal effects), but the downward spike in April 2020 would remain.
Figure 5 – Manufacturing sales levels
Figure 6 shows annual growth rates for manufacturing production, calculated from the manufacturing production index. Note that no linking is required for the production index when new samples are introduced, because although manufacturing sales are an important input for calculating production, the level of the production index is anchored at its base-year average of 100 in 2015.
Figure 6 – Manufacturing production growth rates
New samples are implemented annually to keep monthly surveys up to date with changes in Stats SA’s business register, which is based on information provided by the South African Revenue Service. New samples typically result in changes in levels of sales and income, which would distort growth rates if the estimates based on different samples were not linked. Consequently, Stats SA runs parallel surveys to measure differences in levels caused by new samples, and uses this information to revise historical series such that the old series can be linked to the new series with no artificial breaks. This is important for growth rates for two reasons. First, growth rates are not distorted by changes in sample, and second, newly calculated growth rates are very close to growth rates previously published.