Archived Content

Information identified as archived is provided for reference, research or recordkeeping purposes. It is not subject to the Government of Canada Web Standards and has not been altered or updated since it was archived. Please contact us to request a format other than those available.

10. Evaluation of coverage studies

10.1 Reverse Record Check

10.1.1 Introduction

The results of the largest coverage study, the Reverse Record Check (RRC), can be evaluated by comparing its estimates with data on the same characteristics from other sources, such as the 2011 Census database and administrative data. Comparisons with RRC estimates are used to evaluate RRC estimates and to quantify conceptual and measurement differences.

Despite some conceptual differences between the RRC and the 2011 Census, the RRC estimate of persons enumerated in the 2011 Census can be compared with the census count. To make the two figures comparable, certain adjustments were made in the census count before the comparison was carried out.

Estimates of the components of intercensal growth can be compared with estimates from other sources. In particular, the RRC estimate of the number of persons who died between the 2006 Census and the 2011 Census can be compared with the count from vital statistics files. Estimates of net interprovincial migration based on Canada Revenue Agency data can be compared with RRC estimates. However, strict comparisons for this characteristic are impossible, since adequate adjustments for conceptual differences cannot be made. Lastly, RRC estimates of population growth components can be compared with similar estimates from administrative data.

10.1.2 Comparisons with census counts

Since the RRC's single-stage stratified sampling design produces unbiased estimators, differences between RRC estimates and census counts are due to sampling error in the RRC estimates, conceptual differences between the two sources, and/or systematic biases in the two sources, which result in an underestimate or overestimate of the characteristic being studied.

10.1.2.1 Persons enumerated

Provincial and national comparisons are presented in Table 10.1.2.1 along with the standard error of the RRC estimate and the t-value for testing the hypothesis that there is no difference between the RRC estimate and the comparable census count. The adjustments below were made in the published census counts to account for conceptual differences between the two sources:

  • Adjustments based on the Dwelling Classification Survey were excluded because, while they were included in the census counts, they were not part of the RRC estimate of enumerated persons.
  • The estimate of 2011 Census overcoverage was subtracted, because the census database contained overcovered persons whereas the RRC estimate was based on the number of unique persons enumerated (and not on the number of enumerations).
  • The estimate of the number of persons living outside Canada five years earlier (excluding intercensal immigrants and non-permanent residents) from the National Household Survey was also subtracted, because the RRC estimates did not include the majority of these persons.

Nationally, the RRC estimate of the number of persons enumerated in the 2011 Census was slightly higher (0.12%) than the comparable 2011 Census count. In the 2006 Census, the RRC estimate was slightly lower than the census count (-0.03%). In the 2001 Census, the RRC overestimated the census count by 0.07%, and in 1996, the RRC underestimated the census by 0.08%. Provincially, Manitoba had the largest difference (t-value of 3.33); the RRC estimate of the number of persons enumerated exceeded the comparable census count by 31,894. This difference is statistically significant, and Manitoba was the only province with a statistically significant difference. Significant differences were observed in previous RRCs as well. The most significant differences were investigated to make sure that there was no bias in the RRC classification (including, for example, province of residence on Census Day). Other factors may also play an important role. Apart from sampling error, biases in the adjustments (e.g., returning Canadians) applied to the published census count to obtain a conceptually comparable figure may be responsible for the observed differences. RRC non-response bias may also have had an impact, since the non-response adjustment was designed to obtain the best result for estimating missed persons rather than enumerated persons. Regular checks and quality controls were performed for all steps in the RRC. In view of the significant difference for Manitoba, a more detailed investigation was conducted to ensure that the operations and the estimates were not affected by any of the above-mentioned errors or problems. No such errors or problems were detected.

10.1.3 Comparison with population estimates

10.1.3.1 Deceased persons

Table 10.1.3.1 provides a comparison of the estimated number of persons who died during the intercensal period (May 16, 2006, to May 9, 2011) by RRC province of classification and counts from vital statistics files. At the national level, the RRC estimate exceeded the vital statistics count by 15,063 (1.3%). The largest relative difference was in Newfoundland and Labrador: -1,905 / 22,438, or 8.5%. In absolute value terms, the differences ranged from 0.7% to 8.5%. None of these differences is statistically significant. In t-value terms, the highest values were observed in British Columbia (1.23), where the RRC estimate was higher than the vital statistics count, and Newfoundland and Labrador (-1.09), where the RRC estimate was lower than the vital statistics count. All other estimates were well within one standard error of difference.

10.1.3.2 Interprovincial migration

Table 10.1.3.2 provides a comparison of RRC estimates of net interprovincial migration for the intercensal period and corresponding figures based on Canada Revenue Agency (CRA) files. In general, in-migration and out-migration statistics were not comparable because the RRC only took into account migration flows that occurred between the sampling frame reference date (e.g., May 16, 2006, for the census frame) and Census Day 2011, while estimates based on CRA data took annual migration into account. Accordingly, only net migration estimates are presented.

None of the observed differences was significant. Alberta had the highest t-value, at 1.65, as the RRC estimate of the net migration gain was much higher than the estimate based on CRA data. While both sources estimated a large net migration gain, the size of the gain differed with the source. It is recognized that there was substantial migration to Alberta, and that it might be difficult to distinguish between permanent and temporary migration. Some people migrated to Alberta for work and then settled there permanently. Others went there to work, but kept their residence in their province of origin and returned to it with varying frequency. Census respondents do not always correctly identify the location where they should be enumerated. As a result, the respondent may have provided a temporary place of residence, which led to a misinterpretation of his or her mobility and may have affected the accuracy of RRC mobility estimates.

For all provinces except Nova Scotia and Newfoundland and Labrador, both series of estimates showed net migration gains or net migration losses.

10.1.4 Components of population growth

An extensive comparison of RRC estimates of the components of intercensal population growth and population estimates derived from administrative data was carried out by the Demography Division. (This topic is also discussed in Section 10.3.) The RRC estimates of the demographic components are a by-product of the RRC and therefore not necessarily very precise. Estimates of total population growth from these two sources are presented in Table 10.1.4. The estimates of returning Canadians and persons living on Indian reserves or in Indian settlements that were incompletely enumerated in 2006 and enumerated in 2011 were added to the RRC estimates to make them comparable to the estimates from administrative sources.

The estimates from administrative sources are a combination of many estimates of population growth components (births, deaths, immigration, internal migration, emigration, net number of non-permanent residents, and growth of unenumerated Indian reserves). These estimates are subject to varying amounts of measurement error depending on the source. It is also important to keep in mind that the RRC was not designed to produce estimates of this type and that these estimates are by-products. Consequently, differences between the two series of estimates are to be expected.

Nationally, the RRC estimates differed by 8.6% from the administrative data estimates. The largest differences were observed in Ontario (-136,686) and British Columbia (56,885). As a percentage of the administrative data estimates, these differences were 19.0% and 18.2% respectively.

10.2 Census Overcoverage Study

Many changes were made in the methodology of the 2011 COS to improve the precision of the overcoverage estimates and identify more overcoverage cases than in 2006. To gauge the success of the 2011 COS, the evaluation had two objectives: measure overcoverage missed by the COS, and quantify the improvement attributable to the methodological changes made since 2006. The AMS is a good tool to use for both objectives, since its methodology has remained essentially unchanged since 2001. It is particularly useful for addressing the non-trivial problem of breaking down any increase in the estimated overcoverage into two components: a higher overcoverage in the studied population and an additional overcoverage detected because of improvement in the COS methodology.

10.2.1 Comparison of the 2006 and 2011 AMSs

The 2011 AMS was carried out using the same methodology as for the 2006 AMS, and then the two studies were compared. This made it possible to estimate the relative differences in overcoverage for various domains (e.g., national, provincial/territorial) between 2006 and 2011. The results of the comparison are shown in Table 10.2.1.

For 2006, the AMS produced an estimate of 292,594 overcovered persons at the national level. For 2011, the estimate was 430,702 persons, a relative difference of more than 47% compared with 2006. At the provincial and territorial level, the relative difference was positive in all cases except the Northwest Territories and Nunavut, where it was -20% and -24% respectively. The relative differences for New Brunswick and Yukon were particularly high, at 119% and 180% respectively.

10.2.2 Comparison of the 2011 COS and the 2011 AMS

The results of the 2011 COS were compared with the results of the 2011 AMS to estimate overcoverage missed by the COS but detected by the AMS, overcoverage missed by the AMS but detected by the COS, and overcoverage identified by both studies.

Differences of this kind are to be expected, because of the different approaches taken in the COS (person-based) and the AMS (household-based). The comparison was carried out in two steps.

The first step was to estimate the overcoverage detected by both the AMS and the COS in the COS sampling frames, i.e., overcoverage in the AMS domain of the COS. This overcoverage was estimated by matching person pairs that were in the AMS sampling frame with duplicates in the COS sample. It was estimated using the COS sample.

The second step was to estimate overcoverage detected by the AMS but missed by the COS. This overcoverage was equal to the total overcoverage for all AMS household pairs that contained no COS person pairs (from Step 1, Step 2 or the extension). It was estimated by matching the COS person pairs with the duplicates in the AMS sample. Unmatched AMS duplicates were the portion missed by the COS.

The two comparisons were carried out using the COS initial frame and the COS extended frame. The latter helped detect additional overcoverage cases missed by the COS initial frame, including overcoverage previously detected by the AMS and additional overcoverage not detected by the AMS or the COS initial frame.

10.2.2.1 Evaluation of COS-initial compared with the AMS

The results of comparing COS-initial and the AMS are presented in Table 10.2.2.1a.

The left side of Table 10.2.2.1a contains the following national estimates based on the COS-initial sample:

  • overcoverage in the COS initial frame: 588,856
  • overcoverage in the COS initial frame and the AMS frame: 386,661, or 65.7% of the total overcoverage detected using COS-initial
  • overcoverage in the COS initial frame but not in the AMS frame: 202,195, or 34.3% of the total overcoverage detected using COS-initial.

The right side contains the following national estimates based on the AMS sample:

  • overcoverage in the AMS frame: 430,702
  • overcoverage in the COS initial frame and the AMS frame: 392,302, or 91.1% of the total overcoverage detected using the AMS
  • overcoverage in the AMS frame but not in the COS initial frame: 38,400, or 8.9% of the total overcoverage detected using the AMS.

The overcoverage of 38,400 that was detected by the AMS but was not in the COS initial frame was analyzed in more detail. The estimates, presented in Table 10.2.2.1b, were based on the AMS sample.

This overcoverage falls into one of the following two categories:

  • Overcoverage among AMS household pairs containing at least one person pair from the COS initial frame: 28,978, or 75.5% of the total of 38,400 missed. This large portion of missed overcoverage was the target of the COS extension.
  • Overcoverage among AMS household pairs that contain no person pairs from the COS initial frame: 9,422, or 24.5% of the total of 38,400 missed.

10.2.2.2 Evaluation of COS-extended compared with the AMS

The extension of the COS frame was formed independently of the AMS household pairs. It contains a large portion of the overcoverage detected using the AMS and missed in COS-initial. However, it also contains additional overcoverage not detected by either the AMS or COS-initial.

The results of comparing COS-extended and the AMS are presented in Table 10.2.2.2.

The left side of Table 10.2.2.2 contains the following national estimates based on the COS sample. The various percentages were calculated in relation to the adjusted COS-extended overcoverage estimate of 632,846 persons.

  • unadjusted overcoverage in the COS extended frame (COS initial estimate + extension estimate): 588,856Footnote1 + 31,919Footnote2 = 620,775
  • overcoverage detected by COS-initial and the AMS: 386,661
  • overcoverage detected by COS-initial but not by the AMS: 202,195
  • overcoverage detected by COS-extended and the AMS: 27,625
  • overcoverage detected by COS-extended but not by the AMS: 4,294
  • adjustment of the COS-extended overcoverage with the AMS: 12,051.

Hence, following adjustment using the AMS results, the total overcoverage (COS-initial estimate + extension estimate + estimate from the AMS adjustment) was 588,856 + 31,919 + 12,051 = 632,846.

The right side of Table 10.2.2.2 contains the following national estimates based on the AMS sample:

  • overcoverage in the AMS frame: 430,702
  • overcoverage common to the COS initial frame and the AMS frame: 392,302, or 91.1% of the total overcoverage detected by the AMS
  • overcoverage in the COS extension and in AMS household pairs containing at least one COS-initial person pair: 26,349, or 6.1% of the total overcoverage detected by the AMS
  • overcoverage not detected in the COS extension and in AMS household pairs containing at least one COS-initial person pair: 2,629, or 0.6% of the total overcoverage detected by the AMS. This overcoverage was missed by the extension because the rules applied for the COS extension were similar to but different from the rules applied for the AMS
  • overcoverage in the AMS frame but not in the COS extended frame: 9,422, or 2.2% of the total overcoverage detected by the AMS.

The COS overcoverage estimate was adjusted using the total overcoverage detected by the AMS but not by COS-extended. This adjustment totalled 12,051 and consisted of the following components:

  • overcoverage not detected in the COS extension and in AMS household pairs containing at least one COS-initial person pair: 2,629
  • overcoverage in the AMS frame but not in the COS extended frame (containing no COS-initial person pairs): 9,422.

The above results show that the extension increased the coverage of the COS and detected additional overcoverage that would not have been identified by the AMS. Consequently, the extension eliminated a bias in the overcoverage estimate. Use of the extension also made it possible to assign the detected overcoverage to the appropriate domain when estimates were made for subpopulations. This improved the quality of the estimates for each domain. When the AMS was used for adjustment purposes, the additional overcoverage was distributed proportionally among all groups instead of being assigned to the particular domains in which overcoverage occurred.

10.3 Population estimates

10.3.1 Error of closure

Statistics Canada's Population Estimates Program (PEP) determines provincial and territorial population counts on Census Day by summing census population counts, estimates of census net undercoverage (CNU) and a population estimate for incompletely enumerated Indian reserves (IEIRs). The PEP then extends these adjusted census counts to July 1, at which point they become the base population for postcensal population estimates.

When determining the adjusted census counts, the PEP evaluates the quality of the postcensal estimates that it produced in the five-year period preceding the census. The evaluation focuses on the difference between the postcensal estimates for Census Day and the adjusted population count for this census. This difference is referred to as the error of closure. A detailed review of this error constitutes the main evaluation of the quality of the postcensal estimates.

Table 10.3.1 shows the errors of closure for 2011 and 2006 by province and territory. Note that a positive error of closure means that the postcensal estimate is higher than the adjusted census count. The 2011 error of closure for Canada was 171,115, an error rate of 0.50%. Hence, the national population estimates overestimated Canada’s population. The error and error rate were higher in 2011 than in 2006 (44,127, or 0.14%).Footnote3 Four provinces and two territories had errors of closure greater than 1% or less than -1% in 2011: Newfoundland and Labrador (-2.09%), Prince Edward Island (1.50%), Manitoba (1.79%), British Columbia (1.27%), the Northwest Territories (1.55%) and Nunavut (‑1.40%). By comparison, in 2006, two provinces and all three territories had errors of closure of this magnitude. In 2011, six provinces and one territory had larger errors of closure (in absolute value terms) than in 2006.

10.3.2 Accuracy of postcensal estimates

The results of the census coverage studies are used to adjust census counts for CNU. However, since the studies are based in part on sample surveys, the CNU results contain some statistical variability due to the samples. To determine whether the errors of closure discussed above are statistically significant, the standard error of the adjusted census count must be taken into account. Since the 2006 adjusted census count was used as the base population for the 2006-2011 postcensal estimates, a standard error that combines the statistical variability of the adjusted census counts for 2011 and 2006 was calculated for each province and territory.

Table 10.3.2 shows the 2011 error of closure by province and territory, the combined standard error of the 2006 and 2011 adjusted census counts, and the t-value.Footnote4 The error of closure is statistically significant at a 95% confidence level for Canada, Newfoundland and Labrador, Prince Edward Island (though only very slightly), Ontario, Manitoba and British Columbia. In other words, the sampling variability of the 2006 and 2011 adjusted census counts does not explain the majority of the error of closure.Footnote5

10.3.3 Postcensal estimates and error of closure

The contribution of the PEP components of growth to the error of closure was evaluated in particular for provinces where the error is significant. The evaluation method involves using information from the RRC to decompose the error of closure and compare the growth components estimated by the PEP and the RRC.

From a growth standpoint, the error of closure can be decomposed as follows:

EOC = (ΔPE ΔRRC) + (ΔRRC ΔAC)

where

EOC = error of closure
ΔPE = growth determined by the PEP component estimates
ΔRRC = growth determined by the RRC component estimates
ΔAC = growth based on the difference between the 2006 and 2011 adjusted census counts.

This decomposition does not strictly separate the effect of bias in the PEP growth components from the effect of RRC sampling variability. This variability is present in both comparison terms. However, a significant difference in the first term could confirm larger biases in the PEP postcensal estimates for a province and offers the potential for identifying which components might be the source of the bias. The second term results mainly from the statistical variability of the RRC sample, but it provides no direct information about the effect that that variability has on the error of closure.Footnote6 However, a significant difference in this term can affect the comparison of the PEP and RRC growth figures. In 2001, the error of closure is not equal to the sum of the two terms, mostly because the adjustment to account for the overcoverage of the 2006 Census sampling frame for the 2011 RRC is not equal to the 2006 Census Overcoverage estimate. However, the comparison of the relative importance of both terms is still valid when analysing the error of closure.

Table 10.3.3 shows the error of closure, the value of each of the two difference terms, the standard error and the t-value. For the provincial total and for Ontario and British Columbia, the error of closure is composed largely of the term consisting of the difference between PE growth and RRC growth, which is statistically significant at a 95% confidence level. For Newfoundland and Labrador and Prince Edward Island, neither term is statistically significant, but the PE-RRC growth difference term is dominant for the former while the RRC-AC growth difference term is dominant for the latter. For Manitoba, the RRCAC difference term is larger and statistically significant.

These results suggest that the PEP estimates may have overestimated the growth for the provincial total, Ontario and British Columbia and underestimated the growth for Newfoundland and Labrador. The PEP estimates may therefore partly account for the significant positive error of closure for the provincial total, Ontario and British Columbia and the significant negative error of closure for Newfoundland and Labrador.

The ΔPEΔRRC term was analyzed by growth component for these provinces. For Ontario and British Columbia, the largest difference lies in the net international migration component, with the PEP estimate being much higher. For Newfoundland and Labrador, the net interprovincial migration component shows the largest difference, the PEP estimate being lower.

For international migration, the largest difference is in emigration. The PEP appears to underestimate emigration. The difference is statistically significant at a 95% confidence level for the provincial total, Ontario and British Columbia. It is also statistically significant for Newfoundland and Labrador, but for this province, the PEP emigration estimate is larger.

For interprovincial migration, no province has a statistically significant difference between the net figures estimated by the PEP and the RRC. This is due both to the variance associated with the RRC sample size and to the fact that, for some provinces, the PEP overestimates both in-migration and out-migration. Significant differences are observed in the in-migration and out-migration estimates for some provinces. These differences appear to be due largely to migration flows to Alberta and, to a lesser extent, to Ontario.

10.3.4 Conclusion

From a population growth standpoint, the PEP overestimated the growth of Canada's population between 2006 and 2011. This overestimate was larger than the overestimate for the preceding period (2001 to 2006). The error-of-closure rate was 0.50% in 2011, compared with 0.14% in 2006. After the statistical variability of the adjusted census counts is taken into account, the error of closure is statistically significant for Canada, Newfoundland and Labrador, Prince Edward Island, Ontario, Manitoba and British Columbia. Hence, statistical variability alone may not explain the errors of closure for these provinces.

A decomposition analysis of the error of closure shows that the error for the provincial total, Ontario and British Columbia may be due to an overestimate of growth by the PEP, mainly caused by an underestimate of emigration. For Newfoundland and Labrador, the PEP may have underestimated net interprovincial migration.

While the errors of closure were generally larger than in 2006, the 2011 PEP estimates were still consistent with the census counts adjusted for net undercoverage.

Footnote

Footnote 1

386,661 + 202,195 = 588,856

Return to footnote 1 referrer

Footnote 2

27,625 + 4,294 = 31,919

Return to footnote 2 referrer

Footnote 3

The 2006 error of closure is based on a 2006 postcensal estimate updated in 2013 following a revision of the components for 2001 to 2006.

Return to footnote 3 referrer

Footnote 4

If the t-value is greater than 1.96 or less than -1.96, the PEP estimate is statistically different from the adjusted census count at a 95% confidence level.

Return to footnote 4 referrer

Footnote 5

The analysis subsequently took account of the effect of a change in the RRC's undercoverage estimation method in 2011. Without that change, the error of closure for Prince Edward Island would not have been significant and the error for Ontario would have been just at the 1.96 cut-off level.

Return to footnote 5 referrer

Footnote 6

This term is equivalent to comparing the RRC estimate of persons enumerated to the comparable 2011 Census count. Instead, the impact that the sampling variability of the RRC estimates has on the error of closure is estimated by the sampling errors affecting 2006 and 2011 net undercoverage.

Return to footnote 6 referrer

Date modified: