Exodus from Hubei Province, Statistical Adjustment or Big Data failure ?

Population Data of Hubei Province from 2002 to 2021
Population in 10.000 of Hubei Province from 2002 to 2021 (online data from the National Bureau of Statistics, 21 July, 2022)

Plotting the population data of Hubei Province as given on the website of China’s National Bureau of Statistics 国家统计局 reveals a striking pattern: a sensational exodus out of the Province between the end of 2019 and 2020. With 3,07% less, or, in absolute terms,  a drop from 59.27 to 57.45 million, thus by 1.82 million people, Hubei is the Province with the most important decline in the year 2020. Heilongjiang is second with 2.58%, followed by Jilin with 2.00% but there, it is no novelty.

In Chinese population statistics, negative population growth rates rarely come in isolated years; for Hubei it had even never happened during the last 20 years. The graph below shows population growth rates of several provinces and major cities in comparison.

Population growth rate of several Chinese provinces between 2003 and 2021

Spikes appeared regularly in years where the censuses reevaluated the Chinese population, in particular following the decennial population census, in the year ending with zero (in the graph above this can clearly be seen for the 2010 data), but also to a certain extent for the inter-census population surveys in the years ending with five, when the sample size is about 1% of the total population. As China has improved its census data collection methods, based now upon a combination of three pillars: enumeration, administrative registers and Big Data (using mobile data), and conducts annual population change sample surveys, where the sample size is about 1‰ of the total population, adjustments become less common.

So why such a stark V-shaped pattern for Hubei? Wuhan, the provincial capital, comes to mind, with the Corona outbreak end of 2019 and the subsequent draconian lockdown. Did 1.82 million people leave the province, did they die, did an algorithm fail to process data correctly, or was it, as the legend says, in the case of the iron content in spinach, the fault of the secretary who mistyped a number? Without access to archives, it is hard to tell, but watch out for the Shanghai numbers at the end of this year!

(posted by Andrea Bréard)