Sunday, February 28, 2010

Age distribution in household survey data

Indicators in the field of education statistics, such as those defined in the education glossary of the UNESCO Institute for Statistics, are typically calculated for specific age groups. For example, the youth literacy rate is for the population age 15 to 24 years, the adult literacy rate for the population age 15 and over, and the net attendance rates for primary and secondary education are for the population of primary and secondary school age, respectively. The net intake rate is an example for an indicator that is calculated for a single year of age, the official start age of primary school.

For a correct calculation of education indicators it is necessary to have precise age data. In the case of data collected with population censuses or household surveys this means that the ages recorded for each household member should be without error. However, census or survey data sometimes exhibit the phenomenon of age heaping, usually on ages ending in 0 and 5. Such heaping or digit preference occurs when survey respondents don't know their own age or the ages of other household members, or when ages are intentionally misreported.

The presence of age heaping can be tested with indices of age preference such as Whipple's index. Heaping can also be detected through visual inspection of the age distribution in household survey data. Figures 1 and 2 summarize the age distribution in survey data from Brazil, India, Indonesia and Nigeria. The data from Brazil were collected with a Pesquisa Nacional por Amostra de DomicĂ­lios or National Household Sample Survey in 2006. The data for the other three countries are from Demographic and Health Surveys conducted between 2005 and 2008.

Figure 1 shows the share of single years of age in the total survey sample. A preference for ages ending in 0 and 5 is strikingly obvious in the data from India and Nigeria. In the data from Indonesia, age heaping is also present, but to a lesser extent than for India and Nigeria. Lastly, the graph for Brazil is relatively smooth, indicating a near absence of age heaping.

Figure 1: Age distribution in survey data by single-year age group
Line graph with age distribution in survey data by single-year age group
Data source: Brazil PNAD 2006, India DHS 2005-06, Indonesia DHS 2007, Nigeria DHS 2008.

In Figure 2, single ages are combined in five-year age groups, from 0-4 years and 5-9 years to 90-94 years and 95 years and over. Compared to Figure 1, the distribution lines are much smoother, including for India and Nigeria. We can conclude that age heaping is problematic for education indicators that are calculated for single years, for example all children of primary school entrance age, but less so for indicators that are calculated for a larger age group, for example all children of primary or secondary school age or all persons over 15 years of age.

Figure 2: Age distribution in survey data by five-year age group
Line graph with age distribution in survey data by five-year age group
Data source: Brazil PNAD 2006, India DHS 2005-06, Indonesia DHS 2007, Nigeria DHS 2008.

Related articles
External links
Friedrich Huebler, 28 February 2010 (edited 30 September 2010), Creative Commons License
Permanent URL: http://huebler.blogspot.com/2010/02/age.html

Thursday, February 25, 2010

10 billion songs have been downloaded from iTunes

"Apple® today announced that music fans have purchased and downloaded over 10 billion songs from the iTunes® Store (www.itunes.com), the world’s most popular online music, TV and movie store. The 10 billionth song, “Guess Things Happen That Way” by Johnny Cash, was purchased by Louie Sulcer of Woodstock, Georgia. As the winner of the iTunes Countdown to 10 Billion Songs, Louie will receive a $10,000 iTunes Gift Card. iTunes is the number one music retailer in the world and features the world’s largest music catalog with over 12 million songs."
Source: Apple press release, 25th March 2010
Previously - 8 billion downloads reported on 23rd July 2009

Wednesday, February 24, 2010

50m tweets are posted every day

Click to enlarge

"Folks were tweeting 5,000 times a day in 2007. By 2008, that number was 300,000, and by 2009 it had grown to 2.5 million per day. Tweets grew 1,400% last year to 35 million per day. Today, we are seeing 50 million tweets per day—that's an average of 600 tweets per second. (Yes, we have TPS reports.)
Tweet deliveries are a much higher number because once created, tweets must be delivered to multiple followers. Then there's search and so many other ways to measure and understand growth across this information network. Tweets per day is just one number to think about. We'll make time to share more information so please stay tuned."
Source: Kevin Weil of twitter, writing in the company's blog, 22nd February 2010

Friday, February 19, 2010

Fewer than 20% of the Top 100 videos on YouTube are user generated content


Source: Data from TubeMogul, reported by AllThingsDigital, 17th February 2010
There's a version of the data as a bar chart here
Note - the reason that they are strange percentages, rather than whole numbers, is that they look at a rolling list of the Top 100s over time. E.g. it might be 18 ugc videos one day, and 16 another day.

31% of American households have no internet connection

"Just 31 percent of American households lacked an Internet connection as of October 2009, down from 38 percent in 2007. The vast majority of those homes use a broadband connection, with the share of households with a dial-up connection slipping below 5 percent, according to data released by the National Telecommunications and Information Administration and the U.S. Census Bureau"
Source: National Telecommunications and Information Administration and the U.S. Census Bureau, reported in Mercury News, 19th February 2010

Farmville sells 800,000 virtual tractors every day

"FarmVille has nearly 81 million monthly uniques on Facebook, and [Gerd] Leonhard pointed to sales of 800,000 virtual tractors daily. Several months ago, BusinessWeek estimated annual revenues for parent Zynga at $100 million, though more recent estimates have pushed past $200 million."
Source: Gerd Leonhard, speaking at MIDEM, reported by DigitalMusicNews, 18th February 2010
Note - a virtual tractor costs $3.33