The hidden problem in Big Data: even infinite information does not guarantee consistent measurement

Research output: Contribution to journalArticlepeer-review

Abstract

The social sciences heavily depend on the measurement of abstract constructs for quantifying effects, identifying associations between variables, and testing hypotheses. In data science, constructs are also often used for forecasting, and thanks to the recent big data revolution, they promise to enhance their accuracy by leveraging the constantly increasing stream of digital information around us. However, the possibility of optimizing various social indicators implicitly hinges on our ability to reliably reduce complex and abstract constructs (such as life satisfaction or social trust) into numeric measures. While many scientists are aware of the issue of measurement error, there is widespread, implicit hope that access to more data will eventually render this issue irrel-evant. This paper delves into the nature of measurement error under quasi-ideal condi-tions. We show mathematically and by employing simulations that single measurements fail to converge even when we can access progressively more information. Then, by using real-world data from the Social Capital Benchmark Surveys, we demonstrate how add-ing new information increases the dimensionality of the measured construct quasi-in-definitely, further contributing to measurement divergence. We conclude by discussing implications and future research directions to solve this problem.

Original languageEnglish
Pages (from-to)7-30
Number of pages24
JournalSociety Register
Volume8
Issue number4
DOIs
Publication statusPublished - 30 Dec 2024
Externally publishedYes

Keywords

  • Big Data
  • measurement
  • measurement crisis
  • measurement error
  • simula-tion
  • Social Capital Benchmark Surveys (SCBS)

Fingerprint

Dive into the research topics of 'The hidden problem in Big Data: even infinite information does not guarantee consistent measurement'. Together they form a unique fingerprint.

Cite this