Herramientas de usuario

Herramientas del sitio


# Biostatistics: a fundamental discipline at the core of modern health data science

ver https://www.mja.com.au/journal/2019/211/10/biostatistics-fundamental-discipline-core-modern-health-data-science

Statistical reasoning provides the theoretical basis for extracting knowledge from data in the presence of variability and uncertainty. It is a critical element of most empirical research in public health and clinical medicine, with the best studies incorporating biostatistical input on aspects from study design to data analysis and reporting. Biostatistical methods underpin key public health research disciplines, such as epidemiology and health services research, a role that reflects the core nature of the discipline of biostatistics. Similarly, bioinformatics and computational biology are important new areas in data‐intensive biomedical research that are underpinned by statistical concepts and methods, along with components heavily informed by other core disciplines such as computer science and mathematics. The critical role of biostatistics was affirmed in a recent review of the scale of waste and inefficiency in health research, which observed that, “These issues [of poor study design, conduct and analysis] are often related to misuse of statistical methods, which is accentuated by inadequate training in methods,”3 echoing similar observations made over two decades earlier.4

Importantly, biostatistics, as a subdiscipline of statistics (arguably, the original “data science”5), is an established scientific discipline of its own and is not simply a toolkit of techniques that need to be used correctly. Sound biostatistical work requires not only an understanding of mathematics, probability and sources of bias, which underpin statistical theory and methods, but also (and increasingly) extensive technical skills, including computing. In‐depth training is needed to develop these skills along with the understanding required to conceptualise problems and navigate the tricky waters between real‐world health questions and complex techniques. As noted in a recent review, such training would be very difficult to achieve for most clinicians.6 Superficial understanding of statistics can easily lead to unscientific practice (recently characterised as “cargo‐cult statistics”7) and may be seen as responsible in large part for the current “crisis of reproducibility” in research.8 A prominent example is the evolution of beliefs concerning the risk of cardiovascular disease associated with postmenopausal oestrogen therapy. Influential observational studies in the late 1990s claimed to demonstrate evidence of reduced risk of heart attacks, a conclusion that was contradicted by a major randomised trial.9 Careful re‐analysis of the observational data, guided by contemporary statistical thinking about confounding and time‐dependent changes in risk, produced results that were similar to the randomised trial.10

The emerging era of big data heightens the need for biostatistical expertise, with more decision makers and researchers aiming to extract value from complex messy data, and increasing use of packaged software by individuals with insufficient understanding of the underlying methods. Big data require both an advanced understanding of fundamental statistical concepts and methods, including recent developments in causal reasoning,11 as well as enhanced capacity in computational tools such as dimensionality reduction, distributed processing, machine learning and natural language processing. More data do not necessarily mean better data, and more analytics does not necessarily mean better science, as the quality and reproducibility of research findings will remain highly dependent on the design of the data collection, an understanding of associated limitations and resulting biases, as well as appropriate analytical methods.12,13

bioestadistica/rol_de_la_bioestadistica_y_data_science.txt · Última modificación: 2019/10/28 02:07 por admin