Biostatistics with R: An Introduction to Statistics Through Biological Data by Babak Shahbaba

By Babak Shahbaba

Biostatistics with R is designed round the dynamic interaction between statistical tools, their functions in biology, and their implementation. The booklet explains simple statistical recommendations with an easy but rigorous language. the advance of rules is within the context of actual utilized difficulties, for which step by step directions for utilizing R and R-Commander are supplied. subject matters comprise facts exploration, estimation, speculation trying out, linear regression research, and clustering with appendices on fitting and utilizing R and R-Commander. a singular characteristic of this publication is an creation to Bayesian analysis.

This writer discusses simple statistical research via a chain of organic examples utilizing R and R-Commander as computational instruments. The e-book is perfect for teachers of easy records for biologists and different well-being scientists. The step by step software of statistical equipment mentioned during this ebook permits readers, who're attracted to facts and its software in biology, to exploit the booklet as a self-learning textual content.

Example text

Suppose, for example, that almost all BMI values in our sample are between 20 and 40. Observing a BMI value of 50 would be suspicious. Further investigation might reveal that in fact this is the correct value of BMI for an individual in our sample. In this case, this outlier is a legitimate value. However, a BMI value of 500 or –50 is clearly an erroneous observation, which is possibly due to a data entry mistake. We could identify outliers using data exploration techniques. As an example, we use the AsthmaLOS data collected by [12] to study the length of stay in hospital for asthmatic children in the USA.

You can of course use the above approach to transform a variable in many other ways. For example, suppose that you want to apply the square transformation to a variable X. ) To do this, you can follow the above steps and simply enter X^2 under Expression to compute. 4 Creating New Variable Based on Two or More Existing Variables In the previous chapter, we discussed creating new variables based on existing ones as a common data preprocessing step. Here, we show how we can create a new variable based on two or more existing variables.

For example, if the variable age is denoted by X, then x5 = 23 means that the 5th individual in our sample is 23 years old. ) Based on the values a variable can take, we can classify it into one of two groups: numerical variables or categorical variables. tr data set are numerical variables since they take numerical values, and the numbers they take have their usual meaning. For example, we say that the second individual in our sample is older than the first individual since x2 = 55 is bigger than x1 = 24.

