In research, especially in medical research, we describe characteristics of our study populations through Table 1. The Table 1 contain information about the mean for continue/scale variable, and proportion for categorical variable. For example: we say that the mean of systolic blood pressure in our study population is 145 mmHg, or 30% of participants are smokers. Since is called Table 1, means that is the first table in the manuscript.

To create the Table 1 sometimes it can be very time consuming. Imagine if we have 10 variables (e.g. age, gender.. etc) for 3 groups, and for each variable we compute mean (standard deviation) and number of participants (proportion); in the end we have to fill 60 numbers in the table. Moreover, we usually export the table from R to Microsoft Word, and we can be prone to making mistakes if we’re copy/pasting. Therefore, I did a search to find a simple and comprehensive way to make Table 1 with R. I found two very interesting packages named “Tableone” and “ReporteRs”. The TableOne package is created by Kazuki Yoshida and Justin Bohn and is used to create the Table 1 in R. The ReporteRs package is created by David Gohel and in this post I use for exporting Table from R to Microsoft Word.

I simulated a dataset by using functions `rnorm()`

and `sample()`

. You can download this simulated data set on you desktop to replicate this post. To learn how to upload your dataset into R read this post.

Load the dataset in R:

dt <- read.csv(file.choose(), header=TRUE, sep=",")

Load the package necessary for making Table 1:

`#Load package library(tableone) #Create a variable list which we want in Table 1 listVars <- c("Age", "Gender", "Cholesterol", "SystolicBP", "BMI", "Smoking", "Education") #Define categorical variables catVars <- c("Gender","Smoking","Education")`

My first interest is to make the Table 1 for total population.

#Total Population

table1 <- CreateTableOne(vars = listVars, data = dt, factorVars = catVars)

table1

But, often I am interested to create Table 1 for men and women and to compare their means and proportions. To compute this I run the code below.

`# Removing Gender from list of variables `

listVar <- c("Age", "Cholesterol", "SystolicBP", "BMI", "Smoking", "Education")

table1 <- CreateTableOne(listVars, dt, catVars, strata = c("Gender"))

table1

You can read the original post and more at DataScience+

© 2020 Data Science Central ® Powered by

Badges | Report an Issue | Privacy Policy | Terms of Service

**Upcoming DSC Webinar**

- Optimization and The NFL’s Toughest Scheduling Problem - June 23

At first glance, the NFL’s scheduling problem seems simple: 5 people have 12 weeks to schedule 256 games over the course of a 17-week season. The scenarios are potentially well into the quadrillions. In this latest Data Science Central webinar, you will learn how the NFL began using Gurobi’s mathematical optimization solver to tackle this complex scheduling problem. Register today.

**Most Popular Content on DSC**

To not miss this type of content in the future, subscribe to our newsletter.

- Book: Statistics -- New Foundations, Toolbox, and Machine Learning Recipes
- Book: Classification and Regression In a Weekend - With Python
- Book: Applied Stochastic Processes
- Long-range Correlations in Time Series: Modeling, Testing, Case Study
- How to Automatically Determine the Number of Clusters in your Data
- New Machine Learning Cheat Sheet | Old one
- Confidence Intervals Without Pain - With Resampling
- Advanced Machine Learning with Basic Excel
- New Perspectives on Statistical Distributions and Deep Learning
- Fascinating New Results in the Theory of Randomness
- Fast Combinatorial Feature Selection

**Other popular resources**

- Comprehensive Repository of Data Science and ML Resources
- Statistical Concepts Explained in Simple English
- Machine Learning Concepts Explained in One Picture
- 100 Data Science Interview Questions and Answers
- Cheat Sheets | Curated Articles | Search | Jobs | Courses
- Post a Blog | Forum Questions | Books | Salaries | News

**Archives:** 2008-2014 |
2015-2016 |
2017-2019 |
Book 1 |
Book 2 |
More

**Upcoming DSC Webinar**

- Optimization and The NFL’s Toughest Scheduling Problem - June 23

At first glance, the NFL’s scheduling problem seems simple: 5 people have 12 weeks to schedule 256 games over the course of a 17-week season. The scenarios are potentially well into the quadrillions. In this latest Data Science Central webinar, you will learn how the NFL began using Gurobi’s mathematical optimization solver to tackle this complex scheduling problem. Register today.

**Most popular articles**

- Free Book and Resources for DSC Members
- New Perspectives on Statistical Distributions and Deep Learning
- Time series, Growth Modeling and Data Science Wizardy
- Statistical Concepts Explained in Simple English
- Machine Learning Concepts Explained in One Picture
- Comprehensive Repository of Data Science and ML Resources
- Advanced Machine Learning with Basic Excel
- Difference between ML, Data Science, AI, Deep Learning, and Statistics
- Selected Business Analytics, Data Science and ML articles
- How to Automatically Determine the Number of Clusters in your Data
- Fascinating New Results in the Theory of Randomness
- Hire a Data Scientist | Search DSC | Find a Job
- Post a Blog | Forum Questions

## You need to be a member of Data Science Central to add comments!

Join Data Science Central