.

Klodian's Blog (6)

Map the Life Expectancy in United States with data from Wikipedia with R

Original post is published at DataScience+

Recently, I become interested to grasp the data from webpages, such as Wikipedia, and to visualize it with R. As I did in my previous post, I use `rvest` package to get the data from webpage and…

Continue

Added by Klodian on August 5, 2016 at 10:30pm — No Comments

Visualizing obesity across United States by using data from Wikipedia

Original post published to DataScience+

In this post I will show how to collect data from a webpage and to analyze or visualize in R. For this task I will use the `rvest` package and will get the data from Wikipedia. I got the idea to write this post from Fisseha Berhane.

I will…

Continue

Added by Klodian on June 27, 2016 at 11:48am — No Comments

Handling missing data with MICE package

This is a quick, short and concise tutorial on how to impute missing data. Previously, we have published an extensive tutorial on imputing missing values with MICE package. Current tutorial aim to be simple and user friendly for those who just starting using R.

Preparing the dataset

I have created a simulated dataset, which you can load on your R environment by using the…

Continue

Added by Klodian on June 19, 2016 at 12:30pm — 3 Comments

Export Regression results from R to MS Word

In this post I will present a simple way how to export your regression results (or output) from R into Microsoft Word. Previously, I have written a tutorial how to create Table 1 with study characteristics and to export into Microsoft Word. These posts are especially useful for researchers who prepare their manuscript for publication in peer-reviewed journals.

Get the results…

Continue

Added by Klodian on June 9, 2016 at 11:43am — No Comments

Table 1 and the Characteristics of Study Population (rstats)

In research, especially in medical research, we describe characteristics of our study populations through Table 1. The Table 1 contain information about the mean for continue/scale variable, and proportion for categorical variable. For example: we say that the mean of systolic blood pressure in our study population is 145 mmHg, or 30% of participants are smokers. Since is called Table 1, means that is the first table in the manuscript.

To create the Table 1…

Continue

Added by Klodian on May 29, 2016 at 6:46am — No Comments

Identify, describe, plot, and remove the outliers from the dataset with R (rstats)

In statistics, a outlier is defined as a observation which stands far away from the most of other observations. Often a outlier is present due to the measurements error. Therefore, one of the most important task in data analysis is to identify and (if is necessary) to remove the outliers.

There are different methods to detect the outliers, including standard deviation approach and Tukey’s method which use interquartile (IQR) range approach. In this post I will use…

Continue

Added by Klodian on May 24, 2016 at 11:07pm — No Comments