.

Original post is published at DataScience+

Recently, I become interested to grasp the data from webpages, such as Wikipedia, and to visualize it with R. As I did in my previous post, I use `rvest`

package to get the data from webpage and…

Added by Klodian on August 5, 2016 at 10:30pm — No Comments

Original post published to DataScience+

In this post I will show how to collect data from a webpage and to analyze or visualize in R. For this task I will use the `rvest`

package and will get the data from Wikipedia. I got the idea to write this post from Fisseha Berhane.

I will…

Added by Klodian on June 27, 2016 at 11:48am — No Comments

This is a quick, short and concise tutorial on how to impute missing data. Previously, we have published an extensive tutorial on imputing missing values with MICE package. Current tutorial aim to be simple and user friendly for those who just starting using R.

I have created a simulated dataset, which you can load on your R environment by using the…

Added by Klodian on June 19, 2016 at 12:30pm — 3 Comments

In this post I will present a simple way how to export your regression results (or output) from R into Microsoft Word. Previously, I have written a tutorial how to create Table 1 with study characteristics and to export into Microsoft Word. These posts are especially useful for researchers who prepare their manuscript for publication in peer-reviewed journals.

Added by Klodian on June 9, 2016 at 11:43am — No Comments

In research, especially in medical research, we describe characteristics of our study populations through Table 1. The Table 1 contain information about the mean for continue/scale variable, and proportion for categorical variable. For example: we say that the mean of systolic blood pressure in our study population is 145 mmHg, or 30% of participants are smokers. Since is called Table 1, means that is the first table in the manuscript.

To create the Table 1…

ContinueAdded by Klodian on May 29, 2016 at 6:46am — No Comments

In statistics, a outlier is defined as a observation which stands far away from the most of other observations. Often a outlier is present due to the measurements error. Therefore, one of the most important task in data analysis is to identify and (if is necessary) to remove the outliers.

There are different methods to detect the outliers, including *standard deviation approach* and *Tukey’s method* which use interquartile (IQR) range approach. In this post I will use…

Added by Klodian on May 24, 2016 at 11:07pm — No Comments

- Map the Life Expectancy in United States with data from Wikipedia with R
- Visualizing obesity across United States by using data from Wikipedia
- Handling missing data with MICE package
- Export Regression results from R to MS Word
- Table 1 and the Characteristics of Study Population (rstats)
- Identify, describe, plot, and remove the outliers from the dataset with R (rstats)

- Identify, describe, plot, and remove the outliers from the dataset with R (rstats)
- Map the Life Expectancy in United States with data from Wikipedia with R
- Visualizing obesity across United States by using data from Wikipedia
- Export Regression results from R to MS Word
- Handling missing data with MICE package
- Table 1 and the Characteristics of Study Population (rstats)

- rstats (4)
- r (3)
- R (1)
- map (1)
- visualization (1)

- A History and Timeline of Big Data
- AI voice technology has benefits and limitations
- Strong data governance frameworks are fuel for analytics
- Top 12 most commonly used IoT protocols and standards
- What is the status of quantum computing for business?
- How parallelization works in streaming systems
- An Eggplant automation tool tutorial for Functional, DAI
- Circular economy model enables sustainability and resilience

Posted 29 March 2021

© 2021 TechTarget, Inc. Powered by

Badges | Report an Issue | Privacy Policy | Terms of Service

**Most Popular Content on DSC**

To not miss this type of content in the future, subscribe to our newsletter.

- Book: Applied Stochastic Processes
- Long-range Correlations in Time Series: Modeling, Testing, Case Study
- How to Automatically Determine the Number of Clusters in your Data
- New Machine Learning Cheat Sheet | Old one
- Confidence Intervals Without Pain - With Resampling
- Advanced Machine Learning with Basic Excel
- New Perspectives on Statistical Distributions and Deep Learning
- Fascinating New Results in the Theory of Randomness
- Fast Combinatorial Feature Selection

**Other popular resources**

- Comprehensive Repository of Data Science and ML Resources
- Statistical Concepts Explained in Simple English
- Machine Learning Concepts Explained in One Picture
- 100 Data Science Interview Questions and Answers
- Cheat Sheets | Curated Articles | Search | Jobs | Courses
- Post a Blog | Forum Questions | Books | Salaries | News

**Archives:** 2008-2014 |
2015-2016 |
2017-2019 |
Book 1 |
Book 2 |
More

**Most popular articles**

- Free Book and Resources for DSC Members
- New Perspectives on Statistical Distributions and Deep Learning
- Time series, Growth Modeling and Data Science Wizardy
- Statistical Concepts Explained in Simple English
- Machine Learning Concepts Explained in One Picture
- Comprehensive Repository of Data Science and ML Resources
- Advanced Machine Learning with Basic Excel
- Difference between ML, Data Science, AI, Deep Learning, and Statistics
- Selected Business Analytics, Data Science and ML articles
- How to Automatically Determine the Number of Clusters in your Data
- Fascinating New Results in the Theory of Randomness
- Hire a Data Scientist | Search DSC | Find a Job
- Post a Blog | Forum Questions