Subscribe to DSC Newsletter

All Blog Posts Tagged 'rstats' (11)

COVID-19 Risk Heat Maps with Location Data, Apache Arrow, Markov Chain Modeling, and R Shiny

This is the second of two articles about our recent participation in the Pandemic Response Hackathon. Our project (CoronaRank) was one of only 5 projects out of 230 submissions chosen as Spotlight Winners to present at the closing ceremony. Read on for the technical aspects of our solution, for a more general overview of the hackathon and our product see our first article…


Added by Filip Stachura on April 29, 2020 at 1:00am — 1 Comment

Julia vs R: Freeing the data scientist mind from the curse of vectoRization

Nowadays, most data scientists use either Python or R as their main programming language. That was also my case until I met Julia earlier this year. Julia promises performance comparable to statically typed compiled languages (like C) while keeping the rapid development features of interpreted languages (like Python, R or Matlab). This performance is achieved by just-in-time (JIT) compilation. Instead of interpreting code, Julia compiles code in runtime. While JIT compilation has…


Added by Daniel Moura on August 22, 2019 at 1:00pm — No Comments

Are you buying an apartment? How to hack competition in the real estate market with data monitoring

In the last couple of years, real estate companies have shifted their focus to the digital world, and now almost all investments have an online system showing what apartments are available. This is very convenient for their potential clients, as they can easily become familiar with the apartments on offer. Things become interesting when all available data is monitored on a weekly basis, and sales progress is analysed.

Why is this so important? As sales…

Added by Michał Frącek on October 11, 2018 at 1:30am — No Comments

R for hackers

Last Sunday at Trivadis Tech Event, I talked about R for Hackers. It was the first session slot on Sunday morning, it was a crazy, nerdy topic, and yet there were, like, 30 people attending! An emphatic thank you to everyone who came!

R a crazy, nerdy topic, - why that, you'll be asking? What's so nerdy about using R?

Well, it was about R. But it was neither an introduction ("how to get things done quickly with R"), nor was it even about data science. True, you…


Added by Sigrid Keydana on March 24, 2017 at 2:30am — No Comments

R for SQListas (1): Welcome to the Tidyverse

R for SQListas, what's that about?

This is the 2-part blog version of a talk I've given at DOAG Conference this week. I've also uploaded the slides (no ppt; just pretty R presentation ;-) ) to the articles section, but if you'd like a little text I'm encouraging you to read on. That is, if you're in the target group for this…


Added by Sigrid Keydana on November 17, 2016 at 11:30pm — No Comments

Map the Life Expectancy in United States with data from Wikipedia with R

Original post is published at DataScience+

Recently, I become interested to grasp the data from webpages, such as Wikipedia, and to visualize it with R. As I did in my previous post, I use rvest package to get the data from webpage and…


Added by Klodian on August 5, 2016 at 10:30pm — No Comments

Export Regression results from R to MS Word

In this post I will present a simple way how to export your regression results (or output) from R into Microsoft Word. Previously, I have written a tutorial how to create Table 1 with study characteristics and to export into Microsoft Word. These posts are especially useful for researchers who prepare their manuscript for publication in peer-reviewed journals.

Get the results…


Added by Klodian on June 9, 2016 at 11:43am — No Comments

Table 1 and the Characteristics of Study Population (rstats)

In research, especially in medical research, we describe characteristics of our study populations through Table 1. The Table 1 contain information about the mean for continue/scale variable, and proportion for categorical variable. For example: we say that the mean of systolic blood pressure in our study population is 145 mmHg, or 30% of participants are smokers. Since is called Table 1, means that is the first table in the manuscript.

To create the Table 1…


Added by Klodian on May 29, 2016 at 6:46am — No Comments

Identify, describe, plot, and remove the outliers from the dataset with R (rstats)

In statistics, a outlier is defined as a observation which stands far away from the most of other observations. Often a outlier is present due to the measurements error. Therefore, one of the most important task in data analysis is to identify and (if is necessary) to remove the outliers.

There are different methods to detect the outliers, including standard deviation approach and Tukey’s method which use interquartile (IQR) range approach. In this post I will use…


Added by Klodian on May 24, 2016 at 11:07pm — No Comments

How to: Parallel Programming in R and Python [Video]

Learn how to utilize multi-core, high-memory machines to dramatically accelerate your computations in R and Python, without any complex or time-consuming setup.

You'll learn:

  1. How to…

Added by Anna Anisin on January 28, 2015 at 12:30pm — No Comments

Get Started with the Data Science Bowl

We’ve created a Domino project with starter code in R and Python for participating in the Data Science Bowl. 

Get a jump start in the competition with our starter project by training your models on massive hardware and running multiple experiments in parallel while keeping track…


Added by Anna Anisin on January 13, 2015 at 3:00pm — No Comments

Monthly Archives













  • Add Videos
  • View All

© 2020   Data Science Central ®   Powered by

Badges  |  Report an Issue  |  Privacy Policy  |  Terms of Service