Subscribe to DSC Newsletter

Steve miller's Blog (35)

Simulated Significance

I pulled out a dusty copy of Thinking Stats by Allen Downey the other day. I highly recommend this terrific little read that teaches statistics with easily understood examples using Python. When I purchased the book eight years ago, the Python code proved invaluable as…

Continue

Added by steve miller on May 30, 2019 at 7:56am — No Comments

Nowcasting Chicago Crime with Python-Pandas, and R.

In my many years as a data scientist, I've spent more time doing forecast work than any other type of predictive modeling. Often as not, the challenges have involved forecasting demand for an organization's many products/lines of business a year or more out based on five or more years of actual data, generally of daily…

Continue

Added by steve miller on May 7, 2019 at 5:34am — 1 Comment

Frequencies in Pandas Redux

 

A little less than a year ago, I posted a blog on generating multivariate frequencies with the Python Pandas data management library, at the same time showcasing Python/R graphics interoperability. For my…

Continue

Added by steve miller on April 25, 2019 at 5:33am — No Comments

March Madness, KenPom and Python/Pandas.



March Madness officially arrived at 6 PM CDT, Sunday 3/17/2019. 68 D1 schools -- 32 league champions and 36 at large selections -- received invitations to this year's tournament, which starts…

Continue

Added by steve miller on March 18, 2019 at 5:35am — No Comments

A Blast from Python Past -- Part 3

Last time, I posted Part 2 of a blog trilogy on data programming with Python. That article revolved on showcasing …

Continue

Added by steve miller on March 4, 2019 at 9:01am — No Comments

A Blast from Python Past -- Part 2

Last week I posted the first of a three-part series on basic data programming with Python. For that article, I resurrected scripts written 10 years ago that deployed core Python data structures and functions to assemble a Python list for…

Continue

Added by steve miller on February 5, 2019 at 7:55am — No Comments

A Blast from Python Past

I had an interesting discussion with one of my son's friends at a neighborhood gathering over the holidays. He's just reached the halfway point of a Chicago-area Masters in Analytics program and wanted to pick my brain on the state of the discipline.

Of the four major program foci of business, data, computation, and algorithms, he acknowledged…

Continue

Added by steve miller on January 28, 2019 at 8:24am — No Comments

Kicking Chicago with R.

Like most Chicago football fans, I was pretty distraught after the Bears lost last Sunday's playoff game courtesy of a missed field goal at the end -- a kick that first hit the goalpost and then the crossbar before ultimately failing miserably. While most local fans were grief-stricken like me, some were irrationally inconsolable, demanding the…

Continue

Added by steve miller on January 11, 2019 at 8:22am — No Comments

XGBoost with Python -- Part 0

After posting my last blog, I decided next to do a 2-part series on …

Continue

Added by steve miller on December 20, 2018 at 7:30am — 2 Comments

A So-So Second Date with Julia

A So-So Second Date with Julia

A few months ago, I wrote a quite positive blog on the Julia analytics language,…

Continue

Added by steve miller on December 12, 2018 at 11:49am — No Comments

Matching the Exact Matching of MatchIt

I started a series on causal inference for data science a few weeks back. I think CI methodologies offer great potential for the DS discipline, given that much of our data is observational i.e. outside…

Continue

Added by steve miller on November 19, 2018 at 1:31pm — No Comments

POTUS and the Stock Market

For those who follow the stock market, October's been a pretty rough month, with overall market levels, as measured by major indexes such as the Russell 3000 and the Wilshire 5000, now down into correction territory of 10 percent declines. The falls, unfortunately, closely follow a …

Continue

Added by steve miller on October 30, 2018 at 8:35am — No Comments

Mixing & Matching in R for Data Science

I've spent time over the last few months attempting to enhance my skills in the statistical sub-field of causal inference.

Overly simplified, causal inference comprises a series of methodologies and techniques to assist analysts in making the jump from association or correlation to cause and effect. How can one progress from noting a correlation between factors…

Continue

Added by steve miller on October 10, 2018 at 10:00am — 5 Comments

R, Python, Julia -- and Polyglot

A poll released recently showed Python increasing its lead over R as the language of choice for analytics professionals. Setting aside questions of the representativeness to the analytics practitioner population of a…

Continue

Added by steve miller on September 24, 2018 at 10:55am — 4 Comments

A Little College Sports Analysis, Part III.

This is the third and (I promise) last of the series "A Little College Sports Analysis", wherein I attempt to use data from the Learfield Directors' Cup to evaluate the prowess of college athletic conferences. The …

Continue

Added by steve miller on September 10, 2018 at 11:21am — No Comments

A Little College Sports Analysis II -- 2017-2018 Directors' Cup Conference Rankings

Last time, I wrote on wrangling data from a pdf file to assemble a data set of D1 college athletic performance in the Learfield Directors' Cup competition. In this blog, I embellish that data, calculating individual school ranks from scores…

Continue

Added by steve miller on August 27, 2018 at 10:53am — No Comments

A Little College Sports Analysis, but First a Little Data Wrangling

I'm a big college sports fan, especially active in debates about which D1 conference is best. Five years ago, I came across the Learfield Directors' Cup, an annual evaluation/ranking program of college sports performance based on hard numbers. In separate rankings, Division I,II, and…

Continue

Added by steve miller on August 14, 2018 at 12:35pm — No Comments

a Little SQL with a Little R

My nephew's a very impressive young man. Five years ago, he received a PhD in Biochemistry/Molecular Biology from a prestigious university, earning numerous teaching and research awards along the way. He then took a faculty…

Continue

Added by steve miller on July 16, 2018 at 12:02pm — 6 Comments

ff and Too-Big-for-Memory Data in R -- Part III

After my last blog on the use of relational databases PostgreSQL and MonetDB to help compensate for R's RAM limitations, I received an email from a reader who asked if I'd ever used the R …

Continue

Added by steve miller on July 2, 2018 at 11:30am — No Comments

PostgreSQL, MonetDB, and Too-Big-for-Memory Data in R -- Part II

In PostgreSQL, MonetDB, and Too-Big-for-Memory Data in R -- Part I, I began to discuss how data that was too big for RAM is handled in R, a memory-constrained statistical platform. I attempted to demonstrate the potential of working…

Continue

Added by steve miller on June 13, 2018 at 10:00am — No Comments

Videos

  • Add Videos
  • View All

© 2019   Data Science Central ®   Powered by

Badges  |  Report an Issue  |  Privacy Policy  |  Terms of Service