Subscribe to DSC Newsletter
steve miller
  • Male
  • Chicago, Il
  • United States
Share on Facebook
Share

Steve miller's Friends

  • Kleanthis Koupidis
  • paul gureghian

Gifts Received

Gift

steve miller has not received any gifts yet

Give a Gift

 

steve miller's Page

Latest Activity

Cezar Baisanu liked steve miller's blog post Multi Gigabyte R data.table for Ohio Voter Registration/History
Jan 16
steve miller's blog post was featured

Multi Gigabyte R data.table for Ohio Voter Registration/History

Summary: This blog details R data.table programming to handle multi-gigabyte data. It shows how the data can be efficiently loaded, "normalized", and counted. Readers can readily copy and enhance the code below for their own analytic needs. An intermediate level of R coding sophistication is assumed.In my travels over the holidays, I came across an …See More
Jan 15
steve miller's blog post was featured

Using "record id's" to facilitate processing in Python-Pandas and R-data.table.

Both R and Python-Pandas are array-oriented platforms that support fast filtering through vectors of record-id's. In Python-Pandas, such vectors are implemented via Pandas's powerful index construct; in R-data.table, they're accessible through the "which" and "row.name" functions. In both instances, joins to record-id vectors generate fast subsetted access.How is the record-id vector approach helpful? For starters, the analyst can encapsulate common subsetting conditions once and use many…See More
Dec 15, 2019
steve miller's blog post was featured

Working with Control Breaks Data in R.

About a year ago, a young neighbor who's enrolled in an MS is Data Science program asked my help on an R coding exercise. The challenge was to compute several new category attributes based on columns in an initially loaded dataframe. His solution was to loop through each of the df rows, populating the new vars with basic if/then logic. Kind of reminded me of how I might…See More
Nov 5, 2019
Tim Matteson liked steve miller's blog post AWK -- a Blast from Wrangling Past.
Sep 24, 2019
John L. Ries commented on steve miller's blog post AWK -- a Blast from Wrangling Past.
"I never stopped using AWK.  If patterns work, then I use it and it works nicely.  If I need more complex text processing, I use Perl or (more recently) Python. I use sed too, but only for the simplest text processing. UNIX toys are my…"
Sep 23, 2019
Vincent Granville commented on steve miller's blog post AWK -- a Blast from Wrangling Past.
"Wow, that reminds me old memories, like SED (stream editor.)"
Sep 21, 2019
steve miller's blog post was featured

AWK -- a Blast from Wrangling Past.

I recently came across an interesting account by a practical data scientist on how to munge 25 TB of data. What caught my eye at first was the article's title: "Using AWK and R to parse 25tb". I'm a big R user now and made a living with AWK 30 years ago as a budding data analyst. I also empathized with the author's recountings of his painful but steady education on…See More
Sep 21, 2019
steve miller shared their blog post on Facebook
Sep 5, 2019
steve miller shared their blog post on Facebook
Sep 5, 2019
steve miller posted a blog post

Jobs, Unemployment and 45's Performance.

Despite the consuming controversy surrounding his presidency, POTUS 45 has been able to secure solid ratings on the performance of the economy over his so-far 30-month administration. And he certainly isn't bashful about taking credit for the successes, opining loudly and often that his tax cuts and de-regulation initiatives have significantly goosed the economy.45 has a…See More
Sep 4, 2019
steve miller's blog post was featured

Jobs, Unemployment and 45's Performance.

Despite the consuming controversy surrounding his presidency, POTUS 45 has been able to secure solid ratings on the performance of the economy over his so-far 30-month administration. And he certainly isn't bashful about taking credit for the successes, opining loudly and often that his tax cuts and de-regulation initiatives have significantly goosed the economy.45 has a…See More
Sep 4, 2019
steve miller posted a blog post

Using Python and R to Load Relational Database Tables, Part II

Last time I wrote on using Python/Pandas as an adjunct to loading PostgreSQL tables. In this sequel, I demo how R can be used to collaborate with the database in a similar way.The strategy adopted remains the…See More
Aug 8, 2019
steve miller's blog post was featured

Using Python and R to Load Relational Database Tables, Part II

Last time I wrote on using Python/Pandas as an adjunct to loading PostgreSQL tables. In this sequel, I demo how R can be used to collaborate with the database in a similar way.The strategy adopted remains the…See More
Aug 8, 2019
Richard Morgan liked steve miller's profile
Aug 6, 2019
Richard Morgan liked steve miller's blog post Using Python and R to Load Relational Database Tables, Part I
Aug 6, 2019

Profile Information

Short Bio
40 years experience in consulting services surrounding BI, statistics, analytics, and data science. Most recent position was President of Inquidia Consulting and EVP of BI/Analytics at Braun Consulting. have been a writer for information management, dataversity, and beyenetwork for 12 years.
My Web Site Or LinkedIn Profile
http://https://www.linkedin.com/in/steve-miller-58ab881/
Professional Status
Consultant
Years of Experience:
40
Industry:
Consulting
Your Job Title:
consultant
How did you find out about DataScienceCentral?
researching data science topics. looking to find a venue for some of my blogs.
Interests:
Other
What is your Favorite Data Mining or Analytical Website?
http://www.datasciencecentral.com
What Other Analytical Website do you Recommend?
http://www.oreilly.com

Steve miller's Blog

Multi Gigabyte R data.table for Ohio Voter Registration/History

Posted on January 15, 2020 at 5:29am 0 Comments

Summary: This blog details R data.table programming to handle multi-gigabyte data. It shows how the data can be efficiently loaded, "normalized", and counted. Readers can readily copy and enhance the code below for their own analytic needs. An intermediate level of R coding sophistication is assumed.

In my travels over the holidays, I…

Continue

Using "record id's" to facilitate processing in Python-Pandas and R-data.table.

Posted on December 13, 2019 at 5:51am 0 Comments

ID card template example

Both R and Python-Pandas are array-oriented platforms that support fast filtering through vectors of record-id's. In Python-Pandas, such vectors are implemented via Pandas's powerful index construct; in R-data.table, they're accessible through the "which" and "row.name" functions. In both instances, joins to record-id vectors generate fast subsetted access.

How is the record-id vector approach helpful? For starters, the analyst can encapsulate common…

Continue

Working with Control Breaks Data in R.

Posted on November 4, 2019 at 9:04am 0 Comments

Continue

AWK -- a Blast from Wrangling Past.

Posted on September 21, 2019 at 5:30am 2 Comments

I recently came across an interesting account by a practical data scientist on how to munge 25 TB of data. What caught my eye at first was the article's title: "Using AWK and R to parse 25tb". I'm a big R user now and made a living with AWK 30 years ago as a budding data analyst. I also empathized with the author's recountings of…

Continue

Comment Wall

You need to be a member of Data Science Central to add comments!

Join Data Science Central

  • No comments yet!
 
 
 

Videos

  • Add Videos
  • View All

© 2020   Data Science Central ®   Powered by

Badges  |  Report an Issue  |  Privacy Policy  |  Terms of Service