Home » Uncategorized

R, Python, Julia — and Polyglot

2808365373

A poll released recently showed Python increasing its lead over R as the language of choice for analytics professionals. Setting aside questions of the representativeness to the analytics practitioner population of a sample produced from online polling, the findings have nonetheless sparked spirited discussion on the future of software for the trade.

My unscientific sample of opinion shows Python slightly ahead of R, with users of each quite passionate about their favorite. And my take is that with the mature ecosystems of both, Python and R will continue to develop, grow, and compete for the foreseeable future.

What I find particularly heartening are the significant developments surrounding interoperability of the two platforms — the ability to invoke R within Python programs as well as, conversely, Python within R. Indeed, I’ve written on both Python within R and R within Python for Data Science Central in recent months. Kudos to Python commercial vendor Anaconda and R commercial vendor RStudio for actively promoting these “polyglot” features.

Now complicate this analytics software divide even further by introducing Julia, a language designed from the ground up for performant analytics. With MIT bona-fides, Julia has significantly progressed since its release in 2009. ‘“Julia has been revolutionizing scientific and technical computing since 2009,” says Edelman, the year the creators started working on a new language that combined the best features of Ruby, MatLab, C, Python, R, and others.’ I’m now on my third go-round with Julia and am finally beginning to feel it’s legit. The essential DataFrames package is the real deal.

A new competitor such as Julia is considerably behind from the get-go, remaining so until it can both attain a noticeable programmer presence and establish an open source ecosystem. Julia is approaching that point now, helped in no small part by star recognition and a polyglot commitment that allows it to co-exist in Python/R worlds. I just love the prospect of using R’s uber-productive ggplot in Python and Julia. And I must admit I’m quite impressed by R-to-Julia package XRJulia developed by venerable S architect/developer John Chambers, and the Julia-to-R library, Rif from R luminary Laurent Gautier — even though getting them to work is not for the faint of heart.

This Julia kernel Jupyter Notebook purports to demonstrate interoperability from Julia to R and Julia to Python, showcasing the RCall and Pandas packages. I first read a personal, daily-updated dataset of daily stock index levels into a Julia DataFrame. I then summarize the data for a subset of portfolios, “feeding” the resultant DataFrame to a series of R ggplot scripts. I finally invoke Python Pandas within Julia to read the data into a Python DataFrame that is summarized and transformed to Julia for similar R ggplot visualizations. A subsequent blog will examine R to Julia and Python to Julia functionality.

The software used here is Julia 1.0.0, Python 3.6.5, Microsoft Open R 3.4.4, and JupyterLab 0.32.1.

Find the remainder of the post here.