Subscribe to DSC Newsletter

All Blog Posts (7,243)

My data science journey

I describe here the projects that I worked on, as well as career progress, starting 25 years ago as a PhD student in statistics, until today, and the transformation from statistician to data scientist that occurred slowly and started more than 20 years ago. This also illustrates many applications of data science, most are still active.

Early years

My interest…


Added by Vincent Granville on May 17, 2014 at 12:00pm — 1 Comment

R as Ad Hoc

I was reading through my Twitter feed the other day and saw a comment about the R language being too ad hoc for users.  It got me thinking, "Is that bad? Aren't most languages initially seen as ad hoc?".  

The beauty of R as a data science tool is its "ad hocedness" in that its use can satisfy multiple interests.  Initially I can see this as troublesome in that learning the specificity of a tool's use can be daunting.  But in the long-run I think this benefits a…


Added by Justin on May 15, 2014 at 5:04pm — No Comments

Biggest Potential for Big Data: The Expanding Universe of Unknown Unknowns


Astrophysicist and data scientist Kirk Borne, Ph.D., was among the first to comprehend the importance of vast increases in data as a NASA scientist for almost two decades and now professor of Astrophysics and Computational Science at George Mason University. He’s among the top “influencers” on matters relating to “big data,” And IBM this year named him a…


Added by Ryan Montano on May 15, 2014 at 11:30am — No Comments

Reduce operational costs and improve data driven decision making in Big data Era!

Everyone is talking about data and Big data Whether it’s big or small, simple or complex, freely accessible or locked up in spreadsheets, everyone is worrying about how to get their hands on it . Every company has one or multiple servers, virtual in the cloud, on premise, or both based on the size of the organization. Those servers run applications, websites and other software, which all generate data. only a small amount of people have access to it. Now let me try to explain in simple word…


Added by Prem sah on May 15, 2014 at 8:00am — No Comments

Weekly Digest - May 19

The full version is always published Monday. Starred articles are new additions, posted between Thursday and Sunday

Featured Contributions


Added by Vincent Granville on May 14, 2014 at 6:00pm — No Comments

50 free copies of data science book, signed by the author: get yours!

Fifty copies of my Wiley book are available for the first 50 bloggers posting an original, relevant, non-promotional article, in our blog section.

Your article will be featured in our weekly…


Added by Vincent Granville on May 14, 2014 at 2:00pm — No Comments

The Science News Cycle

Interesting cartoon, epitomizing innumeracy  (or simulated innumeracy). Necessary in today academia to survive and get grants.


Added by Mirko Krivanek on May 13, 2014 at 6:30pm — 1 Comment

Employee Churn 202: Good and Bad Churn

Guest blog post by Pasha Roberts, Chief Scientist, Talent Analytics @pasharoberts

Our prior article on this venue began outlining the business value for solving “the other churn” - employee attrition. We introduced…


Added by Vincent Granville on May 13, 2014 at 9:00am — No Comments

Bad Data Science and Woody Allen

"Life imitates art far more than art imitates life." - Oscar Wilde

In Woody Allen's 1973 iconoclastic movie "Sleeper" a man (health food store owner) wakes up two hundred years in the future. For breakfast…


Added by Michael Walker on May 12, 2014 at 3:00pm — 1 Comment

Business Intelligence Architecture

According to the Asghar et al. (2009), Business Intelligence (BI) is divided into two main parts: (a) BI dimension and (b) BI process. Knowledge, functionality, technology, business and organisation are categorised under BI dimension. The performance of data sources, data warehousing, ETL, OLAPS and other related tools are categorised under BI process. Basically, dimensions and processes are interrelated to form a complete life cycle of BI system…


Added by Avesh Dhakal on May 12, 2014 at 12:30am — No Comments

Addition of Different Dimensions to Data

I was often the lone wolf among my peers in university because I supported a prominent place in society for corporations and an important social role for capital. I questioned whether the directors and executives of companies entered into boardrooms really intending to “oppress” people such as minorities and people with disabilities. Did they deliberately make bathrooms inaccessible to people in wheelchairs perhaps to advance their preconceptions of who gets to go to the bathroom, I pondered…


Added by Don Philip Faithful on May 10, 2014 at 9:44am — No Comments

iPad Program Let's You Touch Your Data


As more devices add touch capabilities, doesn't it make sense that your data should be flexible enough to push around?

Researchers at Carnegie Mellon University may be on to something big when it comes to manipulating Big Data.…


Added by Michael Singer on May 9, 2014 at 1:30pm — No Comments

Data Analytics in Government

Data Analytics in Government

“If it ain’t broke don’t fix it.”

Were that remark directed at government at any level for any function the response would be predictable - could anything be more broke than government. Probably the f-uped conjunction would work its way into most responses. It’s hard to believe that anyone within or associated with government could react differently, even if their outward response were subdued.

Just experiment with it.…


Added by Ken Gold on May 9, 2014 at 12:00pm — 3 Comments

GUI software for database performance management

As the size of the database grows database performance becomes critical. Automation is a growing focus for data center operators facing increasingly complex environments. Database administration is complex, repetitive and time consuming. DBAs have to work long hours during off hours downtime. The outage of database costs heavily to the companies and affect their repute.

Shopping engines and online shopping places are highly dependent on database performance. Slower application…


Added by Muhammad Saeed on May 9, 2014 at 4:00am — No Comments

Practical Applications of Locality Sensitive Hashing for Unstructured Data


The purpose of this article is to demonstrate how the practical Data Scientist can implement a Locality Sensitive Hashing system from start to finish in order to drastically reduce the search time typically required in high dimensional spaces when finding similar items.  Locality Sensitive Hashing accomplishes this efficiency by exponentially reducing the amount of data required for storage when collecting features for comparison between similar…


Added by Jake Drew Ph.D. on May 8, 2014 at 9:00am — No Comments

Weekly Digest, May 12

Starred articles are new additions, posted in the last three days.

Featured Contributions


Added by Vincent Granville on May 7, 2014 at 5:00pm — No Comments

How the gap between data science and statistics grew over time

Very interesting article published by the American Statistical Association. The picture below compares computer science with statistical science - before (I guess the early nineties) versus now. The column labeled CS3 (CS for Computer Science) represents modern computer science, actually this is data science. What's left in statistics is for the reader to guess, I suppose.…


Added by Mirko Krivanek on May 6, 2014 at 7:36am — 1 Comment

Five must read analytics blogs

Whichever role you be in there are broadly 3 ways to be on a continuous learning track for your specialization field. Say, you are a doctor and for a doctor it is very essential to be up to date with all the latest pharmaceutical techniques and medicines. What are the avenues or channel you can take to keep yourself updated. First, you will read lots of latest edition books and magazine on medicine. Second, you will learn from your peers. Thirdly, you learn from your own experience.…


Added by Tavish Srivastava on May 6, 2014 at 6:30am — No Comments

Benefits of Data warehouse and Business Intelligence

The prime benefit of data warehousing is simplicity. The presentation of data in data warehousing is a single image. This single image is made by collecting data from different department of the organisation. Due to this, time for production and operation of data reduces and thus simplifies the decision making as well. This reduction of time to access data also leads to increase in production and effectiveness. Data warehouse will also help to enhance the function of operational systems. It…


Added by Avesh Dhakal on May 4, 2014 at 9:30pm — No Comments

Blog Topics by Tags

Monthly Archives













  • Add Videos
  • View All

© 2020   Data Science Central ®   Powered by

Badges  |  Report an Issue  |  Privacy Policy  |  Terms of Service