I describe here the projects that I worked on, as well as career progress, starting 25 years ago as a PhD student in statistics, until today, and the transformation from statistician to data scientist that occurred slowly and started more than 20 years ago. This also illustrates many applications of data science, most are still active.
I was reading through my Twitter feed the other day and saw a comment about the R language being too ad hoc for users. It got me thinking, "Is that bad? Aren't most languages initially seen as ad hoc?".
The beauty of R as a data science tool is its "ad hocedness" in that its use can satisfy multiple interests. Initially I can see this as troublesome in that learning the specificity of a tool's use can be daunting. But in the long-run I think this benefits a…Continue
Added by Justin on May 15, 2014 at 5:04pm — No Comments
Astrophysicist and data scientist Kirk Borne, Ph.D., was among the first to comprehend the importance of vast increases in data as a NASA scientist for almost two decades and now professor of Astrophysics and Computational Science at George Mason University. He’s among the top “influencers” on matters relating to “big data,” And IBM this year named him a…Continue
Added by Ryan Montano on May 15, 2014 at 11:30am — No Comments
From Slate, StatSoft, BBC, Venturebeat and niche publishers:
Added by Vincent Granville on May 15, 2014 at 11:30am — No Comments
Everyone is talking about data and Big data Whether it’s big or small, simple or complex, freely accessible or locked up in spreadsheets, everyone is worrying about how to get their hands on it . Every company has one or multiple servers, virtual in the cloud, on premise, or both based on the size of the organization. Those servers run applications, websites and other software, which all generate data. only a small amount of people have access to it. Now let me try to explain in simple word…Continue
Added by Prem sah on May 15, 2014 at 8:00am — No Comments
The full version is always published Monday. Starred articles are new additions, posted between Thursday and Sunday
Added by Vincent Granville on May 14, 2014 at 6:00pm — No Comments
Your article will be featured in our weekly…Continue
Added by Vincent Granville on May 14, 2014 at 2:00pm — No Comments
Interesting cartoon, epitomizing innumeracy (or simulated innumeracy). Necessary in today academia to survive and get grants.
Our prior article on this venue began outlining the business value for solving “the other churn” - employee attrition. We introduced…Continue
Added by Vincent Granville on May 13, 2014 at 9:00am — No Comments
"Life imitates art far more than art imitates life." - Oscar Wilde
In Woody Allen's 1973 iconoclastic movie "Sleeper" a man (health food store owner) wakes up two hundred years in the future. For breakfast…
According to the Asghar et al. (2009), Business Intelligence (BI) is divided into two main parts: (a) BI dimension and (b) BI process. Knowledge, functionality, technology, business and organisation are categorised under BI dimension. The performance of data sources, data warehousing, ETL, OLAPS and other related tools are categorised under BI process. Basically, dimensions and processes are interrelated to form a complete life cycle of BI system…Continue
Added by Avesh Dhakal on May 12, 2014 at 12:30am — No Comments
I was often the lone wolf among my peers in university because I supported a prominent place in society for corporations and an important social role for capital. I questioned whether the directors and executives of companies entered into boardrooms really intending to “oppress” people such as minorities and people with disabilities. Did they deliberately make bathrooms inaccessible to people in wheelchairs perhaps to advance their preconceptions of who gets to go to the bathroom, I pondered…Continue
Added by Don Philip Faithful on May 10, 2014 at 9:44am — No Comments
As more devices add touch capabilities, doesn't it make sense that your data should be flexible enough to push around?
Researchers at Carnegie Mellon University may be on to something big when it comes to manipulating Big Data.…Continue
Added by Michael Singer on May 9, 2014 at 1:30pm — No Comments
Data Analytics in Government
“If it ain’t broke don’t fix it.”
Were that remark directed at government at any level for any function the response would be predictable - could anything be more broke than government. Probably the f-uped conjunction would work its way into most responses. It’s hard to believe that anyone within or associated with government could react differently, even if their outward response were subdued.
Just experiment with it.…Continue
As the size of the database grows database performance becomes critical. Automation is a growing focus for data center operators facing increasingly complex environments. Database administration is complex, repetitive and time consuming. DBAs have to work long hours during off hours downtime. The outage of database costs heavily to the companies and affect their repute.
Shopping engines and online shopping places are highly dependent on database performance. Slower application…Continue
Added by Muhammad Saeed on May 9, 2014 at 4:00am — No Comments
The purpose of this article is to demonstrate how the practical Data Scientist can implement a Locality Sensitive Hashing system from start to finish in order to drastically reduce the search time typically required in high dimensional spaces when finding similar items. Locality Sensitive Hashing accomplishes this efficiency by exponentially reducing the amount of data required for storage when collecting features for comparison between similar…Continue
Added by Jake Drew Ph.D. on May 8, 2014 at 9:00am — No Comments
Starred articles are new additions, posted in the last three days.
Added by Vincent Granville on May 7, 2014 at 5:00pm — No Comments
Very interesting article published by the American Statistical Association. The picture below compares computer science with statistical science - before (I guess the early nineties) versus now. The column labeled CS3 (CS for Computer Science) represents modern computer science, actually this is data science. What's left in statistics is for the reader to guess, I suppose.…Continue
Whichever role you be in there are broadly 3 ways to be on a continuous learning track for your specialization field. Say, you are a doctor and for a doctor it is very essential to be up to date with all the latest pharmaceutical techniques and medicines. What are the avenues or channel you can take to keep yourself updated. First, you will read lots of latest edition books and magazine on medicine. Second, you will learn from your peers. Thirdly, you learn from your own experience.…Continue
Added by Tavish Srivastava on May 6, 2014 at 6:30am — No Comments
The prime benefit of data warehousing is simplicity. The presentation of data in data warehousing is a single image. This single image is made by collecting data from different department of the organisation. Due to this, time for production and operation of data reduces and thus simplifies the decision making as well. This reduction of time to access data also leads to increase in production and effectiveness. Data warehouse will also help to enhance the function of operational systems. It…Continue
Added by Avesh Dhakal on May 4, 2014 at 9:30pm — No Comments