Everybody talks about beautiful charts and graphs, usually produced by Tableau or R, see links below. They are definitely great and I guess publishers (including us!) love to regularly show such graphs to our readers because it generates a lot of reactions.
But what about graphic diagrams (see examples below)…Continue
Added by Vincent Granville on March 17, 2013 at 1:16pm — No Comments
There are two types of data scientists:
In-Memory Data Grids (IDG) allow organizations to collect, store, analyze and distribute large, fast-changing data sets in near real-time. Organizations are increasingly using IDG's for the efficient sharing of…Continue
Added by Michael Walker on March 13, 2013 at 7:30pm — No Comments
2.7 zetabytes of data exist in the digital universe today according to recent IBM case studies. Yes, you heard right. I said zetabytes; 10^21 bytes or 1,000x larger than an exabyte (which itself is a billion times larger than a gigabyte). From a visual perspective, if the cup of coffee on your desk was equivalent to one gigabyte, a zetabyte would have the same volume as the Great Wall of China. Now multiply that number…Continue
Added by Kyle Herring on March 13, 2013 at 1:15pm — No Comments
The stock market has soared despite no news to support the new highs. Is it time to sell or to short? Or to buy?
Here's my advice, both as a data scientist, and as a former day trader who survived 15 years of stock market volatility (and increased model-resistant price oscillations), mostly by not trading anymore since 2000,…Continue
Thanks to data mining techniques. This article was published in USA today, but there's nothing new - it's just user segmentation / user profiling and clustering techniques applied to Facebook data. Also, I would not put too much trust on the "likes" - many are fake, especially those generated when you purchase advertising on Facebook.
Would you fake your Facebook "likes" to avoid being an advertiser target?…Continue
Added by Amy on March 12, 2013 at 3:30pm — No Comments
Unstructured Data Really Isn’t
Bradley S. Fordham, PhD (www.linkedin.com/in/drbradleyfordham)
The (ART+DATA) Institute
The term “unstructured data”, is truly an oxymoron. All data has structure, and in fact most data has multiple structures…Continue
Added by Zach Piester on March 10, 2013 at 2:18pm — No Comments
I found it odd there was no way to automatically deskew data in R, so I wrote a short little function to do it. It noticeably improves the peformance of linear models and linear support vector machines.
We are seeking comments and suggestions on a proposed "Data Science Code of Professional Conduct".
Data science is an independent profession. Data scientists have a higher calling than just technical skills. We have a duty to use data science to make life, business and government better.…
Contrary to the prevailing sentiments, I counsel patience with regards to hiring a data scientist. Firstly, I mean no disrespect to my data science colleagues, and many will likely agree with me because no one likes to enter an environment where sub-optimal results or failure are probable. Yes, it’s likely you need to add one or even a few data scientists to the team, but not as your first step into the wide, wide world of Big Data. Give me a few minutes and hopefully you’ll see it my…Continue
Added by Alan Nugent on March 5, 2013 at 4:01am — No Comments
This blog entry continues the topic of how a Data Scientist can convince colleagues to become more data driven. The previous blog covered office politics. This entry covers integration with the strategy and, more specifically, the process that creates the company's strategy.
Every company is unique and consequently so is its strategy process. At first glance, research on how companies develop strategies is complex and contradictory. There are simply too many ways to go about it:…Continue
Added by Vincent Granville on March 3, 2013 at 2:30pm — No Comments
This year, Predictive Analytics World San Francisco (April 14-19) features an incredible agenda filled with awesome keynotes and 35 keynotes from leading organizations..
Check out some of the headliners at PAW SF this year:…Continue
Added by Vincent Granville on March 3, 2013 at 1:00pm — No Comments
SQL is a database query and programming language for retrieving, updating, and managing the data from relational database. SQL was certified to meet ANSI in 1986, and became an international standard in 1987. Nowadays, SQL becomes a basic requirement for every programmer. However, the advantages cannot obscure the disadvantage. SQL is especially designed for technical personnel. SQL syntax is highly abstract, the logic is hard to understand, and only those with strong technical background…Continue
Added by Jim King on February 28, 2013 at 6:00pm — No Comments
Added by Vincent Granville on February 28, 2013 at 3:12pm — No Comments
The Berkeley Data Analytics Stack (BDAS) is an open source, next-generation data analytics stack under development at the…Continue
Added by Michael Walker on February 27, 2013 at 10:08am — No Comments
Added by Michael Walker on February 21, 2013 at 9:00am — No Comments
Added by Vincent Granville on February 17, 2013 at 9:00am — No Comments