Recently, I came across with an interesting book on the statistics which has a narration of Ugly Duckling story and correlation of this story with today's DATA or rather BIG DATA ANALYTICS world. This story originally from famous storyteller Hans Christian Andersen

Story goes like this...

ContinueThe duckling was a big ugly grey bird, so ugly that even a dog would not bite him. The poor duckling…

Added by Manish Bhoge on January 31, 2016 at 12:00pm — No Comments

I am Back ! Yes, I am back (on the track) on my learning track. Sometime, it is really necessary to take a break and introspect why do we learn, before learning. Ah ! it was 9 months safe refuge to learn how Big Data & Analytics can contribute to Data Product.

Data strategy has always been expected to be revenue generation. As Big data and Hadoop entering into the enterprise data strategy it is also expected from big data infrastructure to be revenue addition.…

ContinueAdded by Manish Bhoge on December 12, 2015 at 9:53am — No Comments

Both R & Python should be measured based on their effectiveness in advanced analytics & data science. Initially, as a new comer in data science field we spend good amount of time to understand the pros and cons of these two. I too carried out this study solely for “self” to decide which tool should i pick to get in depth of data science. Eventually, i have started realizing that both (R & Python) has its space of mastery along with their broad support to data science. Here some…

ContinueAdded by Manish Bhoge on February 7, 2014 at 9:22pm — 4 Comments

The term "Data Science" has been evolving not only as a niche skill but as a niche process as well. It is interesting to study "how" the Big data analytics/Data Science/Analytics can be efficiently implemented into the enterprise. So, along with my typical study of analytics viz. Big data analytics I have been also exploring the methodologies to bring the term "Data Science" into mainstream of existing enterprise data analysis, which we conventionally know as "Datawarehouse & BI". This…

ContinueAdded by Manish Bhoge on December 12, 2013 at 7:30am — No Comments

Few days back i have attended a good webinar conducted by Metascale on topic “Are You Still Moving Data? Is ETL Still Relevant in the Era of Hadoop?” This post is targeting this webinar.

In summary, this webinar had nicely explained about how enterprise can use Hadoop as a data hub along with the existing Datawarehouse set up. “Hadoop as a Data Hub” this line itself raised lot of questions in my…

ContinueAdded by Manish Bhoge on November 17, 2013 at 8:16pm — 5 Comments

Practicing Data science indeed a long term effort than a learning handful of skills. We ought to be academically good enough to take up this challenge. However, if you think you came a long way from your academic rebuilding, but you still have that zeal & passion to take the oil from the data and fill the skill gap of data science then here is the** warm-up** tips. Below points must **exercised **before jumping into…

Added by Manish Bhoge on October 18, 2013 at 9:26am — No Comments

Text (word) analysis and tokenized text modeling always give a chill air around ears, specially when you are new to machine learning. Thanks to Python and its extended libraries for its warm support around text analytics and machine learning. Scikit-learn is a savior and excellent support in text processing when you also understand some of the concept like "Bag of word", "Clustering" and "vectorization". Vectorization is must-to-know technique for all machine leaning learners, text miner…

ContinueAdded by Manish Bhoge on September 25, 2013 at 9:47am — No Comments

Data analysis echo system has grown all the way from SQL's to NoSQL and from Excel analysis to Visualization. Today, we are in scarceness of the resources to process ALL (You better understand what i mean by **ALL**) kind of data that is coming to enterprise. Data goes through profiling, formatting, munging or cleansing, pruning, transformation steps to analytics and predictive modeling. Interestingly, there is no one tool proved to be an effective solution to run…

Added by Manish Bhoge on August 27, 2013 at 8:00am — 4 Comments

- Big Data Analytics: From Ugly Duckling to Beautiful Swan
- Where & Why Do You Keep Big Data & Hadoop?
- (R + Python)
- Operational Data Science: excerpt from 2 great articles
- ETL, ELT and Data Hub: Where Hadoop is the right fit ?
- Warm-up exercise before data science.
- Python Scikit-learn to simplify Machine learning : { Bag of words } To [ TF-IDF ]

- An indispensable Python : Data sourcing to Data science.
- Python Scikit-learn to simplify Machine learning : { Bag of words } To [ TF-IDF ]
- (R + Python)
- ETL, ELT and Data Hub: Where Hadoop is the right fit ?
- Warm-up exercise before data science.
- Big Data Analytics: From Ugly Duckling to Beautiful Swan
- Operational Data Science: excerpt from 2 great articles

- Data (5)
- Analytics (3)
- Analysis (2)
- Big (2)
- Hadoop (2)
- Machine (2)
- Python (2)
- Science (2)
- & (1)
- Lake (1)
- Learning (1)
- Modeling (1)
- NumPy (1)
- Pandas (1)
- Predictive (1)
- R (1)
- Scientist (1)
- Scikit-learn (1)
- Statistical (1)
- Statistics (1)
- Text (1)
- data (1)
- learning (1)
- mining (1)
- modeling (1)
- predictive (1)
- science (1)

© 2020 Data Science Central ® Powered by

Badges | Report an Issue | Privacy Policy | Terms of Service

**DSC Podcast**

- Data Science Fails – If It Looks Too Good To Be True…

You’ve probably seen amazing AI news headlines such as: AI can predict earthquakes. Using just a single heartbeat, an AI achieved 100% accuracy predicting congestive heart failure. AI can diagnose covid19 in seconds from a chest scan. A new marketing model is promising to increase the response rate tenfold. It all seems too good to be true. But as the modern proverb says, “If it seems too good to be true, it probably is”. Download now.

**Most Popular Content on DSC**

To not miss this type of content in the future, subscribe to our newsletter.

- Book: Statistics -- New Foundations, Toolbox, and Machine Learning Recipes
- Book: Classification and Regression In a Weekend - With Python
- Book: Applied Stochastic Processes
- Long-range Correlations in Time Series: Modeling, Testing, Case Study
- How to Automatically Determine the Number of Clusters in your Data
- New Machine Learning Cheat Sheet | Old one
- Confidence Intervals Without Pain - With Resampling
- Advanced Machine Learning with Basic Excel
- New Perspectives on Statistical Distributions and Deep Learning
- Fascinating New Results in the Theory of Randomness
- Fast Combinatorial Feature Selection

**Other popular resources**

- Comprehensive Repository of Data Science and ML Resources
- Statistical Concepts Explained in Simple English
- Machine Learning Concepts Explained in One Picture
- 100 Data Science Interview Questions and Answers
- Cheat Sheets | Curated Articles | Search | Jobs | Courses
- Post a Blog | Forum Questions | Books | Salaries | News

**Archives:** 2008-2014 |
2015-2016 |
2017-2019 |
Book 1 |
Book 2 |
More

**DSC Podcast**

- Data Science Fails – If It Looks Too Good To Be True…

You’ve probably seen amazing AI news headlines such as: AI can predict earthquakes. Using just a single heartbeat, an AI achieved 100% accuracy predicting congestive heart failure. AI can diagnose covid19 in seconds from a chest scan. A new marketing model is promising to increase the response rate tenfold. It all seems too good to be true. But as the modern proverb says, “If it seems too good to be true, it probably is”. Download now.

**Most popular articles**

- Free Book and Resources for DSC Members
- New Perspectives on Statistical Distributions and Deep Learning
- Time series, Growth Modeling and Data Science Wizardy
- Statistical Concepts Explained in Simple English
- Machine Learning Concepts Explained in One Picture
- Comprehensive Repository of Data Science and ML Resources
- Advanced Machine Learning with Basic Excel
- Difference between ML, Data Science, AI, Deep Learning, and Statistics
- Selected Business Analytics, Data Science and ML articles
- How to Automatically Determine the Number of Clusters in your Data
- Fascinating New Results in the Theory of Randomness
- Hire a Data Scientist | Search DSC | Find a Job
- Post a Blog | Forum Questions