**Sample forum questions **

Career Related

- ML Math Skills
- Are you smart enough to work at Google?
- What tools can make data scientists more productive?
- Transition between developer and data scientist
- Does it really matter from which college you graduated
- Which courses should I Take?
- Being a Data Science Contractor - UK: How to find work?
- Master's Program Resume
- Research Problem for PhD Student
- What pays most: R, Python, or SQL?

General and Business

- Which classifier has the best performance?
- Java versus Python
- Best method for revenue and sales forecasting
- Which one is best: R, SAS or Python, for data science?
- Is Python or Perl faster than R?
- What is Map-Reduce?
- Two very cool maps: how were they produced?
- Machine Learning Algorithm to Trade Bitcoin
- 8 Types of Data
- What are the differences between prediction, extrapolation, and int...
- More Free Data Sets
- 27 criteria to choose analytic tools
- Is it better to overpredict, or underpredict?
- What are some Data Scientist KPI's?
- Correlation vs. causation
- What is the difference between statistical computing and data mining?
- Regression Trees - What is the best reference?
- Attribution Modeling vs Market Mix Modeling
- How are hotel room rates determined

Technical

- 40-year old trick to clean data efficiently - and perform fuzzy matching
- High Precision Computing in Python or R
- K Means Clustering - Effect of random seed
- Anomaly detection in Time Series Data - Help Required
- Regression Analysis
- Python (and R) for Data Science - sample code, libraries, projects, tutorials
- 1.5 TB dataset of anonymized user interactions released by Yahoo
- Correlation Coefficient in Flat Line Model
- Constraint in a Linear Regression Algorithm
- Time series comparison
- Too many variables for a simple linear computation
- Addressing very low event rate for Logistic Regression
- Generalized Coefficient of Correlation for Non-Linear Relationships
- Handling Imbalanced data when building regression models
- How do I forecast a timeseries of data using GARCH(1,1)?
- Oversampling/Undersampling in Logistic Regression
- Discriminant Analysis on Categorical Variables
- Cluster analysis with categorical variables ?
- Linearity assumption in Linear Regression
- Outlier detection using cluster analysis
- Clustering idea for very large datasets

