Ok. So there’s been a lot of coverage by various websites, data science gurus, and AI experts about what 2019 holds in store for us. Everywhere you look, we have new fads and concepts for the new year. This article is going to be rather different. We are going to highlight the dark horses – the trends that no one has thought about but will completely disrupt the working IT environment (for both good and bad – depends upon which side of…Continue
The short answer is: "No."
I started teaching myself programming in my 40s, and I am a strong advocate that everyone should learn to code. Even if you have no intent to become a developer or a full-stack data scientist, coding teaches you a couple valuable lessons:
About a month ago, I posted a blog on “Technical Deconstruction.” I described this as a technique to break down aggregate data to distinguish between its contributing parts: these parts might contain unique characteristics compared to the aggregate. For instance, I suggested that it can be helpful to break down data by workday - that is to say, maintaining separate data for each day of the week. I said that the data could be further deconstructed perhaps by time period and employee: the…Continue
Added by Don Philip Faithful on April 14, 2018 at 8:00am — No Comments
Data mining is the process of looking at large banks of information to generate new information. Intuitively, you might think that data “mining” refers to the extraction of new data, but this isn’t the case; instead, data mining is about extrapolating patterns and new knowledge from the data you’ve already collected.
Relying on techniques and technologies from the intersection of database management, statistics, and machine learning, specialists in data mining have dedicated their…Continue
Added by Larry Alton on December 22, 2017 at 7:30am — No Comments
One of the best ways to learn about any topic is start with very fundamental questions like What, Why etc? Good old Socratic method. In this series of articles on data mining, I plan to approach this topic in a similar fashion.
Simply put, Data mining is the process of sifting through large data sets to identify…Continue
Variable reduction is a crucial step for accelerating model building without losing the potential predictive power of the data. With the advent of Big Data and sophisticated data mining techniques, the number of variables encountered is often tremendous making variable selection or dimension reduction techniques imperative to produce models with acceptable accuracy and generalization. The temptation to build an ecological model using all available information (i.e., all variables) is hard to…Continue
Added by Valiance Solutions on April 21, 2017 at 9:20pm — No Comments
About a month ago in a blog, I introduced what I described as a “spectral attenuation monitor.” At the time I only had an image from MS Works that…Continue
Added by Don Philip Faithful on April 9, 2017 at 6:30am — No Comments
In order to prevent my programs from freezing up while running long calculations, I generally run the calculations on separate threads. In Java, this process can be accomplished by separating the GUI from processing. In the code below, a thread for an instance of MyProcessing would be invoked using start(): e.g. “(new MyProcessing()).start();” would run indefinitely until T is made null. T can be made null by calling stop() or by directly making T null. Often when the GUI is closing, I…Continue
Added by Don Philip Faithful on March 25, 2017 at 9:42am — No Comments
I have been writing about the Crosswave Differential Algorithm for a number of years. I described in previous blogs how the algorithm emerged almost by accident while I was attempting to write an application intended to support quality control. In this blog I will be discussing the event model that powers the algorithm. Events are the details and circumstances…Continue
Added by Don Philip Faithful on January 14, 2017 at 5:27am — No Comments
This blog contains some snippets of code that I tend to use in Java. I acknowledge that somebody else writing this blog might include different code. Except for a short course at Sun Educational Services, most of my Java programming skills are self-taught. I’m unsure if people with formal backgrounds in computer science might have different styles and conventions. Mine have been shaped primarily by my needs.
Creating a Graphical User Interface…Continue
Added by Don Philip Faithful on November 6, 2016 at 8:00am — No Comments
BERLIN stands for Behavioural Event Reconstruction Linguistic Interface for Narratives. I introduced BERLIN a few blogs ago - in my "final blog." Theoretically after one's final blog, no further blogs are forthcoming. However, I am now posting bonus blogs reflecting aspects of the same closing subject. Today, I will be elaborating on BERLIN's syntax and how its searches are facilitated. As a general rule, the objective of BERLIN is to convert human-friendly narrative into computer-friendly…Continue
Added by Don Philip Faithful on March 5, 2016 at 10:12am — No Comments
I find that different types of surveys represent a large source of data for many organizations: client questionnaires; recruitment interviews; incident debriefings; interrogations; borehole drilling surveys; quality control checks; marketing surveys; security and patrol logs; and inventory audits. I believe that for many people, the idea of collecting information using surveys makes sense; and they recognize the need for the data. Problems arise in relation to the transition from survey to…Continue
Added by Don Philip Faithful on October 10, 2015 at 6:09am — No Comments
“If you treat an individual as he is, he will stay as he is, but if you treat him as if he were what he ought to be and could be, he will become what he ought to be and could be." —JOHANN WOLFGANG VON GOETHE
The last few years I have been trying to get an handle on the field which encompasses analytics , big data, modeling, prediction, machine learning, algorithms , data mining techniques, rules, computational complexity, latency, data products, data engineering, statistical…Continue
I rarely get to use a walkie-talkie during a course in school. As the snapshot of my desktop shows on the image below, I had both a multi-line telephone and portable radio. Just before the exam, I participated in a simulation. Our tabletop exercise contained an emergency scenario: a train derailment involving the evacuation of residents. I served as the Social Services Director. Although I didn't choose this role for myself, I thought it made sense given my graduate degree in the area of…Continue
Added by Don Philip Faithful on May 3, 2015 at 6:04am — No Comments
Wisconsin Gov. Scott Walker may be surging as an early favorite for the Republican presidential nomination — but the numbers show the person leading among potential Iowa caucus-goers often doesn’t win. In the Democratic field, Hillary Clinton is the most dominant position …Continue
In this post, I discuss the basic characteristics of code that I have personally used to extract online data - in a process these days often called data-mining. I intend to cover some general features. Those that wish to do so can also compile the coding samples.
Over the years, I have programmed in a number of computer programming languages including Visual Basic, Perl, Python, and LISP (AutoLISP). The coding samples on this blog are written in Java, my language of…Continue
The easiest person in the world to fool is yourself. Data scientists sometimes fool themselves - in matters trivial and important. Thus, I strongly suggest that we acknowledge real or subconscious biases in ourselves, the data, the analysis and group think. It is prudent for data science teams to have…Continue
Added by Michael Walker on June 6, 2013 at 12:11pm — No Comments
The ‘Bell curve’ or the ‘Gaussian bell curve’ is one of the fundamental concepts on which most of the statistical analysis is based. From social sciences to astronomy to financial services- most of the application of statistics in the real world relies on the assumption that the data being analysed is distributed in the shape of the bell…Continue
Added by Gaurav Vohra on January 15, 2013 at 12:29am — No Comments
When creating a predictive model, data miners need to “tune” it to make the right kind of mistakes. Setting the cut-off point between ‘promising’ and ‘unpromising’ depends a lot on our client’s biggest concern -- missed opportunities or false alarms.
Data Mining Misconceptions #1: The 50/50 Problem…Continue