DSC Weekly Digest 14 September 2021 - DataScienceCentral.com

Announcements

Marketing Analytics and Data Science Join us on October 25-26, 2021 at the Eau Palm Beach Resort & Spa Palm Beach, FL to meet with MADS speakers and like-minded peers for face-to-face discussions, sessions, and 1-on-1 expert consultations on overcoming your challenges. You will discover the infinite possibilities when marketing analytics and data science align to form a revenue-driving powerhouse. Learn More
On September 27th you are invited to the AWS Data Exchange Webinar: How to use external consumer insights and marketing data to build a customer-centric business. External consumer and marketing insights can result in higher customer satisfaction, better retention, and a stronger overall bottom line. Register Now

Making Do With Small Data

The 2010s could, arguably, be described as the era of Big Data, where all of a sudden it seemed like businesses were being deluged by huge amounts of data that had to be processed immediately. Part of this was an amplification of the IT hype mills, as Big Data required Big Servers (or lots of little ones), faster processors, and more programmers to do the heavy lifting of creating the Data Lakes and Enterprise Warehouses that were so integral to the zeitgeist, and part of it was the impact of mobile computing as it suddenly expanded the number of sensors in play dramatically.

Yet the reality on the ground was a bit different for most companies, even many in the IT space itself. Most of the really big data was coming from a few focused social media companies, not from business dramatically increasingly data streams elsewhere, and much of that (most of that) was noise outside of the context that it has come from. Social media is actually a poor place to pick up on covert terrorist activities (high noise, subtle signals), though it’s great in identifying domestic terrorists who want to publicly high-five themselves with their buddies over their latest hijinks.

Most data is, at the end of the day, the trail that transactions leave over time. This information can be valuable, but from the perspective of a business, the metadata at the other end of the transaction is usually fragmentary and hard to quantify. This is one of the reasons that any comprehensive AI solution has to incorporate both algorithmic processes (machine learning) and annotational processes (semantics). Most analytics tools, even neural networks, tend to concentrate on data from the perspective of the transaction, while annotational processes are often far more useful to a company as it is a critical source for what is colloquially called “labeling”.

Labeling is often considered bothersome by analysts because it is time-consuming and requires the collection of metadata rather than the analysis of data. This data also requires developing a conceptual model and the distillation of relationships that usually does require human intervention. It is possible to infer this data using statistical techniques, but it requires a huge amount of data to do so, while at the same time providing at best only a hint of that underlying structure.

The next generation of neural networks is beginning to take this small data into account, in essence focusing increasingly on not just the statistics of the data but also its shape. Known as labeled neural networks (LNN) or graph neural networks (GNN), these various convolutional neural nets replace brute force analysis with what amount to Bayesian networks. These use probabilistic models to identify the schema (or model) implicit in the data. With that information (especially when combined with the contextual streaming that provides the working memory for these processes), GNNs can then become self-labeling, determining not only value but also structure to the resulting function.

The biggest benefit of this technology will be in the areas of making it possible to get the benefits of big data systems without requiring big data. Put another way, artificial intelligence is becoming more intuitive, able to parse out valid patterns with far less raw input. By being able to make do with such small data, all users should be a benefit from this technology, not simply the ones with the deepest pockets.

In media res,

Kurt Cagle
Community Editor,
Data Science Central

To subscribe to the DSC Newsletter, go to Data Science Central and become a member today. It’s free!

Data Science Central Editorial Calendar

DSC is looking for editorial content specifically in these areas for September, with these topics having higher priority than other incoming articles.

Machine Learning and IoT
Data Modeling and Graphs
AI-Enabled Hardware (GPUs and similar tools)
Javascript and AI
GANs and Simulations
ML in Weather Forecasting
UI, UX and AI
Jupyter Notebooks
No-Code Development
Metaverse
GNNs and LNNs

DSC Featured Articles

Neural Network Can Diagnose Covid-19 from Chest X-Rays

Stephanie Glen on 13 Sep 2021
How Businesses Are Using Data Analytics for Better Operational Effi…

Imenso Software on 13 Sep 2021
Interesting Ways Big Data Has Changed the Face of the eCommerce Ind…

Nirav Parmar on 13 Sep 2021
The Top Skills for a Career in Datascience in 2021

Michael Kevin Spencer on 13 Sep 2021
Are Data Scientists Becoming Obsolete?

Vincent Granville on 13 Sep 2021
Deep learning in biology and medicine

ajit jaokar on 12 Sep 2021
4 Ways of Monetization Your Data

Bill Schmarzo on 12 Sep 2021
Advantages of Using Trading Robots on Quantitative Hedge Funds

Rumzz Bajwa on 10 Sep 2021
Selecting the Best Big Data Platform

Prolay Ghosh on 10 Sep 2021
The Top 5 Reasons Why Most AI Projects Fail

Jaimin Dave on 09 Sep 2021
AI for Business Communication: How Effective Is It Really?

Gaurav Sharma on 09 Sep 2021
Technology Firms Are Racing to Make Their Own Chips

Michael Kevin Spencer on 08 Sep 2021
What is DevOps and How can it give a Boost to Software Development?

Varun Bhagat on 08 Sep 2021
Eight Reasons Why Custom Web Application Development Should be the …

Thomas Brown on 08 Sep 2021
Image Distribution

Clayton Davis on 07 Sep 2021
How Logical Data Fabric Accelerates Data Democratization

Saptarshi Sengupta on 07 Sep 2021
A Step By Step Guide To AI Model Development

Jaimin Dave on 07 Sep 2021
Interpreting Image Classification Models via Crowdsourcing

Daria Baidakova on 07 Sep 2021
Elevate your Employee Recognition Program with HR Tech

Aileen Scott on 07 Sep 2021
Big Data Analytics: The Role it Plays in the Banking and Finance Se…

Ryan Williamson on 07 Sep 2021
Data Labeling for Machine Learning Models

Roger Max on 07 Sep 2021
Understanding the Complexity of Metaclasses and their Practical App…

Monika Sangwan on 06 Sep 2021
Binding Cloud, PLM 2.0, and Industry 4.0 into cohesive digital tran…

Vinaksh on 30 Aug 2021
7 Reasons why you need a data integration strategy

Indhu on 02 Jun 2021
DSC Weekly Digest 7 September 2021

Kurt A Cagle on 08 Sep 2021

Picture of the Week

Hiring difficulties jump from last year.

To make sure you keep getting these emails, please add [email protected] to your browser’s address book.

Join Data Science Central | Comprehensive Repository of Data Science and ML Resources

Videos | Search DSC | Post a Blog | Ask a Question

Follow us on Twitter: @DataScienceCtrl | @AnalyticBridge

This email, and all related content, is published by Data Science Central, a division of TechTarget, Inc.

275 Grove Street, Newton, Massachusetts, 02466 US

You are receiving this email because you are a member of TechTarget. When you access content from this email, your information may be shared with the sponsors or future sponsors of that content and with our Partners, see up-to-date Partners List below, as described in our Privacy Policy . For additional assistance, please contact: [email protected]