Can Machine Learning Do Symbolic Manipulation?
I spent some time over the holidays engaged in a fascinating online conversation. The gist of it was a variation of an argument that has been going on in the realm of artificial intelligence from the time of Minsky and Seymour Papert: Whether it is possible for neural networks to do symbolic manipulation (i.e., Algorithms). Back in the 1960s and early 1970s, the answer was simple: the computing power required to make neural networks was far above the available computational power of the time, so it was mostly a moot point. The algorithm would win hands down because neural programming was barely even in the race.
However, sixty years later, the situation has changed dramatically. There are certain areas, such as image classification, natural language processing, and multi-dimensional simulations (such as economic or weather modeling) where neural modeling tools are, in general, superior to taking an algorithmic approach – faster, more accurate, and requiring much less human intervention. The reason for this is that machine learning uses experiences, rendered as datasets, in order to determine those patterns that are closest to the shape of the test data (where shape here describes a surface in many dimensions simultaneously).
The reason that this works is that classification is ultimately about defining shapes that identify given entities, then finding those shapes most likely the defining ones. Each shape, in turn, corresponds to a given label – a cat, a dog, a car, a shoe, a child, an adult. The problem with this, of course, is that you have to label each known shape before you can classify it, at least for the most common type of machine learning, known as supervised learning. You can also let the system identify the shapes themselves, then determine what these shapes are after the fact, which is how unsupervised learning generally works, but there’s a catch there – not all found shapes are meaningful to humans, and interpreting those ersatz shapes can make unsupervised learning complicated at the best of times.
The other problem that occurs with such methods is that you are in essence at the mercy of the source data that you use. If the data that you have is biased, then your model will be biased, and the reality is that there is no way that you can pick out a meaningful sample at the best of times without introducing some kind of bias. This has always been the fundamental flaw of most stochastic methods – they are highly sensitive to garbage in, garbage out.
Where things become more complicated is when you start attempting to identify relationships between different features in the data. Any decent machine learning algorithm can readily derive whether a thing is inside another thing because such spatial relationships have visual components. However, being able to identify other relationships – such as whether a given child is related to a given adult (and how they are related) – is far more difficult to ascertain if this information cannot be translated into something quantifiable (such as strands of DNA). Because we know (some) about genetics, it becomes possible to relate two people to a fairly high degree, because this can be expressed as probabilities that two shapes are similar in certain ways, but without this foreknowledge – without context – interpreting the results of a given classification will always be problematic.
Network graph theory is ultimately about how multiple things in a network are related to one another, whether than that network be a map of roads between cities, a family tree, an organizational chart, or similar structures. In many respects, it is the algorithmic side of networks, describing not just data, but the shapes of data. Schemas are examples of such shapes, as are ontologies and taxonomies. What makes them distinct from neural networks is that they often represent curated relationships. Put another way, such shapes are prior, or contextual, knowledge. Many machine learning purists tend to be dismissive of such networked graphs primarily because they do represent “cured” knowledge (cure, care and curate all derive from the Latin curae, meaning concerns or troubles), as opposed to implicit neural knowledge, but this may be short-sighted.
Most graph neural networks use a particular encoding methodology to describe a particular shape (primarily focusing on graph traversal), but this also throws out a lot of potential knowledge in the process of trying to normalize content. Perhaps, instead, an approach that marries the two can be seen in using a shape or schematic language such as SHACL to describe a given shape, give that shape an identifier, then create shapes that describe the relationships between different kinds of shapes. If you can create very specific shapes, generalize these shapes, then encode the generalizations, you have what amounts to a very powerful tool for introducing symbolic manipulation into machine learning without having to use extensive (and expensive) models or be reliant upon bad (or insufficient) data.
This is a bit deeper than I ordinarily get in these editorials, but it is also something of a promise about what we here at DSC hope to bring in the new year – high-quality discussions, relevant and timely news, recommendations about approaches in managing and working with data, and techniques for managing your career in this rich and complex space. Welcome, one and all, to Data Science Central.
To subscribe to the DSC Newsletter, go to Data Science Central and become a member today. It’s free!
Data Science Central Editorial Calendar
DSC is looking for editorial content specifically in these areas for December, with these topics having higher priority than other incoming articles.
- AI-Enabled Hardware
- Knowledge Graphs
- GANs and Simulations
- ML in Weather Forecasting
- UI, UX and AI
- GNNs and LNNs
- Digital Twins
DSC Featured Articles
- Intelligent Gateways Applications For Greenfield and Brownfield Environments
- Dinesh Bhol on 05 Jan 2022
- Can OTT platforms succeed with machine learning services? An insight
- Divyesh Aegis on 05 Jan 2022
- Trends Towards 2022
- Kurt A Cagle on 05 Jan 2022
- Outsourcing Data Annotation Work
- Roger Max on 05 Jan 2022
- Why a Data Science Career Is Worth Pursuing
- Kathie Adams on 04 Jan 2022
- Top Social Media Content Moderation Trends that Will Reign Supreme in 2022
- Roger Max on 04 Jan 2022
- Could we Live in a Universe with Fewer than Three Dimensions?
- Vincent Granville on 04 Jan 2022
- Data science and analytics: the future implications
- Aileen Scott on 04 Jan 2022
- Data Monetization Approach for B2B2C Industries
- Bill Schmarzo on 03 Jan 2022
- IoT Drives Growth of Intelligent Transportation Systems
- Pragati Pa on 03 Jan 2022
- Artificial Intelligence for Mental Health
- ajit jaokar on 02 Jan 2022
- Building a hypergraph-based semantic knowledge sharing environment for construction
- Alan Morrison on 01 Jan 2022
- Lying to blockchains and other Web3 dilemmas
- Alan Morrison on 31 Dec 2021
- Digital Workers in a Business Context
- Sanjay Sharma on 31 Dec 2021
- Top 12 Software Development Trends in 2022 You Must Watch Out
- Avani Trivedi on 30 Dec 2021
- AWS Cloud Security: Best Practices
- Ryan Williamson on 29 Dec 2021
- DSC Weekly Digest 28 December 2021: An Auld Lang Syne (and Cosyne Too)
- Kurt A Cagle on 28 Dec 2021
Picture of the Week