For the last few years, I have read the free state of AI report
Here are the list of insights which I found interesting
The full report and the download link is at the end of this article
AI research is less open than you think: Only 15% of papers publish their code
Facebook’s PyTorch is fast outpacing Google’s TensorFlow in research papers, which tends to be a leading indicator of production use down the line
PyTorch is also more popular than TensorFlow in paper implementations on GitHub
Language models: Welcome to the Billion Parameter club
Huge models, large companies and massive training costs dominate the hottest area of AI today, NLP.
Bigger models, datasets and compute budgets clearly drive performance
Empirical scaling laws of neural language models show smooth power-law relationships, which means that as model performance increases, the model size and amount of computation has to increase more rapidly.
Tuning billions of model parameters costs millions of dollars
Based on variables released by Google et al., you’re paying circa $1 per 1,000 parameters. This means OpenAI’s 175B parameter GPT-3 could have cost tens of millions to train. Experts suggest the likely budget was $10M.
We’re rapidly approaching outrageous computational, economic, and environmental costs to gain incrementally smaller improvements in model performance
Without major new research breakthroughs, dropping the ImageNet error rate from 11.5% to 1% would require over one hundred billion billion dollars! Many practitioners feel that progress in mature areas of ML is stagnant.
A larger model needs less data than a smaller peer to achieve the same performance
This has implications for problems where training data samples are expensive to generate, which likely confers an advantage to large companies entering new domains with supervised learning-based models.
Even as deep learning consumes more data, it continues to get more efficient
Since 2012 the amount of compute needed to train a neural network to the same performance on ImageNet classification has been decreasing by a factor of 2 every 16 months.
A new generation of transformer language models are unlocking new NLP use-cases
GPT-3, T5, BART are driving a drastic improvement in the performance of transformer models for text-to-text tasks like translation, summarization, text generation, text to code.
NLP benchmarks take a beating: Over a dozen teams outrank the human GLUE baseline
It was only 12 months ago that the human GLUE benchmark was beat by 1 point. Now SuperGLUE is in sight.
What’s next after SuperGLUE? More challenging NLP benchmarks zero-in on knowledge
A multi-task language understanding challenge tests for world knowledge and problem solving ability across 57 tasks including maths, US history, law and more. GPT-3’s performance is lopsided with large knowledge gaps.
The transformer’s ability to generalise is remarkable. It can be thought of as a new layer type that is more powerful than convolutions because it can process sets of inputs and fuse information more globally.
For example, GPT-2 was trained on text but can be fed images in the form of a sequence of pixels to learn how to autocomplete images in an unsupervised manner.
Biology is experiencing its “AI moment”: Over 21,000 papers in 2020 alone
Publications involving AI methods (e.g. deep learning, NLP, computer vision, RL) in biology are growing >50% year-on-year since 2017. Papers published since 2019 account for 25% of all output since 2000.
From physical object recognition to “cell painting”: Decoding biology through images
Large labelled datasets offer huge potential for generating new biological knowledge about health and disease.
Deep learning on cellular microscopy accelerates biological discovery with drug screens
Embeddings from experimental data illuminate biological relationships and predict COVID-19 drug successes.
Ophthalmology advances as the sandbox for deep learning applied to medical imaging
After diagnosis of ‘wet’ age-related macular degeneration (exAMD) in one eye, a computer vision system can predict whether a patient’s second eye will convert from healthy to exAMD within six months. The system uses 3D eye scans and predicted semantic segmentation maps.
AI-based screening mammography reduces false positives and false negatives in two large, clinically-representative datasets from the US and UK
The AI system, an ensemble of three deep learning models operating on individual lesions, individual breasts and the full case, was trained to produce a cancer risk score between 0 and 1 for the entire mammography case. The system outperformed human radiologists and could generalise to US data when trained on UK data only.
Causal reasoning is a vital missing ingredient for applying AI to medical diagnosis
Existing AI approaches to diagnosis are purely associative, identifying diseases that are strongly correlated with a patient’s symptoms. The inability to disentangle correlation from causation can result in suboptimal or dangerous diagnoses.
Model explainability is an important area of AI safety: A new approach aims to incorporate causal structure between input features into model explanations
A flaw with Shapley values, one current approach to explainability, is that they assume the model’s input features are uncorrelated. Asymmetric Shapley Values (ASV) are proposed to incorporate this causal information.
Reinforcement learning helps ensure that molecules you discover in silico can actually be synthesized in the lab. This helps chemists avoid dead ends during drug discovery.
RL agent designs molecules using step-wise transitions defined by chemical reaction templates.
American institutions and corporations continue to dominate NeurIPS 2019 papers
Google, Stanford, CMU, MIT and Microsoft Research own the Top-5.
The same is true at ICML 2020: American organisations cement their leadership position
The top 20 most prolific organisations by ICML 2020 paper acceptances further cemented their position vs. ICML 2019. The chart below shows their Publication Index position gains vs. ICML 2019.
Demand outstrips supply for AI talent
Analysis of Indeed.com US data shows almost 3x more job postings than job views for AI-related roles. Job postings grew 12x faster than job viewings in the last from late 2016 to late 2018.
US states continue to legislate autonomous vehicles policies
Over half of all US states have enacted legislation to related to autonomous vehicles.
Even so, driverless cars are still not so driverless: Only 3 of 66 companies with AV testing permits in California are allowed to test without safety drivers since 2018
The rise of MLOps (DevOps for ML) signals an industry shift from technology R&D (how to build models) to operations (how to run models)
25% of the top-20 fastest growing GitHub projects in Q2 2020 concern ML infrastructure, tooling and operations. Google Search traffic for “MLOps” is now on an uptick for the first time.
As AI adoption grows, regulators give developers more to think about
External monitoring is transitioning from a focus on business metrics down to low-level model metrics. This creates challenges for AI application vendors including slower deployments, IP sharing, and more:
Berkshire Grey robotic installations are achieving millions of robotic picks per month
Supply chain operators realise a 70% reduction in direct labour as a result.
Algorithmic decision making: Regulatory pressure builds
Multiple countries and states start to wrestle with how to regulate the use of ML in decision making.
GPT-3, like GPT-2, still outputs biased predictions when prompted with topics of religion
Example from the GPT-3 (left) and GPT-2 (right) with prompts and the model’s predictions, which contain clear bias. Models trained on large volumes of language on the internet will reflect the bias in those datasets unless their developers make efforts to fix this. See our coverage in State of AI Report 2019 of how Google adapted their translation model to remove gender bias.
Free download link is at state of ai report