Skip to Main Content

IEEE Account

Purchase Details

Profile Information

A not-for-profit organization, IEEE is the world's largest technical professional organization dedicated to advancing technology for the benefit of humanity. © Copyright 2023 IEEE - All rights reserved. Use of this web site signifies your agreement to the terms and conditions.

research papers on machine learning 2021

Best Machine Learning Research of 2021 So Far

Best Machine Learning Research of 2021 So Far

Machine Learning Modeling Research posted by Daniel Gutierrez, ODSC July 19, 2021 Daniel Gutierrez, ODSC

The start of 2021 saw many prominent research groups extending the state of machine learning science to consistently greater heights. In my efforts to keep pace with this accelerated progress, I’ve noticed a number of hot topics that are gaining the attention of researchers: explainable/interpretable ML , federated learning , gradient boosting , causal inference , ROC analysis, and many others. In this article, we’ll take a journey through my top picks of papers for the first half of 2021 that I found compelling and worthwhile. Through my effort to stay current with the field’s research advancement, I found the directions represented in these papers to be very promising. I hope you enjoy my selections as much as I have. (Check my lists from 2019 and 2020 ). 

Scaling Hierarchical Agglomerative Clustering to Billion-sized Datasets

Hierarchical Agglomerative Clustering (HAC) is one of the oldest but still most widely used clustering methods. However, HAC is notoriously hard to scale to large data sets as the underlying complexity is at least quadratic in the number of data points and many algorithms to solve HAC are inherently sequential. This paper proposes Reciprocal Agglomerative Clustering (RAC), a distributed algorithm for HAC that uses a novel strategy to efficiently merge clusters in parallel. The paper proves theoretically that RAC recovers the exact solution of HAC. Furthermore, under clusterability and balancedness assumption it’s shown provable speedups in total runtime due to the parallelism. It’s also shown that these speedups are achievable for certain probabilistic data models. In extensive experiments, it’s shown that this parallelism is achieved on real-world data sets and that the proposed RAC algorithm can recover the HAC hierarchy on billions of data points connected by trillions of edges in less than an hour.

Deep ROC Analysis and AUC as Balanced Average Accuracy to Improve Model Selection, Understanding and Interpretation

Optimal performance is critical for decision-making tasks from medicine to autonomous driving, however common performance measures may be too general or too specific. For binary classifiers, diagnostic tests, or prognosis at a timepoint, measures such as the area under the receiver operating characteristic (ROC) curve, or the area under the precision-recall curve, are too general because they include unrealistic decision thresholds. On the other hand, measures such as accuracy, sensitivity or the F1 score are measures at a single threshold that reflect an individual single probability or predicted risk, rather than a range of individuals or risk. This paper proposes a method in between, deep ROC analysis that examines groups of probabilities or predicted risks for more insightful analysis. The research translates esoteric measures into familiar terms: AUC and the normalized concordant partial AUC are balanced average accuracy (a new finding); the normalized partial AUC is average sensitivity, and the normalized horizontal partial AUC is average specificity. 

Hardware Acceleration of Explainable Machine Learning using Tensor Processing Units

Machine learning (ML) is successful in achieving human-level performance in various fields. However, it lacks the ability to explain an outcome due to its black-box nature. While existing explainable ML is promising, almost all of these methods focus on formatting interpretability as an optimization problem. Such a mapping leads to numerous iterations of time-consuming complex computations, which limits their applicability in real-time applications. This paper proposes a novel framework for accelerating explainable ML using Tensor Processing Units (TPUs). The proposed framework exploits the synergy between matrix convolution and Fourier transform, and takes full advantage of TPU’s natural ability in accelerating matrix computations. Specifically, this paper makes three important contributions: (i) the proposed work is the first attempt in enabling hardware acceleration of explainable ML using TPUs; (ii) the proposed approach is applicable across a wide variety of ML algorithms, and effective utilization of TPU-based acceleration can lead to real-time outcome interpretation; and (iii) extensive experimental results demonstrate that the proposed approach can provide an order-of-magnitude speedup in both classification time (25x on average) and interpretation time (13x on average) compared to state-of-the-art techniques.

Learning to Optimize: A Primer and A Benchmark

Learning to optimize (L2O) is an emerging approach that leverages machine learning to develop optimization methods, aiming at reducing the laborious iterations of hand engineering. It automates the design of an optimization method based on its performance on a set of training problems. This data-driven procedure generates methods that can efficiently solve problems similar to those in the training. In sharp contrast, the typical and traditional designs of optimization methods are theory-driven, so they obtain performance guarantees over the classes of problems specified by the theory. The difference makes L2O suitable for repeatedly solving a certain type of optimization problems over a specific distribution of data, while it typically fails on out-of-distribution problems. The practicality of L2O depends on the type of target optimization, the chosen architecture of the method to learn, and the training procedure. This new paradigm has motivated a community of researchers to explore L2O and report their findings. This paper is poised to be the first comprehensive survey and benchmark of L2O for continuous optimization. The GitHub repo associated with this paper can be found HERE . 

Federated Quantum Machine Learning

Distributed training across several quantum computers could significantly improve the training time and if we could share the learned model, not the data, it could potentially improve the data privacy as the training would happen where the data is located. It is believed that no work has been done in quantum machine learning (QML) in a federation setting yet. This paper presents the federated training on hybrid quantum-classical machine learning models although the framework could be generalized to a pure quantum machine learning model. Specifically, it was considered the quantum neural network (QNN) coupled with the classical pre-trained convolutional model. The distributed federated learning scheme demonstrated almost the same level of trained model accuracies and yet significantly faster-distributed training. It demonstrates a promising future research direction for scaling and privacy aspects.

Towards Causal Representation Learning

The two fields of machine learning and graphical causality arose and developed separately. However, there is now cross-pollination and increasing interest in both fields to benefit from the advances of the other. This paper reviews fundamental concepts of causal inference and relate them to crucial open problems of machine learning, including transfer and generalization, thereby assaying how causality can contribute to modern machine learning research. This also applies in the opposite direction: it’s noted that most work in causality starts from the premise that the causal variables are given. A central problem for AI and causality is, thus, causal representation learning, the discovery of high-level causal variables from low-level observations. Finally, the paper delineates some implications of causality for machine learning and proposes key research areas at the intersection of both communities.

QuPeL: Quantized Personalization with Applications to Federated Learning

Traditionally, federated learning (FL) aims to train a single global model while collaboratively using multiple clients and a server. Two natural challenges that FL algorithms face are heterogeneity in data across clients and collaboration of clients with diverse resources. This machine learning research paper introduces a quantized and personalized FL algorithm QuPeL that facilitates collective training with heterogeneous clients while respecting resource diversity. For personalization, clients are allowed to learn compressed personalized models with different quantization parameters depending on their resources. Towards this, an algorithm is proposed for learning quantized models through a relaxed optimization problem, where quantization values are also optimized over. When each client participating in the (federated) learning process has different requirements of the quantized model (both in value and precision), a quantized personalization framework is formulated by introducing a penalty term for local client objectives against a globally trained model to encourage collaboration. 

Explainable Artificial Intelligence Approaches: A Survey

The lack of explainability of a decision from an Artificial Intelligence (AI) based “black box” system/model, despite its superiority in many real-world applications, is a key stumbling block for adopting AI in many high stakes applications of different domain or industry. While many popular Explainable Artificial Intelligence (XAI) methods or approaches are available to facilitate a human-friendly explanation of the decision, each has its own merits and demerits, with a plethora of open challenges. This machine learning research paper demonstrates popular XAI methods with a mutual case study/task (i.e. credit default prediction), analyze for competitive advantages from multiple perspectives (e.g. local, global), provide meaningful insight on quantifying explainability, and recommend paths towards responsible or human-centered AI using XAI as a medium. Practitioners can use this work as a catalog to understand, compare, and correlate competitive advantages of popular XAI methods. In addition, this survey elicits future research directions towards responsible or human-centric AI systems, which is crucial to adopt AI in high-stakes applications. A Machine Learning Data Processing Framework

Training machine learning models requires feeding input data for models to ingest. Input pipelines for machine learning jobs are often challenging to implement efficiently as they require reading large volumes of data, applying complex transformations, and transferring data to hardware accelerators while overlapping computation and communication to achieve optimal performance. This machine learning research paper presents, a framework for building and executing efficient input pipelines for machine learning jobs. The API provides operators which can be parameterized with user-defined computation, composed, and reused across different machine learning domains. These abstractions allow users to focus on the application logic of data processing, while’s runtime ensures that pipelines run efficiently. The paper demonstrates that input pipeline performance is critical to the end-to-end training time of state-of-the-art machine learning models. delivers the high performance required, while avoiding the need for manual tuning of performance knobs.

Boost then Convolve: Gradient Boosting Meets Graph Neural Networks

Graph neural networks (GNNs) are powerful models that have been successful in various graph representation learning tasks. Whereas gradient boosted decision trees (GBDT) often outperform other machine learning methods when faced with heterogeneous tabular data. But what approach should be used for graphs with tabular node features? Previous GNN models have mostly focused on networks with homogeneous sparse features and are suboptimal in the heterogeneous setting. This paper proposes a novel architecture that trains GBDT and GNN jointly to get the best of both worlds: the GBDT model deals with heterogeneous features, while GNN accounts for the graph structure. The model benefits from end-to-end optimization by allowing new trees to fit the gradient updates of GNN. With an extensive experimental comparison to the leading GBDT and GNN models, the researchers demonstrate a significant increase in performance on a variety of graphs with tabular features. The GitHub repo associated with this paper can be found HERE .

Notes from the editor:

How to Learn More about Machine Learning Research

At our upcoming event this November 16th-18th in San Francisco, ODSC West 2021 will feature a plethora of talks, workshops, and training sessions on machine learning and machine learning research. You can register now for 60% off all ticket types before the discount drops to 40% in a few weeks. Some highlighted sessions on machine learning include:

Sessions on MLOps:

Sessions on Deep Learning:

research papers on machine learning 2021

Daniel Gutierrez, ODSC

Daniel D. Gutierrez is a practicing data scientist who’s been working with data long before the field came in vogue. As a technology journalist, he enjoys keeping a pulse on this fast-paced industry. Daniel is also an educator having taught data science, machine learning and R classes at the university level. He has authored four computer industry books on database and data science technology, including his most recent title, “Machine Learning and Data Science: An Introduction to Statistical Learning Methods with R.” Daniel holds a BS in Mathematics and Computer Science from UCLA.

east square

SQuARE: Towards Multi-Domain and Few-Shot Collaborating Question Answering Agents

East 2023 Modeling posted by ODSC Community Mar 3, 2023 Editor’s note: Iryna Gurevych, PhD and Haritz Puerto are speakers for ODSC East 2023. Be sure...

OpenAI Launches ChatGPT and Whisper APIs to Democratize Advanced NLP Technology

OpenAI Launches ChatGPT and Whisper APIs to Democratize Advanced NLP Technology

AI and Data Science News posted by ODSC Team Mar 2, 2023 OpenAI has announced the launch of its ChatGPT and Whisper APIs. The design goal of which...

Concerns over potential harm from AI prompt Apple to block update of ChatGPT-powered app

Concerns over potential harm from AI prompt Apple to block update of ChatGPT-powered app

AI and Data Science News posted by ODSC Team Mar 2, 2023 In a report from the Wall Street Journal, Apple has blocked an update to an app...

ankur square

machine learning Recently Published Documents

Total documents.

An explainable machine learning model for identifying geographical origins of sea cucumber Apostichopus japonicus based on multi-element profile

A comparison of machine learning- and regression-based models for predicting ductility ratio of rc beam-column joints, alexa, is this a historical record.

Digital transformation in government has brought an increase in the scale, variety, and complexity of records and greater levels of disorganised data. Current practices for selecting records for transfer to The National Archives (TNA) were developed to deal with paper records and are struggling to deal with this shift. This article examines the background to the problem and outlines a project that TNA undertook to research the feasibility of using commercially available artificial intelligence tools to aid selection. The project AI for Selection evaluated a range of commercial solutions varying from off-the-shelf products to cloud-hosted machine learning platforms, as well as a benchmarking tool developed in-house. Suitability of tools depended on several factors, including requirements and skills of transferring bodies as well as the tools’ usability and configurability. This article also explores questions around trust and explainability of decisions made when using AI for sensitive tasks such as selection.

Automated Text Classification of Maintenance Data of Higher Education Buildings Using Text Mining and Machine Learning Techniques

Data-driven analysis and machine learning for energy prediction in distributed photovoltaic generation plants: a case study in queensland, australia, modeling nutrient removal by membrane bioreactor at a sewage treatment plant using machine learning models, big five personality prediction based in indonesian tweets using machine learning methods.

<span lang="EN-US">The popularity of social media has drawn the attention of researchers who have conducted cross-disciplinary studies examining the relationship between personality traits and behavior on social media. Most current work focuses on personality prediction analysis of English texts, but Indonesian has received scant attention. Therefore, this research aims to predict user’s personalities based on Indonesian text from social media using machine learning techniques. This paper evaluates several machine learning techniques, including <a name="_Hlk87278444"></a>naive Bayes (NB), K-nearest neighbors (KNN), and support vector machine (SVM), based on semantic features including emotion, sentiment, and publicly available Twitter profile. We predict the personality based on the big five personality model, the most appropriate model for predicting user personality in social media. We examine the relationships between the semantic features and the Big Five personality dimensions. The experimental results indicate that the Big Five personality exhibit distinct emotional, sentimental, and social characteristics and that SVM outperformed NB and KNN for Indonesian. In addition, we observe several terms in Indonesian that specifically refer to each personality type, each of which has distinct emotional, sentimental, and social features.</span>

Compressive strength of concrete with recycled aggregate; a machine learning-based evaluation

Temperature prediction of flat steel box girders of long-span bridges utilizing in situ environmental parameters and machine learning, computer-assisted cohort identification in practice.

The standard approach to expert-in-the-loop machine learning is active learning, where, repeatedly, an expert is asked to annotate one or more records and the machine finds a classifier that respects all annotations made until that point. We propose an alternative approach, IQRef , in which the expert iteratively designs a classifier and the machine helps him or her to determine how well it is performing and, importantly, when to stop, by reporting statistics on a fixed, hold-out sample of annotated records. We justify our approach based on prior work giving a theoretical model of how to re-use hold-out data. We compare the two approaches in the context of identifying a cohort of EHRs and examine their strengths and weaknesses through a case study arising from an optometric research problem. We conclude that both approaches are complementary, and we recommend that they both be employed in conjunction to address the problem of cohort identification in health research.

Export Citation Format

Share document.

U.S. flag

An official website of the United States government

The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Save citation to file

Email citation, add to collections.

Add to My Bibliography

Your saved search, create a file for external citation management software, your rss feed.

Machine Learning: Algorithms, Real-World Applications and Research Directions


In the current age of the Fourth Industrial Revolution (4 IR or Industry 4.0), the digital world has a wealth of data, such as Internet of Things (IoT) data, cybersecurity data, mobile data, business data, social media data, health data, etc. To intelligently analyze these data and develop the corresponding smart and automated applications, the knowledge of artificial intelligence (AI), particularly, machine learning (ML) is the key. Various types of machine learning algorithms such as supervised, unsupervised, semi-supervised, and reinforcement learning exist in the area. Besides, the deep learning , which is part of a broader family of machine learning methods, can intelligently analyze the data on a large scale. In this paper, we present a comprehensive view on these machine learning algorithms that can be applied to enhance the intelligence and the capabilities of an application. Thus, this study's key contribution is explaining the principles of different machine learning techniques and their applicability in various real-world application domains, such as cybersecurity systems, smart cities, healthcare, e-commerce, agriculture, and many more. We also highlight the challenges and potential research directions based on our study. Overall, this paper aims to serve as a reference point for both academia and industry professionals as well as for decision-makers in various real-world situations and application areas, particularly from the technical point of view.

Keywords: Artificial intelligence; Data science; Data-driven decision-making; Deep learning; Intelligent applications; Machine learning; Predictive analytics.

© The Author(s), under exclusive licence to Springer Nature Singapore Pte Ltd 2021.

Conflict of interest statement

Conflict of interestThe author declares no conflict of interest.

The worldwide popularity score of…

The worldwide popularity score of various types of ML algorithms (supervised, unsupervised, semi-supervised,…

Various types of machine learning…

Various types of machine learning techniques

A general structure of a…

A general structure of a machine learning based predictive model considering both the…

An example of a decision…

An example of a decision tree structure

An example of a random…

An example of a random forest structure considering multiple decision trees

Classification vs. regression. In classification…

Classification vs. regression. In classification the dotted line represents a linear boundary that…

A graphical interpretation of the…

A graphical interpretation of the widely-used hierarchical clustering (Bottom-up and top-down) technique

An example of a principal…

An example of a principal component analysis (PCA) and created principal components PC1…

Machine learning and deep learning…

Machine learning and deep learning performance in general with the amount of data

A structure of an artificial…

A structure of an artificial neural network modeling with multiple processing layers

An example of a convolutional…

An example of a convolutional neural network (CNN or ConvNet) including multiple convolution…

Similar articles

Publication types

LinkOut - more resources

Full text sources.

Other Literature Sources

NCBI Literature Resources

MeSH PMC Bookshelf Disclaimer

The PubMed wordmark and PubMed logo are registered trademarks of the U.S. Department of Health and Human Services (HHS). Unauthorized use of these marks is strictly prohibited.

Subscribe to the PwC Newsletter

Join the community, trending research, the forward-forward algorithm: some preliminary investigations.

research papers on machine learning 2021

The aim of this paper is to introduce a new learning procedure for neural networks and to demonstrate that it works well enough on a few small problems to be worth further investigation.

Discovering faster matrix multiplication algorithms with reinforcement learning

Particularly relevant is the case of 4 × 4 matrices in a finite field, where AlphaTensor’s algorithm improves on Strassen’s two-level algorithm for the first time, to our knowledge, since its discovery 50 years ago2.

LLaMA: Open and Efficient Foundation Language Models

We introduce LLaMA, a collection of foundation language models ranging from 7B to 65B parameters.

research papers on machine learning 2021

Composer: Creative and Controllable Image Synthesis with Composable Conditions

damo-vilab/composer • 20 Feb 2023

Recent large-scale generative models learned on big data are capable of synthesizing incredible images yet suffer from limited controllability.

research papers on machine learning 2021

SpikeGPT: Generative Pre-trained Language Model with Spiking Neural Networks

Spiking neural networks (SNNs) have emerged as an energy-efficient approach to deep learning that leverage sparse and event-driven activations to reduce the computational overhead associated with model inference.

research papers on machine learning 2021

More than you've asked for: A Comprehensive Analysis of Novel Prompt Injection Threats to Application-Integrated Large Language Models

greshake/lm-safety • 23 Feb 2023

In such attacks, an adversary can prompt the LLM to produce malicious content or override the original instructions and the employed filtering schemes.

ZoeDepth: Zero-shot Transfer by Combining Relative and Metric Depth

Finally, ZoeD-M12-NK is the first model that can jointly train on multiple datasets (NYU Depth v2 and KITTI) without a significant drop in performance and achieve unprecedented zero-shot generalization performance to eight unseen datasets from both indoor and outdoor domains.

research papers on machine learning 2021

Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities

research papers on machine learning 2021

Adding Conditional Control to Text-to-Image Diffusion Models

Moreover, training a ControlNet is as fast as fine-tuning a diffusion model, and the model can be trained on a personal devices.

research papers on machine learning 2021

VoxFormer: Sparse Voxel Transformer for Camera-based 3D Semantic Scene Completion

nvlabs/voxformer • 23 Feb 2023

To enable such capability in AI systems, we propose VoxFormer, a Transformer-based semantic scene completion framework that can output complete 3D volumetric semantics from only 2D images.

research papers on machine learning 2021

Suggestions or feedback?

MIT News | Massachusetts Institute of Technology


Centers, Labs, & Programs

Using machine learning to predict high-impact research

Press contact :.

Illustration showing three sets of blue nodes (dots) surrounding a single orange node. In the first set, the orange node is labeled "Article of interest"

Previous image Next image

An artificial intelligence framework built by MIT researchers can give an “early-alert” signal for future high-impact technologies, by learning from patterns gleaned from previous scientific publications.

In a retrospective test of its capabilities, DELPHI , short for Dynamic Early-warning by Learning to Predict High Impact, was able to identify all pioneering papers on an experts’ list of key foundational biotechnologies, sometimes as early as the first year after their publication.

James W. Weis, a research affiliate of the MIT Media Lab, and Joseph Jacobson, a professor of media arts and sciences and head of the Media Lab’s Molecular Machines research group, also used DELPHI to highlight 50 recent scientific papers that they predict will be high impact by 2023. Topics covered by the papers include DNA nanorobots used for cancer treatment, high-energy density lithium-oxygen batteries, and chemical synthesis using deep neural networks, among others.

The researchers see DELPHI as a tool that can help humans better leverage funding for scientific research, identifying “diamond in the rough” technologies that might otherwise languish and offering a way for governments, philanthropies, and venture capital firms to more efficiently and productively support science.

“In essence, our algorithm functions by learning patterns from the history of science, and then pattern-matching on new publications to find early signals of high impact,” says Weis. “By tracking the early spread of ideas, we can predict how likely they are to go viral or spread to the broader academic community in a meaningful way.”

The paper has been published  in Nature Biotechnology .

Searching for the “diamond in the rough”

The machine learning algorithm developed by Weis and Jacobson takes advantage of the vast amount of digital information that is now available with the exponential growth in scientific publication since the 1980s. But instead of using one-dimensional measures, such as the number of citations, to judge a publication’s impact, DELPHI was trained on a full time-series network of journal article metadata to reveal higher-dimensional patterns in their spread across the scientific ecosystem.

The result is a knowledge graph that contains the connections between nodes representing papers, authors, institutions, and other types of data. The strength and type of the complex connections between these nodes determine their properties, which are used in the framework. “These nodes and edges define a time-based graph that DELPHI uses to learn patterns that are predictive of high future impact,” explains Weis.

Together, these network features are used to predict scientific impact, with papers that fall in the top 5 percent of time-scaled node centrality five years after publication considered the “highly impactful” target set that DELPHI aims to identify. These top 5 percent of papers constitute 35 percent of the total impact in the graph. DELPHI can also use cutoffs of the top 1, 10, and 15 percent of time-scaled node centrality, the authors say.

DELPHI suggests that highly impactful papers spread almost virally outside their disciplines and smaller scientific communities. Two papers can have the same number of citations, but highly impactful papers reach a broader and deeper audience. Low-impact papers, on the other hand, “aren’t really being utilized and leveraged by an expanding group of people,” says Weis.

The framework might be useful in “incentivizing teams of people to work together, even if they don’t already know each other — perhaps by directing funding toward them to come together to work on important multidisciplinary problems,” he adds.

Compared to citation number alone, DELPHI identifies over twice the number of highly impactful papers, including 60 percent of “hidden gems,” or papers that would be missed by a citation threshold.

"Advancing fundamental research is about taking lots of shots on goal and then being able to quickly double down on the best of those ideas,” says Jacobson. “This study was about seeing whether we could do that process in a more scaled way, by using the scientific community as a whole, as embedded in the academic graph, as well as being more inclusive in identifying high-impact research directions."

The researchers were surprised at how early in some cases the “alert signal” of a highly impactful paper shows up using DELPHI. “Within one year of publication we are already identifying hidden gems that will have significant impact later on,” says Weis.

He cautions, however, that DELPHI isn’t exactly predicting the future. “We’re using machine learning to extract and quantify signals that are hidden in the dimensionality and dynamics of the data that already exist.”

Fair, efficient, and effective funding

The hope, the researchers say, is that DELPHI will offer a less-biased way to evaluate a paper’s impact, as other measures such as citations and journal impact factor number can be manipulated, as past studies have shown.

“We hope we can use this to find the most deserving research and researchers, regardless of what institutions they’re affiliated with or how connected they are,” Weis says.

As with all machine learning frameworks, however, designers and users should be alert to bias, he adds. “We need to constantly be aware of potential biases in our data and models. We want DELPHI to help find the best research in a less-biased way — so we need to be careful our models are not learning to predict future impact solely on the basis of sub-optimal metrics like h -Index, author citation count, or institutional affiliation.”

DELPHI could be a powerful tool to help scientific funding become more efficient and effective, and perhaps be used to create new classes of financial products related to science investment.

“The emerging metascience of science funding is pointing toward the need for a portfolio approach to scientific investment,” notes David Lang, executive director of the Experiment Foundation. “Weis and Jacobson have made a significant contribution to that understanding and, more importantly, its implementation with DELPHI.”

It’s something Weis has thought about a lot after his own experiences in launching venture capital funds and laboratory incubation facilities for biotechnology startups.

“I became increasingly cognizant that investors, including myself, were consistently looking for new companies in the same spots and with the same preconceptions,” he says. “There’s a giant wealth of highly-talented people and amazing technology that I started to glimpse, but that is often overlooked. I thought there must be a way to work in this space — and that machine learning could help us find and more effectively realize all this unmined potential.”

Share this news article on:

Related links.

Related Topics

Related Articles

Photo of students in a classroom, all wearing “MIT GSL” T-shirts

Twenty years of cultivating tech entrepreneurs

Bhavik Nagda

Bhavik Nagda: Delving into the deployment of new technologies

founders of The Engine portfolio

The Engine announces second round of funding to support “tough tech” companies

Five of the 16 individuals in the 2016 class of National Inventors Hall of Fame inductees hail from MIT. They include Associate Professor Joseph Jacobson (top left), Jonathan (JD) Albert '97 (top right), Radia Perlman '73, SM '76, PhD '88 (center), Barrett Comiskey '97 (bottom left), and Ivan Sutherland PhD '63 (bottom right).

Five with MIT ties tapped for Inventors Hall of Fame

Gen9's BioFab process, now being used by dozens of researchers and companies, cuts costs and accelerates the creation and testing of genes.

Scaling up synthetic-biology innovation

Previous item Next item

More MIT News

A group photo of the more than 110 attendees of QuARC 2023

QuARC 2023 explores the leading edge in quantum information and science

Read full story →

A box containing nine circular portrait photos labeled with the subject's name, plus text saying "Grant Program for Diverse Voices inaugural recipients"

MIT Press announces inaugural recipients of the Grant Program for Diverse Voices

Aviva Intveld, leaning against a wall and smiling

Aviva Intveld named 2023 Gates Cambridge Scholar

Portrait of Edgar H. Schein

Remembering Professor Emeritus Edgar Schein, an influential leader in management

Illustrated thought balloon in the shape of a human brain, filled with smaller multicolored brains, floating over small white heads containing even smaller brains

Large language models are biased. Can logic help save them?

Eight pink digital robots stand left of blue line while ten blue digital robots stand right of blue line on virtual battlefield. Three pink robots, shaped as triangles, shoot towards one blue robot.

Robot armies duke it out in Battlecode’s epic on-screen battles

Massachusetts Institute of Technology 77 Massachusetts Avenue, Cambridge, MA, USA

research papers on machine learning 2021

Frequently Asked Questions

JMLR Papers

Select a volume number to see its table of contents with links to the papers.

Volume 23 (January 2022 - Present)

Volume 22 (January 2021 - December 2021)

Volume 21 (January 2020 - December 2020)

Volume 20 (January 2019 - December 2019)

Volume 19 (August 2018 - December 2018)

Volume 18 (February 2017 - August 2018)

Volume 17 (January 2016 - January 2017)

Volume 16 (January 2015 - December 2015)

Volume 15 (January 2014 - December 2014)

Volume 14 (January 2013 - December 2013)

Volume 13 (January 2012 - December 2012)

Volume 12 (January 2011 - December 2011)

Volume 11 (January 2010 - December 2010)

Volume 10 (January 2009 - December 2009)

Volume 9 (January 2008 - December 2008)

Volume 8 (January 2007 - December 2007)

Volume 7 (January 2006 - December 2006)

Volume 6 (January 2005 - December 2005)

Volume 5 (December 2003 - December 2004)

Volume 4 (Apr 2003 - December 2003)

Volume 3 (Jul 2002 - Mar 2003)

Volume 2 (Oct 2001 - Mar 2002)

Volume 1 (Oct 2000 - Sep 2001)

Special Topics

Bayesian Optimization

Learning from Electronic Health Data (December 2016)

Gesture Recognition (May 2012 - present)

Large Scale Learning (Jul 2009 - present)

Mining and Learning with Graphs and Relations (February 2009 - present)

Grammar Induction, Representation of Language and Language Learning (Nov 2010 - Apr 2011)

Causality (Sep 2007 - May 2010)

Model Selection (Apr 2007 - Jul 2010)

Conference on Learning Theory 2005 (February 2007 - Jul 2007)

Machine Learning for Computer Security (December 2006)

Machine Learning and Large Scale Optimization (Jul 2006 - Oct 2006)

Approaches and Applications of Inductive Programming (February 2006 - Mar 2006)

Learning Theory (Jun 2004 - Aug 2004)

Special Issues

In Memory of Alexey Chervonenkis (Sep 2015)

Independent Components Analysis (December 2003)

Learning Theory (Oct 2003)

Inductive Logic Programming (Aug 2003)

Fusion of Domain Knowledge with Data for Decision Support (Jul 2003)

Variable and Feature Selection (Mar 2003)

Machine Learning Methods for Text and Images (February 2003)

Eighteenth International Conference on Machine Learning (ICML2001) (December 2002)

Computational Learning Theory (Nov 2002)

Shallow Parsing (Mar 2002)

Kernel Methods (December 2001)

research papers on machine learning 2021

Top Machine Learning Research Papers Released In 2021

research papers on machine learning 2021

Advances in machine learning and deep learning research are reshaping our technology. Machine learning and deep learning have accomplished various astounding feats this year in 2021, and key research articles have resulted in technical advances used by billions of people. The research in this sector is advancing at a breakneck pace and assisting you to keep up. Here is a collection of the most important recent scientific study papers.

Rebooting ACGAN: Auxiliary Classifier GANs with Stable Training

The authors of this work examined why ACGAN training becomes unstable as the number of classes in the dataset grows. The researchers revealed that the unstable training occurs due to a gradient explosion problem caused by the unboundedness of the input feature vectors and the classifier’s poor classification capabilities during the early training stage. The researchers presented the Data-to-Data Cross-Entropy loss (D2D-CE) and the Rebooted Auxiliary Classifier Generative Adversarial Network to alleviate the instability and reinforce ACGAN (ReACGAN). Additionally, extensive tests of ReACGAN demonstrate that it is resistant to hyperparameter selection and is compatible with a variety of architectures and differentiable augmentations.

This article is ranked #1 on CIFAR-10 for Conditional Image Generation.

Sign up for your weekly dose of what's up in emerging technology.

For the research paper, read here .

For code, see here .

Download our Mobile App

research papers on machine learning 2021

Dense Unsupervised Learning for Video Segmentation

The authors presented a straightforward and computationally fast unsupervised strategy for learning dense spacetime representations from unlabeled films in this study. The approach demonstrates rapid convergence of training and a high degree of data efficiency. Furthermore, the researchers obtain VOS accuracy superior to previous results despite employing a fraction of the previously necessary training data. The researchers acknowledge that the research findings may be utilised maliciously, such as for unlawful surveillance, and that they are excited to investigate how this skill might be used to better learn a broader spectrum of invariances by exploiting larger temporal windows in movies with complex (ego-)motion, which is more prone to disocclusions.

This study is ranked #1 on DAVIS 2017 for Unsupervised Video Object Segmentation (val).

Temporally-Consistent Surface Reconstruction using Metrically-Consistent Atlases

The authors offer an atlas-based technique for producing unsupervised temporally consistent surface reconstructions by requiring a point on the canonical shape representation to translate to metrically consistent 3D locations on the reconstructed surfaces. Finally, the researchers envisage a plethora of potential applications for the method. For example, by substituting an image-based loss for the Chamfer distance, one may apply the method to RGB video sequences, which the researchers feel will spur development in video-based 3D reconstruction.

This article is ranked #1 on ANIM in the category of Surface Reconstruction. 

EdgeFlow: Achieving Practical Interactive Segmentation with Edge-Guided Flow

The researchers propose a revolutionary interactive architecture called EdgeFlow that uses user interaction data without resorting to post-processing or iterative optimisation. The suggested technique achieves state-of-the-art performance on common benchmarks due to its coarse-to-fine network design. Additionally, the researchers create an effective interactive segmentation tool that enables the user to improve the segmentation result through flexible options incrementally.

This paper is ranked #1 on Interactive Segmentation on PASCAL VOC

Learning Transferable Visual Models From Natural Language Supervision

The authors of this work examined whether it is possible to transfer the success of task-agnostic web-scale pre-training in natural language processing to another domain. The findings indicate that adopting this formula resulted in the emergence of similar behaviours in the field of computer vision, and the authors examine the social ramifications of this line of research. CLIP models learn to accomplish a range of tasks during pre-training to optimise their training objective. Using natural language prompting, CLIP can then use this task learning to enable zero-shot transfer to many existing datasets. When applied at a large scale, this technique can compete with task-specific supervised models, while there is still much space for improvement.

This research is ranked #1 on Zero-Shot Transfer Image Classification on SUN

CoAtNet: Marrying Convolution and Attention for All Data Sizes

The researchers in this article conduct a thorough examination of the features of convolutions and transformers, resulting in a principled approach for combining them into a new family of models dubbed CoAtNet. Extensive experiments demonstrate that CoAtNet combines the advantages of ConvNets and Transformers, achieving state-of-the-art performance across a range of data sizes and compute budgets. Take note that this article is currently concentrating on ImageNet classification for model construction. However, the researchers believe their approach is relevant to a broader range of applications, such as object detection and semantic segmentation.

This paper is ranked #1 on Image Classification on ImageNet (using extra training data).

SwinIR: Image Restoration Using Swin Transformer

The authors of this article suggest the SwinIR image restoration model, which is based on the Swin Transformer . The model comprises three modules: shallow feature extraction, deep feature extraction, and human-recognition reconstruction. For deep feature extraction, the researchers employ a stack of residual Swin Transformer blocks (RSTB), each formed of Swin Transformer layers, a convolution layer, and a residual connection.

This research article is ranked #1 on Image Super-Resolution on Manga109 – 4x upscaling.

More Great AIM Stories

Hugging face makes openai’s worst nightmare come true, data fear looms as india embraces chatgpt, open-source movement in india gets hardware update, unboxing llms, how confidential computing is changing the ai chip game, why an indian equivalent of openai is unlikely for now, from hazy vision to clear direction: agi’s path revealed.

Dr. Nivash Jeevanandam

Our Upcoming Conferences

16-17th Mar, 2023 | Bangalore Rising 2023 | Women in Tech Conference

27-28th Apr, 2023 I Bangalore Data Engineering Summit (DES) 2023 27-28th Apr, 2023

23 Jun, 2023 | Bangalore MachineCon India 2023 [AI100 Awards]

21 Jul, 2023 | New York MachineCon USA 2023 [AI100 Awards]

3 Ways to Join our Community

Telegram group.

Discover special offers, top stories, upcoming events, and more.

Discord Server

Stay Connected with a larger ecosystem of data science and ML Professionals

Subscribe to our Daily newsletter

Get our daily awesome stories & videos in your inbox, aim top stories.

Council Post: Driving effective digitisation in underwriting space with AI, NLP and analytics

Council Post: Driving effective digitisation in underwriting space with AI, NLP and analytics

As AI and NLP enable digitisation, the power of analytics boosts the effectiveness of extracted content by leveraging predictive modelling.

research papers on machine learning 2021

Which Indian IT majors have started service unit on Metaverse

Few days back, Tech Mahindra announced its foray into metaverse with the launch of TechMVerse.

research papers on machine learning 2021

Can we and should we have fully open APIs?

With fully open APIs, too, developers get the same level of control and freedom in the cloud.

research papers on machine learning 2021

A complete tutorial on masked language modelling using BERT

Masked language modelling is one of such interesting applications of natural language processing. Masked image modelling is a way to perform word prediction that was originally hidden intentionally in a sentence.

tech companies Russia

All major tech firms that suspended operations in Russia

Apple had restricted Apple Pay services in Russia and removed state media like RT News and Sputnik News from the App Store.

research papers on machine learning 2021

Ukrainian crisis puts tech talent and global chip supply at risk

Pre-invasion, Kyiv, Lviv, Kharkiv and Dnipro used to be major IT hubs in Ukraine.

research papers on machine learning 2021

India enters the “Quantum Key Distribution” club

Quantum key distribution works by transmitting millions of photons (polarised light particles) over a fibre optic cable from one entity location to another to create a bitstream of ones and zeroes.

research papers on machine learning 2021

Interview with Rahul Saxena, co-founder and CTO at AiDash

Any team should always have a mix of people with varying skills and exposure to run the business without compromises.

Oracle JAPAC

Indian AI startups DeepVisionTech & TensorGo Technologies win at Oracle APAC Startup Idol 2022

Two homegrown startups, DeepVisionTech and TensorGo Technologies, won awards at the Oracle APAC Startup Idol 2022.

research papers on machine learning 2021

A guide to feature engineering in time series with Tsfresh

In time series modelling, feature engineering works in a different way because it is sequential data and it gets formed using the changes in any values according to the time

Our mission is to bring about better-informed and more conscious decisions about technology through authoritative, influential, and trustworthy journalism.

Shape the future of tech.

© Analytics India Magazine Pvt Ltd & AIM Media House LLC 2023

The Rising 2023: 5th edition of India's biggest gathering of Women in Tech to be held in Bengaluru on March 16 and 17th


  1. Machine Learning Based Research Papers

    research papers on machine learning 2021

  2. Machine Learning Research Papers Pdf

    research papers on machine learning 2021

  3. Machine Learning Research Papers 2020

    research papers on machine learning 2021

  4. Analytical Essay: Essays online

    research papers on machine learning 2021

  5. Machine Learning Research Papers 2020

    research papers on machine learning 2021

  6. Machine Learning Research Papers 2020

    research papers on machine learning 2021


  1. Secrets of Research Papers 🥹 #datascience #data #research

  2. Machine Learning Course

  3. Student Recruitment Analysis using Machine Learning and Deep Learning Techniques

  4. Active patterns in F#

  5. Matching array elements in F#

  6. Initialising arrays with zeros in F#


  1. 2021 20th IEEE International Conference on Machine Learning ...

    Read all the papers in 2021 20th IEEE International Conference on Machine Learning and Applications (ICMLA) ...

  2. Best Machine Learning Research of 2021 So Far

    Specifically, this paper makes three important contributions: (i) the proposed work is the first attempt in enabling hardware acceleration of explainable ML using TPUs; (ii) the proposed approach is applicable across a wide variety of ML algorithms, and effective utilization of TPU-based acceleration can lead to real-time outcome interpretation; …

  3. machine learning Latest Research Papers | ScienceGate

    Data-driven analysis and machine learning for energy prediction in distributed photovoltaic generation plants: A case study in Queensland, Australia Energy Reports 10.1016/j.egyr.2021.11.123 2022 Vol 8 pp. 745-751 Author (s): Lucas Ramos Marilaine Colnago Wallace Casaca Keyword (s): Machine Learning Data Driven

  4. Journal of Machine Learning Research

    Journal of Machine Learning Research The Journal of Machine Learning Research (JMLR), established in 2000, provides an international forum for the electronic and paper publication of high-quality scholarly articles in all areas of machine learning. All published papers are freely available online.

  5. Machine Learning: Algorithms, Real-World Applications and ...

    Machine Learning: Algorithms, Real-World Applications and Research Directions SN Comput Sci. 2021;2(3):160.doi: 10.1007/s42979-021-00592-x. Epub 2021 Mar 22. Author Iqbal H Sarker 1 2 Affiliations 1Swinburne University of Technology, Melbourne, VIC 3122 Australia.

  6. The latest in Machine Learning | Papers With Code

    nebuly-ai/nebullvm • • NA 2022. The aim of this paper is to introduce a new learning procedure for neural networks and to demonstrate that it works well enough on a few small problems to be worth further investigation. 4,755. 5.79 stars / hour. Paper.

  7. Using machine learning to predict high-impact research

    The paper has been published in Nature Biotechnology. Searching for the “diamond in the rough” The machine learning algorithm developed by Weis and Jacobson takes advantage of the vast amount of digital information that is now available with the exponential growth in scientific publication since the 1980s.

  8. JMLR Papers - Journal of Machine Learning Research

    JMLR Papers. Select a volume number to see its table of contents with links to the papers. Volume 23 (January 2022 - Present) . Volume 22 (January 2021 - December 2021)

  9. Top Machine Learning Research Papers Released In 2021

    Top Machine Learning Research Papers Released In 2021 Advances in the machine and deep learning in 2021 could lead to new technologies utilised by billions of people worldwide. By Dr. Nivash Jeevanandam Advances in machine learning and deep learning research are reshaping our technology.