Skip to Main Content
- Change Username/Password
- Update Address
- Payment Options
- Order History
- View Purchased Documents
- Communications Preferences
- Profession and Education
- Technical Interests
- US & Canada: +1 800 678 4333
- Worldwide: +1 732 981 0060
- Contact & Support
- About IEEE Xplore
- Nondiscrimination Policy
- Privacy & Opting Out of Cookies
A not-for-profit organization, IEEE is the world's largest technical professional organization dedicated to advancing technology for the benefit of humanity. © Copyright 2023 IEEE - All rights reserved. Use of this web site signifies your agreement to the terms and conditions.
- ODSC EUROPE
- AI+ Training
- Speak at ODSC
- Data Visualization
- Machine Learning
- Deep Learning
- Downloadable Guide
- NLP/Text Analytics
- Generative AI
- Write for us
- ODSC Community Slack Channel
- ODSC Medium Publication
- Speaker Blogs
- Guest Contributors
- AI and Data Science News
- Research in academia
- Upcoming Webinars
Best Machine Learning Research of 2021 So Far
Machine Learning Modeling Research posted by Daniel Gutierrez, ODSC July 19, 2021 Daniel Gutierrez, ODSC
The start of 2021 saw many prominent research groups extending the state of machine learning science to consistently greater heights. In my efforts to keep pace with this accelerated progress, I’ve noticed a number of hot topics that are gaining the attention of researchers: explainable/interpretable ML , federated learning , gradient boosting , causal inference , ROC analysis, and many others. In this article, we’ll take a journey through my top picks of papers for the first half of 2021 that I found compelling and worthwhile. Through my effort to stay current with the field’s research advancement, I found the directions represented in these papers to be very promising. I hope you enjoy my selections as much as I have. (Check my lists from 2019 and 2020 ).
Scaling Hierarchical Agglomerative Clustering to Billion-sized Datasets
Hierarchical Agglomerative Clustering (HAC) is one of the oldest but still most widely used clustering methods. However, HAC is notoriously hard to scale to large data sets as the underlying complexity is at least quadratic in the number of data points and many algorithms to solve HAC are inherently sequential. This paper proposes Reciprocal Agglomerative Clustering (RAC), a distributed algorithm for HAC that uses a novel strategy to efficiently merge clusters in parallel. The paper proves theoretically that RAC recovers the exact solution of HAC. Furthermore, under clusterability and balancedness assumption it’s shown provable speedups in total runtime due to the parallelism. It’s also shown that these speedups are achievable for certain probabilistic data models. In extensive experiments, it’s shown that this parallelism is achieved on real-world data sets and that the proposed RAC algorithm can recover the HAC hierarchy on billions of data points connected by trillions of edges in less than an hour.
Deep ROC Analysis and AUC as Balanced Average Accuracy to Improve Model Selection, Understanding and Interpretation
Optimal performance is critical for decision-making tasks from medicine to autonomous driving, however common performance measures may be too general or too specific. For binary classifiers, diagnostic tests, or prognosis at a timepoint, measures such as the area under the receiver operating characteristic (ROC) curve, or the area under the precision-recall curve, are too general because they include unrealistic decision thresholds. On the other hand, measures such as accuracy, sensitivity or the F1 score are measures at a single threshold that reflect an individual single probability or predicted risk, rather than a range of individuals or risk. This paper proposes a method in between, deep ROC analysis that examines groups of probabilities or predicted risks for more insightful analysis. The research translates esoteric measures into familiar terms: AUC and the normalized concordant partial AUC are balanced average accuracy (a new finding); the normalized partial AUC is average sensitivity, and the normalized horizontal partial AUC is average specificity.
Hardware Acceleration of Explainable Machine Learning using Tensor Processing Units
Machine learning (ML) is successful in achieving human-level performance in various fields. However, it lacks the ability to explain an outcome due to its black-box nature. While existing explainable ML is promising, almost all of these methods focus on formatting interpretability as an optimization problem. Such a mapping leads to numerous iterations of time-consuming complex computations, which limits their applicability in real-time applications. This paper proposes a novel framework for accelerating explainable ML using Tensor Processing Units (TPUs). The proposed framework exploits the synergy between matrix convolution and Fourier transform, and takes full advantage of TPU’s natural ability in accelerating matrix computations. Specifically, this paper makes three important contributions: (i) the proposed work is the first attempt in enabling hardware acceleration of explainable ML using TPUs; (ii) the proposed approach is applicable across a wide variety of ML algorithms, and effective utilization of TPU-based acceleration can lead to real-time outcome interpretation; and (iii) extensive experimental results demonstrate that the proposed approach can provide an order-of-magnitude speedup in both classification time (25x on average) and interpretation time (13x on average) compared to state-of-the-art techniques.
Learning to Optimize: A Primer and A Benchmark
Learning to optimize (L2O) is an emerging approach that leverages machine learning to develop optimization methods, aiming at reducing the laborious iterations of hand engineering. It automates the design of an optimization method based on its performance on a set of training problems. This data-driven procedure generates methods that can efficiently solve problems similar to those in the training. In sharp contrast, the typical and traditional designs of optimization methods are theory-driven, so they obtain performance guarantees over the classes of problems specified by the theory. The difference makes L2O suitable for repeatedly solving a certain type of optimization problems over a specific distribution of data, while it typically fails on out-of-distribution problems. The practicality of L2O depends on the type of target optimization, the chosen architecture of the method to learn, and the training procedure. This new paradigm has motivated a community of researchers to explore L2O and report their findings. This paper is poised to be the first comprehensive survey and benchmark of L2O for continuous optimization. The GitHub repo associated with this paper can be found HERE .
Federated Quantum Machine Learning
Distributed training across several quantum computers could significantly improve the training time and if we could share the learned model, not the data, it could potentially improve the data privacy as the training would happen where the data is located. It is believed that no work has been done in quantum machine learning (QML) in a federation setting yet. This paper presents the federated training on hybrid quantum-classical machine learning models although the framework could be generalized to a pure quantum machine learning model. Specifically, it was considered the quantum neural network (QNN) coupled with the classical pre-trained convolutional model. The distributed federated learning scheme demonstrated almost the same level of trained model accuracies and yet significantly faster-distributed training. It demonstrates a promising future research direction for scaling and privacy aspects.
Towards Causal Representation Learning
The two fields of machine learning and graphical causality arose and developed separately. However, there is now cross-pollination and increasing interest in both fields to benefit from the advances of the other. This paper reviews fundamental concepts of causal inference and relate them to crucial open problems of machine learning, including transfer and generalization, thereby assaying how causality can contribute to modern machine learning research. This also applies in the opposite direction: it’s noted that most work in causality starts from the premise that the causal variables are given. A central problem for AI and causality is, thus, causal representation learning, the discovery of high-level causal variables from low-level observations. Finally, the paper delineates some implications of causality for machine learning and proposes key research areas at the intersection of both communities.
QuPeL: Quantized Personalization with Applications to Federated Learning
Traditionally, federated learning (FL) aims to train a single global model while collaboratively using multiple clients and a server. Two natural challenges that FL algorithms face are heterogeneity in data across clients and collaboration of clients with diverse resources. This machine learning research paper introduces a quantized and personalized FL algorithm QuPeL that facilitates collective training with heterogeneous clients while respecting resource diversity. For personalization, clients are allowed to learn compressed personalized models with different quantization parameters depending on their resources. Towards this, an algorithm is proposed for learning quantized models through a relaxed optimization problem, where quantization values are also optimized over. When each client participating in the (federated) learning process has different requirements of the quantized model (both in value and precision), a quantized personalization framework is formulated by introducing a penalty term for local client objectives against a globally trained model to encourage collaboration.
Explainable Artificial Intelligence Approaches: A Survey
The lack of explainability of a decision from an Artificial Intelligence (AI) based “black box” system/model, despite its superiority in many real-world applications, is a key stumbling block for adopting AI in many high stakes applications of different domain or industry. While many popular Explainable Artificial Intelligence (XAI) methods or approaches are available to facilitate a human-friendly explanation of the decision, each has its own merits and demerits, with a plethora of open challenges. This machine learning research paper demonstrates popular XAI methods with a mutual case study/task (i.e. credit default prediction), analyze for competitive advantages from multiple perspectives (e.g. local, global), provide meaningful insight on quantifying explainability, and recommend paths towards responsible or human-centered AI using XAI as a medium. Practitioners can use this work as a catalog to understand, compare, and correlate competitive advantages of popular XAI methods. In addition, this survey elicits future research directions towards responsible or human-centric AI systems, which is crucial to adopt AI in high-stakes applications.
tf.data: A Machine Learning Data Processing Framework
Training machine learning models requires feeding input data for models to ingest. Input pipelines for machine learning jobs are often challenging to implement efficiently as they require reading large volumes of data, applying complex transformations, and transferring data to hardware accelerators while overlapping computation and communication to achieve optimal performance. This machine learning research paper presents tf.data, a framework for building and executing efficient input pipelines for machine learning jobs. The tf.data API provides operators which can be parameterized with user-defined computation, composed, and reused across different machine learning domains. These abstractions allow users to focus on the application logic of data processing, while tf.data’s runtime ensures that pipelines run efficiently. The paper demonstrates that input pipeline performance is critical to the end-to-end training time of state-of-the-art machine learning models. tf.data delivers the high performance required, while avoiding the need for manual tuning of performance knobs.
Boost then Convolve: Gradient Boosting Meets Graph Neural Networks
Graph neural networks (GNNs) are powerful models that have been successful in various graph representation learning tasks. Whereas gradient boosted decision trees (GBDT) often outperform other machine learning methods when faced with heterogeneous tabular data. But what approach should be used for graphs with tabular node features? Previous GNN models have mostly focused on networks with homogeneous sparse features and are suboptimal in the heterogeneous setting. This paper proposes a novel architecture that trains GBDT and GNN jointly to get the best of both worlds: the GBDT model deals with heterogeneous features, while GNN accounts for the graph structure. The model benefits from end-to-end optimization by allowing new trees to fit the gradient updates of GNN. With an extensive experimental comparison to the leading GBDT and GNN models, the researchers demonstrate a significant increase in performance on a variety of graphs with tabular features. The GitHub repo associated with this paper can be found HERE .
Notes from the editor:
How to Learn More about Machine Learning Research
At our upcoming event this November 16th-18th in San Francisco, ODSC West 2021 will feature a plethora of talks, workshops, and training sessions on machine learning and machine learning research. You can register now for 60% off all ticket types before the discount drops to 40% in a few weeks. Some highlighted sessions on machine learning include:
- Towards More Energy-Efficient Neural Networks? Use Your Brain!: Olaf de Leeuw | Data Scientist | Dataworkz
- Practical MLOps: Automation Journey: Evgenii Vinogradov, PhD | Head of DHW Development | YooMoney
- Applications of Modern Survival Modeling with Python: Brian Kent, PhD | Data Scientist | Founder The Crosstab Kite
- Using Change Detection Algorithms for Detecting Anomalous Behavior in Large Systems: Veena Mendiratta, PhD | Adjunct Faculty, Network Reliability and Analytics Researcher | Northwestern University
Sessions on MLOps:
- Tuning Hyperparameters with Reproducible Experiments: Milecia McGregor | Senior Software Engineer | Iterative
- MLOps… From Model to Production: Filipa Peleja, PhD | Lead Data Scientist | Levi Strauss & Co
- Operationalization of Models Developed and Deployed in Heterogeneous Platforms: Sourav Mazumder | Data Scientist, Thought Leader, AI & ML Operationalization Leader | IBM
- Develop and Deploy a Machine Learning Pipeline in 45 Minutes with Ploomber: Eduardo Blancas | Data Scientist | Fidelity Investments
Sessions on Deep Learning:
- GANs: Theory and Practice, Image Synthesis With GANs Using TensorFlow: Ajay Baranwal | Center Director | Center for Deep Learning in Electronic Manufacturing, Inc
- Machine Learning With Graphs: Going Beyond Tabular Data: Dr. Clair J. Sullivan | Data Science Advocate | Neo4j
- Deep Dive into Reinforcement Learning with PPO using TF-Agents & TensorFlow 2.0: Oliver Zeigermann | Software Developer | embarc Software Consulting GmbH
- Get Started with Time-Series Forecasting using the Google Cloud AI Platform: Karl Weinmeister | Developer Relations Engineering Manager | Google
Daniel Gutierrez, ODSC
Daniel D. Gutierrez is a practicing data scientist who’s been working with data long before the field came in vogue. As a technology journalist, he enjoys keeping a pulse on this fast-paced industry. Daniel is also an educator having taught data science, machine learning and R classes at the university level. He has authored four computer industry books on database and data science technology, including his most recent title, “Machine Learning and Data Science: An Introduction to Statistical Learning Methods with R.” Daniel holds a BS in Mathematics and Computer Science from UCLA.
SQuARE: Towards Multi-Domain and Few-Shot Collaborating Question Answering Agents
East 2023 Modeling posted by ODSC Community Mar 3, 2023 Editor’s note: Iryna Gurevych, PhD and Haritz Puerto are speakers for ODSC East 2023. Be sure...
OpenAI Launches ChatGPT and Whisper APIs to Democratize Advanced NLP Technology
AI and Data Science News posted by ODSC Team Mar 2, 2023 OpenAI has announced the launch of its ChatGPT and Whisper APIs. The design goal of which...
Concerns over potential harm from AI prompt Apple to block update of ChatGPT-powered app
AI and Data Science News posted by ODSC Team Mar 2, 2023 In a report from the Wall Street Journal, Apple has blocked an update to an app...
machine learning Recently Published Documents
- Latest Documents
- Most Cited Documents
- Contributed Authors
- Related Sources
- Related Keywords
An explainable machine learning model for identifying geographical origins of sea cucumber Apostichopus japonicus based on multi-element profile
A comparison of machine learning- and regression-based models for predicting ductility ratio of rc beam-column joints, alexa, is this a historical record.
Digital transformation in government has brought an increase in the scale, variety, and complexity of records and greater levels of disorganised data. Current practices for selecting records for transfer to The National Archives (TNA) were developed to deal with paper records and are struggling to deal with this shift. This article examines the background to the problem and outlines a project that TNA undertook to research the feasibility of using commercially available artificial intelligence tools to aid selection. The project AI for Selection evaluated a range of commercial solutions varying from off-the-shelf products to cloud-hosted machine learning platforms, as well as a benchmarking tool developed in-house. Suitability of tools depended on several factors, including requirements and skills of transferring bodies as well as the tools’ usability and configurability. This article also explores questions around trust and explainability of decisions made when using AI for sensitive tasks such as selection.
Automated Text Classification of Maintenance Data of Higher Education Buildings Using Text Mining and Machine Learning Techniques
Data-driven analysis and machine learning for energy prediction in distributed photovoltaic generation plants: a case study in queensland, australia, modeling nutrient removal by membrane bioreactor at a sewage treatment plant using machine learning models, big five personality prediction based in indonesian tweets using machine learning methods.
<span lang="EN-US">The popularity of social media has drawn the attention of researchers who have conducted cross-disciplinary studies examining the relationship between personality traits and behavior on social media. Most current work focuses on personality prediction analysis of English texts, but Indonesian has received scant attention. Therefore, this research aims to predict user’s personalities based on Indonesian text from social media using machine learning techniques. This paper evaluates several machine learning techniques, including <a name="_Hlk87278444"></a>naive Bayes (NB), K-nearest neighbors (KNN), and support vector machine (SVM), based on semantic features including emotion, sentiment, and publicly available Twitter profile. We predict the personality based on the big five personality model, the most appropriate model for predicting user personality in social media. We examine the relationships between the semantic features and the Big Five personality dimensions. The experimental results indicate that the Big Five personality exhibit distinct emotional, sentimental, and social characteristics and that SVM outperformed NB and KNN for Indonesian. In addition, we observe several terms in Indonesian that specifically refer to each personality type, each of which has distinct emotional, sentimental, and social features.</span>
Compressive strength of concrete with recycled aggregate; a machine learning-based evaluation
Temperature prediction of flat steel box girders of long-span bridges utilizing in situ environmental parameters and machine learning, computer-assisted cohort identification in practice.
The standard approach to expert-in-the-loop machine learning is active learning, where, repeatedly, an expert is asked to annotate one or more records and the machine finds a classifier that respects all annotations made until that point. We propose an alternative approach, IQRef , in which the expert iteratively designs a classifier and the machine helps him or her to determine how well it is performing and, importantly, when to stop, by reporting statistics on a fixed, hold-out sample of annotated records. We justify our approach based on prior work giving a theoretical model of how to re-use hold-out data. We compare the two approaches in the context of identifying a cohort of EHRs and examine their strengths and weaknesses through a case study arising from an optometric research problem. We conclude that both approaches are complementary, and we recommend that they both be employed in conjunction to address the problem of cohort identification in health research.
Export Citation Format
An official website of the United States government
The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.
The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.
- Account settings
- My Bibliography
- Citation manager
Save citation to file
Email citation, add to collections.
- Create a new collection
- Add to an existing collection
Add to My Bibliography
Your saved search, create a file for external citation management software, your rss feed.
- Search in PubMed
- Search in NLM Catalog
- Add to Search
Machine Learning: Algorithms, Real-World Applications and Research Directions
- 1 Swinburne University of Technology, Melbourne, VIC 3122 Australia.
- 2 Department of Computer Science and Engineering, Chittagong University of Engineering & Technology, 4349 Chattogram, Bangladesh.
- PMID: 33778771
- PMCID: PMC7983091
- DOI: 10.1007/s42979-021-00592-x
In the current age of the Fourth Industrial Revolution (4 IR or Industry 4.0), the digital world has a wealth of data, such as Internet of Things (IoT) data, cybersecurity data, mobile data, business data, social media data, health data, etc. To intelligently analyze these data and develop the corresponding smart and automated applications, the knowledge of artificial intelligence (AI), particularly, machine learning (ML) is the key. Various types of machine learning algorithms such as supervised, unsupervised, semi-supervised, and reinforcement learning exist in the area. Besides, the deep learning , which is part of a broader family of machine learning methods, can intelligently analyze the data on a large scale. In this paper, we present a comprehensive view on these machine learning algorithms that can be applied to enhance the intelligence and the capabilities of an application. Thus, this study's key contribution is explaining the principles of different machine learning techniques and their applicability in various real-world application domains, such as cybersecurity systems, smart cities, healthcare, e-commerce, agriculture, and many more. We also highlight the challenges and potential research directions based on our study. Overall, this paper aims to serve as a reference point for both academia and industry professionals as well as for decision-makers in various real-world situations and application areas, particularly from the technical point of view.
Keywords: Artificial intelligence; Data science; Data-driven decision-making; Deep learning; Intelligent applications; Machine learning; Predictive analytics.
© The Author(s), under exclusive licence to Springer Nature Singapore Pte Ltd 2021.
Conflict of interest statement
Conflict of interestThe author declares no conflict of interest.
The worldwide popularity score of…
The worldwide popularity score of various types of ML algorithms (supervised, unsupervised, semi-supervised,…
Various types of machine learning…
Various types of machine learning techniques
A general structure of a…
A general structure of a machine learning based predictive model considering both the…
An example of a decision…
An example of a decision tree structure
An example of a random…
An example of a random forest structure considering multiple decision trees
Classification vs. regression. In classification…
Classification vs. regression. In classification the dotted line represents a linear boundary that…
A graphical interpretation of the…
A graphical interpretation of the widely-used hierarchical clustering (Bottom-up and top-down) technique
An example of a principal…
An example of a principal component analysis (PCA) and created principal components PC1…
Machine learning and deep learning…
Machine learning and deep learning performance in general with the amount of data
A structure of an artificial…
A structure of an artificial neural network modeling with multiple processing layers
An example of a convolutional…
An example of a convolutional neural network (CNN or ConvNet) including multiple convolution…
- Data Science and Analytics: An Overview from Data-Driven Smart Computing, Decision-Making and Applications Perspective. Sarker IH. Sarker IH. SN Comput Sci. 2021;2(5):377. doi: 10.1007/s42979-021-00765-8. Epub 2021 Jul 12. SN Comput Sci. 2021. PMID: 34278328 Free PMC article. Review.
- AI-Based Modeling: Techniques, Applications and Research Issues Towards Automation, Intelligent and Smart Systems. Sarker IH. Sarker IH. SN Comput Sci. 2022;3(2):158. doi: 10.1007/s42979-022-01043-x. Epub 2022 Feb 10. SN Comput Sci. 2022. PMID: 35194580 Free PMC article. Review.
- Deep Learning: A Comprehensive Overview on Techniques, Taxonomy, Applications and Research Directions. Sarker IH. Sarker IH. SN Comput Sci. 2021;2(6):420. doi: 10.1007/s42979-021-00815-1. Epub 2021 Aug 18. SN Comput Sci. 2021. PMID: 34426802 Free PMC article. Review.
- At the Confluence of Artificial Intelligence and Edge Computing in IoT-Based Applications: A Review and New Perspectives. Bourechak A, Zedadra O, Kouahla MN, Guerrieri A, Seridi H, Fortino G. Bourechak A, et al. Sensors (Basel). 2023 Feb 2;23(3):1639. doi: 10.3390/s23031639. Sensors (Basel). 2023. PMID: 36772680 Free PMC article. Review.
- Artificial Intelligence Applications and Self-Learning 6G Networks for Smart Cities Digital Ecosystems: Taxonomy, Challenges, and Future Directions. Ismail L, Buyya R. Ismail L, et al. Sensors (Basel). 2022 Aug 1;22(15):5750. doi: 10.3390/s22155750. Sensors (Basel). 2022. PMID: 35957307 Free PMC article. Review.
- Conventional and Novel Diagnostic Tools for the Diagnosis of Emerging SARS-CoV-2 Variants. Chavda VP, Valu DD, Parikh PK, Tiwari N, Chhipa AS, Shukla S, Patel SS, Balar PC, Paiva-Santos AC, Patravale V. Chavda VP, et al. Vaccines (Basel). 2023 Feb 6;11(2):374. doi: 10.3390/vaccines11020374. Vaccines (Basel). 2023. PMID: 36851252 Free PMC article. Review.
- Evaluation of Machine Leaning Algorithms for Streets Traffic Prediction: A Smart Home Use Case. Feng X, Ahvar E, Lee GM. Feng X, et al. Sensors (Basel). 2023 Feb 15;23(4):2174. doi: 10.3390/s23042174. Sensors (Basel). 2023. PMID: 36850771 Free PMC article.
- Reviewing Federated Machine Learning and Its Use in Diseases Prediction. Moshawrab M, Adda M, Bouzouane A, Ibrahim H, Raad A. Moshawrab M, et al. Sensors (Basel). 2023 Feb 13;23(4):2112. doi: 10.3390/s23042112. Sensors (Basel). 2023. PMID: 36850717 Free PMC article. Review.
- Achieving Reliability in Cloud Computing by a Novel Hybrid Approach. Shahid MA, Alam MM, Su'ud MM. Shahid MA, et al. Sensors (Basel). 2023 Feb 9;23(4):1965. doi: 10.3390/s23041965. Sensors (Basel). 2023. PMID: 36850563 Free PMC article.
- Maximum Entropy Exploration in Contextual Bandits with Neural Networks and Energy Based Models. Elwood A, Leonardi M, Mohamed A, Rozza A. Elwood A, et al. Entropy (Basel). 2023 Jan 18;25(2):188. doi: 10.3390/e25020188. Entropy (Basel). 2023. PMID: 36832555 Free PMC article.
- Canadian institute of cybersecurity, university of new brunswick, iscx dataset, http://www.unb.ca/cic/datasets/index.html/ (Accessed on 20 October 2019).
- Cic-ddos2019 [online]. available: https://www.unb.ca/cic/datasets/ddos-2019.html/ (Accessed on 28 March 2020).
- World health organization: WHO. http://www.who.int/ .
- Google trends. In https://trends.google.com/trends/ , 2019.
- Adnan N, Nordin Shahrina Md, Rahman I, Noor A. The effects of knowledge transfer on farmers decision making toward sustainable agriculture practices. World J Sci Technol Sustain Dev. 2018.
- Search in MeSH
LinkOut - more resources
Full text sources.
- Europe PubMed Central
- PubMed Central
Other Literature Sources
- The Lens - Patent Citations
- scite Smart Citations
- Citation Manager
NCBI Literature Resources
MeSH PMC Bookshelf Disclaimer
The PubMed wordmark and PubMed logo are registered trademarks of the U.S. Department of Health and Human Services (HHS). Unauthorized use of these marks is strictly prohibited.
Subscribe to the PwC Newsletter
Join the community, trending research, the forward-forward algorithm: some preliminary investigations.
The aim of this paper is to introduce a new learning procedure for neural networks and to demonstrate that it works well enough on a few small problems to be worth further investigation.
Discovering faster matrix multiplication algorithms with reinforcement learning
Particularly relevant is the case of 4 × 4 matrices in a finite field, where AlphaTensor’s algorithm improves on Strassen’s two-level algorithm for the first time, to our knowledge, since its discovery 50 years ago2.
LLaMA: Open and Efficient Foundation Language Models
We introduce LLaMA, a collection of foundation language models ranging from 7B to 65B parameters.
Composer: Creative and Controllable Image Synthesis with Composable Conditions
damo-vilab/composer • 20 Feb 2023
Recent large-scale generative models learned on big data are capable of synthesizing incredible images yet suffer from limited controllability.
SpikeGPT: Generative Pre-trained Language Model with Spiking Neural Networks
Spiking neural networks (SNNs) have emerged as an energy-efficient approach to deep learning that leverage sparse and event-driven activations to reduce the computational overhead associated with model inference.
More than you've asked for: A Comprehensive Analysis of Novel Prompt Injection Threats to Application-Integrated Large Language Models
greshake/lm-safety • 23 Feb 2023
In such attacks, an adversary can prompt the LLM to produce malicious content or override the original instructions and the employed filtering schemes.
ZoeDepth: Zero-shot Transfer by Combining Relative and Metric Depth
Finally, ZoeD-M12-NK is the first model that can jointly train on multiple datasets (NYU Depth v2 and KITTI) without a significant drop in performance and achieve unprecedented zero-shot generalization performance to eight unseen datasets from both indoor and outdoor domains.
Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities
Adding Conditional Control to Text-to-Image Diffusion Models
Moreover, training a ControlNet is as fast as fine-tuning a diffusion model, and the model can be trained on a personal devices.
VoxFormer: Sparse Voxel Transformer for Camera-based 3D Semantic Scene Completion
nvlabs/voxformer • 23 Feb 2023
To enable such capability in AI systems, we propose VoxFormer, a Transformer-based semantic scene completion framework that can output complete 3D volumetric semantics from only 2D images.
Suggestions or feedback?
MIT News | Massachusetts Institute of Technology
- Machine learning
- Social justice
- Black holes
- Classes and programs
- Aeronautics and Astronautics
- Brain and Cognitive Sciences
- Political Science
- Mechanical Engineering
Centers, Labs, & Programs
- Abdul Latif Jameel Poverty Action Lab (J-PAL)
- Picower Institute for Learning and Memory
- Lincoln Laboratory
- School of Architecture + Planning
- School of Engineering
- School of Humanities, Arts, and Social Sciences
- Sloan School of Management
- School of Science
- MIT Schwarzman College of Computing
Using machine learning to predict high-impact research
Press contact :.
Previous image Next image
An artificial intelligence framework built by MIT researchers can give an “early-alert” signal for future high-impact technologies, by learning from patterns gleaned from previous scientific publications.
In a retrospective test of its capabilities, DELPHI , short for Dynamic Early-warning by Learning to Predict High Impact, was able to identify all pioneering papers on an experts’ list of key foundational biotechnologies, sometimes as early as the first year after their publication.
James W. Weis, a research affiliate of the MIT Media Lab, and Joseph Jacobson, a professor of media arts and sciences and head of the Media Lab’s Molecular Machines research group, also used DELPHI to highlight 50 recent scientific papers that they predict will be high impact by 2023. Topics covered by the papers include DNA nanorobots used for cancer treatment, high-energy density lithium-oxygen batteries, and chemical synthesis using deep neural networks, among others.
The researchers see DELPHI as a tool that can help humans better leverage funding for scientific research, identifying “diamond in the rough” technologies that might otherwise languish and offering a way for governments, philanthropies, and venture capital firms to more efficiently and productively support science.
“In essence, our algorithm functions by learning patterns from the history of science, and then pattern-matching on new publications to find early signals of high impact,” says Weis. “By tracking the early spread of ideas, we can predict how likely they are to go viral or spread to the broader academic community in a meaningful way.”
The paper has been published in Nature Biotechnology .
Searching for the “diamond in the rough”
The machine learning algorithm developed by Weis and Jacobson takes advantage of the vast amount of digital information that is now available with the exponential growth in scientific publication since the 1980s. But instead of using one-dimensional measures, such as the number of citations, to judge a publication’s impact, DELPHI was trained on a full time-series network of journal article metadata to reveal higher-dimensional patterns in their spread across the scientific ecosystem.
The result is a knowledge graph that contains the connections between nodes representing papers, authors, institutions, and other types of data. The strength and type of the complex connections between these nodes determine their properties, which are used in the framework. “These nodes and edges define a time-based graph that DELPHI uses to learn patterns that are predictive of high future impact,” explains Weis.
Together, these network features are used to predict scientific impact, with papers that fall in the top 5 percent of time-scaled node centrality five years after publication considered the “highly impactful” target set that DELPHI aims to identify. These top 5 percent of papers constitute 35 percent of the total impact in the graph. DELPHI can also use cutoffs of the top 1, 10, and 15 percent of time-scaled node centrality, the authors say.
DELPHI suggests that highly impactful papers spread almost virally outside their disciplines and smaller scientific communities. Two papers can have the same number of citations, but highly impactful papers reach a broader and deeper audience. Low-impact papers, on the other hand, “aren’t really being utilized and leveraged by an expanding group of people,” says Weis.
The framework might be useful in “incentivizing teams of people to work together, even if they don’t already know each other — perhaps by directing funding toward them to come together to work on important multidisciplinary problems,” he adds.
Compared to citation number alone, DELPHI identifies over twice the number of highly impactful papers, including 60 percent of “hidden gems,” or papers that would be missed by a citation threshold.
"Advancing fundamental research is about taking lots of shots on goal and then being able to quickly double down on the best of those ideas,” says Jacobson. “This study was about seeing whether we could do that process in a more scaled way, by using the scientific community as a whole, as embedded in the academic graph, as well as being more inclusive in identifying high-impact research directions."
The researchers were surprised at how early in some cases the “alert signal” of a highly impactful paper shows up using DELPHI. “Within one year of publication we are already identifying hidden gems that will have significant impact later on,” says Weis.
He cautions, however, that DELPHI isn’t exactly predicting the future. “We’re using machine learning to extract and quantify signals that are hidden in the dimensionality and dynamics of the data that already exist.”
Fair, efficient, and effective funding
The hope, the researchers say, is that DELPHI will offer a less-biased way to evaluate a paper’s impact, as other measures such as citations and journal impact factor number can be manipulated, as past studies have shown.
“We hope we can use this to find the most deserving research and researchers, regardless of what institutions they’re affiliated with or how connected they are,” Weis says.
As with all machine learning frameworks, however, designers and users should be alert to bias, he adds. “We need to constantly be aware of potential biases in our data and models. We want DELPHI to help find the best research in a less-biased way — so we need to be careful our models are not learning to predict future impact solely on the basis of sub-optimal metrics like h -Index, author citation count, or institutional affiliation.”
DELPHI could be a powerful tool to help scientific funding become more efficient and effective, and perhaps be used to create new classes of financial products related to science investment.
“The emerging metascience of science funding is pointing toward the need for a portfolio approach to scientific investment,” notes David Lang, executive director of the Experiment Foundation. “Weis and Jacobson have made a significant contribution to that understanding and, more importantly, its implementation with DELPHI.”
It’s something Weis has thought about a lot after his own experiences in launching venture capital funds and laboratory incubation facilities for biotechnology startups.
“I became increasingly cognizant that investors, including myself, were consistently looking for new companies in the same spots and with the same preconceptions,” he says. “There’s a giant wealth of highly-talented people and amazing technology that I started to glimpse, but that is often overlooked. I thought there must be a way to work in this space — and that machine learning could help us find and more effectively realize all this unmined potential.”
Share this news article on:
- Joseph Jacobson
- Molecular Machines group
- MIT Media Lab
- School of Architecture and Planning
- Artificial intelligence
- Bioengineering and biotechnology
- Venture capital
- Computer science and technology
Twenty years of cultivating tech entrepreneurs
Bhavik Nagda: Delving into the deployment of new technologies
The Engine announces second round of funding to support “tough tech” companies
Five with MIT ties tapped for Inventors Hall of Fame
Scaling up synthetic-biology innovation
Previous item Next item
More MIT News
QuARC 2023 explores the leading edge in quantum information and science
Read full story →
MIT Press announces inaugural recipients of the Grant Program for Diverse Voices
Aviva Intveld named 2023 Gates Cambridge Scholar
Remembering Professor Emeritus Edgar Schein, an influential leader in management
Large language models are biased. Can logic help save them?
Robot armies duke it out in Battlecode’s epic on-screen battles
- More news on MIT News homepage →
Massachusetts Institute of Technology 77 Massachusetts Avenue, Cambridge, MA, USA
- Map (opens in new window)
- Events (opens in new window)
- People (opens in new window)
- Careers (opens in new window)
- Social Media Hub
- MIT on Twitter
- MIT on Facebook
- MIT on YouTube
- MIT on Instagram
Frequently Asked Questions
Select a volume number to see its table of contents with links to the papers.
Volume 23 (January 2022 - Present)
Volume 22 (January 2021 - December 2021)
Volume 21 (January 2020 - December 2020)
Volume 20 (January 2019 - December 2019)
Volume 19 (August 2018 - December 2018)
Volume 18 (February 2017 - August 2018)
Volume 17 (January 2016 - January 2017)
Volume 16 (January 2015 - December 2015)
Volume 15 (January 2014 - December 2014)
Volume 14 (January 2013 - December 2013)
Volume 13 (January 2012 - December 2012)
Volume 12 (January 2011 - December 2011)
Volume 11 (January 2010 - December 2010)
Volume 10 (January 2009 - December 2009)
Volume 9 (January 2008 - December 2008)
Volume 8 (January 2007 - December 2007)
Volume 7 (January 2006 - December 2006)
Volume 6 (January 2005 - December 2005)
Volume 5 (December 2003 - December 2004)
Volume 4 (Apr 2003 - December 2003)
Volume 3 (Jul 2002 - Mar 2003)
Volume 2 (Oct 2001 - Mar 2002)
Volume 1 (Oct 2000 - Sep 2001)
Learning from Electronic Health Data (December 2016)
Gesture Recognition (May 2012 - present)
Large Scale Learning (Jul 2009 - present)
Mining and Learning with Graphs and Relations (February 2009 - present)
Grammar Induction, Representation of Language and Language Learning (Nov 2010 - Apr 2011)
Causality (Sep 2007 - May 2010)
Model Selection (Apr 2007 - Jul 2010)
Conference on Learning Theory 2005 (February 2007 - Jul 2007)
Machine Learning for Computer Security (December 2006)
Machine Learning and Large Scale Optimization (Jul 2006 - Oct 2006)
Approaches and Applications of Inductive Programming (February 2006 - Mar 2006)
Learning Theory (Jun 2004 - Aug 2004)
In Memory of Alexey Chervonenkis (Sep 2015)
Independent Components Analysis (December 2003)
Learning Theory (Oct 2003)
Inductive Logic Programming (Aug 2003)
Fusion of Domain Knowledge with Data for Decision Support (Jul 2003)
Variable and Feature Selection (Mar 2003)
Machine Learning Methods for Text and Images (February 2003)
Eighteenth International Conference on Machine Learning (ICML2001) (December 2002)
Computational Learning Theory (Nov 2002)
Shallow Parsing (Mar 2002)
Kernel Methods (December 2001)
- Published on November 18, 2021
- In Opinions
Top Machine Learning Research Papers Released In 2021
- By Dr. Nivash Jeevanandam
Advances in machine learning and deep learning research are reshaping our technology. Machine learning and deep learning have accomplished various astounding feats this year in 2021, and key research articles have resulted in technical advances used by billions of people. The research in this sector is advancing at a breakneck pace and assisting you to keep up. Here is a collection of the most important recent scientific study papers.
Rebooting ACGAN: Auxiliary Classifier GANs with Stable Training
The authors of this work examined why ACGAN training becomes unstable as the number of classes in the dataset grows. The researchers revealed that the unstable training occurs due to a gradient explosion problem caused by the unboundedness of the input feature vectors and the classifier’s poor classification capabilities during the early training stage. The researchers presented the Data-to-Data Cross-Entropy loss (D2D-CE) and the Rebooted Auxiliary Classifier Generative Adversarial Network to alleviate the instability and reinforce ACGAN (ReACGAN). Additionally, extensive tests of ReACGAN demonstrate that it is resistant to hyperparameter selection and is compatible with a variety of architectures and differentiable augmentations.
This article is ranked #1 on CIFAR-10 for Conditional Image Generation.
Sign up for your weekly dose of what's up in emerging technology.
For the research paper, read here .
For code, see here .
Download our Mobile App
Dense Unsupervised Learning for Video Segmentation
The authors presented a straightforward and computationally fast unsupervised strategy for learning dense spacetime representations from unlabeled films in this study. The approach demonstrates rapid convergence of training and a high degree of data efficiency. Furthermore, the researchers obtain VOS accuracy superior to previous results despite employing a fraction of the previously necessary training data. The researchers acknowledge that the research findings may be utilised maliciously, such as for unlawful surveillance, and that they are excited to investigate how this skill might be used to better learn a broader spectrum of invariances by exploiting larger temporal windows in movies with complex (ego-)motion, which is more prone to disocclusions.
This study is ranked #1 on DAVIS 2017 for Unsupervised Video Object Segmentation (val).
Temporally-Consistent Surface Reconstruction using Metrically-Consistent Atlases
The authors offer an atlas-based technique for producing unsupervised temporally consistent surface reconstructions by requiring a point on the canonical shape representation to translate to metrically consistent 3D locations on the reconstructed surfaces. Finally, the researchers envisage a plethora of potential applications for the method. For example, by substituting an image-based loss for the Chamfer distance, one may apply the method to RGB video sequences, which the researchers feel will spur development in video-based 3D reconstruction.
This article is ranked #1 on ANIM in the category of Surface Reconstruction.
EdgeFlow: Achieving Practical Interactive Segmentation with Edge-Guided Flow
The researchers propose a revolutionary interactive architecture called EdgeFlow that uses user interaction data without resorting to post-processing or iterative optimisation. The suggested technique achieves state-of-the-art performance on common benchmarks due to its coarse-to-fine network design. Additionally, the researchers create an effective interactive segmentation tool that enables the user to improve the segmentation result through flexible options incrementally.
This paper is ranked #1 on Interactive Segmentation on PASCAL VOC
Learning Transferable Visual Models From Natural Language Supervision
The authors of this work examined whether it is possible to transfer the success of task-agnostic web-scale pre-training in natural language processing to another domain. The findings indicate that adopting this formula resulted in the emergence of similar behaviours in the field of computer vision, and the authors examine the social ramifications of this line of research. CLIP models learn to accomplish a range of tasks during pre-training to optimise their training objective. Using natural language prompting, CLIP can then use this task learning to enable zero-shot transfer to many existing datasets. When applied at a large scale, this technique can compete with task-specific supervised models, while there is still much space for improvement.
This research is ranked #1 on Zero-Shot Transfer Image Classification on SUN
CoAtNet: Marrying Convolution and Attention for All Data Sizes
The researchers in this article conduct a thorough examination of the features of convolutions and transformers, resulting in a principled approach for combining them into a new family of models dubbed CoAtNet. Extensive experiments demonstrate that CoAtNet combines the advantages of ConvNets and Transformers, achieving state-of-the-art performance across a range of data sizes and compute budgets. Take note that this article is currently concentrating on ImageNet classification for model construction. However, the researchers believe their approach is relevant to a broader range of applications, such as object detection and semantic segmentation.
This paper is ranked #1 on Image Classification on ImageNet (using extra training data).
SwinIR: Image Restoration Using Swin Transformer
The authors of this article suggest the SwinIR image restoration model, which is based on the Swin Transformer . The model comprises three modules: shallow feature extraction, deep feature extraction, and human-recognition reconstruction. For deep feature extraction, the researchers employ a stack of residual Swin Transformer blocks (RSTB), each formed of Swin Transformer layers, a convolution layer, and a residual connection.
This research article is ranked #1 on Image Super-Resolution on Manga109 – 4x upscaling.
More Great AIM Stories
Hugging face makes openai’s worst nightmare come true, data fear looms as india embraces chatgpt, open-source movement in india gets hardware update, unboxing llms, how confidential computing is changing the ai chip game, why an indian equivalent of openai is unlikely for now, from hazy vision to clear direction: agi’s path revealed.
Our Upcoming Conferences
16-17th Mar, 2023 | Bangalore Rising 2023 | Women in Tech Conference
27-28th Apr, 2023 I Bangalore Data Engineering Summit (DES) 2023 27-28th Apr, 2023
23 Jun, 2023 | Bangalore MachineCon India 2023 [AI100 Awards]
21 Jul, 2023 | New York MachineCon USA 2023 [AI100 Awards]
3 Ways to Join our Community
Discover special offers, top stories, upcoming events, and more.
Stay Connected with a larger ecosystem of data science and ML Professionals
Subscribe to our Daily newsletter
Get our daily awesome stories & videos in your inbox, aim top stories.
Council Post: Driving effective digitisation in underwriting space with AI, NLP and analytics
As AI and NLP enable digitisation, the power of analytics boosts the effectiveness of extracted content by leveraging predictive modelling.
Which Indian IT majors have started service unit on Metaverse
Few days back, Tech Mahindra announced its foray into metaverse with the launch of TechMVerse.
Can we and should we have fully open APIs?
With fully open APIs, too, developers get the same level of control and freedom in the cloud.
A complete tutorial on masked language modelling using BERT
Masked language modelling is one of such interesting applications of natural language processing. Masked image modelling is a way to perform word prediction that was originally hidden intentionally in a sentence.
All major tech firms that suspended operations in Russia
Apple had restricted Apple Pay services in Russia and removed state media like RT News and Sputnik News from the App Store.
Ukrainian crisis puts tech talent and global chip supply at risk
Pre-invasion, Kyiv, Lviv, Kharkiv and Dnipro used to be major IT hubs in Ukraine.
India enters the “Quantum Key Distribution” club
Quantum key distribution works by transmitting millions of photons (polarised light particles) over a fibre optic cable from one entity location to another to create a bitstream of ones and zeroes.
Interview with Rahul Saxena, co-founder and CTO at AiDash
Any team should always have a mix of people with varying skills and exposure to run the business without compromises.
Indian AI startups DeepVisionTech & TensorGo Technologies win at Oracle APAC Startup Idol 2022
Two homegrown startups, DeepVisionTech and TensorGo Technologies, won awards at the Oracle APAC Startup Idol 2022.
A guide to feature engineering in time series with Tsfresh
In time series modelling, feature engineering works in a different way because it is sequential data and it gets formed using the changes in any values according to the time
Our mission is to bring about better-informed and more conscious decisions about technology through authoritative, influential, and trustworthy journalism.
Shape the future of tech.
© Analytics India Magazine Pvt Ltd & AIM Media House LLC 2023
Read all the papers in 2021 20th IEEE International Conference on Machine Learning and Applications (ICMLA) ...
Specifically, this paper makes three important contributions: (i) the proposed work is the first attempt in enabling hardware acceleration of explainable ML using TPUs; (ii) the proposed approach is applicable across a wide variety of ML algorithms, and effective utilization of TPU-based acceleration can lead to real-time outcome interpretation; …
Data-driven analysis and machine learning for energy prediction in distributed photovoltaic generation plants: A case study in Queensland, Australia Energy Reports 10.1016/j.egyr.2021.11.123 2022 Vol 8 pp. 745-751 Author (s): Lucas Ramos Marilaine Colnago Wallace Casaca Keyword (s): Machine Learning Data Driven
Journal of Machine Learning Research The Journal of Machine Learning Research (JMLR), established in 2000, provides an international forum for the electronic and paper publication of high-quality scholarly articles in all areas of machine learning. All published papers are freely available online.
Machine Learning: Algorithms, Real-World Applications and Research Directions SN Comput Sci. 2021;2(3):160.doi: 10.1007/s42979-021-00592-x. Epub 2021 Mar 22. Author Iqbal H Sarker 1 2 Affiliations 1Swinburne University of Technology, Melbourne, VIC 3122 Australia.
nebuly-ai/nebullvm • • NA 2022. The aim of this paper is to introduce a new learning procedure for neural networks and to demonstrate that it works well enough on a few small problems to be worth further investigation. 4,755. 5.79 stars / hour. Paper.
The paper has been published in Nature Biotechnology. Searching for the “diamond in the rough” The machine learning algorithm developed by Weis and Jacobson takes advantage of the vast amount of digital information that is now available with the exponential growth in scientific publication since the 1980s.
JMLR Papers. Select a volume number to see its table of contents with links to the papers. Volume 23 (January 2022 - Present) . Volume 22 (January 2021 - December 2021)
Top Machine Learning Research Papers Released In 2021 Advances in the machine and deep learning in 2021 could lead to new technologies utilised by billions of people worldwide. By Dr. Nivash Jeevanandam Advances in machine learning and deep learning research are reshaping our technology.