Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • View all journals
  • Explore content
  • About the journal
  • Publish with us
  • Sign up for alerts
  • Published: 02 August 2023

Scientific discovery in the age of artificial intelligence

  • Hanchen Wang   ORCID: orcid.org/0000-0002-1691-024X 1 , 2   na1   nAff37   nAff38 ,
  • Tianfan Fu 3   na1 ,
  • Yuanqi Du 4   na1 ,
  • Wenhao Gao 5 ,
  • Kexin Huang 6 ,
  • Ziming Liu 7 ,
  • Payal Chandak   ORCID: orcid.org/0000-0003-1097-803X 8 ,
  • Shengchao Liu   ORCID: orcid.org/0000-0003-2030-2367 9 , 10 ,
  • Peter Van Katwyk   ORCID: orcid.org/0000-0002-3512-0665 11 , 12 ,
  • Andreea Deac 9 , 10 ,
  • Anima Anandkumar 2 , 13 ,
  • Karianne Bergen 11 , 12 ,
  • Carla P. Gomes   ORCID: orcid.org/0000-0002-4441-7225 4 ,
  • Shirley Ho 14 , 15 , 16 , 17 ,
  • Pushmeet Kohli   ORCID: orcid.org/0000-0002-7466-7997 18 ,
  • Joan Lasenby 1 ,
  • Jure Leskovec   ORCID: orcid.org/0000-0002-5411-923X 6 ,
  • Tie-Yan Liu 19 ,
  • Arjun Manrai 20 ,
  • Debora Marks   ORCID: orcid.org/0000-0001-9388-2281 21 , 22 ,
  • Bharath Ramsundar 23 ,
  • Le Song 24 , 25 ,
  • Jimeng Sun 26 ,
  • Jian Tang 9 , 27 , 28 ,
  • Petar Veličković 18 , 29 ,
  • Max Welling 30 , 31 ,
  • Linfeng Zhang 32 , 33 ,
  • Connor W. Coley   ORCID: orcid.org/0000-0002-8271-8723 5 , 34 ,
  • Yoshua Bengio   ORCID: orcid.org/0000-0002-9322-3515 9 , 10 &
  • Marinka Zitnik   ORCID: orcid.org/0000-0001-8530-7228 20 , 22 , 35 , 36  

Nature volume  620 ,  pages 47–60 ( 2023 ) Cite this article

95k Accesses

132 Citations

599 Altmetric

Metrics details

  • Computer science
  • Machine learning
  • Scientific community

A Publisher Correction to this article was published on 30 August 2023

This article has been updated

Artificial intelligence (AI) is being increasingly integrated into scientific discovery to augment and accelerate research, helping scientists to generate hypotheses, design experiments, collect and interpret large datasets, and gain insights that might not have been possible using traditional scientific methods alone. Here we examine breakthroughs over the past decade that include self-supervised learning, which allows models to be trained on vast amounts of unlabelled data, and geometric deep learning, which leverages knowledge about the structure of scientific data to enhance model accuracy and efficiency. Generative AI methods can create designs, such as small-molecule drugs and proteins, by analysing diverse data modalities, including images and sequences. We discuss how these methods can help scientists throughout the scientific process and the central issues that remain despite such advances. Both developers and users of AI tools need a better understanding of when such approaches need improvement, and challenges posed by poor data quality and stewardship remain. These issues cut across scientific disciplines and require developing foundational algorithmic approaches that can contribute to scientific understanding or acquire it autonomously, making them critical areas of focus for AI innovation.

This is a preview of subscription content, access via your institution

Access options

Access Nature and 54 other Nature Portfolio journals

Get Nature+, our best-value online-access subscription

24,99 € / 30 days

cancel any time

Subscribe to this journal

Receive 51 print issues and online access

185,98 € per year

only 3,65 € per issue

Buy this article

  • Purchase on Springer Link
  • Instant access to full article PDF

Prices may be subject to local taxes which are calculated during checkout

technical research paper on artificial intelligence

Similar content being viewed by others

technical research paper on artificial intelligence

Accelerating science with human-aware artificial intelligence

technical research paper on artificial intelligence

Accelerating material design with the generative toolkit for scientific discovery

technical research paper on artificial intelligence

Why big data and compute are not necessarily the path to big materials science

Change history, 30 august 2023.

A Correction to this paper has been published: https://doi.org/10.1038/s41586-023-06559-7

LeCun, Y., Bengio, Y. & Hinton, G. Deep learning. Nature 521 , 436–444 (2015). This survey summarizes key elements of deep learning and its development in speech recognition, computer vision and and natural language processing .

Article   ADS   CAS   PubMed   Google Scholar  

de Regt, H. W. Understanding, values, and the aims of science. Phil. Sci. 87 , 921–932 (2020).

Article   MathSciNet   Google Scholar  

Pickstone, J. V. Ways of Knowing: A New History of Science, Technology, and Medicine (Univ. Chicago Press, 2001).

Han, J. et al. Deep potential: a general representation of a many-body potential energy surface. Commun. Comput. Phys. 23 , 629–639 (2018). This paper introduced a deep neural network architecture that learns the potential energy surface of many-body systems while respecting the underlying symmetries of the system by incorporating group theory.

Akiyama, K. et al. First M87 Event Horizon Telescope results. IV. Imaging the central supermassive black hole. Astrophys. J. Lett. 875 , L4 (2019).

Article   ADS   CAS   Google Scholar  

Wagner, A. Z. Constructions in combinatorics via neural networks. Preprint at https://arxiv.org/abs/2104.14516 (2021).

Coley, C. W. et al. A robotic platform for flow synthesis of organic compounds informed by AI planning. Science 365 , eaax1566 (2019).

Article   CAS   PubMed   Google Scholar  

Bommasani, R. et al. On the opportunities and risks of foundation models. Preprint at https://arxiv.org/abs/2108.07258 (2021).

Davies, A. et al. Advancing mathematics by guiding human intuition with AI. Nature 600 , 70–74 (2021). This paper explores how AI can aid the development of pure mathematics by guiding mathematical intuition.

Article   ADS   CAS   PubMed   PubMed Central   MATH   Google Scholar  

Jumper, J. et al. Highly accurate protein structure prediction with AlphaFold. Nature 596 , 583–589 (2021). This study was the first to demonstrate the ability to predict protein folding structures using AI methods with a high degree of accuracy, achieving results that are at or near the experimental resolution. This accomplishment is particularly noteworthy, as predicting protein folding has been a grand challenge in the field of molecular biology for over 50 years.

Article   ADS   CAS   PubMed   PubMed Central   Google Scholar  

Stokes, J. M. et al. A deep learning approach to antibiotic discovery. Cell 180 , 688–702 (2020).

Article   CAS   PubMed   PubMed Central   Google Scholar  

Bohacek, R. S., McMartin, C. & Guida, W. C. The art and practice of structure-based drug design: a molecular modeling perspective. Med. Res. Rev. 16 , 3–50 (1996).

Bileschi, M. L. et al. Using deep learning to annotate the protein universe. Nat. Biotechnol. 40 , 932–937 (2022).

Bellemare, M. G. et al. Autonomous navigation of stratospheric balloons using reinforcement learning. Nature 588 , 77–82 (2020). This paper describes a reinforcement-learning algorithm for navigating a super-pressure balloon in the stratosphere, making real-time decisions in the changing environment.

Tshitoyan, V. et al. Unsupervised word embeddings capture latent knowledge from materials science literature. Nature 571 , 95–98 (2019).

Zhang, L. et al. Deep potential molecular dynamics: a scalable model with the accuracy of quantum mechanics. Phys. Rev. Lett. 120 , 143001 (2018).

Deiana, A. M. et al. Applications and techniques for fast machine learning in science. Front. Big Data 5 , 787421 (2022).

Karagiorgi, G. et al. Machine learning in the search for new fundamental physics. Nat. Rev. Phys. 4 , 399–412 (2022).

Zhou, C. & Paffenroth, R. C. Anomaly detection with robust deep autoencoders. In ACM SIGKDD International Conference on Knowledge Discovery and Data Mining 665–674 (2017).

Hinton, G. E. & Salakhutdinov, R. R. Reducing the dimensionality of data with neural networks. Science 313 , 504–507 (2006).

Article   ADS   MathSciNet   CAS   PubMed   MATH   Google Scholar  

Kasieczka, G. et al. The LHC Olympics 2020 a community challenge for anomaly detection in high energy physics. Rep. Prog. Phys. 84 , 124201 (2021).

Govorkova, E. et al. Autoencoders on field-programmable gate arrays for real-time, unsupervised new physics detection at 40 MHz at the Large Hadron Collider. Nat. Mach. Intell. 4 , 154–161 (2022).

Article   Google Scholar  

Chamberland, M. et al. Detecting microstructural deviations in individuals with deep diffusion MRI tractometry. Nat. Comput. Sci. 1 , 598–606 (2021).

Article   PubMed   PubMed Central   Google Scholar  

Rafique, M. et al. Delegated regressor, a robust approach for automated anomaly detection in the soil radon time series data. Sci. Rep. 10 , 3004 (2020).

Pastore, V. P. et al. Annotation-free learning of plankton for classification and anomaly detection. Sci. Rep. 10 , 12142 (2020).

Naul, B. et al. A recurrent neural network for classification of unevenly sampled variable stars. Nat. Astron. 2 , 151–155 (2018).

Article   ADS   Google Scholar  

Lee, D.-H. et al. Pseudo-label: the simple and efficient semi-supervised learning method for deep neural networks. In ICML Workshop on Challenges in Representation Learning (2013).

Zhou, D. et al. Learning with local and global consistency. In Advances in Neural Information Processing Systems 16 , 321–328 (2003).

Radivojac, P. et al. A large-scale evaluation of computational protein function prediction. Nat. Methods 10 , 221–227 (2013).

Barkas, N. et al. Joint analysis of heterogeneous single-cell RNA-seq dataset collections. Nat. Methods 16 , 695–698 (2019).

Tran, K. & Ulissi, Z. W. Active learning across intermetallics to guide discovery of electrocatalysts for CO 2 reduction and H 2 evolution. Nat. Catal. 1 , 696–703 (2018).

Article   CAS   Google Scholar  

Jablonka, K. M. et al. Bias free multiobjective active learning for materials design and discovery. Nat. Commun. 12 , 2312 (2021).

Roussel, R. et al. Turn-key constrained parameter space exploration for particle accelerators using Bayesian active learning. Nat. Commun. 12 , 5612 (2021).

Ratner, A. J. et al. Data programming: creating large training sets, quickly. In Advances in Neural Information Processing Systems 29 , 3567–3575 (2016).

Ratner, A. et al. Snorkel: rapid training data creation with weak supervision. In International Conference on Very Large Data Bases 11 , 269–282 (2017). This paper presents a weakly-supervised AI system designed to annotate massive amounts of data using labeling functions.

Butter, A. et al. GANplifying event samples. SciPost Phys. 10 , 139 (2021).

Brown, T. et al. Language models are few-shot learners. In Advances in Neural Information Processing Systems 33 , 1877–1901 (2020).

Ramesh, A. et al. Zero-shot text-to-image generation. In International Conference on Machine Learning 139 , 8821–8831 (2021).

Littman, M. L. Reinforcement learning improves behaviour from evaluative feedback. Nature 521 , 445–451 (2015).

Cubuk, E. D. et al. Autoaugment: learning augmentation strategies from data. In IEEE Conference on Computer Vision and Pattern Recognition 113–123 (2019).

Reed, C. J. et al. Selfaugment: automatic augmentation policies for self-supervised learning. In IEEE Conference on Computer Vision and Pattern Recognition 2674–2683 (2021).

ATLAS Collaboration et al. Deep generative models for fast photon shower simulation in ATLAS. Preprint at https://arxiv.org/abs/2210.06204 (2022).

Mahmood, F. et al. Deep adversarial training for multi-organ nuclei segmentation in histopathology images. IEEE Trans. Med. Imaging 39 , 3257–3267 (2019).

Teixeira, B. et al. Generating synthetic X-ray images of a person from the surface geometry. In IEEE Conference on Computer Vision and Pattern Recognition 9059–9067 (2018).

Lee, D., Moon, W.-J. & Ye, J. C. Assessing the importance of magnetic resonance contrasts using collaborative generative adversarial networks. Nat. Mach. Intell. 2 , 34–42 (2020).

Kench, S. & Cooper, S. J. Generating three-dimensional structures from a two-dimensional slice with generative adversarial network-based dimensionality expansion. Nat. Mach. Intell. 3 , 299–305 (2021).

Wan, C. & Jones, D. T. Protein function prediction is improved by creating synthetic feature samples with generative adversarial networks. Nat. Mach. Intell. 2 , 540–550 (2020).

Repecka, D. et al. Expanding functional protein sequence spaces using generative adversarial networks. Nat. Mach. Intell. 3 , 324–333 (2021).

Marouf, M. et al. Realistic in silico generation and augmentation of single-cell RNA-seq data using generative adversarial networks. Nat. Commun. 11 , 166 (2020).

Ghahramani, Z. Probabilistic machine learning and artificial intelligence. Nature 521 , 452–459 (2015). This survey provides an introduction to probabilistic machine learning, which involves the representation and manipulation of uncertainty in models and predictions, playing a central role in scientific data analysis.

Cogan, J. et al. Jet-images: computer vision inspired techniques for jet tagging. J. High Energy Phys. 2015 , 118 (2015).

Zhao, W. et al. Sparse deconvolution improves the resolution of live-cell super-resolution fluorescence microscopy. Nat. Biotechnol. 40 , 606–617 (2022).

Brbić, M. et al. MARS: discovering novel cell types across heterogeneous single-cell experiments. Nat. Methods 17 , 1200–1206 (2020).

Article   PubMed   Google Scholar  

Qiao, C. et al. Evaluation and development of deep neural networks for image super-resolution in optical microscopy. Nat. Methods 18 , 194–202 (2021).

Andreassen, A. et al. OmniFold: a method to simultaneously unfold all observables. Phys. Rev. Lett. 124 , 182001 (2020).

Bergenstråhle, L. et al. Super-resolved spatial transcriptomics by deep data fusion. Nat. Biotechnol. 40 , 476–479 (2021).

Vincent, P. et al. Extracting and composing robust features with denoising autoencoders. In International Conference on Machine Learning 1096–1103 (2008).

Kingma, D. P. & Welling, M. Auto-encoding variational Bayes. In International Conference on Learning Representations (2014).

Eraslan, G. et al. Single-cell RNA-seq denoising using a deep count autoencoder. Nat. Commun. 10 , 390 (2019).

Goodfellow, I., Bengio, Y. & Courville, A. Deep Learning (MIT Press, 2016).

Olshausen, B. A. & Field, D. J. Emergence of simple-cell receptive field properties by learning a sparse code for natural images. Nature 381 , 607–609 (1996).

Bengio, Y. Deep learning of representations for unsupervised and transfer learning. In ICML Workshop on Unsupervised and Transfer Learning (2012).

Detlefsen, N. S., Hauberg, S. & Boomsma, W. Learning meaningful representations of protein sequences. Nat. Commun. 13 , 1914 (2022).

Becht, E. et al. Dimensionality reduction for visualizing single-cell data using UMAP. Nat. Biotechnol. 37 , 38–44 (2019).

Bronstein, M. M. et al. Geometric deep learning: going beyond euclidean data. IEEE Signal Process Mag. 34 , 18–42 (2017).

Anderson, P. W. More is different: broken symmetry and the nature of the hierarchical structure of science. Science 177 , 393–396 (1972).

Qiao, Z. et al. Informing geometric deep learning with electronic interactions to accelerate quantum chemistry. Proc. Natl Acad. Sci. USA 119 , e2205221119 (2022).

Bogatskiy, A. et al. Symmetry group equivariant architectures for physics. Preprint at https://arxiv.org/abs/2203.06153 (2022).

Bronstein, M. M. et al. Geometric deep learning: grids, groups, graphs, geodesics, and gauges. Preprint at https://arxiv.org/abs/2104.13478 (2021).

Townshend, R. J. L. et al. Geometric deep learning of RNA structure. Science 373 , 1047–1051 (2021).

Wicky, B. I. M. et al. Hallucinating symmetric protein assemblies. Science 378 , 56–61 (2022).

Kipf, T. N. & Welling, M. Semi-supervised classification with graph convolutional networks. In International Conference on Learning Representations (2017).

Veličković, P. et al. Graph attention networks. In International Conference on Learning Representations (2018).

Hamilton, W. L., Ying, Z. & Leskovec, J. Inductive representation learning on large graphs. In Advances in Neural Information Processing Systems 30 , 1024–1034 (2017).

Gilmer, J. et al. Neural message passing for quantum chemistry. In International Conference on Machine Learning 1263–1272 (2017).

Li, M. M., Huang, K. & Zitnik, M. Graph representation learning in biomedicine and healthcare. Nat. Biomed. Eng. 6 , 1353–1369 (2022).

Satorras, V. G., Hoogeboom, E. & Welling, M. E( n ) equivariant graph neural networks. In International Conference on Machine Learning 9323–9332 (2021). This study incorporates principles of physics into the design of neural models, advancing the field of equivariant machine learning .

Thomas, N. et al. Tensor field networks: rotation-and translation-equivariant neural networks for 3D point clouds. Preprint at https://arxiv.org/abs/1802.08219 (2018).

Finzi, M. et al. Generalizing convolutional neural networks for equivariance to lie groups on arbitrary continuous data. In International Conference on Machine Learning 3165–3176 (2020).

Fuchs, F. et al. SE(3)-transformers: 3D roto-translation equivariant attention networks. In Advances in Neural Information Processing Systems 33 , 1970-1981 (2020).

Zaheer, M. et al. Deep sets. In Advances in Neural Information Processing Systems 30 , 3391–3401 (2017). This paper is an early study that explores the use of deep neural architectures on set data, which consists of an unordered list of elements.

Cohen, T. S. et al. Spherical CNNs. In International Conference on Learning Representations (2018).

Gordon, J. et al. Permutation equivariant models for compositional generalization in language. In International Conference on Learning Representations (2019).

Finzi, M., Welling, M. & Wilson, A. G. A practical method for constructing equivariant multilayer perceptrons for arbitrary matrix groups. In International Conference on Machine Learning 3318–3328 (2021).

Dijk, D. V. et al. Recovering gene interactions from single-cell data using data diffusion. Cell 174 , 716–729 (2018).

Gainza, P. et al. Deciphering interaction fingerprints from protein molecular surfaces using geometric deep learning. Nat. Methods 17 , 184–192 (2020).

Hatfield, P. W. et al. The data-driven future of high-energy-density physics. Nature 593 , 351–361 (2021).

Bapst, V. et al. Unveiling the predictive power of static structure in glassy systems. Nat. Phys. 16 , 448–454 (2020).

Zhang, R., Zhou, T. & Ma, J. Multiscale and integrative single-cell Hi-C analysis with Higashi. Nat. Biotechnol. 40 , 254–261 (2022).

Sammut, S.-J. et al. Multi-omic machine learning predictor of breast cancer therapy response. Nature 601 , 623–629 (2022).

DeZoort, G. et al. Graph neural networks at the Large Hadron Collider. Nat. Rev. Phys . 5 , 281–303 (2023).

Liu, S. et al. Pre-training molecular graph representation with 3D geometry. In International Conference on Learning Representations (2022).

The LIGO Scientific Collaboration. et al. A gravitational-wave standard siren measurement of the Hubble constant. Nature 551 , 85–88 (2017).

Reichstein, M. et al. Deep learning and process understanding for data-driven Earth system science. Nature 566 , 195–204 (2019).

Goenka, S. D. et al. Accelerated identification of disease-causing variants with ultra-rapid nanopore genome sequencing. Nat. Biotechnol. 40 , 1035–1041 (2022).

Bengio, Y. et al. Greedy layer-wise training of deep networks. In Advances in Neural Information Processing Systems 19 , 153–160 (2006).

Hinton, G. E., Osindero, S. & Teh, Y.-W. A fast learning algorithm for deep belief nets. Neural Comput. 18 , 1527–1554 (2006).

Article   MathSciNet   PubMed   MATH   Google Scholar  

Jordan, M. I. & Mitchell, T. M. Machine learning: trends, perspectives, and prospects. Science 349 , 255–260 (2015).

Devlin, J. et al. BERT: pre-training of deep bidirectional transformers for language understanding. In North American Chapter of the Association for Computational Linguistics 4171–4186 (2019).

Rives, A. et al. Biological structure and function emerge from scaling unsupervised learning to 250 million protein sequences. Proc. Natl Acad. Sci. USA 118 , e2016239118 (2021).

Elnaggar, A. et al. ProtTrans: rowards cracking the language of lifes code through self-supervised deep learning and high performance computing. In IEEE Transactions on Pattern Analysis and Machine Intelligence (2021).

Hie, B. et al. Learning the language of viral evolution and escape. Science 371 , 284–288 (2021). This paper modeled viral escape with machine learning algorithms originally developed for human natural language.

Biswas, S. et al. Low- N protein engineering with data-efficient deep learning. Nat. Methods 18 , 389–396 (2021).

Ferruz, N. & Höcker, B. Controllable protein design with language models. Nat. Mach. Intell. 4 , 521–532 (2022).

Hsu, C. et al. Learning inverse folding from millions of predicted structures. In International Conference on Machine Learning 8946–8970 (2022).

Baek, M. et al. Accurate prediction of protein structures and interactions using a three-track neural network. Science 373 , 871–876 (2021). Inspired by AlphaFold2, this study reported RoseTTAFold, a novel three-track neural module capable of simultaneously processing protein’s sequence, distance and coordinates.

Weininger, D. SMILES, a chemical language and information system. 1. Introduction to methodology and encoding rules. J. Chem. Inf. Comput. Sci. 28 , 31–36 (1988).

Lin, T.-S. et al. BigSMILES: a structurally-based line notation for describing macromolecules. ACS Cent. Sci. 5 , 1523–1531 (2019).

Krenn, M. et al. SELFIES and the future of molecular string representations. Patterns 3 , 100588 (2022).

Flam-Shepherd, D., Zhu, K. & Aspuru-Guzik, A. Language models can learn complex molecular distributions. Nat. Commun. 13 , 3293 (2022).

Skinnider, M. A. et al. Chemical language models enable navigation in sparsely populated chemical space. Nat. Mach. Intell. 3 , 759–770 (2021).

Chithrananda, S., Grand, G. & Ramsundar, B. ChemBERTa: large-scale self-supervised pretraining for molecular property prediction. In Machine Learning for Molecules Workshop at NeurIPS (2020).

Schwaller, P. et al. Predicting retrosynthetic pathways using transformer-based models and a hyper-graph exploration strategy. Chem. Sci. 11 , 3316–3325 (2020).

Tetko, I. V. et al. State-of-the-art augmented NLP transformer models for direct and single-step retrosynthesis. Nat. Commun. 11 , 5575 (2020).

Schwaller, P. et al. Mapping the space of chemical reactions using attention-based neural networks. Nat. Mach. Intell. 3 , 144–152 (2021).

Kovács, D. P., McCorkindale, W. & Lee, A. A. Quantitative interpretation explains machine learning models for chemical reaction prediction and uncovers bias. Nat. Commun. 12 , 1695 (2021).

Article   ADS   PubMed   PubMed Central   Google Scholar  

Pesciullesi, G. et al. Transfer learning enables the molecular transformer to predict regio-and stereoselective reactions on carbohydrates. Nat. Commun. 11 , 4874 (2020).

Vaswani, A. et al. Attention is all you need. In Advances in Neural Information Processing Systems 30 , 5998–6008 (2017). This paper introduced the transformer, a modern neural network architecture that can process sequential data in parallel, revolutionizing natural language processing and sequence modeling.

Mousavi, S. M. et al. Earthquake transformer—an attentive deep-learning model for simultaneous earthquake detection and phase picking. Nat. Commun. 11 , 3952 (2020).

Avsec, Ž. et al. Effective gene expression prediction from sequence by integrating long-range interactions. Nat. Methods 18 , 1196–1203 (2021).

Meier, J. et al. Language models enable zero-shot prediction of the effects of mutations on protein function. In Advances in Neural Information Processing Systems 34 , 29287–29303 (2021).

Kamienny, P.-A. et al. End-to-end symbolic regression with transformers. In Advances in Neural Information Processing Systems 35 , 10269–10281 (2022).

Jaegle, A. et al. Perceiver: general perception with iterative attention. In International Conference on Machine Learning 4651–4664 (2021).

Chen, L. et al. Decision transformer: reinforcement learning via sequence modeling. In Advances in Neural Information Processing Systems 34 , 15084–15097 (2021).

Dosovitskiy, A. et al. An image is worth 16x16 words: transformers for image recognition at scale. In International Conference on Learning Representations (2020).

Choromanski, K. et al. Rethinking attention with performers. In International Conference on Learning Representations (2021).

Li, Z. et al. Fourier neural operator for parametric partial differential equations. In International Conference on Learning Representations (2021).

Kovachki, N. et al. Neural operator: learning maps between function spaces. J. Mach. Learn. Res. 24 , 1–97 (2023).

Russell, J. L. Kepler’s laws of planetary motion: 1609–1666. Br. J. Hist. Sci. 2 , 1–24 (1964).

Huang, K. et al. Artificial intelligence foundation for therapeutic science. Nat. Chem. Biol. 18 , 1033–1036 (2022).

Guimerà, R. et al. A Bayesian machine scientist to aid in the solution of challenging scientific problems. Sci. Adv. 6 , eaav6971 (2020).

Liu, G. et al. Deep learning-guided discovery of an antibiotic targeting Acinetobacter baumannii. Nat. Chem. Biol. https://doi.org/10.1038/s41589-023-01349-8 (2023).

Gómez-Bombarelli, R. et al. Design of efficient molecular organic light-emitting diodes by a high-throughput virtual screening and experimental approach. Nat. Mater. 15 , 1120–1127 (2016). This paper proposes using a black-box AI predictor to accelerate high-throughput screening of molecules in materials science.

Article   ADS   PubMed   Google Scholar  

Sadybekov, A. A. et al. Synthon-based ligand discovery in virtual libraries of over 11 billion compounds. Nature 601 , 452–459 (2022).

The NNPDF Collaboration Evidence for intrinsic charm quarks in the proton. Nature 606 , 483–487 (2022).

Graff, D. E., Shakhnovich, E. I. & Coley, C. W. Accelerating high-throughput virtual screening through molecular pool-based active learning. Chem. Sci. 12 , 7866–7881 (2021).

Janet, J. P. et al. Accurate multiobjective design in a space of millions of transition metal complexes with neural-network-driven efficient global optimization. ACS Cent. Sci. 6 , 513–524 (2020).

Bacon, F. Novum Organon Vol. 1620 (2000).

Schmidt, M. & Lipson, H. Distilling free-form natural laws from experimental data. Science 324 , 81–85 (2009).

Petersen, B. K. et al. Deep symbolic regression: recovering mathematical expressions from data via risk-seeking policy gradients. In International Conference on Learning Representations (2020).

Zhavoronkov, A. et al. Deep learning enables rapid identification of potent DDR1 kinase inhibitors. Nat. Biotechnol. 37 , 1038–1040 (2019). This paper describes a reinforcement-learning algorithm for navigating molecular combinatorial spaces, and it validates generated molecules using wet-lab experiments.

Zhou, Z. et al. Optimization of molecules via deep reinforcement learning. Sci. Rep. 9 , 10752 (2019).

You, J. et al. Graph convolutional policy network for goal-directed molecular graph generation. In Advances in Neural Information Processing Systems 31 , 6412–6422 (2018).

Bengio, Y. et al. GFlowNet foundations. Preprint at https://arxiv.org/abs/2111.09266 (2021). This paper describes a generative flow network that generates objects by sampling them from a distribution optimized for drug design.

Jain, M. et al. Biological sequence design with GFlowNets. In International Conference on Machine Learning 9786–9801 (2022).

Malkin, N. et al. Trajectory balance: improved credit assignment in GFlowNets. In Advances in Neural Information Processing Systems 35 , 5955–5967 (2022).

Borkowski, O. et al. Large scale active-learning-guided exploration for in vitro protein production optimization. Nat. Commun. 11 , 1872 (2020). This study introduced a dynamic programming approach to determine the optimal locations and capacities of hydropower dams in the Amazon Basin, balancing between energy production and environmental impact .

Flecker, A. S. et al. Reducing adverse impacts of Amazon hydropower expansion. Science 375 , 753–760 (2022). This study introduced a dynamic programming approach to determine the optimal locations and capacities of hydropower dams in the Amazon basin, achieving a balance between the benefits of energy production and the potential environmental impacts.

Pion-Tonachini, L. et al. Learning from learning machines: a new generation of AI technology to meet the needs of science. Preprint at https://arxiv.org/abs/2111.13786 (2021).

Kusner, M. J., Paige, B. & Hernández-Lobato, J. M. Grammar variational autoencoder. In International Conference on Machine Learning 1945–1954 (2017). This paper describes a grammar variational autoencoder that generates novel symbolic laws and drug molecules.

Brunton, S. L., Proctor, J. L. & Kutz, J. N. Discovering governing equations from data by sparse identification of nonlinear dynamical systems. Proc. Natl Acad. Sci. USA 113 , 3932–3937 (2016).

Article   ADS   MathSciNet   CAS   PubMed   PubMed Central   MATH   Google Scholar  

Liu, Z. & Tegmark, M. Machine learning hidden symmetries. Phys. Rev. Lett. 128 , 180201 (2022).

Article   ADS   MathSciNet   CAS   PubMed   Google Scholar  

Gabbard, H. et al. Bayesian parameter estimation using conditional variational autoencoders for gravitational-wave astronomy. Nat. Phys. 18 , 112–117 (2022).

Chen, D. et al. Automating crystal-structure phase mapping by combining deep learning with constraint reasoning. Nat. Mach. Intell. 3 , 812–822 (2021).

Gómez-Bombarelli, R. et al. Automatic chemical design using a data-driven continuous representation of molecules. ACS Cent. Sci. 4 , 268–276 (2018).

Anishchenko, I. et al. De novo protein design by deep network hallucination. Nature 600 , 547–552 (2021).

Fu, T. et al. Differentiable scaffolding tree for molecular optimization. In International Conference on Learning Representations (2021).

Sanchez-Lengeling, B. & Aspuru-Guzik, A. Inverse molecular design using machine learning: generative models for matter engineering. Science 361 , 360–365 (2018).

Huang, K. et al. Therapeutics Data Commons: machine learning datasets and tasks for drug discovery and development. In NeurIPS Datasets and Benchmarks (2021). This study describes an initiative with open AI models, datasets and education programmes to facilitate advances in therapeutic science across all stages of drug discovery and development.

Dance, A. Lab hazard. Nature 458 , 664–665 (2009).

Segler, M. H. S., Preuss, M. & Waller, M. P. Planning chemical syntheses with deep neural networks and symbolic AI. Nature 555 , 604–610 (2018). This paper describes an approach that combines deep neural networks with Monte Carlo tree search to plan chemical synthesis.

Gao, W., Raghavan, P. & Coley, C. W. Autonomous platforms for data-driven organic synthesis. Nat. Commun. 13 , 1075 (2022).

Kusne, A. G. et al. On-the-fly closed-loop materials discovery via Bayesian active learning. Nat. Commun. 11 , 5966 (2020).

Gormley,A. J. & Webb, M. A. Machine learning in combinatorial polymer chemistry. Nat. Rev. Mater. 6 , 642–644 (2021).

Ament, S. et al. Autonomous materials synthesis via hierarchical active learning of nonequilibrium phase diagrams. Sci. Adv. 7 , eabg4930 (2021).

Degrave, J. et al. Magnetic control of tokamak plasmas through deep reinforcement learning. Nature 602 , 414–419 (2022). This paper describes an approach for controlling tokamak plasmas, using a reinforcement-learning agent to command-control coils and satisfy physical and operational constraints.

Melnikov, A. A. et al. Active learning machine learns to create new quantum experiments. Proc. Natl Acad. Sci. USA 115 , 1221–1226 (2018).

Smith, J. S., Isayev, O. & Roitberg, A. E. ANI-1: an extensible neural network potential with DFT accuracy at force field computational cost. Chem. Sci. 8 , 3192–3203 (2017).

Wang, D. et al. Efficient sampling of high-dimensional free energy landscapes using adaptive reinforced dynamics. Nat. Comput. Sci. 2 , 20–29 (2022). This paper describes a neural network for reliable uncertainty estimations in molecular dynamics, enabling efficient sampling of high-dimensional free energy landscapes.

Wang, W. & Gómez-Bombarelli, R. Coarse-graining auto-encoders for molecular dynamics. npj Comput. Mater. 5 , 125 (2019).

Hermann, J., Schätzle, Z. & Noé, F. Deep-neural-network solution of the electronic Schrödinger equation. Nat. Chem. 12 , 891–897 (2020). This paper describes a method to learn the wavefunction of quantum systems using deep neural networks in conjunction with variational quantum Monte Carlo.

Carleo, G. & Troyer, M. Solving the quantum many-body problem with artificial neural networks. Science 355 , 602–606 (2017).

Em Karniadakis, G. et al. Physics-informed machine learning. Nat. Rev. Phys. 3 , 422–440 (2021).

Li, Z. et al. Physics-informed neural operator for learning partial differential equations. Preprint at https://arxiv.org/abs/2111.03794 (2021).

Kochkov, D. et al. Machine learning–accelerated computational fluid dynamics. Proc. Natl Acad. Sci. USA 118 , e2101784118 (2021). This paper describes an approach to accelerating computational fluid dynamics by training a neural network to interpolate from coarse to fine grids and generalize to varying forcing functions and Reynolds numbers.

Ji, W. et al. Stiff-PINN: physics-informed neural network for stiff chemical kinetics. J. Phys. Chem. A 125 , 8098–8106 (2021).

Smith, J. D., Azizzadenesheli, K. & Ross, Z. E. EikoNet: solving the Eikonal equation with deep neural networks. IEEE Trans. Geosci. Remote Sens. 59 , 10685–10696 (2020).

Waheed, U. B. et al. PINNeik: Eikonal solution using physics-informed neural networks. Comput. Geosci. 155 , 104833 (2021).

Chen, R. T. Q. et al. Neural ordinary differential equations. In Advances in Neural Information Processing Systems 31 , 6572–6583 (2018). This paper established a connection between neural networks and differential equations by introducing the adjoint method to learn continuous-time dynamical systems from data, replacing backpropagation.

Raissi, M., Perdikaris, P. & Karniadakis, G. E. Physics-informed neural networks: a deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. J. Comput. Phys. 378 , 686–707 (2019). This paper describes a deep-learning approach for solving forwards and inverse problems in nonlinear partial differential equations and can find solutions to differential equations from data.

Article   ADS   MathSciNet   MATH   Google Scholar  

Lu, L. et al. Learning nonlinear operators via DeepONet based on the universal approximation theorem of operators. Nat. Mach. Intell. 3 , 218–229 (2021).

Brandstetter, J., Worrall, D. & Welling, M. Message passing neural PDE solvers. In International Conference on Learning Representations (2022).

Noé, F. et al. Boltzmann generators: sampling equilibrium states of many-body systems with deep learning. Science 365 , eaaw1147 (2019). This paper presents an efficient sampling algorithm using normalizing flows to simulate equilibrium states in many-body systems.

Rezende, D. & Mohamed, S. Variational inference with normalizing flows. In International Conference on Machine Learning 37 , 1530–1538, (2015).

Dinh, L., Sohl-Dickstein, J. & Bengio, S. Density estimation using real NVP. In International Conference on Learning Representations (2017).

Nicoli, K. A. et al. Estimation of thermodynamic observables in lattice field theories with deep generative models. Phys. Rev. Lett. 126 , 032001 (2021).

Kanwar, G. et al. Equivariant flow-based sampling for lattice gauge theory. Phys. Rev. Lett. 125 , 121601 (2020).

Gabrié, M., Rotskoff, G. M. & Vanden-Eijnden, E. Adaptive Monte Carlo augmented with normalizing flows. Proc. Natl Acad. Sci. USA 119 , e2109420119 (2022).

Article   MathSciNet   PubMed   PubMed Central   Google Scholar  

Jasra, A., Holmes, C. C. & Stephens, D. A. Markov chain Monte Carlo methods and the label switching problem in Bayesian mixture modeling. Stat. Sci. 20 , 50–67 (2005).

Bengio, Y. et al. Better mixing via deep representations. In International Conference on Machine Learning 552–560 (2013).

Pompe, E., Holmes, C. & Łatuszyński, K. A framework for adaptive MCMC targeting multimodal distributions. Ann. Stat. 48 , 2930–2952 (2020).

Article   MathSciNet   MATH   Google Scholar  

Townshend, R. J. L. et al. ATOM3D: tasks on molecules in three dimensions. In NeurIPS Datasets and Benchmarks (2021).

Kearnes, S. M. et al. The open reaction database. J. Am. Chem. Soc. 143 , 18820–18826 (2021).

Chanussot, L. et al. Open Catalyst 2020 (OC20) dataset and community challenges. ACS Catal. 11 , 6059–6072 (2021).

Brown, N. et al. GuacaMol: benchmarking models for de novo molecular design. J. Chem. Inf. Model. 59 , 1096–1108 (2019).

Notin, P. et al. Tranception: protein fitness prediction with autoregressive transformers and inference-time retrieval. In International Conference on Machine Learning 16990–17017 (2022).

Mitchell, M. et al. Model cards for model reporting. In Conference on Fairness, Accountability, and Transparency 220–229 (2019).

Gebru, T. et al. Datasheets for datasets. Commun. ACM 64 , 86–92 (2021).

Bai, X. et al. Advancing COVID-19 diagnosis with privacy-preserving collaboration in artificial intelligence. Nat. Mach. Intell. 3 , 1081–1089 (2021).

Warnat-Herresthal, S. et al. Swarm learning for decentralized and confidential clinical machine learning. Nature 594 , 265–270 (2021).

Hie, B., Cho, H. & Berger, B. Realizing private and practical pharmacological collaboration. Science 362 , 347–350 (2018).

Rohrbach, S. et al. Digitization and validation of a chemical synthesis literature database in the ChemPU. Science 377 , 172–180 (2022).

Gysi, D. M. et al. Network medicine framework for identifying drug-repurposing opportunities for COVID-19. Proc. Natl Acad. Sci. USA 118 , e2025581118 (2021).

King, R. D. et al. The automation of science. Science 324 , 85–89 (2009).

Mirdita, M. et al. ColabFold: making protein folding accessible to all. Nat. Methods 19 , 679–682 (2022).

Doerr, S. et al. TorchMD: a deep learning framework for molecular simulations. J. Chem. Theory Comput. 17 , 2355–2363 (2021).

Schoenholz, S. S. & Cubuk, E. D. JAX MD: a framework for differentiable physics. In Advances in Neural Information Processing Systems 33 , 11428–11441 (2020).

Peters, J., Janzing, D. & Schölkopf, B. Elements of Causal Inference: Foundations and Learning Algorithms (MIT Press, 2017).

Bengio, Y. et al. A meta-transfer objective for learning to disentangle causal mechanisms. In International Conference on Learning Representations (2020).

Schölkopf, B. et al. Toward causal representation learning. Proc. IEEE 109 , 612–634 (2021).

Goyal, A. & Bengio, Y. Inductive biases for deep learning of higher-level cognition. Proc. R. Soc. A 478 , 20210068 (2022).

Deleu, T. et al. Bayesian structure learning with generative flow networks. In Conference on Uncertainty in Artificial Intelligence 518–528 (2022).

Geirhos, R. et al. Shortcut learning in deep neural networks. Nat. Mach. Intell. 2 , 665–673 (2020).

Koh, P. W. et al. WILDS: a benchmark of in-the-wild distribution shifts. In International Conference on Machine Learning 5637–5664 (2021).

Luo, Z. et al. Label efficient learning of transferable representations across domains and tasks. In Advances in Neural Information Processing Systems 30 , 165–177 (2017).

Mahmood, R. et al. How much more data do I need? estimating requirements for downstream tasks. In IEEE Conference on Computer Vision and Pattern Recognition 275–284 (2022).

Coley, C. W., Eyke, N. S. & Jensen, K. F. Autonomous discovery in the chemical sciences part II: outlook. Angew. Chem. Int. Ed. 59 , 23414–23436 (2020).

Gao, W. & Coley, C. W. The synthesizability of molecules proposed by generative models. J. Chem. Inf. Model. 60 , 5714–5723 (2020).

Kogler, R. et al. Jet substructure at the Large Hadron Collider. Rev. Mod. Phys. 91 , 045003 (2019).

Acosta, J. N. et al. Multimodal biomedical AI. Nat. Med. 28 , 1773–1784 (2022).

Alayrac, J.-B. et al. Flamingo: a visual language model for few-shot learning. In Advances in Neural Information Processing Systems 35 , 23716–23736 (2022).

Elmarakeby, H. A. et al. Biologically informed deep neural network for prostate cancer discovery. Nature 598 , 348–352 (2021).

Qin, Y. et al. A multi-scale map of cell structure fusing protein images and interactions. Nature 600 , 536–542 (2021).

Schaffer, L. V. & Ideker, T. Mapping the multiscale structure of biological systems. Cell Systems 12 , 622–635 (2021).

Stiglic, G. et al. Interpretability of machine learning-based prediction models in healthcare. Wiley Interdiscip. Rev. Data Min. Knowl. Discov. 10 , e1379 (2020).

Erion, G. et al. A cost-aware framework for the development of AI models for healthcare applications. Nat. Biomed. Eng. 6 , 1384–1398 (2022).

Lundberg, S. M. et al. Explainable machine-learning predictions for the prevention of hypoxaemia during surgery. Nat. Biomed. Eng. 2 , 749–760 (2018).

Sanders, L. M. et al. Beyond low Earth orbit: biological research, artificial intelligence, and self-driving labs. Preprint at https://arxiv.org/abs/2112.12582 (2021).

Gagne, D. J. II et al. Interpretable deep learning for spatial analysis of severe hailstorms. Mon. Weather Rev. 147 , 2827–2845 (2019).

Rudin, C. Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead. Nat. Mach. Intell. 1 , 206–215 (2019).

Koh, P. W. & Liang, P. Understanding black-box predictions via influence functions. In International Conference on Machine Learning 1885–1894 (2017).

Mirzasoleiman, B., Bilmes, J. & Leskovec, J. Coresets for data-efficient training of machine learning models. In International Conference on Machine Learning 6950–6960 (2020).

Kim, B. et al. Interpretability beyond feature attribution: quantitative testing with concept activation vectors (TCAV). In International Conference on Machine Learning 2668–2677 (2018).

Silver, D. et al. Mastering the game of go without human knowledge. Nature 550 , 354–359 (2017).

Baum, Z. J. et al. Artificial intelligence in chemistry: current trends and future directions. J. Chem. Inf. Model. 61 , 3197–3212 (2021).

Finlayson, S. G. et al. Adversarial attacks on medical machine learning. Science 363 , 1287–1289 (2019).

Urbina, F. et al. Dual use of artificial-intelligence-powered drug discovery. Nat. Mach. Intell. 4 , 189–191 (2022).

Norgeot, B. et al. Minimum information about clinical artificial intelligence modeling: the MI-CLAIM checklist. Nat. Med. 26 , 1320–1324 (2020).

Download references

Acknowledgements

M.Z. gratefully acknowledges the support of the National Institutes of Health under R01HD108794, U.S. Air Force under FA8702-15-D-0001, awards from Harvard Data Science Initiative, Amazon Faculty Research, Google Research Scholar Program, Bayer Early Excellence in Science, AstraZeneca Research, Roche Alliance with Distinguished Scientists, and Kempner Institute for the Study of Natural and Artificial Intelligence. C.P.G. and Y.D. acknowledge the support from the U.S. Air Force Office of Scientific Research under Multidisciplinary University Research Initiatives Program (MURI) FA9550-18-1-0136, Defense University Research Instrumentation Program (DURIP) FA9550-21-1-0316, and awards from Scientific Autonomous Reasoning Agent (SARA), and AI for Discovery Assistant (AIDA). Any opinions, findings, conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of the funders. We thank D. Hassabis, A. Davies, S. Mohamed, Z. Li, K. Ma, Z. Qiao, E. Weinstein, A. V. Weller, Y. Zhong and A. M. Brandt for discussions on the paper.

Author information

Hanchen Wang

Present address: Department of Research and Early Development, Genentech Inc, South San Francisco, CA, USA

Present address: Department of Computer Science, Stanford University, Stanford, CA, USA

These authors contributed equally: Hanchen Wang, Tianfan Fu, Yuanqi Du

Authors and Affiliations

Department of Engineering, University of Cambridge, Cambridge, UK

Hanchen Wang & Joan Lasenby

Department of Computing and Mathematical Sciences, California Institute of Technology, Pasadena, CA, USA

Hanchen Wang & Anima Anandkumar

Department of Computational Science and Engineering, Georgia Institute of Technology, Atlanta, GA, USA

Department of Computer Science, Cornell University, Ithaca, NY, USA

Yuanqi Du & Carla P. Gomes

Department of Chemical Engineering, Massachusetts Institute of Technology, Cambridge, MA, USA

Wenhao Gao & Connor W. Coley

Department of Computer Science, Stanford University, Stanford, CA, USA

Kexin Huang & Jure Leskovec

Department of Physics, Massachusetts Institute of Technology, Cambridge, MA, USA

Harvard-MIT Program in Health Sciences and Technology, Cambridge, MA, USA

Payal Chandak

Mila – Quebec AI Institute, Montreal, Quebec, Canada

Shengchao Liu, Andreea Deac, Jian Tang & Yoshua Bengio

Université de Montréal, Montreal, Quebec, Canada

Shengchao Liu, Andreea Deac & Yoshua Bengio

Department of Earth, Environmental and Planetary Sciences, Brown University, Providence, RI, USA

Peter Van Katwyk & Karianne Bergen

Data Science Institute, Brown University, Providence, RI, USA

NVIDIA, Santa Clara, CA, USA

Anima Anandkumar

Center for Computational Astrophysics, Flatiron Institute, New York, NY, USA

Department of Astrophysical Sciences, Princeton University, Princeton, NJ, USA

Department of Physics, Carnegie Mellon University, Pittsburgh, PA, USA

Department of Physics and Center for Data Science, New York University, New York, NY, USA

Google DeepMind, London, UK

Pushmeet Kohli & Petar Veličković

Microsoft Research, Beijing, China

Tie-Yan Liu

Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA

Arjun Manrai & Marinka Zitnik

Department of Systems Biology, Harvard Medical School, Boston, MA, USA

Debora Marks

Broad Institute of MIT and Harvard, Cambridge, MA, USA

Debora Marks & Marinka Zitnik

Deep Forest Sciences, Palo Alto, CA, USA

Bharath Ramsundar

BioMap, Beijing, China

Mohamed bin Zayed University of Artificial Intelligence, Abu Dhabi, United Arab Emirates

University of Illinois at Urbana-Champaign, Champaign, IL, USA

HEC Montréal, Montreal, Quebec, Canada

CIFAR AI Chair, Toronto, Ontario, Canada

Department of Computer Science and Technology, University of Cambridge, Cambridge, UK

Petar Veličković

University of Amsterdam, Amsterdam, Netherlands

Max Welling

Microsoft Research Amsterdam, Amsterdam, Netherlands

DP Technology, Beijing, China

Linfeng Zhang

AI for Science Institute, Beijing, China

Department of Electrical Engineering and Computer Science, Massachusetts Institute of Technology, Cambridge, MA, USA

Connor W. Coley

Harvard Data Science Initiative, Cambridge, MA, USA

Marinka Zitnik

Kempner Institute for the Study of Natural and Artificial Intelligence, Harvard University, Cambridge, MA, USA

You can also search for this author in PubMed   Google Scholar

Contributions

All authors contributed to the design and writing of the paper, helped shape the research, provided critical feedback, and commented on the paper and its revisions. H.W., T.F., Y.D. and M.Z conceived the study and were responsible for overall direction and planning. W.G., K.H. and Z.L. contributed equally to this work (equal second authorship) and are listed alphabetically.

Corresponding author

Correspondence to Marinka Zitnik .

Ethics declarations

Competing interests.

The authors declare no competing interests.

Peer review

Peer review information.

Nature thanks Brian Gallagher and Benjamin Nachman for their contribution to the peer review of this work.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article.

Wang, H., Fu, T., Du, Y. et al. Scientific discovery in the age of artificial intelligence. Nature 620 , 47–60 (2023). https://doi.org/10.1038/s41586-023-06221-2

Download citation

Received : 30 March 2022

Accepted : 16 May 2023

Published : 02 August 2023

Issue Date : 03 August 2023

DOI : https://doi.org/10.1038/s41586-023-06221-2

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

This article is cited by

Antimicrobial resistance crisis: could artificial intelligence be the solution.

  • Guang-Yu Liu
  • Xiao-Fen Liu

Military Medical Research (2024)

Embracing data science in catalysis research

  • Manu Suvarna
  • Javier Pérez-Ramírez

Nature Catalysis (2024)

Automated BigSMILES conversion workflow and dataset for homopolymeric macromolecules

  • Joonbum Lee
  • Junhee Seok

Scientific Data (2024)

Techniques for supercharging academic writing with generative AI

  • Zhicheng Lin

Nature Biomedical Engineering (2024)

Memristor-based hardware accelerators for artificial intelligence

  • Takashi Ando
  • Qiangfei Xia

Nature Reviews Electrical Engineering (2024)

By submitting a comment you agree to abide by our Terms and Community Guidelines . If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.

Quick links

  • Explore articles by subject
  • Guide to authors
  • Editorial policies

Sign up for the Nature Briefing: AI and Robotics newsletter — what matters in AI and robotics research, free to your inbox weekly.

technical research paper on artificial intelligence

technical research paper on artificial intelligence

The Journal of Artificial Intelligence Research (JAIR) is dedicated to the rapid dissemination of important research results to the global artificial intelligence (AI) community. The journal’s scope encompasses all areas of AI, including agents and multi-agent systems, automated reasoning, constraint processing and search, knowledge representation, machine learning, natural language, planning and scheduling, robotics and vision, and uncertainty in AI.

Current Issue

Vol. 79 (2024)

Published: 2024-01-10

Bt-GAN: Generating Fair Synthetic Healthdata via Bias-transforming Generative Adversarial Networks

Collision avoiding max-sum for mobile sensor teams, usn: a robust imitation learning method against diverse action noise, structure in deep reinforcement learning: a survey and open problems, a map of diverse synthetic stable matching instances, digcn: a dynamic interaction graph convolutional network based on learnable proposals for object detection, iterative train scheduling under disruption with maximum satisfiability, removing bias and incentivizing precision in peer-grading, cultural bias in explainable ai research: a systematic analysis, learning to resolve social dilemmas: a survey, a principled distributional approach to trajectory similarity measurement and its application to anomaly detection, multi-modal attentive prompt learning for few-shot emotion recognition in conversations, condense: conditional density estimation for time series anomaly detection, performative ethics from within the ivory tower: how cs practitioners uphold systems of oppression, learning logic specifications for policy guidance in pomdps: an inductive logic programming approach, multi-objective reinforcement learning based on decomposition: a taxonomy and framework, can fairness be automated guidelines and opportunities for fairness-aware automl, practical and parallelizable algorithms for non-monotone submodular maximization with size constraint, exploring the tradeoff between system profit and income equality among ride-hailing drivers, on mitigating the utility-loss in differentially private learning: a new perspective by a geometrically inspired kernel approach, an algorithm with improved complexity for pebble motion/multi-agent path finding on trees, weighted, circular and semi-algebraic proofs, reinforcement learning for generative ai: state of the art, opportunities and open research challenges, human-in-the-loop reinforcement learning: a survey and position on requirements, challenges, and opportunities, boolean observation games, detecting change intervals with isolation distributional kernel, query-driven qualitative constraint acquisition, visually grounded language learning: a review of language games, datasets, tasks, and models, right place, right time: proactive multi-robot task allocation under spatiotemporal uncertainty, principles and their computational consequences for argumentation frameworks with collective attacks, the ai race: why current neural network-based architectures are a poor basis for artificial general intelligence, undesirable biases in nlp: addressing challenges of measurement.

Advertisement

Advertisement

AI-Based Modeling: Techniques, Applications and Research Issues Towards Automation, Intelligent and Smart Systems

  • Review Article
  • Open access
  • Published: 10 February 2022
  • Volume 3 , article number  158 , ( 2022 )

Cite this article

You have full access to this open access article

technical research paper on artificial intelligence

  • Iqbal H. Sarker   ORCID: orcid.org/0000-0003-1740-5517 1 , 2  

86k Accesses

195 Citations

5 Altmetric

Explore all metrics

Artificial intelligence (AI) is a leading technology of the current age of the Fourth Industrial Revolution (Industry 4.0 or 4IR), with the capability of incorporating human behavior and intelligence into machines or systems. Thus, AI-based modeling is the key to build automated, intelligent, and smart systems according to today’s needs. To solve real-world issues, various types of AI such as analytical, functional, interactive, textual, and visual AI can be applied to enhance the intelligence and capabilities of an application. However, developing an effective AI model is a challenging task due to the dynamic nature and variation in real-world problems and data. In this paper, we present a comprehensive view on “AI-based Modeling” with the principles and capabilities of potential AI techniques that can play an important role in developing intelligent and smart systems in various real-world application areas including business, finance, healthcare, agriculture, smart cities, cybersecurity and many more. We also emphasize and highlight the research issues within the scope of our study. Overall, the goal of this paper is to provide a broad overview of AI-based modeling that can be used as a reference guide by academics and industry people as well as decision-makers in various real-world scenarios and application domains.

Similar content being viewed by others

technical research paper on artificial intelligence

Data-Centric Artificial Intelligence

technical research paper on artificial intelligence

Integrating Artificial Intelligence for Adaptive Decision-Making in Complex System

technical research paper on artificial intelligence

Towards Artificial Intelligence: Concepts, Applications, and Innovations

Avoid common mistakes on your manuscript.

Introduction

Nowadays, we live in a technological age, the Fourth Industrial Revolution, known as Industry 4.0 or 4IR [ 59 , 91 ], which envisions fast change in technology, industries, societal patterns, and processes as a consequence of enhanced interconnectivity and smart automation. This revolution is impacting almost every industry in every country and causing a tremendous change in a non-linear manner at an unprecedented rate, with implications for all disciplines, industries, and economies. Three key terms Automation , i.e., reducing human interaction in operations, Intelligent , i.e., ability to extract insights or usable knowledge from data, and Smart computing , i.e., self-monitoring, analyzing, and reporting, known as self-awareness, have become fundamental criteria in designing today’s applications and systems in every sector of our lives since the current world is more reliant on technology than ever before. The use of modern smart technologies enables making smarter, faster decisions regarding the business process, ultimately increasing the productivity and profitability of the overall operation, where Artificial Intelligence (AI) is known as a leading technology in the area. The AI revolution, like earlier industrial revolutions that launched massive economic activity in manufacturing, commerce, transportation, and other areas, has the potential to lead the way of progress. As a result, the impact of AI on the fourth industrial revolution motivates us to focus briefly on “ AI-based modeling ” in this paper.

Artificial intelligence (AI) is a broad field of computer science concerned with building smart machines capable of performing tasks that typically require human intelligence. In other words, we can say that it aims is to make computers smart and intelligent by giving them the ability to think and learn using computer programs or machines, i.e., can think and function in the same way that people do. From a philosophical perspective, AI has the potential to help people live more meaningful lives without having to work as hard, as well as manage the massive network of interconnected individuals, businesses, states, and nations in a way that benefits everyone. Thus, the primary goal of AI is to enable computers and machines to perform cognitive functions such as problem-solving, decision making, perception, and comprehension of human communication. Therefore, AI-based modeling is the key to building automated, intelligent and smart systems according to today’s needs, which has emerged as the next major technological milestone, influencing the future of practically every business by making every process better, faster, and more precise.

While today’s Fourth Industrial Revolution is typically focusing on technology-driven “automation, intelligent and smart systems”, AI technology has become one of the core technologies to achieve the goal. However, developing an effective AI model is a challenging task due to the dynamic nature and variation in real-world problems and data. Thus, we take into account several AI categories: The first one is “ Analytical AI ” with the capability of extracting insights from data to ultimately produce recommendations and thus contributing to data-driven decision-making; the Second one is “ Functional AI ” which is similar to analytical AI; however, instead of giving recommendations, it takes actions; the Third one is “ Interactive AI ” that typically allows businesses to automate communication without compromising on interactivity like smart personal assistants or chatbots; the Fourth one is “ Textual AI ” that covers textual analytics or natural language processing through which business can enjoy text recognition, speech-to-text conversion, machine translation, and content generation capabilities; and finally the Fifth one is “ Visual AI ” that covers computer vision or augmented reality fields, discussed briefly in “Why artificial intelligence in today’s research and applications?”.

Although the area of “artificial intelligence” is huge, we mainly focus on potential techniques towards solving real-world issues, where the results are used to build automated, intelligent, and smart systems in various application areas. To build AI-based models, we classify various AI techniques into ten categories: (1) machine learning; (2) neural networks and deep learning; (3) data mining, knowledge discovery and advanced analytics; (4) rule-based modeling and decision-making; (5) fuzzy logic-based approach; (6) knowledge representation, uncertainty reasoning, and expert system modeling; (7) case-based reasoning; (8) text mining and natural language processing; (9) visual analytics, computer vision and pattern recognition; (10) hybridization, searching and Optimization. These techniques can play an important role in developing intelligent and smart systems in various real-world application areas that include business, finance, healthcare, agriculture, smart cities, cybersecurity, and many more, depending on the nature of the problem and target solution. Thus, it is important to comprehend the concepts of these techniques mentioned above, as well as their relevance in a variety of real-world scenarios, discussed briefly in “ Potential AI techniques ”.

Based on the importance and capabilities of AI techniques, in this paper, we give a comprehensive view on “AI-based modeling” that can play a key role towards automation, intelligent and smart systems according to today’s needs. Thus, the key focus is to explain the principles of various AI techniques and their applicability to the advancement of computing and decision-making to meet the requirements of the Fourth Industrial Revolution. Therefore the purpose of this paper is to provide a fundamental guide for those academics and industry professionals who want to study, research, and develop automated, intelligent, and smart systems based on artificial intelligence techniques in relevant application domains.

The main contributions of this paper are therefore listed as follows:

To define the scope of our study in terms of automation, intelligent and smart computing, and decision-making in the context of today’s real-world needs.

To explore various types of AI that includes analytical, functional, interactive, textual, and visual AI, to understand the theme of the power of artificial intelligence in computing and decision-making while solving various problems in today’s Fourth Industrial Revolution.

To provide a comprehensive view on AI techniques that can be applied to build an AI-based model to enhance the intelligence and capabilities of a real-world application.

To discuss the applicability of AI-based solutions in various real-world application domains to assist developers as well as researchers in broadening their perspectives on AI techniques.

To highlight and summarize the potential research issues within the scope of our study for conducting future research, system development and improvement.

The rest of the paper is organized as follows. The next section provides a background highlighting why artificial intelligence is in today’s research and application. In the subsequent section, we discuss and summarize how various AI techniques can be used for intelligence modeling in various application areas. Next, we summarize various real-world application areas, where AI techniques can be employed to build automated, intelligent, and smart systems. The impact and future aspect of AI highlighting research issues have been presented in the penultimate section, and the final section concludes this paper.

Why Artificial Intelligence in Today’s Research and Applications?

In this section, our goal is to motivate the study of various AI techniques that can be applied in various application areas in today’s interconnected world. For this, we explore Industry 4.0 and the revolution of AI, types of AI techniques, as well as the relation with the most prominent machine and deep learning techniques. Hence, the scope of our study in terms of research and applications is also explored through our discussion.

Industry 4.0 and the Revolution of AI

We are now in the age of the 4th Industrial Revolution, referred to as Industry 4.0 [ 59 , 91 ], which represents a new era of innovation in technology, particularly, AI-driven technology. After the Internet and mobile Internet sparked the 3rd Industrial Revolution, AI technologies, fueled by data, are now creating an atmosphere of Industry 4.0. The term “Industry 4.0” typically refers to the present trend of leveraging modern technology to automate processes and exchange information. In a broad sense, Industry 4.0 has been defined as “A term used to describe the present trend of industrial technology automation and data exchange, which includes cyber-physical systems, the Internet of Things, cloud computing, and cognitive computing, as well as the development of the smart factory”. The digital revolution to Industry 4.0 begins with data collection, followed by artificial intelligence to interpret the data. Thus, the term “Intelligence Revolution” can be considered in the context of computing and services as the world is being reshaped by AI that incorporates human behavior and intelligence into machines or systems.

AI is the buzzword these days as it is going to impact businesses of all shapes and sizes, across all industries. Existing products or services can be enhanced by industrial AI to make them more effective, reliable, and safe. For example, computer vision is used in the automotive industry to avoid collisions and allow vehicles to stay in their lane, making driving safer. The world’s most powerful nations are hurrying to jump on the AI bandwagon and are increasing their investments in the field. Similarly, the largest and most powerful corporations are working hard to build ground-breaking AI solutions that will put them ahead of the competition. As a result, its impact may be observed in practically every area including homes, businesses, hospitals, cities, and the virtual world, as summarized in “ Real-World Applications of AI ”.

figure 1

Various types of artificial intelligence (AI) considering the variations of real-world issues

Understanding Various Types of Artificial Intelligence

Artificial intelligence (AI) is primarily concerned with comprehending and carrying out intelligent tasks such as thinking, acquiring new abilities, and adapting to new contexts and challenges. AI is thus considered a branch of science and engineering that focuses on simulating a wide range of issues and functions in the field of human intellect. However, due to the dynamic nature and diversity of real-world situations and data, building an effective AI model is a challenging task. Thus, to solve various issues in today’s Fourth Industrial Revolution, we explore various types of AI that include analytical, functional, interactive, textual, and visual, to understand the theme of the power of AI, as shown in Fig.  1 . In the following, we define the scope of each category in terms of computing and real-world services.

Analytical AI: Analytics typically refers to the process of identifying, interpreting, and communicating meaningful patterns of data. Thus, Analytical AI aims to discover new insights, patterns, and relationships or dependencies in data and to assist in data-driven decision-making. Therefore, in the domain of today’s business intelligence, it becomes a core part of AI that can provide insights to an enterprise and generate suggestions or recommendations through its analytical processing capability. Various machine learning [ 81 ] and deep learning [ 80 ] techniques can be used to build an analytical AI model to solve a particular real-world problem. For instance, to assess business risk, a data-driven analytical model can be used.

Functional AI: Functional AI works similarly to analytical AI because it also explores massive quantities of data for patterns and dependencies. Functional AI, on the other hand, executes actions rather than making recommendations. For instance, a functional AI model could be useful in robotics and IoT applications to take immediate actions.

Interactive AI: Interactive AI typically enables efficient and interactive communication automation, which is well established in many aspects of our daily lives, particularly in the commercial sphere. For instance, to build chatbots and smart personal assistants an interactive AI model could be useful. While building an interactive AI model, a variety of techniques such as machine learning, frequent pattern mining, reasoning, AI heuristic search can be employed.

Textual AI: Textual AI typically covers textual analytics or natural language processing through which businesses can enjoy text recognition, speech-to-text conversion, machine translation as well as content generation capabilities. For instance, an enterprise may use textual AI to support an internal corporate knowledge repository to provide relevant services, e.g., answering consumers’ queries.

Visual AI: Visual AI is typically capable to recognize, classify, and sorting items, as well as converting images and videos into insights. Thus, visual AI can be considered as a branch of computer science that trains machines to learn images and visual data in the same manner that humans do. This sort of AI is often used in fields such as computer vision and augmented reality.

As discussed above, each of the AI types has the potential to provide solutions to various real-world problems. However, to provide solutions by taking into account the target applications, various AI techniques and their combinations that include machine learning, deep learning, advanced analytics, knowledge discovery, reasoning, searching, and relevant others can be used, discussed briefly in “ Potential AI techniques ”. As most of the real-world issues need advanced analytics [ 79 ] to provide an intelligent and smart solution according to today’s needs, analytical AI that uses machine learning (ML) and deep learning (DL) techniques can play a key role in the area of AI-powered computing and system.

The Relation of AI with ML and DL

Artificial intelligence (AI), machine learning (ML), and deep learning (DL) are three prominent terminologies used interchangeably nowadays to represent intelligent systems or software. The position of machine learning and deep learning within the artificial intelligence field is depicted in Fig.  2 . According to Fig.  2 , DL is a subset of ML which is also a subset of AI. In general, AI [ 77 ] combines human behavior and intelligence into machines or systems, whereas ML is a way of learning from data or experience [ 81 ], which automates analytical model building. Deep learning [ 80 ] also refers to data-driven learning approaches that use multi-layer neural networks and processing to compute. In the deep learning approach, the term “Deep” refers to the concept of numerous levels or stages through which data is processed to develop a data-driven model.

figure 2

An illustration of the position of machine learning (ML) and deep Learning (DL) within the area of artificial intelligence (AI)

Thus, both ML and DL can be considered as essential AI technologies, as well as a frontier for AI that can be used to develop intelligent systems and automate processes. It also takes AI to a new level, termed “Smarter AI” with data-driven learning. There is a significant relationship with “Data Science” [ 79 ] as well because both ML and DL can learn from data. These learning methods can also play a crucial role in advanced analytics and intelligent decision-making in data science, which typically refers to the complete process of extracting insights in data in a certain problem domain. Overall, we can conclude that both ML and DL technologies have the potential to transform the current world, particularly in terms of a powerful computational engine, and to contribute to technology-driven automation, smart and intelligent systems. In addition to these learning techniques, several others can play the role in the development of AI-based models in various real-world application areas, depending on the nature of the problem and the target solution, discussed briefly in “  Potential AI techniques ”.

Potential AI Techniques

In this section, we briefly discuss the principles and capabilities of potential AI techniques that can be used in developing intelligent and smart systems in various real-world application areas. For this we divide AI techniques into ten potential categories by taking into account various types of AI, mentioned in earlier “ Why artificial intelligence in today’s research and applications? ”. Followings are the ten categories of AI techniques that can play a key role in automation, intelligent, and smart computer systems, depending on the nature of the problem.

Machine Learning

Machine learning (ML) is known as one of the most promising AI technologies, which is typically the study of computer algorithms that automate analytical model building [ 81 ]. ML models are often made up of a set of rules, procedures, or sophisticated “transfer functions” that can be used to discover interesting data patterns or anticipate behavior [ 23 ]. Machine learning is also known as predictive analytics that makes predictions about certain unknowns in the future through the use of data and is used to solve many real-world business issues, e.g., business risk prediction. In Fig.  3 , a general framework of a machine learning-based predictive model is depicted, where the model is trained from historical data in phase 1 and the outcome is generated for new test data in phase 2. For modeling in a particular problem domain, different types of machine learning techniques can be used according to their learning principles and capabilities, as discussed below.

figure 3

A general structure of a machine learning based predictive model considering both the training and testing phase

Supervised learning This is performed when particular goals are specified to be achieved from a set of inputs, i.e., a ‘task-driven strategy’ that uses labeled data to train algorithms to classify data or forecast outcomes, for example—detecting spam-like emails. The two most common supervised learning tasks are classification (predicting a label) and regression (predicting a quantity) analysis, discussed briefly in our earlier paper Sarker et al. [ 81 ]. Navies Bayes [ 42 ], K-nearest neighbors [ 4 ], Support vector machines [ 46 ], Decision Trees - ID3 [ 71 ], C4.5 [ 72 ], CART [ 15 ], BehavDT [ 84 ], IntrudTree [ 82 ], Ensemble learning, Random Forest [ 14 ], Linear regression [ 36 ], Support vector regression [ 46 ], etc. [ 81 ] are the popular techniques that can be used to solve various supervised learning tasks, according to the nature of the given data in a particular problem domain. For instance, to detect various types of cyber-attacks the classification models could be useful, while cyber-crime trend analysis or estimating the financial loss in the domain of cybersecurity, a regression model could be useful, which enables enterprises to assess and manage their cyber-risk.

Unsupervised learning This is referred to as a ‘data-driven method’, in which the primary goal is to uncover patterns, structures, or knowledge from unlabeled data. Clustering, visualization, dimensionality reduction, finding association rules, and anomaly detection are some of the most common unsupervised tasks, discussed briefly in our earlier paper Sarker et al. [ 81 ]. The popular techniques for solving unsupervised learning tasks are clustering algorithms such as K-means [ 55 ], K-Mediods [ 64 ], CLARA [ 45 ], DBSCAN [ 27 ], hierarchical clustering, single linkage [ 92 ] or complete linkage [ 93 ], BOTS [ 86 ], association learning algorithms such as AIS [ 2 ], Apriori [ 3 ], Apriori-TID and Apriori-Hybrid [ 3 ], FP-Tree [ 37 ], and RARM [ 18 ], Eclat [ 105 ], ABC-RuleMiner [ 88 ] as well as feature selection and extracting techniques like Pearson Correlation [ 81 ], principal component analysis [ 40 , 66 ], etc. that can be used to solve various unsupervised learning-related tasks, according to the nature of the data. An unsupervised clustering model, for example, could be useful in customer segmentation or identifying different consumer groups around which to build marketing or other business strategies.

Other learning techniques In addition to particular supervised and unsupervised tasks, semi-supervised learning can be regarded as a hybridization of both techniques explained above, as it uses both labeled and unlabeled data to train a model. It could be effective for improving model performance when data must be labeled automatically without human interaction. For instance, classifying Internet content or texts, a semi-supervised learning model could be useful. Reinforcement learning is another type of machine learning training strategy that rewards desired behaviors while punishing unwanted ones. A reinforcement learning agent, in general, is capable of perceiving and interpreting its surroundings, taking actions, and learning through trial and error, i.e., an environment-driven approach, in which the environment is typically modeled as a Markov decision process and decisions are made using a reward function [ 10 ]. Monte Carlo learning, Q-learning, Deep Q Networks, are the most common reinforcement learning algorithms [ 43 ]. Trajectory optimization, motion planning, dynamic pathing, and scenario-based learning policies for highways are some of the autonomous driving activities where reinforcement learning could be used.

Overall, machine learning modeling [ 81 ] has been employed in practically every aspect of our lives, including healthcare, cybersecurity, business, education, virtual assistance, recommendation systems, smart cities, and many more. Blumenstock et al. [ 12 ], for example, provides a machine learning strategy for getting COVID-19 assistance to people who need it the most. Sarker et al. highlight numerous sorts of cyber anomalies and attacks that can be detected using machine learning approaches in the domain of cybersecurity [ 78 , 89 ]. Saharan et al. [ 76 ] describe a machine-learning-based strategy to develop an effective smart parking pricing system for smart city environments. In our earlier paper, Sarker et al. [ 81 ] we briefly discussed various types of machine learning techniques including clustering, feature learning, classification, regression, association analysis, etc. highlighting their working principles, learning capabilities, and real-world applications. In Table  1 , we have outlined the above-mentioned machine learning techniques, emphasizing model building procedures and tasks. Overall, machine learning algorithms can build a model based on training data of a particular problem domain, to make predictions or decisions without having to be explicitly programmed to do so. Thus, we can conclude that machine learning approaches can play a crucial part in the development of useful models in a variety of application areas, based on their learning capabilities and the nature of the data, and the desired outcome.

Neural Networks and Deep Learning

Deep learning (DL) [ 80 ] is known as another popular AI technique, which is based on artificial neural networks (ANN). Nowadays, DL has become a hot topic in the computing world due to its layer-wise learning capability from data. Multiple hidden layers, including input and output layers, make up a typical deep neural network. Figure  4 shows a general structure of a deep neural network ( \(hidden \; layer=N\) and N \(\ge\) 2) comparing with a shallow network ( \(hidden \; layer=1\) ). DL techniques can be divided into three major categories, highlighted in our earlier paper Sarker et al. [ 80 ]. These are as below:

figure 4

A general architecture of a a shallow network with one hidden layer and b a deep neural network with multiple hidden layers

figure 5

A taxonomy of DL techniques [ 80 ], broadly divided into three major categories (1) deep networks for supervised or discriminative learning, (2) deep networks for unsupervised or generative learning, and (3) deep networks for hybrid learning and relevant others

Deep networks for supervised or discriminative learning In supervised or classification applications, this type of DL approach is used to provide a discriminative function. Discriminative deep architectures are often designed to provide pattern categorization discrimination by characterizing the posterior distributions of classes conditioned on observable data [ 20 ]. Multi-layer perceptron (MLP) [ 67 ], Convolutional neural networks (CNN or ConvNet) [ 53 ], Recurrent neural networks (RNN) [ 24 , 57 ], and their variants can be used to build the deep discriminative learning models to solve the relevant real-world issues.

Deep networks for unsupervised or generative learning This category of deep learning approaches is commonly used to identify high-order correlation qualities or features for pattern analysis or synthesis, as well as the joint statistical distributions of visible data and their associated classes [ 20 ]. The key notion of generative deep architectures is that specific supervisory information, such as target class labels, is unimportant throughout the learning process. Techniques in this category are mostly employed for unsupervised learning, as they are commonly used for feature learning or data generation and representation [ 19 , 20 ]. Thus, generative modeling can also be utilized as a preprocessing step for supervised learning tasks, ensuring discriminative model accuracy. The Generative Adversarial Network (GAN) [ 32 ], Autoencoder (AE) [ 31 ], Restricted Boltzmann Machine (RBM) [ 58 ], Self-Organizing Map (SOM) [ 50 ], and Deep Belief Network (DBN) [ 39 ], as well as their variants, can be used to build the deep generative learning models to solve the relevant real-world issues.

Deep networks for hybrid learning Generative models are versatile, learning from both labeled and unlabeled data. In contrast, discriminative models are unable to learn from unlabeled data yet outperform their generative versions in supervised tasks. Hybrid networks are motivated by a paradigm for simultaneously training deep generative and discriminative models. Multiple (two or more) deep basic learning models make up hybrid deep learning models, with the basic model being the discriminative or generative deep learning model outlined previously. For instance, a generative model followed by a discriminative model, or an integration of a generative or discriminative model followed by a non-deep learning classifier, may be effective for tackling real-world problems.

Figure  5 shows a taxonomy of these DL techniques that can be employed in many application areas including healthcare, cybersecurity, business, virtual help, smart cities, visual analytics, and many more. For example, Aslan et al. [ 9 ] offer a CNN-based transfer learning strategy for COVID-19 infection detection. Islam et al. [ 41 ] describes a combined deep CNN-LSTM network for the identification of novel coronavirus (COVID-19) using X-ray images. Using transferable generative adversarial networks built on deep autoencoders, Kim et al. [ 48 ] propose a method for detecting zero-day malware. Anuradha et al. [ 8 ] propose a deep CNN-based stock trend prediction utilizing a reinforcement-LSTM model based on large data. Wang et al. [ 100 ] offer a real-time collision prediction technique for intelligent transportation systems based on deep learning. Dhyani et al. [ 22 ] proposed an intelligent Chatbot utilizing deep learning with Bidirectional RNN and attention model. Overall, deep learning approaches can play a crucial role in the development of effective AI models in a variety of application areas, based on their learning capabilities and the nature of the data, and the target outcome.

Data Mining, Knowledge Discovery and Advanced Analytics

Over the last decade, data mining has been a common word that is interchangeable with terms like knowledge mining from data, knowledge extraction, knowledge discovery from data (KDD), data or pattern analysis, etc. [ 79 ]. Figure  6 shows a general procedure of the knowledge discovery process. According to Han et al. [ 36 ], the term “knowledge mining from data” should have been used instead. Data mining is described as the process of extracting useful patterns and knowledge from huge volumes of data [ 36 ], which is related to another popular term “Data Science” [ 79 ]. Data science is typically defined as a concept that unites statistics, data analysis, and related methodologies to analyze and investigate realities through data.

figure 6

A general procedure of the knowledge discovery process

In the area of data analytics, several key questions such as “What happened?”, “Why did it happen?”, “What will happen in the future?”, “What action should be taken?” are common and important [ 79 ]. Based on these questions, four types of analytics such as descriptive, diagnostic, predictive, and prescriptive analytics are highlighted below, which can be used to build the corresponding data-driven models.

Descriptive analytics It is the analysis of historical data to have a better understanding of how a business has changed. Thus, descriptive analytics answers the question, “What happened in the past?” by describing historical data such as sales and operations statistics, marketing tactics, social media usage, etc.

Diagnostic analytics It is a type of sophisticated analytics that explores data or content to figure out “Why did it happen?” The purpose of diagnostic analytics is to assist in the discovery of the problem’s root cause.

Predictive analytics This type of advanced analytics typically explores data to answer the question, “What will happen in the future?” Thus, the primary purpose of predictive analytics is to identify and, in most cases, answer this question with a high degree of confidence.

Prescriptive analytics This focuses on advising the optimal course of action based on data to maximize the total outcomes and profitability, answering the question, “What action should be taken?”

To summarize, both descriptive and diagnostic analytics examine the past to determine what happened and why it happened. Predictive and prescriptive analytics employ historical data to foresee what will happen in the future and what actions should be made to mitigate such impacts. For a clear understanding, Table  2 shows a summary of these analytics that are applied in various application areas. For example, Hamed et al. [ 35 ] build decision support systems in Arabic higher education institutions using data mining and business intelligence. Alazab et al. [ 5 ] provide a data mining strategy to maximize the competitive advantage on E-business websites. From logs to stories, Afzaliseresht et al. [ 1 ] provide human-centered data mining for cyber threat information. Poort et al. [ 70 ] have described an automated diagnostic analytics workflow for the detection of production events-application to mature gas fields. Srinivas et al. [ 94 ] provide a prescriptive analytics framework for optimizing outpatient appointment systems using machine learning algorithms and scheduling rules. Thus, we can conclude data mining and analytics can play a crucial part to build AI models through the extracted insights from the data.

Rule-Based Modeling and Decision-Making

Typically, a rule-based system is used to store and modify knowledge to understand data in a meaningful way. A rule base is a sort of knowledge base that has a list of rules. In most cases, rules are written as IF-THEN statements of the form:

IF \(<antecedent>\) THEN \(<consequent>\)

Such an IF-THEN rule-based expert system model can have the decision-making ability of a human expert in an intelligent system designed to solve complex problems and knowledge reasoning [ 85 ]. The reason is that the rules in professional frameworks are easily understood by humans and are capable of representing relevant knowledge clearly and effectively. Furthermore, rule-based models may be quickly improved according to the demands by adding, deleting, or updating rules based on domain expert information, or recency, i.e. based on recent trends [ 83 ].

Previously, the term “rule-based system” was used to describe systems that used rule sets that were handcrafted or created by humans. However, rule-based machine learning approaches could be more effective in terms of automation and intelligence, which include mainly classification and association rule learning techniques [ 85 ]. Several popular classification techniques such as decision trees [ 72 ], IntrudTree [ 82 ], BehavDT [ 84 ], Ripple Down Rule learner (RIDOR) [ 101 ], Repeated Incremental Pruning to Produce Error Reduction (RIPPER) [ 102 ], etc. exist with the ability of rule generation. Based on support and confidence value, association rules are built by searching for frequent IF-THEN pattern data. Common association rule learning techniques such as AIS [ 2 ], Apriori [ 3 ], FP-Tree [ 37 ], RARM [ 18 ], Eclat [ 105 ], ABC-RuleMiner [ 88 ], and others can be used to build a rule-based model utilizing a given data set. Sarker et al. [ 88 ], for example, provide a rule-based machine learning strategy for context-aware intelligent and adaptive mobile services. Borah et al. [ 13 ] propose a method for employing dynamic rare association rule mining to find risk variables for unfavorable illnesses. Using case-based clustering and weighted association rule mining, Bhavithra et al. [ 11 ] offer a personalized web page suggestion. Xu et al. [ 103 ] introduced a risk prediction and early warning system for air traffic controllers’ risky behaviors utilizing association rule mining and random forest. Thus, we can conclude that rule-based modeling can play a significant role to build AI models as well as intelligent decision-making in various application areas to solve real-world issues.

Fuzzy Logic-Based Approach

Fuzzy logic is a precise logic of imprecision and approximate reasoning [ 104 ]. This is a natural generalization of standard logic in which a concept’s degree of truth, also known as membership value or degree of membership, can range from 0.0 to 1.0. Standard logic only applies to concepts that are either completely true, i.e., degree of truth 1.0, or completely false, i.e., degree of truth 0.0. Fuzzy logic, on the other hand, has been used to deal with the concept of partial truth, in which the truth value may range from completely true to completely false, such as 0.9 or 0.5. For instance, “if x is very large, do y; if x is not very large, do z”. Here the boundaries of very big and not too big may overlap, i.e. fuzzy. As a result, fuzzy logic-based models can recognize, represent, manipulate, understand, and use data and information that are vague and uncertain [ 104 ]. Figure  7 shows a general architecture of a fuzzy logic system. It typically has four parts as below:

figure 7

A general architecture of fuzzy logic systems

Fuzzification It transforms inputs, i.e. crisp numbers into fuzzy sets.

Knowledge-base It contains the set of rules and the IF-THEN conditions provided by the experts to govern the decision-making system, based on linguistic information.

Inference engine It determines the matching degree of the current fuzzy input concerning each rule and decides which rules are to be fired according to the input field. Next, the fired rules are combined to form the control actions.

Defuzzification It transforms the fuzzy sets obtained by the inference engine into a crisp value.

Although machine learning models are capable of differentiating between two (or more) object classes based on their ability to learn from data, the fuzzy logic approach is preferred when distinguishing features are vaguely defined and rely on human expertise and knowledge. Thus, the system may work with any type of input data, including imprecise, distorted, or noisy data, as well as with limited data. It is a suitable strategy to use in scenarios with real, continuous-valued elements because it uses data acquired in surroundings with such properties [ 34 ]. Fuzzy logic-based models are used to tackle problems in a variety of fields. Reddy et al. [ 74 ], for example, use a fuzzy logic classifier for heart disease detection, with the derived rules from fuzzy classifiers being optimized using an adaptive genetic algorithm. Krishnan et al. [ 51 ] describes a fuzzy logic-based smart irrigation system using IoT, which sends out periodic acknowledgment messages on task statuses such as soil humidity and temperature. Hamamoto et al. [ 34 ] describe a network anomaly detection method based on fuzzy logic for determining whether or not a given instance is anomalous. Kang et al. [ 44 ] proposed a fuzzy weighted association rule mining approach for developing a customer satisfaction product form. Overall, we can infer that fuzzy logic can make reasonable conclusions in a world of imprecision, uncertainty, and partial data, and thus might be useful in such scenarios while building a model.

Knowledge Representation, Uncertainty Reasoning, and Expert System Modeling

Knowledge representation is the study of how an intelligent agent’s beliefs, intents, and judgments may be expressed appropriately for automated reasoning, and it has emerged as one of the most promising topics of Artificial Intelligence. Reasoning is the process of using existing knowledge to conclude, make predictions, or construct explanations. Many types of knowledge can be used in various application domains include descriptive knowledge, structural knowledge, procedural knowledge, meta-knowledge, and heuristic knowledge [ 87 ]. Knowledge representation is more than just storing data in a database; it also allows an intelligent machine to learn from its knowledge and experiences to act intelligently as a human. As a result, in designing an intelligent system, an effective method of knowledge representation is required. Several knowledge representation approaches exist in the fields that can be utilized to develop a knowledge-based conceptual model, including logical, semantic network, frame, and production rules [ 95 ]. In the following, we summarize the potential knowledge representation strategies taking real-world issues into account.

figure 8

An example of ontology components for the entity University [ 26 ]

figure 9

A general architecture of an expert system

Ontology-based In general, ontology is “an explicit specification of conceptualization and a formal way to define the semantics of knowledge and data” [ 56 ]. According to [ 56 ], formally, an ontology is represented as “ \(\{O = C, R, I, H, A\}\) , where \(\{C = C_1, C_2,\ldots ,C_n\}\) represents a set of concepts, and \(\{R = R_1, R_2,\ldots ,R_m\}\) represents a set of relations defined over the concepts. I represents a set of instances of concepts, and H represents a Directed Acyclic Graph (DAG) defined by the subsumption relation between concepts, and A represents a set of axioms bringing additional constraints on the ontology”. Ontology-based knowledge representation and reasoning techniques provide sophisticated knowledge about the environment for processing tasks or methods. Figure  8 shows an example of ontology components for the entity University [ 26 ]. By defining shared and common domain theories, ontologies help people and machines to communicate concisely by supporting semantic knowledge for a particular domain. In the area of semantic data mining, such ontology-based approaches like classification, mining with association rules, clustering, finding links, etc. can play a significant role to build smart systems.

Rule-base It typically consists of pairs of the condition, and corresponding action, which means, “IF \(<condition>\) THEN \(<action>\) ” [ 85 ]. As a result, an agent checks the condition first, and if the condition is satisfied, the related rule fires. The key benefit of a rule-based system like this is that the “condition” part can select which rule is appropriate to use for a given scenario. The “action” portion, on the other hand, is responsible for implementing the problem’s solutions. Furthermore, in a rule-based system, we can easily insert, delete, or update rules as needed.

Uncertainty and probabilistic reasoning Probabilistic reasoning is a method of knowledge representation in which the concept of probability is used to signify the uncertainty in knowledge, and where probability theory and logic are combined to address the uncertainty [ 65 ]. Probability is the numerical measure of the possibility of an event occurring, and it can be defined as the chance that an uncertain event will occur. To deal with uncertainty in a model, probabilistic models, fuzzy logic, Bayesian belief networks, etc. can be employed.

A knowledge-based system, such as an expert system for decision-making, relies on these representations of knowledge. The inference engine and the knowledge base are two subsystems of the expert system, as represented in Fig.  9 . The information in the knowledge base is organized according to the knowledge representation discussed above. The inference engine looks for knowledge-based information and linkages and, like a human expert, provides answers, predictions, and recommendations. Such a knowledge-based system can be found in many application areas. For instance, Goel et al. [ 29 ] present an ontology-driven context-aware framework for smart traffic monitoring. Chukkapalli et al. [ 16 ] present ontology-driven AI and access control systems for smart fisheries. Kiran et al. [ 49 ] present enhanced security-aware technique and ontology data access control in cloud computing. Syed et al. [ 97 ] present a conceptual ontology and cyber intelligence alert system for cybersecurity vulnerability management. An ontology-based cyber security policy implementation in Saudi Arabia has been presented in Talib et al. [ 98 ]. Recently, Sarker et al. [ 90 ] explores an expert system modeling for personalized decision-making in mobile apps. Thus, knowledge representation and modeling are important to build AI models as well as intelligent decision-making in various application areas to solve real-world issues.

Case-Based Reasoning

Case-based reasoning (CBR) is a cognitive science and AI paradigm that represents reasoning as primarily memory-based. CBR is concerned with the “smart” reuse of knowledge from previously solved problems (“cases”) and its adaption to new and unsolved problems. The inference is a problem-solving strategy based on the similarity of the current situation to previously solved problems recorded in a repository. Its premise is that the more similar the two issues are, the more similar their solutions will be. Thus, case-based reasoners handle new problems by obtaining previously stored ’cases’ that describe similar earlier problem-solving experiences and customizing their solutions to meet new requirements. For example, patient case histories and treatments are utilized in medical education to assist diagnose and treating new patients. Figure  10 shows a general architecture of case-based reasoning. CBR research looks at the CBR process as a model of human cognition as well as a method for developing intelligent systems.

figure 10

A general architecture of case-based reasoning

CBR is utilized in a variety of applications. Lamy et al. [ 52 ], for example, provide a visual case-based reasoning strategy for explainable artificial intelligence for breast cancer. Gonzalez et al. [ 30 ] provide a case-based reasoning-based energy optimization technique. Khosravani et al. [ 47 ] offers a case-based reasoning application in a defect detection system for dripper manufacturing. Corrales et al. [ 17 ] provide a case-based reasoning system for data cleaning algorithm recommendation in classification and regression problems. As the number of stored cases grows, CBR becomes more intelligent and thus might be useful in such scenarios while building a model. However, as the time required to find and process relevant cases increases, the system’s efficiency will decline.

Text Mining and Natural Language Processing

Text mining [ 7 ], also known as text data mining, similar to text analytics, is the process of extracting meaningful information from a variety of text or written resources, such as websites, books, emails, reviews, docs, comments, articles, and so on. Information retrieval, lexical analysis to investigate word frequency distributions, pattern recognition, tagging or annotation, information extraction, and data mining techniques such as link and association analysis, visualization, and predictive analytics are all part of text analysis. Text mining achieves this by employing several analysis techniques, such as natural language processing (NLP). NLP is a text analysis technique that allows machines to interpret human speech. NLP tasks include speech recognition, also known as speech-to-text, word segmentation or tokenization, lemmatization and stemming, part of speech tagging, parsing, word sense disambiguation, named entity recognition, sentiment analysis, topic segmentation and recognition, and natural language generation, which is the task of converting structured data into human language [ 21 ]. Fake news identification, spam detection, machine translation, question answering, social media sentiment analysis, text summarization, virtual agents and chatbots, and other real-world applications use NLP techniques.

Although many language-processing systems were built in the early days using symbolic approaches, such as hand-coding a set of rules and looking them up in a dictionary, NLP now blends computational linguistics with statistical, machine learning, and deep learning models [ 80 , 81 ]. These technologies, when used together, allow computers to process human language in the form of text or speech data and comprehend its full meaning, including the speaker’s or writer’s intent and sentiment. Many works have been done in this area. For example, using the feature ensemble model, Phan et al. [ 68 ] propose a method for improving the performance of sentiment analysis of tweets with a fuzzy sentiment. Using weighted word embeddings and deep neural networks, Onan et al. [ 62 ] provide sentiment analysis on product reviews. Subramaniyaswamy et al. [ 96 ] present sentiment analysis of tweets for estimating event criticality and security. In [ 60 ], the efficacy of social media data in healthcare communication is discussed. Typically, learning techniques rather than static analysis is more effective in terms of automation and intelligence in textual modeling or NLP systems. In addition to standard machine learning algorithms [ 81 ], deep learning models and techniques, particularly, based on convolutional neural networks (CNNs) and recurrent neural networks (RNNs) enable such systems to learn as they go and extract progressively accurate meaning from large amounts of unstructured, unlabeled text and speech input. Thus various deep learning techniques including generative and discriminative models can be used to build powerful textual or NLP model according to their learning capabilities from data, discussed briefly in our earlier paper Sarker et al. [ 80 ], which could also be a significant research direction in the area. Overall, we can conclude that by combining machine and deep learning techniques with natural language processing, computers can intelligently analyze, understand, and infer meaning from human speech or text, and thus could be useful for building textual AI models.

Visual Analytics, Computer Vision and Pattern Recognition

Computer vision [ 99 ] is also a branch of AI that allows computers and systems to extract useful information from digital images, videos, and other visual inputs and act or make recommendations based on that data. From an engineering standpoint, it aims to comprehend and automate operations that the human visual system is capable of. As a result, this is concerned with the automated extraction, analysis, and comprehension of relevant information from a single image or a series of images. In terms of technology, it entails the creation of a theoretical and algorithmic foundation for achieving autonomous visual understanding by processing an image at the pixel level. Typical tasks in the field of visual analytics and computer vision include object recognition or classification, detection, tracking, picture restoration, feature matching, image segmentation, scene reconstruction, video motion analysis, and so on.

figure 11

A general architecture of a convolutional neural network (CNN or ConvNet)

Pattern recognition, which is the automated recognition of patterns and regularities in data, is the basis for today’s computer vision algorithms. Pattern recognition often involves the categorization (supervised learning) and grouping (unsupervised learning) of patterns [ 81 ]. Although pattern recognition has its roots in statistics and engineering, due to the greater availability of huge data and a new wealth of processing power, some recent techniques to pattern recognition include the use of machines and deep learning. Convolutional neural networks (CNN or ConvNet) [ 53 , 80 ] have recently demonstrated considerable promise in a variety of computer vision tasks, including classification, object detection, and scene analysis. The general architecture of a convolution neural network is depicted in Figure  11 . Large datasets of thousands or millions of labeled training samples are typically used to train these algorithms. However, the lack of appropriate data limits the applications that can be developed. While enormous volumes of data can be obtained fast, supervised learning also necessitates data that has been labeled. Unfortunately, data labeling takes a long time and costs a lot of money. In this area, a lot of work has been done. Elakkiya et al. [ 25 ] develop a cervical cancer diagnostics healthcare system utilizing hybrid object detection adversarial networks in their paper. Harrou et al. [ 38 ] present an integrated vision-based technique for detecting human falls in a residential setting. Pan et al. [ 63 ] demonstrated a visual recognition based on deep learning for navigation mark classification. Typically, learning techniques rather than static analysis is more effective in terms of automation and intelligence in such visual analytics. In addition to standard machine learning algorithms [ 81 ], various deep learning techniques including generative and discriminative models can be used to build powerful visual model according to their learning capabilities from data, discussed briefly in our earlier paper Sarker et al. [ 80 ], which could also be a significant research direction in the area. Thus, this is important to build effective visual AI models in various application areas to solve real-world issues in the current age of the Fourth Industrial Revolution or Industry 4.0, according to the goal of this paper.

Hybrid Approach, Searching, and Optimization

A “hybrid approach” is a blend of multiple approaches or systems to design a new and superior model. As a result, a hybrid strategy integrates the necessary approaches outlined above depending on the demands. For instance, in our earlier publication, Sarker et al. [ 85 ], we have used a hybridization of machine learning and knowledge-base expert system to build an effective context-aware model for intelligent mobile services. In this hybrid context-aware model, context-aware rules are discovered using machine learning techniques, which are used as the knowledge base of an expert system rather than traditional handcrafted static rules to make computing and decision-making processes more actionable and intelligent. Similarly, in another hybrid approach [ 68 ], the concepts of fuzzy logic, deep learning, and natural language processing were integrated to improve Twitter sentiment analysis accuracy. The authors in [ 33 ] present a deep convolutional neural network-based automated and robust object recognition in X-ray baggage inspection, where deep learning is integrated with computer vision analysis. Kang et al. [ 44 ] proposed a fuzzy weighted association rule mining strategy to produce a customer satisfaction product form. Moreover, Sarker et al. discussed various machine learning [ 81 ] and deep learning [ 80 ] techniques and their hybridization that can be used to solve a variety of real-world problems in many application areas such as business, finance, healthcare, smart cities, cybersecurity, etc. Thus, hybridization of multiple techniques could play a key role to build an effective AI model in the area.

Moreover, many AI problems can be solved theoretically by searching through a large number of possible solutions, and the reasoning process may be reduced down to a simple search. Thus, search strategies, also known as universal problem-solving approaches in AI, can also play a significant role to solve real-world issues such as gaming, ranking web pages, video, and other content in search results, etc., due to the properties of its completeness, optimality, time complexity, and space complexity. Depending on the nature of the problems, search algorithms can be uninformed search (a.k.a. blind, brute-force) or informed search (a.k.a. heuristic search). Uninformed search [ 75 ] refers to a group of general-purpose search algorithms that generate search trees without relying on domain information, such as breadth-first, depth-first, uniform cost search, etc. Informed search [ 75 ] algorithms, on the other hand, use additional or problem-specific knowledge in the search process, such as greedy search, A* search, graph search, etc. For example, when searching on Google Maps, one needs to provide information such as a position from the current location to precisely traverse the distance, time traveled, and real-time traffic updates on that specific route. Informed search can solve a variety of complicated problems that cannot be handled any other way. Furthermore, evolutionary computation employs an optimization search technique, such as genetic algorithms, which has a great potential to solve real-world issues. For instance, in the domain of cybersecurity, a genetic algorithm is used for effective feature selection to detect anomalies in fog computing environment [ 61 ]. In [ 28 ], genetic algorithm is used for optimized feature selection to detect Android malware using machine learning techniques. With AI-powered search, the platform learns from the data to provide the most accurate and relevant search results automatically. Thus, searching as well as optimization techniques can be used as a part of hybridization while building AI models to solve real-world problems.

Overall, we can conclude that the above explored ten potential AI techniques can play a significant role while building various AI models such as analytical, functional, interactive, textual, and visual models, depending on the nature of the problem and target application. In the next section, we summarize various real-world application areas, where these AI techniques are employed in today’s interconnected world towards automation, intelligent and smart systems.

figure 12

Several potential real-world application areas of artificial intelligence (AI)

Real-World Applications of AI

AI approaches have been effectively applied to a variety of issues in a variety of application areas throughout the last several years. Healthcare, cybersecurity, business, social media, virtual reality and assistance, robotics, and many other application areas are common nowadays. We have outlined some potential real-world AI application areas in Fig.  12 . Various AI techniques, such as machine learning, deep learning, knowledge discovery, reasoning, natural language processing, expert system modeling, and many others, as detailed above in “ Potential AI techniques ” are used in these application domains. We have also listed several AI tasks and techniques that are utilized to solve in several real-world application areas in Table  3 . Overall, we can conclude from Fig.  12 and Table  3 that the future prospects of AI modeling in real-world application domains are vast and there are several opportunities to work and conduct research. In the following section, we discuss the future aspect of AI as well as research issues towards automation, intelligent and smart systems.

Future Aspect and Research Issues

Artificial intelligence is influencing the future of almost every sector and every person on the planet. AI has acted as the driving force behind developing technologies for industrial automation, medical applications, agriculture, IoT applications, cybersecurity services, etc. summarized in “ Future Aspect and Research Issues ”, and it will continue to do so for the foreseeable future. This interdisciplinary science comes with numerous advancements and approaches that are possible with the help of deep learning, machine learning algorithms, knowledge-base expert systems, natural language processing, visual recognition, etc. discussed briefly in “ Potential AI techniques ”. Thus, by taking into account the capabilities of AI technologies, we illustrate three essential terms, mentioned in “ Introduction ” within the scope of our study. These are

Automation One of the main themes of today’s applications is automation, which encompasses a wide range of technologies that reduce human interaction in operations. A program, a script, or batch processing are commonly used in computing to automate tasks. AI-based automation takes the insights gained through computational analytics to the next level, allowing for automated decision-making. As a result, we can describe automation as the development and implementation of technology to manufacture and deliver products and services to increase the efficiency, dependability, and/or speed of various jobs traditionally handled by humans. In customer service, for example, virtual assistants can lower expenses while empowering both customers and human agents, resulting in a better customer experience. Artificial intelligence technology has the potential to automate almost any industry and every person on the planet.

Intelligent computing It is also known as computational intelligence, and it refers to a computer’s or system’s ability to extract insights or usable knowledge from data or experimental observation, or to learn a specific task. Intelligent computing methodologies include information processing, data mining, and knowledge discovery, as well as machine learning, pattern recognition, signal processing, natural language processing, fuzzy systems, knowledge representation, and reasoning. Transportation, industry, health, agriculture, business, finance, security, and other fields could all benefit from intelligent systems. Thus, the above-mentioned AI techniques, discussed in “ Potential AI techniques ” are the main drivers for performing intelligent computing as well as decision-making.

Smart computing The word “Smart” can be described as self-monitoring, analyzing, and reporting technology in smart computing, and the word “Computing” can be defined as computational analysis. As a result, it can be thought of as the next generation of computing, which is used to create something self-aware, that is, something that can sense the activities of its environment, massage the gathered data, perform some analytics, and provide the best decisions while also predicting future risks and challenges. In other words, it is a significant multidisciplinary area in which AI-based computational methods and technologies, as explained in “ Potential AI techniques ”, are integrated with engineering approaches to produce systems, applications, and new services that suit societal demands. Overall, it strives to construct a smart system by monitoring, analyzing, and reporting data in a faster and smarter manner, with AI-based modeling playing a vital part in system intelligence and decision-making.

The above terms are also the key focus of the current fourth industrial revolution (Industry 4.0). Business, health care, energy, transportation systems, environment, security, surveillance, industrial systems, information retrieval and publication, entertainment and creativity, and social activities can all benefit from automation, intelligence, and smart computer systems. For example, chatbots, consumer personalization, image-based targeting advertising, and warehouse and inventory automation are all examples of how AI will continue to drive e-commerce. The potential benefits of using AI in medicine are now being investigated. The medical industry has a wealth of data that may be used to develop healthcare-related predictive models. Manufacturing, notably the automobile industry, will be significantly impacted by AI. AI will have an impact on sales operations in a range of industries. Marketing tactics, such as business models, sales procedures, and customer service options, as well as customer behavior, are predicted to be significantly influenced by AI. AI and machine learning will be key technologies in cybersecurity for identifying and forecasting threats [ 77 , 89 ]. AI will be a vital tool for financial security because of its ability to analyze large amounts of data, foresee fraud, and identify it. In the near future, interacting with AI will surely become commonplace. Artificial intelligence can be used to solve incredibly difficult problems and find solutions that are vital to human well-being. These developments have enormous economic and societal implications. Thus, we can say, AI’s potential is limitless and its future will be shaped by our decisions and actions. While our discussion has established a solid foundation on AI-based systems and applications, hence we outline the below ten research issues.

Several potential AI techniques exist in the area with the capability of solving problems, discussed in “ Potential AI techniques ”. To understand the nature of the problem and an in-depth analysis is important to find a suitable solution, i.e., detecting cyber-anomalies or attacks [ 78 ]. Thus, the challenge is “Which AI technique is most suited to solving a specific real-world problem, taking into account the problem’s nature?”

One promising research direction for AI-based solutions is to develop a general framework that can handle the issues involved. A well-designed framework and experimental evaluation are both a crucial direction and a significant challenge. Thus, the question is “How can we design an effective AI-based framework to achieve the target outcome by taking into account the issues involved?”

The digital world contains a wealth of data in this age of the Fourth Industrial Revolution (Industry 4.0 or 4IR), including IoT data, corporate data, health data, cellular data, urban data, cybersecurity data, and many more [ 79 ]. Extracting insights using various analytical methods is important for smart decision-making in a particular system. Thus, the question is “How to extract useful insights or knowledge from real-world raw data to build an automated and intelligent system for a particular business problem?

Nowadays, data are considered as the most valuable resource in the world and various machine learning [ 81 ] and deep learning [ 80 ] techniques are used to learn from data or past experience, which automates analytical model building. The increase in data and such data-driven analytical modeling have made AI the highest growth in history. Thus, it’s important to do some data pre-processing tasks to feed into the ultimate machine learning model, so the data behaves nicely for the model. Therefore, the question is “How to effectively feed data to a machine or deep learning model to solve a particular real-world problem?”

The traditional machine learning [ 81 ] and deep learning [ 80 ] techniques may not be directly applicable for the expected outcome in many cases. Thus, designing new techniques or their variants by taking into account model optimization, accuracy, and applicability, according to the nature of the data and target real-world application, could be a novel contribution in the area. Therefore the question is—“How to design an effective learning algorithm or model allowing the application to learn automatically from the patterns or features in the data?”

In the domain of today’s smart computing, the term ‘context-awareness’ typically refers to a system’s capacity to gather information about its surroundings at any given time and adapt its behavior accordingly. Thus, the concept of context-aware machine learning can play a key role to build an intelligent context-aware application, highlighted in our book Sarker et al. [ 85 ]. Thus, the question is “How to effectively incorporate context-awareness in an AI-based smart system that can sense from the surrounding environment and make intelligent decisions accordingly?”

Decision rules, represented as IF-THEN statement, can play an important role in the area of AI. Expert systems, a core part of AI, are typically used to solve many real-world complex problems by reasoning through knowledge, which are mostly represented by such IF-THEN rules rather than traditional procedural code [ 85 ]. Thus, a rule-based system can manipulate knowledge and interpret information in a useful way.  Therefore, the question is “How can we design an automated rule-based system emulating the decision-making ability of a human expert through discovering a concise set of IF-THEN rules from the data?

A decision support system is a type of information system that aids in the decision-making process of a business or organization. AI techniques discussed in “ Potential AI techniques ” can play a key role to provide intelligent decisions across a wide range of sectors (e.g., business, education, healthcare, etc.) rather than the traditional system, according to the nature of the problem. Thus, the challenge is “How can we design an AI-assisted decision-support system that aids a team or organization in making better decisions?”

Uncertainty refers to an event’s lack of confidence or certainty, e.g., information occurred from unreliable sources. Several strategies, such as the probability-based model or fuzzy logic, discussed in “ Potential AI techniques ” allow for the processing of uncertain and imprecise knowledge while also providing a sophisticated reasoning framework. The ability of AI to identify and handle uncertainty and risk is essential for applying AI to decision-making challenges. Thus, the question is “How to manage uncertainty in AI-enabled decision-making applications”.

With the widespread availability of various IoT services, Internet of things (IoT) devices are becoming more common in mobile networks. It is essential nowadays to have a lightweight solution that promises high-performing artificial intelligence applications for mobile and IoT devices. Thus, the question is “How to design AI-enabled lightweight model for intelligent decision-making through IoT and mobile devices”.

To summarize, AI is a relatively open topic to which academics can contribute by inventing new methods or refining existing methods to address the issues raised above and solve real-world problems in a range of application areas. AI will be employed in any context where large amounts of data are needed to be handled fast and accurately, and cost savings are required. AI will affect the planet more than anything else in human history. One important thing is that AI-powered automation does not pose a threat to jobs in the workplace for individuals, businesses, or countries with the appropriate skills. AI-certified professionals have access to a wide range of job prospects. AI Engineer, Artificial Intelligence Programmer, AI System Developer, Data Scientist, Machine Learning Engineer, Data Analyst, AI Architect, Deep Learning Engineer, AI Software Engineer, and many other employment opportunities are available to these professionals.

Overall, AI technologies are driving a new wave of economic progress, resolving some of the world’s most challenging issues and delivering solutions to some of humanity’s most significant challenges. Many industries, including information technology, telecommunications, transportation, traffic management, health care, education, criminal justice, defense, banking, and agriculture, have the potential to be transformed by artificial intelligence. Without compromising the significant characteristics that identify mankind, we can assure that AI systems are deliberate, intelligent, and flexible with adequate security. Governments and decision-makers of a country need to focus public policies that promote AI innovation while minimizing unexpected societal consequences to realize its full potential in real-world scenarios.

Concluding Remarks

In this article, we have provided a comprehensive view of AI-based modeling which is considered a key component of the fourth industrial revolution (Industry 4.0). It begins with research motivation and proceeds to AI techniques and breakthroughs in many application domains. Then in numerous dimensions, the important techniques in this area are explored. We take into account ten categories of popular AI techniques in this thorough analysis, including machine learning, deep learning, natural language processing, knowledge discovery, expert system modeling, etc., which can be applied in a variety of applications depending on current demands. In terms of machine intelligence, complex learning algorithms should be trained using data and knowledge from the target application before the system can help with intelligent decision-making.

Overall, AI techniques have proven to be beneficial in a variety of applications and research fields, including business intelligence, finance, healthcare, visual recognition, smart cities, IoT, cybersecurity, and many more, as explored in the paper. Finally, we explored the future aspects of AI towards automation, intelligence, and smart computing systems, highlighting several research issues within the scope of our study. This can also aid researchers in conducting more in-depth analyses, resulting in a more reliable and realistic outcome. Overall, we feel that our study and discussion on AI-based modeling points in the right direction and can be used as a reference guide for future research and development in relevant application domains by academics as well as industry professionals.

Afzaliseresht N, Miao Y, Michalska S, Liu Q, Wang H. From logs to stories: human-centred data mining for cyber threat intelligence. IEEE Access. 2020;8:19089–99.

Article   Google Scholar  

Agrawal R, Imieliński T, Swami A. Mining association rules between sets of items in large databases. In: ACM SIGMOD record, ACM; vol 22, pp. 207–216 (1993).

Agrawal R, Srikant R. Fast algorithms for mining association rules. In: Proceedings of the international joint conference on very large data bases, Santiago Chile, vol 1215, pp. 487–499 (1994).

Aha DW, Kibler D, Albert MK. Instance-based learning algorithms. Mach Learn. 1991;6(1):37–66.

Alazab A, Bevinakoppa S, Khraisat A. Maximising competitive advantage on e-business websites: a data mining approach. In: 2018 IEEE conference on big data and analytics (ICBDA), IEEE; 2018. p. 111–116.

Ale L, Sheta A, Li L, Wang Y, Zhang N. Deep learning based plant disease detection for smart agriculture. In: 2019 IEEE globecom workshops (GC Wkshps), IEEE; 2019. p. 1–6.

Allahyari M, Pouriyeh S, Assefi M, Safaei S, Trippe ED, Gutierrez JB, Kochut K. A brief survey of text mining: classification, clustering and extraction techniques. arXiv:1707.02919 (arXiv preprint), 2017.

Anuradha J, et al. Big data based stock trend prediction using deep cnn with reinforcement-lstm model. Int J Syst Assur Eng Manage. 2021;2:1–11.

Google Scholar  

Aslan MF, Unlersen MF, Sabanci K, Durdu A. Cnn-based transfer learning-bilstm network: a novel approach for covid-19 infection detection. Appl Soft Comput. 2021;98:106912.

Bellman R. A markovian decision process. J Math Mech. 1957;2:679–84.

MathSciNet   MATH   Google Scholar  

Bhavithra J, Saradha A. Personalized web page recommendation using case-based clustering and weighted association rule mining. Cluster Comput. 2019;22(3):6991–7002.

Blumenstock J. Machine learning can help get covid-19 aid to those who need it most. Nature. 2020;20:20.

Borah A, Nath B. Identifying risk factors for adverse diseases using dynamic rare association rule mining. Expert Syst Appl. 2018;113:233–63.

Breiman L. Random forests. Mach Learn. 2001;45(1):5–32.

Article   MATH   Google Scholar  

Breiman L, Friedman J, Stone CJ, Olshen RA. Classification and regression trees. New York: CRC Press; 1984.

MATH   Google Scholar  

Chukkapalli SSL, Aziz SB, Alotaibi N, Mittal S, Gupta M, Abdelsalam M. Ontology driven ai and access control systems for smart fisheries. In: Proceedings of the 2021 ACM workshop on secure and trustworthy cyber-physical systems, 2021. p. 59–68.

Corrales DC, Ledezma A, Corrales JC. A case-based reasoning system for recommendation of data cleaning algorithms in classification and regression tasks. Appl Soft Comput. 2020;90:106180.

Das A, Ng W-K, Woon Y-K. Rapid association rule mining. In: Proceedings of the tenth international conference on Information and knowledge management, ACM; 2001. p. 474–481.

Da’u A, Salim N. Recommendation system based on deep learning methods: a systematic review and new directions. Artif Intell Rev. 2020;53(4):2709–48.

Li Deng. A tutorial survey of architectures, algorithms, and applications for deep learning. APSIPA Trans Signal Inf Process. 2014;3:20.

Deng L, Liu Y. Deep learning in natural language processing. Berlin: Springer; 2018.

Book   Google Scholar  

Dhyani M, Kumar R. An intelligent chatbot using deep learning with bidirectional rnn and attention model. Mater Today Proc. 2021;34:817–24.

Dua Su, Du X. Data mining and machine learning in cybersecurity. 2016.

Dupond S. A thorough review on the current advance of neural network structures. Annu Rev Control. 2019;14:200–30.

Elakkiya R, Subramaniyaswamy V, Vijayakumar V, Aniket Mahanti. Cervical cancer diagnostics healthcare system using hybrid object detection adversarial networks. IEEE J Biomed Health Inform. 2021;20:20.

Elliman D, Pulido JRG. Visualizing ontology components through self-organizing maps. In: Proceedings sixth international conference on information visualisation, IEEE; 2002. p. 434–438.

Ester M, Kriegel H-P, Sander J, Xiaowei X, et al. A density-based algorithm for discovering clusters in large spatial databases with noise. In Kdd. 1996;96:226–31.

Fatima A, Maurya R, Dutta MK, Burget R, Masek J. Android malware detection using genetic algorithm based optimized feature selection and machine learning. In: 2019 42nd international conference on telecommunications and signal processing (TSP), IEEE; 2019. p. 220–223.

Goel D, Pahal N, Jain P, Chaudhury S. An ontology-driven context aware framework for smart traffic monitoring. In: 2017 IEEE region 10 symposium (TENSYMP), IEEE; 2017. p. 1–5.

González-Briones A, Prieto J, De La Prieta F, Herrera-Viedma E, Corchado JM. Energy optimization using a case-based reasoning strategy. Sensors. 2018;18(3):865.

Goodfellow I, Bengio Y, Courville A, Bengio Y. Deep learning, vol. 1. Cambridge: MIT press; 2016.

Goodfellow I, Pouget-Abadie J, Mirza M, Bing X, Warde-Farley D, Ozair S, Courville A, Bengio Y. Generative adversarial nets. In: Advances in neural information processing systems, 2014. p. 2672–2680.

Bangzhong Gu, Rongjun Ge, Yang Chen, Limin Luo, Gouenou Coatrieux. Automatic and robust object detection in x-ray baggage inspection using deep convolutional neural networks. IEEE Trans Ind Electron. 2020;20:20.

Hamamoto AH, Carvalho LF, Sampaio LDH, Abrão T, Proença ML Jr. Network anomaly detection system using genetic algorithm and fuzzy logic. Expert Syst Appl. 2018;92:390–402.

Hamed M, Mahmoud T, Gómez JM, Kfouri G. Using data mining and business intelligence to develop decision support systems in Arabic higher education institutions. In: Modernizing academic teaching and research in business and economics. Berlin: Springer; 2017. p. 71–84.

Chapter   Google Scholar  

Han J, Pei J, Kamber M. Data mining: concepts and techniques. Amsterdam: Elsevier; 2011.

Han J, Pei J, Yin Y. Mining frequent patterns without candidate generation. In: ACM Sigmod Record, vol 29, ACM; 2000. p. 1–12.

Harrou F, Zerrouki N, Sun Y, Houacine A. An integrated vision-based approach for efficient human fall detection in a home environment. IEEE Access. 2019;7:114966–74.

Hinton GE. Deep belief networks. Scholarpedia. 2009;4(5):5947.

Hotelling H. Analysis of a complex of statistical variables into principal components. J Educ Psychol. 1933;24(6):417.

Islam MZ, Islam MM, Asraf A. A combined deep cnn-lstm network for the detection of novel coronavirus (covid-19) using x-ray images. Inform Med Unlocked. 2020;20:100412.

John GH, Langley P. Estimating continuous distributions in bayesian classifiers. In: Proceedings of the Eleventh conference on Uncertainty in artificial intelligence. Morgan Kaufmann Publishers Inc.; 1995. p. 338–345.

Kaelbling LP, Littman ML, Moore AW. Reinforcement learning: a survey. J Artif Intell Res. 1996;4:237–85.

Kang X, Porter CS, Bohemia E. Using the fuzzy weighted association rule mining approach to develop a customer satisfaction product form. J Intell Fuzzy Syst. 2020;38(4):4343–57.

Kaufman L, Rousseeuw PJ. Finding groups in data: an introduction to cluster analysis, vol. 344. New York: Wiley; 2009.

Keerthi SS, Shevade SK, Bhattacharyya C, Murthy KRK. Improvements to platt’s smo algorithm for svm classifier design. Neural Comput. 2001;13(3):637–49.

Khosravani MR, Nasiri S, Weinberg K. Application of case-based reasoning in a fault detection system on production of drippers. Appl Soft Comput. 2019;75:227–32.

Kim J-Y, Seok-Jun B, Cho S-B. Zero-day malware detection using transferred generative adversarial networks based on deep autoencoders. Inf Sci. 2018;460:83–102.

Kiran GM, Nalini N. Enhanced security-aware technique and ontology data access control in cloud computing. Int J Commun Syst. 2020;33(15):e4554.

Kohonen T. The self-organizing map. Proc IEEE. 1990;78(9):1464–80.

Krishnan RS, Julie EG, Robinson YH, Raja S, Kumar R, Thong PH, et al. Fuzzy logic based smart irrigation system using internet of things. J Clean Prod. 2020;252:119902.

Lamy J-B, Sekar B, Guezennec G, Bouaud J, Séroussi B. Explainable artificial intelligence for breast cancer: a visual case-based reasoning approach. Artif Intell Med. 2019;94:42–53.

LeCun Y, Bottou L, Bengio Y, Haffner P. Gradient-based learning applied to document recognition. Proc IEEE. 1998;86(11):2278–324.

Li T-HS, Kuo P-H, Tsai T-N, Luan P-C. Cnn and lstm based facial expression analysis model for a humanoid robot. IEEE Access. 2019;7:93998–4011.

MacQueen James et al. Some methods for classification and analysis of multivariate observations. In: Proceedings of the fifth Berkeley symposium on mathematical statistics and probability, vol 1. Oakland, CA, USA; 1967. p. 281–297

Maedche A, Staab S. Ontology learning for the semantic web. IEEE Intell Syst. 2001;16(2):72–9.

Mandic D, Chambers J. Recurrent neural networks for prediction: learning algorithms, architectures and stability. New York: Wiley; 2001.

Marlin B, Swersky K, Chen B, Freitas N. Inductive principles for restricted boltzmann machine learning. In: Proceedings of the thirteenth international conference on artificial intelligence and statistics, p. 509–516. JMLR Workshop and Conference Proceedings, 2010.

Maynard AD. Navigating the fourth industrial revolution. Nat Nanotechnol. 2015;10(12):1005–6.

Nawaz MS, Bilal M, Lali MIU, Mustafa RU, Aslam W, Jajja S. Effectiveness of social media data in healthcare communication. J Med Imaging Health Inform. 2017;7(6):1365–71.

Onah JO, Abdullahi M, Hassan IH, Al-Ghusham A, et al. Genetic algorithm based feature selection and naïve bayes for anomaly detection in fog computing environment. Mach Learn Appl. 2021;6:100156.

Aytuğ Onan. Sentiment analysis on product reviews based on weighted word embeddings and deep neural networks. Concurr Comput Pract Exp. 2020;20:e5909.

Pan M, Liu Y, Jiayi Cao Yu, Li CL, Chen C-H. Visual recognition based on deep learning for navigation mark classification. IEEE Access. 2020;8:32767–75.

Park H-S, Jun C-H. A simple and fast algorithm for k-medoids clustering. Expert Syst Appl. 2009;36(2):3336–41.

Pearl J. Probabilistic reasoning in intelligent systems: networks of plausible inference. Burlington: Morgan kaufmann; 1988.

Pearson K. Liii. on lines and planes of closest fit to systems of points in space. Lond Edinburgh Dublin Philos Mag J Sci. 1901;2(11):559–72.

Pedregosa F, Varoquaux G, Gramfort A, Michel V, Thirion B, Grisel O, Blondel M, Prettenhofer P, Weiss R, Dubourg V, et al. Scikit-learn: machine learning in python. J Mach Learn Res. 2011;12:2825–30.

Phan HT, Tran VC, Nguyen NT, Hwang D. Improving the performance of sentiment analysis of tweets containing fuzzy sentiment using the feature ensemble model. IEEE Access. 2020;8:14630–41.

Piccialli F, Giampaolo F, Prezioso E, Crisci D, Cuomo S. Predictive analytics for smart parking: a deep learning approach in forecasting of iot data. ACM Trans Internet Technol. 2021;21(3):1–21.

Poort J, Omrani PS, Vecchia AL, Visser G, Janzen M, Koenes J. An automated diagnostic analytics workflow for the detection of production events-application to mature gas fields. In: Abu Dhabi international petroleum exhibition and conference. OnePetro; 2020.

Quinlan JR. Induction of decision trees. Mach Learn. 1986;1(1):81–106.

Ross Quinlan J. C4.5: programs for machine learning. Mach Learn. 1993;20:20.

Ramzan B, Bajwa IS, Jamil N, Amin RU, Ramzan S, Mirza F, Sarwar N. An intelligent data analysis for recommendation systems using machine learning. Sci Programm. 2019;20:19.

Reddy GT, Reddy MPK, Lakshmanna K, Rajput DS, Kaluri R, Srivastava G. Hybrid genetic algorithm and a fuzzy logic classifier for heart disease diagnosis. Evol Intel. 2020;13(2):185–96.

Russell S, Norvig P. Artificial intelligence: a modern approach, global edition 4th. Foundations. 2021;19:23.

Saharan S, Kumar N, Bawa S. An efficient smart parking pricing system for smart city environment: a machine-learning based approach. Future Gener Comput Syst. 2020;106:622–40.

Sarker IH. Ai-driven cybersecurity: an overview, security intelligence modeling and research directions. SN Comput Sci. 2021;20:21.

Sarker Iqbal H. Cyberlearning: effectiveness analysis of machine learning security modeling to detect cyber-anomalies and multi-attacks. Internet Things. 2021;100:393.

Sarker Iqbal H. Data science and analytics: an overview from data-driven smart computing, decision-making and applications perspective. SN Comput Sci. 2021;20:21.

Sarker IH. Deep learning: a comprehensive overview on techniques, taxonomy, applications and research directions. SN Comput Sci. 2021;2(6):1–20.

Article   MathSciNet   Google Scholar  

Sarker IH. Machine learning: algorithms, real-world applications and research directions. SN Comput Sci. 2021;2(3):1–21.

Sarker IH, Abushark YB, Alsolami F, Khan AI. Intrudtree: a machine learning based cyber security intrusion detection model. Symmetry. 2020;12(5):754.

Sarker IH, Colman A, Han J. Recencyminer: mining recency-based personalized behavior from contextual smartphone data. J Big Data. 2019;6(1):1–21.

Sarker IH, Colman A, Han J, Khan AI, Abushark YB, Salah K. Behavdt: a behavioral decision tree learning to build user-centric context-aware predictive model. Mob Netw Appl. 2019;20:1–11.

Sarker IH, Colman A, Han J, Watters P. Context-aware machine learning and mobile data analytics: automated rule-based services with intelligent decision-making. Berlin: Springer; 2022.

Sarker IH, Colman A, Kabir MA, Han J. Individualized time-series segmentation for mining mobile phone user behavior. Comput J. 2018;61(3):349–68.

Sarker IH, Hoque MM, Uddin MK, Alsanoosy T. Mobile data science and intelligent apps: concepts, AI-based modeling and research directions. Mobile Netw Appl. 2020;20:1–19.

Sarker Iqbal H, Kayes ASM. Abc-ruleminer: user behavioral rule-based machine learning method for context-aware intelligent services. J Netw Comput Appl. 2020;10:2762.

Sarker IH, Kayes ASM, Badsha S, Alqahtani H, Watters P, Ng A. Cybersecurity data science: an overview from machine learning perspective. J Big Data. 2020;7(1):1–29.

Sarker IH, Khan AI, Abushark YB, Alsolami F. Mobile expert system: exploring context-aware machine learning rules for personalized decision-making in mobile applications. Symmetry. 2021;13(10):1975.

Beata Ślusarczyk. Industry 4.0: are we ready? Pol J Manage Stud. 2018;17:20.

Sneath PHA. The application of computers to taxonomy. J Gener Microbiol. 1957;17:1.

Thorvald Sorensen. method of establishing groups of equal amplitude in plant sociology based on similarity of species. Biol Skr. 1948;5:20.

Srinivas S, Ravindran AR. Optimizing outpatient appointment system using machine learning algorithms and scheduling rules: a prescriptive analytics framework. Expert Syst Appl. 2018;102:245–61.

Stephan G, Pascal H, Andreas A. Knowledge representation and ontologies. Semant Web Serv Concepts Technol Appl. 2007;20:51–105.

Subramaniyaswamy V, Logesh R, Abejith M, Umasankar S, Umamakeswari A. Sentiment analysis of tweets for estimating criticality and security of events. In: Improving the safety and efficiency of emergency services: emerging tools and technologies for first responders. IGI global; 2020. p. 293–319.

Syed R. Cybersecurity vulnerability management: a conceptual ontology and cyber intelligence alert system. Inform Manage. 2020;57(6):103334.

Talib AM, Alomary FO, Alwadi HF, Albusayli RR, et al. Ontology-based cyber security policy implementation in Saudi Arabia. J Inf Secur. 2018;9(04):315.

Athanasios Voulodimos, Nikolaos Doulamis, Anastasios Doulamis, Eftychios Protopapadakis. Deep learning for computer vision: a brief review. Comput Intell Neurosci. 2018;20:18.

Wang W, Zhao M, Wang J. Effective android malware detection with a hybrid model based on deep autoencoder and convolutional neural network. J Ambient Intell Humaniz Comput. 2019;10(8):3035–43.

Witten IH, Frank E. Data mining: practical machine learning tools and techniques. Burlington: Morgan Kaufmann; 2005.

Witten IH, Frank E, Trigg LE, Hall MA, Holmes G, Cunningham SJ. Weka: practical machine learning tools and techniques with java implementations. 1999.

Ruihua X, Luo F. Risk prediction and early warning for air traffic controllers’ unsafe acts using association rule mining and random forest. Saf Sci. 2021;135:105125.

Zadeh LA. Is there a need for fuzzy logic? Inf Sci. 2008;178(13):2751–79.

Article   MathSciNet   MATH   Google Scholar  

Zaki MJ. Scalable algorithms for association mining. IEEE Trans Knowl Data Eng. 2000;12(3):372–90.

Download references

Open Access funding enabled and organized by CAUL and its Member Institutions.

Author information

Authors and affiliations.

Swinburne University of Technology, Melbourne, VIC, 3122, Australia

Iqbal H. Sarker

Department of Computer Science and Engineering, Chittagong University of Engineering & Technology, Chittagong, 4349, Bangladesh

You can also search for this author in PubMed   Google Scholar

Corresponding author

Correspondence to Iqbal H. Sarker .

Ethics declarations

Conflict of interest.

The author declares no conflict of interest.

Additional information

Publisher's note.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

This article is part of the topical collection “Advances in Computational Approaches for Artificial Intelligence, Image Processing, IoT and Cloud Applications” guest-edited by Bhanu Prakash K N and M. Shivakumar.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ .

Reprints and permissions

About this article

Sarker, I.H. AI-Based Modeling: Techniques, Applications and Research Issues Towards Automation, Intelligent and Smart Systems. SN COMPUT. SCI. 3 , 158 (2022). https://doi.org/10.1007/s42979-022-01043-x

Download citation

Received : 20 July 2021

Accepted : 21 January 2022

Published : 10 February 2022

DOI : https://doi.org/10.1007/s42979-022-01043-x

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Artificial intelligence
  • Machine learning
  • Data science
  • Advanced analytics
  • Intelligent computing
  • Smart systems
  • Industry 4.0 applications
  • Find a journal
  • Publish with us
  • Track your research

Impact of Artificial Intelligence: Applications, Transformation Strategy and Future Potential

Ieee account.

  • Change Username/Password
  • Update Address

Purchase Details

  • Payment Options
  • Order History
  • View Purchased Documents

Profile Information

  • Communications Preferences
  • Profession and Education
  • Technical Interests
  • US & Canada: +1 800 678 4333
  • Worldwide: +1 732 981 0060
  • Contact & Support
  • About IEEE Xplore
  • Accessibility
  • Terms of Use
  • Nondiscrimination Policy
  • Privacy & Opting Out of Cookies

A not-for-profit organization, IEEE is the world's largest technical professional organization dedicated to advancing technology for the benefit of humanity. © Copyright 2024 IEEE - All rights reserved. Use of this web site signifies your agreement to the terms and conditions.

The present and future of AI

Finale doshi-velez on how ai is shaping our lives and how we can shape ai.

image of Finale Doshi-Velez, the John L. Loeb Professor of Engineering and Applied Sciences

Finale Doshi-Velez, the John L. Loeb Professor of Engineering and Applied Sciences. (Photo courtesy of Eliza Grinnell/Harvard SEAS)

How has artificial intelligence changed and shaped our world over the last five years? How will AI continue to impact our lives in the coming years? Those were the questions addressed in the most recent report from the One Hundred Year Study on Artificial Intelligence (AI100), an ongoing project hosted at Stanford University, that will study the status of AI technology and its impacts on the world over the next 100 years.

The 2021 report is the second in a series that will be released every five years until 2116. Titled “Gathering Strength, Gathering Storms,” the report explores the various ways AI is  increasingly touching people’s lives in settings that range from  movie recommendations  and  voice assistants  to  autonomous driving  and  automated medical diagnoses .

Barbara Grosz , the Higgins Research Professor of Natural Sciences at the Harvard John A. Paulson School of Engineering and Applied Sciences (SEAS) is a member of the standing committee overseeing the AI100 project and Finale Doshi-Velez , Gordon McKay Professor of Computer Science, is part of the panel of interdisciplinary researchers who wrote this year’s report. 

We spoke with Doshi-Velez about the report, what it says about the role AI is currently playing in our lives, and how it will change in the future.  

Q: Let's start with a snapshot: What is the current state of AI and its potential?

Doshi-Velez: Some of the biggest changes in the last five years have been how well AIs now perform in large data regimes on specific types of tasks.  We've seen [DeepMind’s] AlphaZero become the best Go player entirely through self-play, and everyday uses of AI such as grammar checks and autocomplete, automatic personal photo organization and search, and speech recognition become commonplace for large numbers of people.  

In terms of potential, I'm most excited about AIs that might augment and assist people.  They can be used to drive insights in drug discovery, help with decision making such as identifying a menu of likely treatment options for patients, and provide basic assistance, such as lane keeping while driving or text-to-speech based on images from a phone for the visually impaired.  In many situations, people and AIs have complementary strengths. I think we're getting closer to unlocking the potential of people and AI teams.

There's a much greater recognition that we should not be waiting for AI tools to become mainstream before making sure they are ethical.

Q: Over the course of 100 years, these reports will tell the story of AI and its evolving role in society. Even though there have only been two reports, what's the story so far?

There's actually a lot of change even in five years.  The first report is fairly rosy.  For example, it mentions how algorithmic risk assessments may mitigate the human biases of judges.  The second has a much more mixed view.  I think this comes from the fact that as AI tools have come into the mainstream — both in higher stakes and everyday settings — we are appropriately much less willing to tolerate flaws, especially discriminatory ones. There's also been questions of information and disinformation control as people get their news, social media, and entertainment via searches and rankings personalized to them. So, there's a much greater recognition that we should not be waiting for AI tools to become mainstream before making sure they are ethical.

Q: What is the responsibility of institutes of higher education in preparing students and the next generation of computer scientists for the future of AI and its impact on society?

First, I'll say that the need to understand the basics of AI and data science starts much earlier than higher education!  Children are being exposed to AIs as soon as they click on videos on YouTube or browse photo albums. They need to understand aspects of AI such as how their actions affect future recommendations.

But for computer science students in college, I think a key thing that future engineers need to realize is when to demand input and how to talk across disciplinary boundaries to get at often difficult-to-quantify notions of safety, equity, fairness, etc.  I'm really excited that Harvard has the Embedded EthiCS program to provide some of this education.  Of course, this is an addition to standard good engineering practices like building robust models, validating them, and so forth, which is all a bit harder with AI.

I think a key thing that future engineers need to realize is when to demand input and how to talk across disciplinary boundaries to get at often difficult-to-quantify notions of safety, equity, fairness, etc. 

Q: Your work focuses on machine learning with applications to healthcare, which is also an area of focus of this report. What is the state of AI in healthcare? 

A lot of AI in healthcare has been on the business end, used for optimizing billing, scheduling surgeries, that sort of thing.  When it comes to AI for better patient care, which is what we usually think about, there are few legal, regulatory, and financial incentives to do so, and many disincentives. Still, there's been slow but steady integration of AI-based tools, often in the form of risk scoring and alert systems.

In the near future, two applications that I'm really excited about are triage in low-resource settings — having AIs do initial reads of pathology slides, for example, if there are not enough pathologists, or get an initial check of whether a mole looks suspicious — and ways in which AIs can help identify promising treatment options for discussion with a clinician team and patient.

Q: Any predictions for the next report?

I'll be keen to see where currently nascent AI regulation initiatives have gotten to. Accountability is such a difficult question in AI,  it's tricky to nurture both innovation and basic protections.  Perhaps the most important innovation will be in approaches for AI accountability.

Topics: AI / Machine Learning , Computer Science

Cutting-edge science delivered direct to your inbox.

Join the Harvard SEAS mailing list.

Scientist Profiles

Finale Doshi-Velez

Finale Doshi-Velez

Herchel Smith Professor of Computer Science

Press Contact

Leah Burrows | 617-496-1351 | [email protected]

Related News

Two men wearing hospital scrubs, two wearing blue jackets with the logo for the company EndoShunt, in front of medical equipment

Seven SEAS teams named President’s Innovation Challenge finalists

Start-ups will vie for up to $75,000 in prize money

Computer Science , Design , Electrical Engineering , Entrepreneurship , Events , Master of Design Engineering , Materials Science & Mechanical Engineering , MS/MBA

A group of Harvard SEAS students standing behind a wooden table, in front of a sign that says "Agents of Change"

Exploring the depths of AI

 New SEAS club spends Spring Break meeting AI technology professionals in San Francisco

AI / Machine Learning , Computer Science , Student Organizations

Head shot of SEAS Ph.D. alum Jacomo Corbo

Alumni profile: Jacomo Corbo, Ph.D. '08

Racing into the future of machine learning 

AI / Machine Learning , Computer Science

AI Index Report

Welcome to the seventh edition of the AI Index report. The 2024 Index is our most comprehensive to date and arrives at an important moment when AI’s influence on society has never been more pronounced. This year, we have broadened our scope to more extensively cover essential trends such as technical advancements in AI, public perceptions of the technology, and the geopolitical dynamics surrounding its development. Featuring more original data than ever before, this edition introduces new estimates on AI training costs, detailed analyses of the responsible AI landscape, and an entirely new chapter dedicated to AI’s impact on science and medicine.

Read the 2024 AI Index Report

The AI Index report tracks, collates, distills, and visualizes data related to artificial intelligence (AI). Our mission is to provide unbiased, rigorously vetted, broadly sourced data in order for policymakers, researchers, executives, journalists, and the general public to develop a more thorough and nuanced understanding of the complex field of AI.

The AI Index is recognized globally as one of the most credible and authoritative sources for data and insights on artificial intelligence. Previous editions have been cited in major newspapers, including the The New York Times, Bloomberg, and The Guardian, have amassed hundreds of academic citations, and been referenced by high-level policymakers in the United States, the United Kingdom, and the European Union, among other places. This year’s edition surpasses all previous ones in size, scale, and scope, reflecting the growing significance that AI is coming to hold in all of our lives.

Steering Committee Co-Directors

Jack Clark

Ray Perrault

Steering committee members.

Erik Brynjolfsson

Erik Brynjolfsson

John Etchemendy

John Etchemendy

Katrina light

Katrina Ligett

Terah Lyons

Terah Lyons

James Manyika

James Manyika

Juan Carlos Niebles

Juan Carlos Niebles

Vanessa Parli

Vanessa Parli

Yoav Shoham

Yoav Shoham

Russell Wald

Russell Wald

Staff members.

Loredana Fattorini

Loredana Fattorini

Nestor Maslej

Nestor Maslej

Letter from the co-directors.

A decade ago, the best AI systems in the world were unable to classify objects in images at a human level. AI struggled with language comprehension and could not solve math problems. Today, AI systems routinely exceed human performance on standard benchmarks.

Progress accelerated in 2023. New state-of-the-art systems like GPT-4, Gemini, and Claude 3 are impressively multimodal: They can generate fluent text in dozens of languages, process audio, and even explain memes. As AI has improved, it has increasingly forced its way into our lives. Companies are racing to build AI-based products, and AI is increasingly being used by the general public. But current AI technology still has significant problems. It cannot reliably deal with facts, perform complex reasoning, or explain its conclusions.

AI faces two interrelated futures. First, technology continues to improve and is increasingly used, having major consequences for productivity and employment. It can be put to both good and bad uses. In the second future, the adoption of AI is constrained by the limitations of the technology. Regardless of which future unfolds, governments are increasingly concerned. They are stepping in to encourage the upside, such as funding university R&D and incentivizing private investment. Governments are also aiming to manage the potential downsides, such as impacts on employment, privacy concerns, misinformation, and intellectual property rights.

As AI rapidly evolves, the AI Index aims to help the AI community, policymakers, business leaders, journalists, and the general public navigate this complex landscape. It provides ongoing, objective snapshots tracking several key areas: technical progress in AI capabilities, the community and investments driving AI development and deployment, public opinion on current and potential future impacts, and policy measures taken to stimulate AI innovation while managing its risks and challenges. By comprehensively monitoring the AI ecosystem, the Index serves as an important resource for understanding this transformative technological force.

On the technical front, this year’s AI Index reports that the number of new large language models released worldwide in 2023 doubled over the previous year. Two-thirds were open-source, but the highest-performing models came from industry players with closed systems. Gemini Ultra became the first LLM to reach human-level performance on the Massive Multitask Language Understanding (MMLU) benchmark; performance on the benchmark has improved by 15 percentage points since last year. Additionally, GPT-4 achieved an impressive 0.97 mean win rate score on the comprehensive Holistic Evaluation of Language Models (HELM) benchmark, which includes MMLU among other evaluations.

Although global private investment in AI decreased for the second consecutive year, investment in generative AI skyrocketed. More Fortune 500 earnings calls mentioned AI than ever before, and new studies show that AI tangibly boosts worker productivity. On the policymaking front, global mentions of AI in legislative proceedings have never been higher. U.S. regulators passed more AI-related regulations in 2023 than ever before. Still, many expressed concerns about AI’s ability to generate deepfakes and impact elections. The public became more aware of AI, and studies suggest that they responded with nervousness.

Ray Perrault Co-director, AI Index

Our Supporting Partners

Supporting Partner Logos

Analytics & Research Partners

technical research paper on artificial intelligence

Stay up to date on the AI Index by subscribing to the  Stanford HAI newsletter.

Help | Advanced Search

Computer Science > Artificial Intelligence

Title: a survey on the memory mechanism of large language model based agents.

Abstract: Large language model (LLM) based agents have recently attracted much attention from the research and industry communities. Compared with original LLMs, LLM-based agents are featured in their self-evolving capability, which is the basis for solving real-world problems that need long-term and complex agent-environment interactions. The key component to support agent-environment interactions is the memory of the agents. While previous studies have proposed many promising memory mechanisms, they are scattered in different papers, and there lacks a systematical review to summarize and compare these works from a holistic perspective, failing to abstract common and effective designing patterns for inspiring future studies. To bridge this gap, in this paper, we propose a comprehensive survey on the memory mechanism of LLM-based agents. In specific, we first discuss ''what is'' and ''why do we need'' the memory in LLM-based agents. Then, we systematically review previous studies on how to design and evaluate the memory module. In addition, we also present many agent applications, where the memory module plays an important role. At last, we analyze the limitations of existing work and show important future directions. To keep up with the latest advances in this field, we create a repository at \url{ this https URL }.

Submission history

Access paper:.

  • HTML (experimental)
  • Other Formats

license icon

References & Citations

  • Google Scholar
  • Semantic Scholar

BibTeX formatted citation

BibSonomy logo

Bibliographic and Citation Tools

Code, data and media associated with this article, recommenders and search tools.

  • Institution

arXivLabs: experimental projects with community collaborators

arXivLabs is a framework that allows collaborators to develop and share new arXiv features directly on our website.

Both individuals and organizations that work with arXivLabs have embraced and accepted our values of openness, community, excellence, and user data privacy. arXiv is committed to these values and only works with partners that adhere to them.

Have an idea for a project that will add value for arXiv's community? Learn more about arXivLabs .

Artificial Intelligence (Generative) Resources

Ai research tools, additional ai tools.

  • How to Craft Prompts
  • Research Resources on AI
  • Latest News on AI
  • Ethics & AI
  • Citing Generative AI
  • AI, Authorship, & Copyright
  • Campus Resources and Policies

About This Table

The resources described in the table represent an incomplete list of tools specifically geared towards exploring and synthesizing research. As generative AI becomes more integrated in online search tools , even the very early stages of research and topic development could incorporate AI. If you have any questions about using these tools for your research, please Email a Librarian .

AI tools for research can help you to discover new sources for your literature review or research assignment. These tools will synthesize information from large databases of scholarly output with the aim of finding the most relevant articles and saving researchers' time. As with our research databases or any other search tool, however, it's important not to rely on one tool for all of your research, as you will risk missing important information on your topic of interest.

Georgetown University's Center for New Designs in Learning and Scholarship (CNDLS) offers a list of additional AI tools with a range of different purposes including visual design, writing, time management, and more.

  • << Previous: Home
  • Next: How to Craft Prompts >>
  • Last Updated: Apr 22, 2024 2:36 PM
  • URL: https://guides.library.georgetown.edu/ai

Creative Commons

U.S. flag

An official website of the United States government

The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

  • Publications
  • Account settings

Preview improvements coming to the PMC website in October 2024. Learn More or Try it out now .

  • Advanced Search
  • Journal List
  • Innovation (Camb)
  • v.2(4); 2021 Nov 28

Artificial intelligence: A powerful paradigm for scientific research

1 Institute of Computing Technology, Chinese Academy of Sciences, Beijing 100190, China

35 University of Chinese Academy of Sciences, Beijing 100049, China

5 Academy of Mathematics and Systems Science, Chinese Academy of Sciences, Beijing 100190, China

10 Zhongshan Hospital Institute of Clinical Science, Fudan University, Shanghai 200032, China

Changping Huang

18 Aerospace Information Research Institute, Chinese Academy of Sciences, Beijing 100094, China

11 Institute of Physics, Chinese Academy of Sciences, Beijing 100190, China

37 Songshan Lake Materials Laboratory, Dongguan, Guangdong 523808, China

26 Institute of High Energy Physics, Chinese Academy of Sciences, Beijing 100049, China

Xingchen Liu

28 Institute of Coal Chemistry, Chinese Academy of Sciences, Taiyuan 030001, China

2 Institute of Software, Chinese Academy of Sciences, Beijing 100190, China

Fengliang Dong

3 National Center for Nanoscience and Technology, Beijing 100190, China

Cheng-Wei Qiu

4 Department of Electrical and Computer Engineering, National University of Singapore, Singapore 117583, Singapore

6 Department of Gynaecology, Obstetrics and Gynaecology Hospital, Fudan University, Shanghai 200011, China

36 Shanghai Key Laboratory of Female Reproductive Endocrine-Related Diseases, Shanghai 200011, China

7 School of Food Science and Technology, Dalian Polytechnic University, Dalian 116034, China

41 Second Affiliated Hospital School of Medicine, and School of Public Health, Zhejiang University, Hangzhou 310058, China

8 Department of Obstetrics and Gynecology, Peking University Third Hospital, Beijing 100191, China

9 Zhejiang Provincial People’s Hospital, Hangzhou 310014, China

Chenguang Fu

12 School of Materials Science and Engineering, Zhejiang University, Hangzhou 310027, China

Zhigang Yin

13 Fujian Institute of Research on the Structure of Matter, Chinese Academy of Sciences, Fuzhou 350002, China

Ronald Roepman

14 Medical Center, Radboud University, 6500 Nijmegen, the Netherlands

Sabine Dietmann

15 Institute for Informatics, Washington University School of Medicine, St. Louis, MO 63110, USA

Marko Virta

16 Department of Microbiology, University of Helsinki, 00014 Helsinki, Finland

Fredrick Kengara

17 School of Pure and Applied Sciences, Bomet University College, Bomet 20400, Kenya

19 Agriculture College of Shihezi University, Xinjiang 832000, China

Taolan Zhao

20 Institute of Genetics and Developmental Biology, Chinese Academy of Sciences, Beijing 100101, China

21 The Brain Cognition and Brain Disease Institute, Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen 518055, China

38 Shenzhen-Hong Kong Institute of Brain Science-Shenzhen Fundamental Research Institutions, Shenzhen 518055, China

Jialiang Yang

22 Geneis (Beijing) Co., Ltd, Beijing 100102, China

23 Department of Communication Studies, Hong Kong Baptist University, Hong Kong, China

24 South China Botanical Garden, Chinese Academy of Sciences, Guangzhou 510650, China

39 Center of Economic Botany, Core Botanical Gardens, Chinese Academy of Sciences, Guangzhou 510650, China

Zhaofeng Liu

27 Shanghai Astronomical Observatory, Chinese Academy of Sciences, Shanghai 200030, China

29 Suzhou Institute of Nano-Tech and Nano-Bionics, Chinese Academy of Sciences, Suzhou 215123, China

Xiaohong Liu

30 Chongqing Institute of Green and Intelligent Technology, Chinese Academy of Sciences, Chongqing 400714, China

James P. Lewis

James m. tiedje.

34 Center for Microbial Ecology, Department of Plant, Soil and Microbial Sciences, Michigan State University, East Lansing, MI 48824, USA

40 Zhejiang Lab, Hangzhou 311121, China

25 Shanghai Institute of Nutrition and Health, Chinese Academy of Sciences, Shanghai 200031, China

31 Department of Computer Science, Aberystwyth University, Aberystwyth, Ceredigion SY23 3FL, UK

Zhipeng Cai

32 Department of Computer Science, Georgia State University, Atlanta, GA 30303, USA

33 Institute of Soil Science, Chinese Academy of Sciences, Nanjing 210008, China

Jiabao Zhang

Artificial intelligence (AI) coupled with promising machine learning (ML) techniques well known from computer science is broadly affecting many aspects of various fields including science and technology, industry, and even our day-to-day life. The ML techniques have been developed to analyze high-throughput data with a view to obtaining useful insights, categorizing, predicting, and making evidence-based decisions in novel ways, which will promote the growth of novel applications and fuel the sustainable booming of AI. This paper undertakes a comprehensive survey on the development and application of AI in different aspects of fundamental sciences, including information science, mathematics, medical science, materials science, geoscience, life science, physics, and chemistry. The challenges that each discipline of science meets, and the potentials of AI techniques to handle these challenges, are discussed in detail. Moreover, we shed light on new research trends entailing the integration of AI into each scientific discipline. The aim of this paper is to provide a broad research guideline on fundamental sciences with potential infusion of AI, to help motivate researchers to deeply understand the state-of-the-art applications of AI-based fundamental sciences, and thereby to help promote the continuous development of these fundamental sciences.

Graphical abstract

An external file that holds a picture, illustration, etc.
Object name is fx1.jpg

Public summary

  • • “Can machines think?” The goal of artificial intelligence (AI) is to enable machines to mimic human thoughts and behaviors, including learning, reasoning, predicting, and so on.
  • • “Can AI do fundamental research?” AI coupled with machine learning techniques is impacting a wide range of fundamental sciences, including mathematics, medical science, physics, etc.
  • • “How does AI accelerate fundamental research?” New research and applications are emerging rapidly with the support by AI infrastructure, including data storage, computing power, AI algorithms, and frameworks.

Introduction

“Can machines think?” Alan Turing posed this question in his famous paper “Computing Machinery and Intelligence.” 1 He believes that to answer this question, we need to define what thinking is. However, it is difficult to define thinking clearly, because thinking is a subjective behavior. Turing then introduced an indirect method to verify whether a machine can think, the Turing test, which examines a machine's ability to show intelligence indistinguishable from that of human beings. A machine that succeeds in the test is qualified to be labeled as artificial intelligence (AI).

AI refers to the simulation of human intelligence by a system or a machine. The goal of AI is to develop a machine that can think like humans and mimic human behaviors, including perceiving, reasoning, learning, planning, predicting, and so on. Intelligence is one of the main characteristics that distinguishes human beings from animals. With the interminable occurrence of industrial revolutions, an increasing number of types of machine types continuously replace human labor from all walks of life, and the imminent replacement of human resources by machine intelligence is the next big challenge to be overcome. Numerous scientists are focusing on the field of AI, and this makes the research in the field of AI rich and diverse. AI research fields include search algorithms, knowledge graphs, natural languages processing, expert systems, evolution algorithms, machine learning (ML), deep learning (DL), and so on.

The general framework of AI is illustrated in Figure 1 . The development process of AI includes perceptual intelligence, cognitive intelligence, and decision-making intelligence. Perceptual intelligence means that a machine has the basic abilities of vision, hearing, touch, etc., which are familiar to humans. Cognitive intelligence is a higher-level ability of induction, reasoning and acquisition of knowledge. It is inspired by cognitive science, brain science, and brain-like intelligence to endow machines with thinking logic and cognitive ability similar to human beings. Once a machine has the abilities of perception and cognition, it is often expected to make optimal decisions as human beings, to improve the lives of people, industrial manufacturing, etc. Decision intelligence requires the use of applied data science, social science, decision theory, and managerial science to expand data science, so as to make optimal decisions. To achieve the goal of perceptual intelligence, cognitive intelligence, and decision-making intelligence, the infrastructure layer of AI, supported by data, storage and computing power, ML algorithms, and AI frameworks is required. Then by training models, it is able to learn the internal laws of data for supporting and realizing AI applications. The application layer of AI is becoming more and more extensive, and deeply integrated with fundamental sciences, industrial manufacturing, human life, social governance, and cyberspace, which has a profound impact on our work and lifestyle.

An external file that holds a picture, illustration, etc.
Object name is gr1.jpg

The general framework of AI

History of AI

The beginning of modern AI research can be traced back to John McCarthy, who coined the term “artificial intelligence (AI),” during at a conference at Dartmouth College in 1956. This symbolized the birth of the AI scientific field. Progress in the following years was astonishing. Many scientists and researchers focused on automated reasoning and applied AI for proving of mathematical theorems and solving of algebraic problems. One of the famous examples is Logic Theorist, a computer program written by Allen Newell, Herbert A. Simon, and Cliff Shaw, which proves 38 of the first 52 theorems in “Principia Mathematica” and provides more elegant proofs for some. 2 These successes made many AI pioneers wildly optimistic, and underpinned the belief that fully intelligent machines would be built in the near future. However, they soon realized that there was still a long way to go before the end goals of human-equivalent intelligence in machines could come true. Many nontrivial problems could not be handled by the logic-based programs. Another challenge was the lack of computational resources to compute more and more complicated problems. As a result, organizations and funders stopped supporting these under-delivering AI projects.

AI came back to popularity in the 1980s, as several research institutions and universities invented a type of AI systems that summarizes a series of basic rules from expert knowledge to help non-experts make specific decisions. These systems are “expert systems.” Examples are the XCON designed by Carnegie Mellon University and the MYCIN designed by Stanford University. The expert system derived logic rules from expert knowledge to solve problems in the real world for the first time. The core of AI research during this period is the knowledge that made machines “smarter.” However, the expert system gradually revealed several disadvantages, such as privacy technologies, lack of flexibility, poor versatility, expensive maintenance cost, and so on. At the same time, the Fifth Generation Computer Project, heavily funded by the Japanese government, failed to meet most of its original goals. Once again, the funding for AI research ceased, and AI was at the second lowest point of its life.

In 2006, Geoffrey Hinton and coworkers 3 , 4 made a breakthrough in AI by proposing an approach of building deeper neural networks, as well as a way to avoid gradient vanishing during training. This reignited AI research, and DL algorithms have become one of the most active fields of AI research. DL is a subset of ML based on multiple layers of neural networks with representation learning, 5 while ML is a part of AI that a computer or a program can use to learn and acquire intelligence without human intervention. Thus, “learn” is the keyword of this era of AI research. Big data technologies, and the improvement of computing power have made deriving features and information from massive data samples more efficient. An increasing number of new neural network structures and training methods have been proposed to improve the representative learning ability of DL, and to further expand it into general applications. Current DL algorithms match and exceed human capabilities on specific datasets in the areas of computer vision (CV) and natural language processing (NLP). AI technologies have achieved remarkable successes in all walks of life, and continued to show their value as backbones in scientific research and real-world applications.

Within AI, ML is having a substantial broad effect across many aspects of technology and science: from computer science to geoscience to materials science, from life science to medical science to chemistry to mathematics and to physics, from management science to economics to psychology, and other data-intensive empirical sciences, as ML methods have been developed to analyze high-throughput data to obtain useful insights, categorize, predict, and make evidence-based decisions in novel ways. To train a system by presenting it with examples of desired input-output behavior, could be far easier than to program it manually by predicting the desired response for all potential inputs. The following sections survey eight fundamental sciences, including information science (informatics), mathematics, medical science, materials science, geoscience, life science, physics, and chemistry, which develop or exploit AI techniques to promote the development of sciences and accelerate their applications to benefit human beings, society, and the world.

AI in information science

AI aims to provide the abilities of perception, cognition, and decision-making for machines. At present, new research and applications in information science are emerging at an unprecedented rate, which is inseparable from the support by the AI infrastructure. As shown in Figure 2 , the AI infrastructure layer includes data, storage and computing power, ML algorithms, and the AI framework. The perception layer enables machines have the basic ability of vision, hearing, etc. For instance, CV enables machines to “see” and identify objects, while speech recognition and synthesis helps machines to “hear” and recognize speech elements. The cognitive layer provides higher ability levels of induction, reasoning, and acquiring knowledge with the help of NLP, 6 knowledge graphs, 7 and continual learning. 8 In the decision-making layer, AI is capable of making optimal decisions, such as automatic planning, expert systems, and decision-supporting systems. Numerous applications of AI have had a profound impact on fundamental sciences, industrial manufacturing, human life, social governance, and cyberspace. The following subsections provide an overview of the AI framework, automatic machine learning (AutoML) technology, and several state-of-the-art AI/ML applications in the information field.

An external file that holds a picture, illustration, etc.
Object name is gr2.jpg

The knowledge graph of the AI framework

The AI framework provides basic tools for AI algorithm implementation

In the past 10 years, applications based on AI algorithms have played a significant role in various fields and subjects, on the basis of which the prosperity of the DL framework and platform has been founded. AI frameworks and platforms reduce the requirement of accessing AI technology by integrating the overall process of algorithm development, which enables researchers from different areas to use it across other fields, allowing them to focus on designing the structure of neural networks, thus providing better solutions to problems in their fields. At the beginning of the 21st century, only a few tools, such as MATLAB, OpenNN, and Torch, were capable of describing and developing neural networks. However, these tools were not originally designed for AI models, and thus faced problems, such as complicated user API and lacking GPU support. During this period, using these frameworks demanded professional computer science knowledge and tedious work on model construction. As a solution, early frameworks of DL, such as Caffe, Chainer, and Theano, emerged, allowing users to conveniently construct complex deep neural networks (DNNs), such as convolutional neural networks (CNNs), recurrent neural networks (RNNs), and LSTM conveniently, and this significantly reduced the cost of applying AI models. Tech giants then joined the march in researching AI frameworks. 9 Google developed the famous open-source framework, TensorFlow, while Facebook's AI research team released another popular platform, PyTorch, which is based on Torch; Microsoft Research published CNTK, and Amazon announced MXNet. Among them, TensorFlow, also the most representative framework, referred to Theano's declarative programming style, offering a larger space for graph-based optimization, while PyTorch inherited the imperative programming style of Torch, which is intuitive, user friendly, more flexible, and easier to be traced. As modern AI frameworks and platforms are being widely applied, practitioners can now assemble models swiftly and conveniently by adopting various building block sets and languages specifically suitable for given fields. Polished over time, these platforms gradually developed a clearly defined user API, the ability for multi-GPU training and distributed training, as well as a variety of model zoos and tool kits for specific tasks. 10 Looking forward, there are a few trends that may become the mainstream of next-generation framework development. (1) Capability of super-scale model training. With the emergence of models derived from Transformer, such as BERT and GPT-3, the ability of training large models has become an ideal feature of the DL framework. It requires AI frameworks to train effectively under the scale of hundreds or even thousands of devices. (2) Unified API standard. The APIs of many frameworks are generally similar but slightly different at certain points. This leads to some difficulties and unnecessary learning efforts, when the user attempts to shift from one framework to another. The API of some frameworks, such as JAX, has already become compatible with Numpy standard, which is familiar to most practitioners. Therefore, a unified API standard for AI frameworks may gradually come into being in the future. (3) Universal operator optimization. At present, kernels of DL operator are implemented either manually or based on third-party libraries. Most third-party libraries are developed to suit certain hardware platforms, causing large unnecessary spending when models are trained or deployed on different hardware platforms. The development speed of new DL algorithms is usually much faster than the update rate of libraries, which often makes new algorithms to be beyond the range of libraries' support. 11

To improve the implementation speed of AI algorithms, much research focuses on how to use hardware for acceleration. The DianNao family is one of the earliest research innovations on AI hardware accelerators. 12 It includes DianNao, DaDianNao, ShiDianNao, and PuDianNao, which can be used to accelerate the inference speed of neural networks and other ML algorithms. Of these, the best performance of a 64-chip DaDianNao system can achieve a speed up of 450.65× over a GPU, and reduce the energy by 150.31×. Prof. Chen and his team in the Institute of Computing Technology also designed an Instruction Set Architecture for a broad range of neural network accelerators, called Cambricon, which developed into a serial DL accelerator. After Cambricon, many AI-related companies, such as Apple, Google, HUAWEI, etc., developed their own DL accelerators, and AI accelerators became an important research field of AI.

AI for AI—AutoML

AutoML aims to study how to use evolutionary computing, reinforcement learning (RL), and other AI algorithms, to automatically generate specified AI algorithms. Research on the automatic generation of neural networks has existed before the emergence of DL, e.g., neural evolution. 13 The main purpose of neural evolution is to allow neural networks to evolve according to the principle of survival of the fittest in the biological world. Through selection, crossover, mutation, and other evolutionary operators, the individual quality in a population is continuously improved and, finally, the individual with the greatest fitness represents the best neural network. The biological inspiration in this field lies in the evolutionary process of human brain neurons. The human brain has such developed learning and memory functions that it cannot do without the complex neural network system in the brain. The whole neural network system of the human brain benefits from a long evolutionary process rather than gradient descent and back propagation. In the era of DL, the application of AI algorithms to automatically generate DNN has attracted more attention and, gradually, developed into an important direction of AutoML research: neural architecture search. The implementation methods of neural architecture search are usually divided into the RL-based method and the evolutionary algorithm-based method. In the RL-based method, an RNN is used as a controller to generate a neural network structure layer by layer, and then the network is trained, and the accuracy of the verification set is used as the reward signal of the RNN to calculate the strategy gradient. During the iteration, the controller will give the neural network, with higher accuracy, a higher probability value, so as to ensure that the strategy function can output the optimal network structure. 14 The method of neural architecture search through evolution is similar to the neural evolution method, which is based on a population and iterates continuously according to the principle of survival of the fittest, so as to obtain a high-quality neural network. 15 Through the application of neural architecture search technology, the design of neural networks is more efficient and automated, and the accuracy of the network gradually outperforms that of the networks designed by AI experts. For example, Google's SOTA network EfficientNet was realized through the baseline network based on neural architecture search. 16

AI enabling networking design adaptive to complex network conditions

The application of DL in the networking field has received strong interest. Network design often relies on initial network conditions and/or theoretical assumptions to characterize real network environments. However, traditional network modeling and design, regulated by mathematical models, are unlikely to deal with complex scenarios with many imperfect and high dynamic network environments. Integrating DL into network research allows for a better representation of complex network environments. Furthermore, DL could be combined with the Markov decision process and evolve into the deep reinforcement learning (DRL) model, which finds an optimal policy based on the reward function and the states of the system. Taken together, these techniques could be used to make better decisions to guide proper network design, thereby improving the network quality of service and quality of experience. With regard to the aspect of different layers of the network protocol stack, DL/DRL can be adopted for network feature extraction, decision-making, etc. In the physical layer, DL can be used for interference alignment. It can also be used to classify the modulation modes, design efficient network coding 17 and error correction codes, etc. In the data link layer, DL can be used for resource (such as channels) allocation, medium access control, traffic prediction, 18 link quality evaluation, and so on. In the network (routing) layer, routing establishment and routing optimization 19 can help to obtain an optimal routing path. In higher layers (such as the application layer), enhanced data compression and task allocation is used. Besides the above protocol stack, one critical area of using DL is network security. DL can be used to classify the packets into benign/malicious types, and how it can be integrated with other ML schemes, such as unsupervised clustering, to achieve a better anomaly detection effect.

AI enabling more powerful and intelligent nanophotonics

Nanophotonic components have recently revolutionized the field of optics via metamaterials/metasurfaces by enabling the arbitrary manipulation of light-matter interactions with subwavelength meta-atoms or meta-molecules. 20 , 21 , 22 The conventional design of such components involves generally forward modeling, i.e., solving Maxwell's equations based on empirical and intuitive nanostructures to find corresponding optical properties, as well as the inverse design of nanophotonic devices given an on-demand optical response. The trans-dimensional feature of macro-optical components consisting of complex nano-antennas makes the design process very time consuming, computationally expensive, and even numerically prohibitive, such as device size and complexity increase. DL is an efficient and automatic platform, enabling novel efficient approaches to designing nanophotonic devices with high-performance and versatile functions. Here, we present briefly the recent progress of DL-based nanophotonics and its wide-ranging applications. DL was exploited for forward modeling at first using a DNN. 23 The transmission or reflection coefficients can be well predicted after training on huge datasets. To improve the prediction accuracy of DNN in case of small datasets, transfer learning was introduced to migrate knowledge between different physical scenarios, which greatly reduced the relative error. Furthermore, a CNN and an RNN were developed for the prediction of optical properties from arbitrary structures using images. 24 The CNN-RNN combination successfully predicted the absorption spectra from the given input structural images. In inverse design of nanophotonic devices, there are three different paradigms of DL methods, i.e., supervised, unsupervised, and RL. 25 Supervised learning has been utilized to design structural parameters for the pre-defined geometries, such as tandem DNN and bidirectional DNNs. Unsupervised learning methods learn by themselves without a specific target, and thus are more accessible to discovering new and arbitrary patterns 26 in completely new data than supervised learning. A generative adversarial network (GAN)-based approach, combining conditional GANs and Wasserstein GANs, was proposed to design freeform all-dielectric multifunctional metasurfaces. RL, especially double-deep Q-learning, powers up the inverse design of high-performance nanophotonic devices. 27 DL has endowed nanophotonic devices with better performance and more emerging applications. 28 , 29 For instance, an intelligent microwave cloak driven by DL exhibits millisecond and self-adaptive response to an ever-changing incident wave and background. 28 Another example is that a DL-augmented infrared nanoplasmonic metasurface is developed for monitoring dynamics between four major classes of bio-molecules, which could impact the fields of biology, bioanalytics, and pharmacology from fundamental research, to disease diagnostics, to drug development. 29 The potential of DL in the wide arena of nanophotonics has been unfolding. Even end-users without optics and photonics background could exploit the DL as a black box toolkit to design powerful optical devices. Nevertheless, how to interpret/mediate the intermediate DL process and determine the most dominant factors in the search for optimal solutions, are worthy of being investigated in depth. We optimistically envisage that the advancements in DL algorithms and computation/optimization infrastructures would enable us to realize more efficient and reliable training approaches, more complex nanostructures with unprecedented shapes and sizes, and more intelligent and reconfigurable optic/optoelectronic systems.

AI in other fields of information science

We believe that AI has great potential in the following directions:

  • • AI-based risk control and management in utilities can prevent costly or hazardous equipment failures by using sensors that detect and send information regarding the machine's health to the manufacturer, predicting possible issues that could occur so as to ensure timely maintenance or automated shutdown.
  • • AI could be used to produce simulations of real-world objects, called digital twins. When applied to the field of engineering, digital twins allow engineers and technicians to analyze the performance of an equipment virtually, thus avoiding safety and budget issues associated with traditional testing methods.
  • • Combined with AI, intelligent robots are playing an important role in industry and human life. Different from traditional robots working according to the procedures specified by humans, intelligent robots have the ability of perception, recognition, and even automatic planning and decision-making, based on changes in environmental conditions.
  • • AI of things (AIoT) or AI-empowered IoT applications. 30 have become a promising development trend. AI can empower the connected IoT devices, embedded in various physical infrastructures, to perceive, recognize, learn, and act. For instance, smart cities constantly collect data regarding quality-of-life factors, such as the status of power supply, public transportation, air pollution, and water use, to manage and optimize systems in cities. Due to these data, especially personal data being collected from informed or uninformed participants, data security, and privacy 31 require protection.

AI in mathematics

Mathematics always plays a crucial and indispensable role in AI. Decades ago, quite a few classical AI-related approaches, such as k-nearest neighbor, 32 support vector machine, 33 and AdaBoost, 34 were proposed and developed after their rigorous mathematical formulations had been established. In recent years, with the rapid development of DL, 35 AI has been gaining more and more attention in the mathematical community. Equipped with the Markov process, minimax optimization, and Bayesian statistics, RL, 36 GANs, 37 and Bayesian learning 38 became the most favorable tools in many AI applications. Nevertheless, there still exist plenty of open problems in mathematics for ML, including the interpretability of neural networks, the optimization problems of parameter estimation, and the generalization ability of learning models. In the rest of this section, we discuss these three questions in turn.

The interpretability of neural networks

From a mathematical perspective, ML usually constructs nonlinear models, with neural networks as a typical case, to approximate certain functions. The well-known Universal Approximation Theorem suggests that, under very mild conditions, any continuous function can be uniformly approximated on compact domains by neural networks, 39 which serves a vital function in the interpretability of neural networks. However, in real applications, ML models seem to admit accurate approximations of many extremely complicated functions, sometimes even black boxes, which are far beyond the scope of continuous functions. To understand the effectiveness of ML models, many researchers have investigated the function spaces that can be well approximated by them, and the corresponding quantitative measures. This issue is closely related to the classical approximation theory, but the approximation scheme is distinct. For example, Bach 40 finds that the random feature model is naturally associated with the corresponding reproducing kernel Hilbert space. In the same way, the Barron space is identified as the natural function space associated with two-layer neural networks, and the approximation error is measured using the Barron norm. 41 The corresponding quantities of residual networks (ResNets) are defined for the flow-induced spaces. For multi-layer networks, the natural function spaces for the purposes of approximation theory are the tree-like function spaces introduced in Wojtowytsch. 42 There are several works revealing the relationship between neural networks and numerical algorithms for solving partial differential equations. For example, He and Xu 43 discovered that CNNs for image classification have a strong connection with multi-grid (MG) methods. In fact, the pooling operation and feature extraction in CNNs correspond directly to restriction operation and iterative smoothers in MG, respectively. Hence, various convolution and pooling operations used in CNNs can be better understood.

The optimization problems of parameter estimation

In general, the optimization problem of estimating parameters of certain DNNs is in practice highly nonconvex and often nonsmooth. Can the global minimizers be expected? What is the landscape of local minimizers? How does one handle the nonsmoothness? All these questions are nontrivial from an optimization perspective. Indeed, numerous works and experiments demonstrate that the optimization for parameter estimation in DL is itself a much nicer problem than once thought; see, e.g., Goodfellow et al. 44 As a consequence, the study on the solution landscape ( Figure 3 ), also known as loss surface of neural networks, is no longer supposed to be inaccessible and can even in turn provide guidance for global optimization. Interested readers can refer to the survey paper (Sun et al. 45 ) for recent progress in this aspect.

An external file that holds a picture, illustration, etc.
Object name is gr3.jpg

Recent studies indicate that nonsmooth activation functions, e.g., rectified linear units, are better than smooth ones in finding sparse solutions. However, the chain rule does not work in the case that the activation functions are nonsmooth, which then makes the widely used stochastic gradient (SG)-based approaches not feasible in theory. Taking approximated gradients at nonsmooth iterates as a remedy ensures that SG-type methods are still in extensive use, but that the numerical evidence has also exposed their limitations. Also, the penalty-based approaches proposed by Cui et al. 46 and Liu et al. 47 provide a new direction to solve the nonsmooth optimization problems efficiently.

The generalization ability of learning models

A small training error does not always lead to a small test error. This gap is caused by the generalization ability of learning models. A key finding in statistical learning theory states that the generalization error is bounded by a quantity that grows with the increase of the model capacity, but shrinks as the number of training examples increases. 48 A common conjecture relating generalization to solution landscape is that flat and wide minima generalize better than sharp ones. Thus, regularization techniques, including the dropout approach, 49 have emerged to force the algorithms to bypass the sharp minima. However, the mechanism behind this has not been fully explored. Recently, some researchers have focused on the ResNet-type architecture, with dropout being inserted after the last convolutional layer of each modular building. They thus managed to explain the stochastic dropout training process and the ensuing dropout regularization effect from the perspective of optimal control. 50

AI in medical science

There is a great trend for AI technology to grow more and more significant in daily operations, including medical fields. With the growing needs of healthcare for patients, hospital needs are evolving from informationization networking to the Internet Hospital and eventually to the Smart Hospital. At the same time, AI tools and hardware performance are also growing rapidly with each passing day. Eventually, common AI algorithms, such as CV, NLP, and data mining, will begin to be embedded in the medical equipment market ( Figure 4 ).

An external file that holds a picture, illustration, etc.
Object name is gr4.jpg

AI doctor based on electronic medical records

For medical history data, it is inevitable to mention Doctor Watson, developed by the Watson platform of IBM, and Modernizing Medicine, which aims to solve oncology, and is now adopted by CVS & Walgreens in the US and various medical organizations in China as well. Doctor Watson takes advantage of the NLP performance of the IBM Watson platform, which already collected vast data of medical history, as well as prior knowledge in the literature for reference. After inputting the patients' case, Doctor Watson searches the medical history reserve and forms an elementary treatment proposal, which will be further ranked by prior knowledge reserves. With the multiple models stored, Doctor Watson gives the final proposal as well as the confidence of the proposal. However, there are still problems for such AI doctors because, 51 as they rely on prior experience from US hospitals, the proposal may not be suitable for other regions with different medical insurance policies. Besides, the knowledge updating of the Watson platform also relies highly on the updating of the knowledge reserve, which still needs manual work.

AI for public health: Outbreak detection and health QR code for COVID-19

AI can be used for public health purposes in many ways. One classical usage is to detect disease outbreaks using search engine query data or social media data, as Google did for prediction of influenza epidemics 52 and the Chinese Academy of Sciences did for modeling the COVID-19 outbreak through multi-source information fusion. 53 After the COVID-19 outbreak, a digital health Quick Response (QR) code system has been developed by China, first to detect potential contact with confirmed COVID-19 cases and, secondly, to indicate the person's health status using mobile big data. 54 Different colors indicate different health status: green means healthy and is OK for daily life, orange means risky and requires quarantine, and red means confirmed COVID-19 patient. It is easy to use for the general public, and has been adopted by many other countries. The health QR code has made great contributions to the worldwide prevention and control of the COVID-19 pandemic.

Biomarker discovery with AI

High-dimensional data, including multi-omics data, patient characteristics, medical laboratory test data, etc., are often used for generating various predictive or prognostic models through DL or statistical modeling methods. For instance, the COVID-19 severity evaluation model was built through ML using proteomic and metabolomic profiling data of sera 55 ; using integrated genetic, clinical, and demographic data, Taliaz et al. built an ML model to predict patient response to antidepressant medications 56 ; prognostic models for multiple cancer types (such as liver cancer, lung cancer, breast cancer, gastric cancer, colorectal cancer, pancreatic cancer, prostate cancer, ovarian cancer, lymphoma, leukemia, sarcoma, melanoma, bladder cancer, renal cancer, thyroid cancer, head and neck cancer, etc.) were constructed through DL or statistical methods, such as least absolute shrinkage and selection operator (LASSO), combined with Cox proportional hazards regression model using genomic data. 57

Image-based medical AI

Medical image AI is one of the most developed mature areas as there are numerous models for classification, detection, and segmentation tasks in CV. For the clinical area, CV algorithms can also be used for computer-aided diagnosis and treatment with ECG, CT, eye fundus imaging, etc. As human doctors may be tired and prone to make mistakes after viewing hundreds and hundreds of images for diagnosis, AI doctors can outperform a human medical image viewer due to their specialty at repeated work without fatigue. The first medical AI product approved by FDA is IDx-DR, which uses an AI model to make predictions of diabetic retinopathy. The smartphone app SkinVision can accurately detect melanomas. 58 It uses “fractal analysis” to identify moles and their surrounding skin, based on size, diameter, and many other parameters, and to detect abnormal growth trends. AI-ECG of LEPU Medical can automatically detect heart disease with ECG images. Lianying Medical takes advantage of their hardware equipment to produce real-time high-definition image-guided all-round radiotherapy technology, which successfully achieves precise treatment.

Wearable devices for surveillance and early warning

For wearable devices, AliveCor has developed an algorithm to automatically predict the presence of atrial fibrillation, which is an early warning sign of stroke and heart failure. The 23andMe company can also test saliva samples at a small cost, and a customer can be provided with information based on their genes, including who their ancestors were or potential diseases they may be prone to later in life. It provides accurate health management solutions based on individual and family genetic data. In the 20–30 years of the near feature, we believe there are several directions for further research: (1) causal inference for real-time in-hospital risk prediction. Clinical doctors usually acquire reasonable explanations for certain medical decisions, but the current AI models nowadays are usually black box models. The casual inference will help doctors to explain certain AI decisions and even discover novel ground truths. (2) Devices, including wearable instruments for multi-dimensional health monitoring. The multi-modality model is now a trend for AI research. With various devices to collect multi-modality data and a central processor to fuse all these data, the model can monitor the user's overall real-time health condition and give precautions more precisely. (3) Automatic discovery of clinical markers for diseases that are difficult to diagnose. Diseases, such as ALS, are still difficult for clinical doctors to diagnose because they lack any effective general marker. It may be possible for AI to discover common phenomena for these patients and find an effective marker for early diagnosis.

AI-aided drug discovery

Today we have come into the precision medicine era, and the new targeted drugs are the cornerstones for precision therapy. However, over the past decades, it takes an average of over one billion dollars and 10 years to bring a new drug into the market. How to accelerate the drug discovery process, and avoid late-stage failure, are key concerns for all the big and fiercely competitive pharmaceutical companies. The highlighted emerging role of AI, including ML, DL, expert systems, and artificial neural networks (ANNs), has brought new insights and high efficiency into the new drug discovery processes. AI has been adopted in many aspects of drug discovery, including de novo molecule design, structure-based modeling for proteins and ligands, quantitative structure-activity relationship research, and druggable property judgments. DL-based AI appliances demonstrate superior merits in addressing some challenging problems in drug discovery. Of course, prediction of chemical synthesis routes and chemical process optimization are also valuable in accelerating new drug discovery, as well as lowering production costs.

There has been notable progress in the AI-aided new drug discovery in recent years, for both new chemical entity discovery and the relating business area. Based on DNNs, DeepMind built the AlphaFold platform to predict 3D protein structures that outperformed other algorithms. As an illustration of great achievement, AlphaFold successfully and accurately predicted 25 scratch protein structures from a 43 protein panel without using previously built proteins models. Accordingly, AlphaFold won the CASP13 protein-folding competition in December 2018. 59 Based on the GANs and other ML methods, Insilico constructed a modular drug design platform GENTRL system. In September 2019, they reported the discovery of the first de novo active DDR1 kinase inhibitor developed by the GENTRL system. It took the team only 46 days from target selection to get an active drug candidate using in vivo data. 60 Exscientia and Sumitomo Dainippon Pharma developed a new drug candidate, DSP-1181, for the treatment of obsessive-compulsive disorder on the Centaur Chemist AI platform. In January 2020, DSP-1181 started its phase I clinical trials, which means that, from program initiation to phase I study, the comprehensive exploration took less than 12 months. In contrast, comparable drug discovery using traditional methods usually needs 4–5 years with traditional methods.

How AI transforms medical practice: A case study of cervical cancer

As the most common malignant tumor in women, cervical cancer is a disease that has a clear cause and can be prevented, and even treated, if detected early. Conventionally, the screening strategy for cervical cancer mainly adopts the “three-step” model of “cervical cytology-colposcopy-histopathology.” 61 However, limited by the level of testing methods, the efficiency of cervical cancer screening is not high. In addition, owing to the lack of knowledge by doctors in some primary hospitals, patients cannot be provided with the best diagnosis and treatment decisions. In recent years, with the advent of the era of computer science and big data, AI has gradually begun to extend and blend into various fields. In particular, AI has been widely used in a variety of cancers as a new tool for data mining. For cervical cancer, a clinical database with millions of medical records and pathological data has been built, and an AI medical tool set has been developed. 62 Such an AI analysis algorithm supports doctors to access the ability of rapid iterative AI model training. In addition, a prognostic prediction model established by ML and a web-based prognostic result calculator have been developed, which can accurately predict the risk of postoperative recurrence and death in cervical cancer patients, and thereby better guide decision-making in postoperative adjuvant treatment. 63

AI in materials science

As the cornerstone of modern industry, materials have played a crucial role in the design of revolutionary forms of matter, with targeted properties for broad applications in energy, information, biomedicine, construction, transportation, national security, spaceflight, and so forth. Traditional strategies rely on the empirical trial and error experimental approaches as well as the theoretical simulation methods, e.g., density functional theory, thermodynamics, or molecular dynamics, to discover novel materials. 64 These methods often face the challenges of long research cycles, high costs, and low success rates, and thus cannot meet the increasingly growing demands of current materials science. Accelerating the speed of discovery and deployment of advanced materials will therefore be essential in the coming era.

With the rapid development of data processing and powerful algorithms, AI-based methods, such as ML and DL, are emerging with good potentials in the search for and design of new materials prior to actually manufacturing them. 65 , 66 By integrating material property data, such as the constituent element, lattice symmetry, atomic radius, valence, binding energy, electronegativity, magnetism, polarization, energy band, structure-property relation, and functionalities, the machine can be trained to “think” about how to improve material design and even predict the properties of new materials in a cost-effective manner ( Figure 5 ).

An external file that holds a picture, illustration, etc.
Object name is gr5.jpg

AI is expected to power the development of materials science

AI in discovery and design of new materials

Recently, AI techniques have made significant advances in rational design and accelerated discovery of various materials, such as piezoelectric materials with large electrostrains, 67 organic-inorganic perovskites for photovoltaics, 68 molecular emitters for efficient light-emitting diodes, 69 inorganic solid materials for thermoelectrics, 70 and organic electronic materials for renewable-energy applications. 66 , 71 The power of data-driven computing and algorithmic optimization can promote comprehensive applications of simulation and ML (i.e., high-throughput virtual screening, inverse molecular design, Bayesian optimization, and supervised learning, etc.), in material discovery and property prediction in various fields. 72 For instance, using a DL Bayesian framework, the attribute-driven inverse materials design has been demonstrated for efficient and accurate prediction of functional molecular materials, with desired semiconducting properties or redox stability for applications in organic thin-film transistors, organic solar cells, or lithium-ion batteries. 73 It is meaningful to adopt automation tools for quick experimental testing of potential materials and utilize high-performance computing to calculate their bulk, interface, and defect-related properties. 74 The effective convergence of automation, computing, and ML can greatly speed up the discovery of materials. In the future, with the aid of AI techniques, it will be possible to accomplish the design of superconductors, metallic glasses, solder alloys, high-entropy alloys, high-temperature superalloys, thermoelectric materials, two-dimensional materials, magnetocaloric materials, polymeric bio-inspired materials, sensitive composite materials, and topological (electronic and phonon) materials, and so on. In the past decade, topological materials have ignited the research enthusiasm of condensed matter physicists, materials scientists, and chemists, as they exhibit exotic physical properties with potential applications in electronics, thermoelectrics, optics, catalysis, and energy-related fields. From the most recent predictions, more than a quarter of all inorganic materials in nature are topologically nontrivial. The establishment of topological electronic materials databases 75 , 76 , 77 and topological phononic materials databases 78 using high-throughput methods will help to accelerate the screening and experimental discovery of new topological materials for functional applications. It is recognized that large-scale high-quality datasets are required to practice AI. Great efforts have also been expended in building high-quality materials science databases. As one of the top-ranking databases of its kind, the “atomly.net” materials data infrastructure, 79 has calculated the properties of more than 180,000 inorganic compounds, including their equilibrium structures, electron energy bands, dielectric properties, simulated diffraction patterns, elasticity tensors, etc. As such, the atomly.net database has set a solid foundation for extending AI into the area of materials science research. The X-ray diffraction (XRD)-matcher model of atomly.net uses ML to match and classify the experimental XRD to the simulated patterns. Very recently, by using the dataset from atomly.net, an accurate AI model was built to rapidly predict the formation energy of almost any given compound to yield a fairly good predictive ability. 80

AI-powered Materials Genome Initiative

The Materials Genome Initiative (MGI) is a great plan for rational realization of new materials and related functions, and it aims to discover, manufacture, and deploy advanced materials efficiently, cost-effectively, and intelligently. The initiative creates policy, resources, and infrastructure for accelerating materials development at a high level. This is a new paradigm for the discovery and design of next-generation materials, and runs from a view point of fundamental building blocks toward general materials developments, and accelerates materials development through efforts in theory, computation, and experiment, in a highly integrated high-throughput manner. MGI raises an ultimately high goal and high level for materials development and materials science for humans in the future. The spirit of MGI is to design novel materials by using data pools and powerful computation once the requirements or aspirations of functional usages appear. The theory, computation, and algorithm are the primary and substantial factors in the establishment and implementation of MGI. Advances in theories, computations, and experiments in materials science and engineering provide the footstone to not only accelerate the speed at which new materials are realized but to also shorten the time needed to push new products into the market. These AI techniques bring a great promise to the developing MGI. The applications of new technologies, such as ML and DL, directly accelerate materials research and the establishment of MGI. The model construction and application to science and engineering, as well as the data infrastructure, are of central importance. When the AI-powered MGI approaches are coupled with the ongoing autonomy of manufacturing methods, the potential impact to society and the economy in the future is profound. We are now beginning to see that the AI-aided MGI, among other things, integrates experiments, computation, and theory, and facilitates access to materials data, equips the next generation of the materials workforce, and enables a paradigm shift in materials development. Furthermore, the AI-powdered MGI could also design operational procedures and control the equipment to execute experiments, and to further realize autonomous experimentation in future material research.

Advanced functional materials for generation upgrade of AI

The realization and application of AI techniques depend on the computational capability and computer hardware, and this bases physical functionality on the performance of computers or supercomputers. For our current technology, the electric currents or electric carriers for driving electric chips and devices consist of electrons with ordinary characteristics, such as heavy mass and low mobility. All chips and devices emit relatively remarkable heat levels, consuming too much energy and lowering the efficiency of information transmission. Benefiting from the rapid development of modern physics, a series of advanced materials with exotic functional effects have been discovered or designed, including superconductors, quantum anomalous Hall insulators, and topological fermions. In particular, the superconducting state or topologically nontrivial electrons will promote the next-generation AI techniques once the (near) room temperature applications of these states are realized and implanted in integrated circuits. 81 In this case, the central processing units, signal circuits, and power channels will be driven based on the electronic carriers that show massless, energy-diffusionless, ultra-high mobility, or chiral-protection characteristics. The ordinary electrons will be removed from the physical circuits of future-generation chips and devices, leaving superconducting and topological chiral electrons running in future AI chips and supercomputers. The efficiency of transmission, for information and logic computing will be improved on a vast scale and at a very low cost.

AI for materials and materials for AI

The coming decade will continue to witness the development of advanced ML algorithms, newly emerging data-driven AI methodologies, and integrated technologies for facilitating structure design and property prediction, as well as to accelerate the discovery, design, development, and deployment of advanced materials into existing and emerging industrial sectors. At this moment, we are facing challenges in achieving accelerated materials research through the integration of experiment, computation, and theory. The great MGI, proposed for high-level materials research, helps to promote this process, especially when it is assisted by AI techniques. Still, there is a long way to go for the usage of these advanced functional materials in future-generation electric chips and devices to be realized. More materials and functional effects need to be discovered or improved by the developing AI techniques. Meanwhile, it is worth noting that materials are the core components of devices and chips that are used for construction of computers or machines for advanced AI systems. The rapid development of new materials, especially the emergence of flexible, sensitive, and smart materials, is of great importance for a broad range of attractive technologies, such as flexible circuits, stretchable tactile sensors, multifunctional actuators, transistor-based artificial synapses, integrated networks of semiconductor/quantum devices, intelligent robotics, human-machine interactions, simulated muscles, biomimetic prostheses, etc. These promising materials, devices, and integrated technologies will greatly promote the advancement of AI systems toward wide applications in human life. Once the physical circuits are upgraded by advanced functional or smart materials, AI techniques will largely promote the developments and applications of all disciplines.

AI in geoscience

Ai technologies involved in a large range of geoscience fields.

Momentous challenges threatening current society require solutions to problems that belong to geoscience, such as evaluating the effects of climate change, assessing air quality, forecasting the effects of disaster incidences on infrastructure, by calculating the incoming consumption and availability of food, water, and soil resources, and identifying factors that are indicators for potential volcanic eruptions, tsunamis, floods, and earthquakes. 82 , 83 It has become possible, with the emergence of advanced technology products (e.g., deep sea drilling vessels and remote sensing satellites), for enhancements in computational infrastructure that allow for processing large-scale, wide-range simulations of multiple models in geoscience, and internet-based data analysis that facilitates collection, processing, and storage of data in distributed and crowd-sourced environments. 84 The growing availability of massive geoscience data provides unlimited possibilities for AI—which has popularized all aspects of our daily life (e.g., entertainment, transportation, and commerce)—to significantly contribute to geoscience problems of great societal relevance. As geoscience enters the era of massive data, AI, which has been extensively successful in different fields, offers immense opportunities for settling a series of problems in Earth systems. 85 , 86 Accompanied by diversified data, AI-enabled technologies, such as smart sensors, image visualization, and intelligent inversion, are being actively examined in a large range of geoscience fields, such as marine geoscience, rock physics, geology, ecology, seismicity, environment, hydrology, remote sensing, Arc GIS, and planetary science. 87

Multiple challenges in the development of geoscience

There are some traits of geoscience development that restrict the applicability of fundamental algorithms for knowledge discovery: (1) inherent challenges of geoscience processes, (2) limitation of geoscience data collection, and (3) uncertainty in samples and ground truth. 88 , 89 , 90 Amorphous boundaries generally exist in geoscience objects between space and time that are not as well defined as objects in other fields. Geoscience phenomena are also significantly multivariate, obey nonlinear relationships, and exhibit spatiotemporal structure and non-stationary characteristics. Except for the inherent challenges of geoscience observations, the massive data at multiple dimensions of time and space, with different levels of incompleteness, noise, and uncertainties, disturb processes in geoscience. For supervised learning approaches, there are other difficulties owing to the lack of gold standard ground truth and the “small size” of samples (e.g., a small amount of historical data with sufficient observations) in geoscience applications.

Usage of AI technologies as efficient approaches to promote the geoscience processes

Geoscientists continually make every effort to develop better techniques for simulating the present status of the Earth system (e.g., how much greenhouse gases are released into the atmosphere), and the connections between and within its subsystems (e.g., how does the elevated temperature influence the ocean ecosystem). Viewed from the perspective of geoscience, newly emerging approaches, with the aid of AI, are a perfect combination for these issues in the application of geoscience: (1) characterizing objects and events 91 ; (2) estimating geoscience variables from observations 92 ; (3) forecasting geoscience variables according to long-term observations 85 ; (4) exploring geoscience data relationships 93 ; and (5) causal discovery and causal attribution. 94 While characterizing geoscience objects and events using traditional methods are primarily rooted in hand-coded features, algorithms can automatically detect the data by improving the performance with pattern-mining techniques. However, due to spatiotemporal targets with vague boundaries and the related uncertainties, it can be necessary to advance pattern-mining methods that can explain the temporal and spatial characteristics of geoscience data when characterizing different events and objects. To address the non-stationary issue of geoscience data, AI-aided algorithms have been expanded to integrate the holistic results of professional predictors and engender robust estimations of climate variables (e.g., humidity and temperature). Furthermore, forecasting long-term trends of the current situation in the Earth system using AI-enabled technologies can simulate future scenarios and formulate early resource planning and adaptation policies. Mining geoscience data relationships can help us seize vital signs of the Earth system and promote our understanding of geoscience developments. Of great interest is the advancement of AI-decision methodology with uncertain prediction probabilities, engendering vague risks with poorly resolved tails, signifying the most extreme, transient, and rare events formulated by model sets, which supports various cases to improve accuracy and effectiveness.

AI technologies for optimizing the resource management in geoscience

Currently, AI can perform better than humans in some well-defined tasks. For example, AI techniques have been used in urban water resource planning, mainly due to their remarkable capacity for modeling, flexibility, reasoning, and forecasting the water demand and capacity. Design and application of an Adaptive Intelligent Dynamic Water Resource Planning system, the subset of AI for sustainable water resource management in urban regions, largely prompted the optimization of water resource allocation, will finally minimize the operation costs and improve the sustainability of environmental management 95 ( Figure 6 ). Also, meteorology requires collecting tremendous amounts of data on many different variables, such as humidity, altitude, and temperature; however, dealing with such a huge dataset is a big challenge. 96 An AI-based technique is being utilized to analyze shallow-water reef images, recognize the coral color—to track the effects of climate change, and to collect humidity, temperature, and CO 2 data—to grasp the health of our ecological environment. 97 Beyond AI's capabilities for meteorology, it can also play a critical role in decreasing greenhouse gas emissions originating from the electric-power sector. Comprised of production, transportation, allocation, and consumption of electricity, many opportunities exist in the electric-power sector for Al applications, including speeding up the development of new clean energy, enhancing system optimization and management, improving electricity-demand forecasts and distribution, and advancing system monitoring. 98 New materials may even be found, with the auxiliary of AI, for batteries to store energy or materials and absorb CO 2 from the atmosphere. 99 Although traditional fossil fuel operations have been widely used for thousands of years, AI techniques are being used to help explore the development of more potential sustainable energy sources for the development (e.g., fusion technology). 100

An external file that holds a picture, illustration, etc.
Object name is gr6.jpg

Applications of AI in hydraulic resource management

In addition to the adjustment of energy structures due to climate change (a core part of geoscience systems), a second, less-obvious step could also be taken to reduce greenhouse gas emission: using AI to target inefficiencies. A related statistical report by the Lawrence Livermore National Laboratory pointed out that around 68% of energy produced in the US could be better used for purposeful activities, such as electricity generation or transportation, but is instead contributing to environmental burdens. 101 AI is primed to reduce these inefficiencies of current nuclear power plants and fossil fuel operations, as well as improve the efficiency of renewable grid resources. 102 For example, AI can be instrumental in the operation and optimization of solar and wind farms to make these utility-scale renewable-energy systems far more efficient in the production of electricity. 103 AI can also assist in reducing energy losses in electricity transportation and allocation. 104 A distribution system operator in Europe used AI to analyze load, voltage, and network distribution data, to help “operators assess available capacity on the system and plan for future needs.” 105 AI allowed the distribution system operator to employ existing and new resources to make the distribution of energy assets more readily available and flexible. The International Energy Agency has proposed that energy efficiency is core to the reform of energy systems and will play a key role in reducing the growth of global energy demand to one-third of the current level by 2040.

AI as a building block to promote development in geoscience

The Earth’s system is of significant scientific interest, and affects all aspects of life. 106 The challenges, problems, and promising directions provided by AI are definitely not exhaustive, but rather, serve to illustrate that there is great potential for future AI research in this important field. Prosperity, development, and popularization of AI approaches in the geosciences is commonly driven by a posed scientific question, and the best way to succeed is that AI researchers work closely with geoscientists at all stages of research. That is because the geoscientists can better understand which scientific question is important and novel, which sample collection process can reasonably exhibit the inherent strengths, which datasets and parameters can be used to answer that question, and which pre-processing operations are conducted, such as removing seasonal cycles or smoothing. Similarly, AI researchers are better suited to decide which data analysis approaches are appropriate and available for the data, the advantages and disadvantages of these approaches, and what the approaches actually acquire. Interpretability is also an important goal in geoscience because, if we can understand the basic reasoning behind the models, patterns, or relationships extracted from the data, they can be used as building blocks in scientific knowledge discovery. Hence, frequent communication between the researchers avoids long detours and ensures that analysis results are indeed beneficial to both geoscientists and AI researchers.

AI in the life sciences

The developments of AI and the life sciences are intertwined. The ultimate goal of AI is to achieve human-like intelligence, as the human brain is capable of multi-tasking, learning with minimal supervision, and generalizing learned skills, all accomplished with high efficiency and low energy cost. 107

Mutual inspiration between AI and neuroscience

In the past decades, neuroscience concepts have been introduced into ML algorithms and played critical roles in triggering several important advances in AI. For example, the origins of DL methods lie directly in neuroscience, 5 which further stimulated the emergence of the field of RL. 108 The current state-of-the-art CNNs incorporate several hallmarks of neural computation, including nonlinear transduction, divisive normalization, and maximum-based pooling of inputs, 109 which were directly inspired by the unique processing of visual input in the mammalian visual cortex. 110 By introducing the brain's attentional mechanisms, a novel network has been shown to produce enhanced accuracy and computational efficiency at difficult multi-object recognition tasks than conventional CNNs. 111 Other neuroscience findings, including the mechanisms underlying working memory, episodic memory, and neural plasticity, have inspired the development of AI algorithms that address several challenges in deep networks. 108 These algorithms can be directly implemented in the design and refinement of the brain-machine interface and neuroprostheses.

On the other hand, insights from AI research have the potential to offer new perspectives on the basics of intelligence in the brains of humans and other species. Unlike traditional neuroscientists, AI researchers can formalize the concepts of neural mechanisms in a quantitative language to extract their necessity and sufficiency for intelligent behavior. An important illustration of such exchange is the development of the temporal-difference (TD) methods in RL models and the resemblance of TD-form learning in the brain. 112 Therefore, the China Brain Project covers both basic research on cognition and translational research for brain disease and brain-inspired intelligence technology. 113

AI for omics big data analysis

Currently, AI can perform better than humans in some well-defined tasks, such as omics data analysis and smart agriculture. In the big data era, 114 there are many types of data (variety), the volume of data is big, and the generation of data (velocity) is fast. The high variety, big volume, and fast velocity of data makes having it a matter of big value, but also makes it difficult to analyze the data. Unlike traditional statistics-based methods, AI can easily handle big data and reveal hidden associations.

In genetics studies, there are many successful applications of AI. 115 One of the key questions is to determine whether a single amino acid polymorphism is deleterious. 116 There have been sequence conservation-based SIFT 117 and network-based SySAP, 118 but all these methods have met bottlenecks and cannot be further improved. Sundaram et al. developed PrimateAI, which can predict the clinical outcome of mutation based on DNN. 119 Another problem is how to call copy-number variations, which play important roles in various cancers. 120 , 121 Glessner et al. proposed a DL-based tool DeepCNV, in which the area under the receiver operating characteristic (ROC) curve was 0.909, much higher than other ML methods. 122 In epigenetic studies, m6A modification is one of the most important mechanisms. 123 Zhang et al. developed an ensemble DL predictor (EDLm6APred) for mRNA m6A site prediction. 124 The area under the ROC curve of EDLm6APred was 86.6%, higher than existing m6A methylation site prediction models. There are many other DL-based omics tools, such as DeepCpG 125 for methylation, DeepPep 126 for proteomics, AtacWorks 127 for assay for transposase-accessible chromatin with high-throughput sequencing, and deepTCR 128 for T cell receptor sequencing.

Another emerging application is DL for single-cell sequencing data. Unlike bulk data, in which the sample size is usually much smaller than the number of features, the sample size of cells in single-cell data could also be big compared with the number of genes. That makes the DL algorithm applicable for most single-cell data. Since the single-cell data are sparse and have many unmeasured missing values, DeepImpute can accurately impute these missing values in the big gene × cell matrix. 129 During the quality control of single-cell data, it is important to remove the doublet solo embedded cells, using autoencoder, and then build a feedforward neural network to identify the doublet. 130 Potential energy underlying single-cell gradients used generative modeling to learn the underlying differentiation landscape from time series single-cell RNA sequencing data. 131

In protein structure prediction, the DL-based AIphaFold2 can accurately predict the 3D structures of 98.5% of human proteins, and will predict the structures of 130 million proteins of other organisms in the next few months. 132 It is even considered to be the second-largest breakthrough in life sciences after the human genome project 133 and will facilitate drug development among other things.

AI makes modern agriculture smart

Agriculture is entering a fourth revolution, termed agriculture 4.0 or smart agriculture, benefiting from the arrival of the big data era as well as the rapid progress of lots of advanced technologies, in particular ML, modern information, and communication technologies. 134 , 135 Applications of DL, information, and sensing technologies in agriculture cover the whole stages of agricultural production, including breeding, cultivation, and harvesting.

Traditional breeding usually exploits genetic variations by searching natural variation or artificial mutagenesis. However, it is hard for either method to expose the whole mutation spectrum. Using DL models trained on the existing variants, predictions can be made on multiple unidentified gene loci. 136 For example, an ML method, multi-criteria rice reproductive gene predictor, was developed and applied to predict coding and lincRNA genes associated with reproductive processes in rice. 137 Moreover, models trained in species with well-studied genomic data (such as Arabidopsis and rice) can also be applied to other species with limited genome information (such as wild strawberry and soybean). 138 In most cases, the links between genotypes and phenotypes are more complicated than we expected. One gene can usually respond to multiple phenotypes, and one trait is generally the product of the synergism between multi-genes and multi-development. For this reason, multi-traits DL models were developed and enabled genomic editing in plant breeding. 139 , 140

It is well known that dynamic and accurate monitoring of crops during the whole growth period is vitally important to precision agriculture. In the new stage of agriculture, both remote sensing and DL play indispensable roles. Specifically, remote sensing (including proximal sensing) could produce agricultural big data from ground, air-borne, to space-borne platforms, which have a unique potential to offer an economical approach for non-destructive, timely, objective, synoptic, long-term, and multi-scale information for crop monitoring and management, thereby greatly assisting in precision decisions regarding irrigation, nutrients, disease, pests, and yield. 141 , 142 DL makes it possible to simply, efficiently, and accurately discover knowledge from massive and complicated data, especially for remote sensing big data that are characterized with multiple spatial-temporal-spectral information, owing to its strong capability for feature representation and superiority in capturing the essential relation between observation data and agronomy parameters or crop traits. 135 , 143 Integration of DL and big data for agriculture has demonstrated the most disruptive force, as big as the green revolution. As shown in Figure 7 , for possible application a scenario of smart agriculture, multi-source satellite remote sensing data with various geo- and radio-metric information, as well as abundance of spectral information from UV, visible, and shortwave infrared to microwave regions, can be collected. In addition, advanced aircraft systems, such as unmanned aerial vehicles with multi/hyper-spectral cameras on board, and smartphone-based portable devices, will be used to obtain multi/hyper-spectral data in specific fields. All types of data can be integrated by DL-based fusion techniques for different purposes, and then shared for all users for cloud computing. On the cloud computing platform, different agriculture remote sensing models developed by a combination of data-driven ML methods and physical models, will be deployed and applied to acquire a range of biophysical and biochemical parameters of crops, which will be further analyzed by a decision-making and prediction system to obtain the current water/nutrient stress, growth status, and to predict future development. As a result, an automatic or interactive user service platform can be accessible to make the correct decisions for appropriate actions through an integrated irrigation and fertilization system.

An external file that holds a picture, illustration, etc.
Object name is gr7.jpg

Integration of AI and remote sensing in smart agriculture

Furthermore, DL presents unique advantages in specific agricultural applications, such as for dense scenes, that increase the difficulty of artificial planting and harvesting. It is reported that CNNs and Autoencoder models, trained with image data, are being used increasingly for phenotyping and yield estimation, 144 such as counting fruits in orchards, grain recognition and classification, disease diagnosis, etc. 145 , 146 , 147 Consequently, this may greatly liberate the labor force.

The application of DL in agriculture is just beginning. There are still many problems and challenges for the future development of DL technology. We believe, with the continuous acquisition of massive data and the optimization of algorithms, DL will have a better prospect in agricultural production.

AI in physics

The scale of modern physics ranges from the size of a neutron to the size of the Universe ( Figure 8 ). According to the scale, physics can be divided into four categories: particle physics on the scale of neutrons, nuclear physics on the scale of atoms, condensed matter physics on the scale of molecules, and cosmic physics on the scale of the Universe. AI, also called ML, plays an important role in all physics in different scales, since the use of the AI algorithm will be the main trend in data analyses, such as the reconstruction and analysis of images.

An external file that holds a picture, illustration, etc.
Object name is gr8.jpg

Scale of the physics

Speeding up simulations and identifications of particles with AI

There are many applications or explorations of applications of AI in particle physics. We cannot cover all of them here, but only use lattice quantum chromodynamics (LQCD) and the experiments on the Beijing spectrometer (BES) and the large hadron collider (LHC) to illustrate the power of ML in both theoretical and experimental particle physics.

LQCD studies the nonperturbative properties of QCD by using Monte Carlo simulations on supercomputers to help us understand the strong interaction that binds quarks together to form nucleons. Markov chain Monte Carlo simulations commonly used in LQCD suffer from topological freezing and critical slowing down as the simulations approach the real situation of the actual world. New algorithms with the help of DL are being proposed and tested to overcome those difficulties. 148 , 149 Physical observables are extracted from LQCD data, whose signal-to-noise ratio deteriorates exponentially. For non-Abelian gauge theories, such as QCD, complicated contour deformations can be optimized by using ML to reduce the variance of LQCD data. Proof-of-principle applications in two dimensions have been studied. 150 ML can also be used to reduce the time cost of generating LQCD data. 151

On the experimental side, particle identification (PID) plays an important role. Recently, a few PID algorithms on BES-III were developed, and the ANN 152 is one of them. Also, extreme gradient boosting has been used for multi-dimensional distribution reweighting, muon identification, and cluster reconstruction, and can improve the muon identification. U-Net is a convolutional network for pixel-level semantic segmentation, which is widely used in CV. It has been applied on BES-III to solve the problem of multi-turn curling track finding for the main drift chamber. The average efficiency and purity for the first turn's hits is about 91%, at the threshold of 0.85. Current (and future) particle physics experiments are producing a huge amount of data. Machine leaning can be used to discriminate between signal and overwhelming background events. Examples of data analyses on LHC, using supervised ML, can be found in a 2018 collaboration. 153 To take the potential advantage of quantum computers forward, quantum ML methods are also being investigated, see, for example, Wu et al., 154 and references therein, for proof-of-concept studies.

AI makes nuclear physics powerful

Cosmic ray muon tomography (Muography) 155 is an imaging graphe technology using natural cosmic ray muon radiation rather than artificial radiation to reduce the dangers. As an advantage, this technology can detect high-Z materials without destruction, as muon is sensitive to high-Z materials. The Classification Model Algorithm (CMA) algorithm is based on the classification in the supervised learning and gray system theory, and generates a binary classifier designing and decision function with the input of the muon track, and the output indicates whether the material exists at the location. The AI helps the user to improve the efficiency of the scanning time with muons.

AIso, for nuclear detection, the Cs 2 LiYCl 6 :Ce (CLYC) signal can react to both electrons and neutrons to create a pulse signal, and can therefore be applied to detect both neutrons and electrons, 156 but needs identification of the two particles by analyzing the shapes of the waves, that is n-γ ID. The traditional method has been the PSD (pulse shape discrimination) method, which is used to separate the waves of two particles by analyzing the distribution of the pulse information—such as amplitude, width, raise time, fall time, and the two particles that can be separated when the distribution has two separated Gaussian distributions. The traditional PSD can only analyze single-pulse waves, rather than multipulse waves, when two particles react with CLYC closely. But it can be solved by using an ANN method for classification of the six categories (n,γ,n + n,n + γ,γ + n,γ). Also, there are several parameters that could be used by AI to improve the reconstruction algorithm with high efficiency and less error.

AI-aided condensed matter physics

AI opens up a new avenue for physical science, especially when a trove of data is available. Recent works demonstrate that ML provides useful insights to improve the density functional theory (DFT), in which the single-electron picture of the Kohn-Sham scheme has the difficulty of taking care of the exchange and correlation effects of many-body systems. Yu et al. proposed a Bayesian optimization algorithm to fit the Hubbard U parameter, and the new method can find the optimal Hubbard U through a self-consistent process with good efficiency compared with the linear response method, 157 and boost the accuracy to the near-hybrid-functional-level. Snyder et al. developed an ML density functional for a 1D non-interacting non-spin-polarized fermion system to obtain significantly improved kinetic energy. This method enabled a direct approximation of the kinetic energy of a quantum system and can be utilized in orbital-free DFT modeling, and can even bypass the solving of the Kohn-Sham equation—while maintaining the precision to the quantum chemical level when a strong correlation term is included. Recently, FermiNet showed that the many-body quantum mechanics equations can be solved via AI. AI models also show advantages of capturing the interatom force field. In 2010, the Gaussian approximation potential (GAP) 158 was introduced as a powerful interatomic force field to describe the interactions between atoms. GAP uses kernel regression and invariant many-body representations, and performs quite well. For instance, it can simulate crystallization of amorphous crystals under high pressure fairly accurately. By employing the smooth overlap of the atomic position kernel (SOAP), 159 the accuracy of the potential can be further enhanced and, therefore, the SOAP-GAP can be viewed as a field-leading method for AI molecular dynamic simulation. There are also several other well-developed AI interatomic potentials out there, e.g., crystal graph CNNs provide a widely applicable way of vectorizing crystalline materials; SchNet embeds the continuous-filter convolutional layers into its DNNs for easing molecular dynamic as the potentials are space continuous; DimeNet constructs the directional message passing neural network by adding not only the bond length between atoms but also the bond angle, the dihedral angle, and the interactions between unconnected atoms into the model to obtain good accuracy.

AI helps explore the Universe

AI is one of the newest technologies, while astronomy is one of the oldest sciences. When the two meet, new opportunities for scientific breakthroughs are often triggered. Observations and data analysis play a central role in astronomy. The amount of data collected by modern telescopes has reached unprecedented levels, even the most basic task of constructing a catalog has become challenging with traditional source-finding tools. 160 Astronomers have developed automated and intelligent source-finding tools based on DL, which not only offer significant advantages in operational speed but also facilitate a comprehensive understanding of the Universe by identifying particular forms of objects that cannot be detected by traditional software and visual inspection. 160 , 161

More than a decade ago, a citizen science project called “Galaxy Zoo” was proposed to help label one million images of galaxies collected by the Sloan Digital Sky Survey (SDSS) by posting images online and recruiting volunteers. 162 Larger optical telescopes, in operation or under construction, produce data several orders of magnitude higher than SDSS. Even with volunteers involved, there is no way to analyze the vast amount of data received. The advantages of ML are not limited to source-finding and galaxy classification. In fact, it has a much wider range of applications. For example, CNN plays an important role in detecting and decoding gravitational wave signals in real time, reconstructing all parameters within 2 ms, while traditional algorithms take several days to accomplish the same task. 163 Such DL systems have also been used to automatically generate alerts for transients and track asteroids and other fast-moving near-Earth objects, improving detection efficiency by several orders of magnitude. In addition, astrophysicists are exploring the use of neural networks to measure galaxy clusters and study the evolution of the Universe.

In addition to the amazing speed, neural networks seem to have a deeper understanding of the data than expected and can recognize more complex patterns, indicating that the “machine” is evolving rather than just learning the characteristics of the input data.

AI in chemistry

Chemistry plays an important “central” role in other sciences 164 because it is the investigation of the structure and properties of matter, and identifies the chemical reactions that convert substances into to other substances. Accordingly, chemistry is a data-rich branch of science containing complex information resulting from centuries of experiments and, more recently, decades of computational analysis. This vast treasure trove of data is most apparent within the Chemical Abstract Services, which has collected more than 183 million unique organic and inorganic substances, including alloys, coordination compounds, minerals, mixtures, polymers, and salts, and is expanding by addition of thousands of additional new substances daily. 165 The unlimited complexity in the variety of material compounds explains why chemistry research is still a labor-intensive task. The level of complexity and vast amounts of data within chemistry provides a prime opportunity to achieve significant breakthroughs with the application of AI. First, the type of molecules that can be constructed from atoms are almost unlimited, which leads to unlimited chemical space 166 ; the interconnection of these molecules with all possible combinations of factors, such as temperature, substrates, and solvents, are overwhelmingly large, giving rise to unlimited reaction space. 167 Exploration of the unlimited chemical space and reaction space, and navigating to the optimum ones with the desired properties, is thus practically impossible solely from human efforts. Secondly, in chemistry, the huge assortment of molecules and the interplay of them with the external environments brings a new level of complexity, which cannot be simply predicted using physical laws. While many concepts, rules, and theories have been generalized from centuries of experience from studying trivial (i.e., single component) systems, nontrivial complexities are more likely as we discover that “more is different” in the words of Philip Warren Anderson, American physicist and Nobel Laureate. 168 Nontrivial complexities will occur when the scale changes, and the breaking of symmetry in larger, increasingly complex systems, and the rules will shift from quantitative to qualitative. Due to lack of systematic and analytical theory toward the structures, properties, and transformations of macroscopic substances, chemistry research is thus, incorrectly, guided by heuristics and fragmental rules accumulated over the previous centuries, yielding progress that only proceeds through trial and error. ML will recognize patterns from large amounts of data; thereby offering an unprecedented way of dealing with complexity, and reshaping chemistry research by revolutionizing the way in which data are used. Every sub-field of chemistry, currently, has utilized some form of AI, including tools for chemistry research and data generation, such as analytical chemistry and computational chemistry, as well as application to organic chemistry, catalysis, and medical chemistry, which we discuss herein.

AI breaks the limitations of manual feature selection methods

In analytical chemistry, the extraction of information has traditionally relied heavily on the feature selection techniques, which are based on prior human experiences. Unfortunately, this approach is inefficient, incomplete, and often biased. Automated data analysis based on AI will break the limitations of manual variable selection methods by learning from large amounts of data. Feature selection through DL algorithms enables information extraction from the datasets in NMR, chromatography, spectroscopy, and other analytical tools, 169 thereby improving the model prediction accuracy for analysis. These ML approaches will greatly accelerate the analysis of materials, leading to the rapid discovery of new molecules or materials. Raman scattering, for instance, since its discovery in the 1920s, has been widely employed as a powerful vibrational spectroscopy technology, capable of providing vibrational fingerprints intrinsic to analytes, thus enabling identification of molecules. 170 Recently, ML methods have been trained to recognize features in Raman (or SERS) spectra for the identity of an analyte by applying DL networks, including ANN, CNN, and fully convolutional network for feature engineering. 171 For example, Leong et al. designed a machine-learning-driven “SERS taster” to simultaneously harness useful vibrational information from multiple receptors for enhanced multiplex profiling of five wine flavor molecules at ppm levels. Principal-component analysis is employed for the discrimination of alcohols with varying degrees of substitution, and supported with vector machine discriminant analysis, is used to quantitatively classify all flavors with 100% accuracy. 172 Overall, AI techniques provide the first glimmer of hope for a universal method for spectral data analysis, which is fast, accurate, objective and definitive and with attractive advantages in a wide range of applications.

AI improves the accuracy and efficiency for various levels of computational theory

Complementary to analytical tools, computational chemistry has proven a powerful approach for using simulations to understand chemical properties; however, it is faced with an accuracy-versus-efficiency dilemma. This dilemma greatly limits the application of computational chemistry to real-world chemistry problems. To overcome this dilemma, ML and other AI methods are being applied to improve the accuracy and efficiency for various levels of theory used to describe the effects arising at different time and length scales, in the multi-scaling of chemical reactions. 173 Many of the open challenges in computational chemistry can be solved by ML approaches, for example, solving Schrödinger's equation, 174 developing atomistic 175 or coarse graining 176 potentials, constructing reaction coordinates, 177 developing reaction kinetics models, 178 and identifying key descriptors for computable properties. 179 In addition to analytical chemistry and computational chemistry, several disciplines of chemistry have incorporated AI technology to chemical problems. We discuss the areas of organic chemistry, catalysis, and medical chemistry as examples of where ML has made a significant impact. Many examples exist in literature for other subfields of chemistry and AI will continue to demonstrate breakthroughs in a wide range of chemical applications.

AI enables robotics capable of automating the synthesis of molecules

Organic chemistry studies the structure, property, and reaction of carbon-based molecules. The complexity of the chemical and reaction space, for a given property, presents an unlimited number of potential molecules that can be synthesized by chemists. Further complications are added when faced with the problems of how to synthesize a particular molecule, given that the process relies much on heuristics and laborious testing. Challenges have been addressed by researchers using AI. Given enough data, any properties of interest of a molecule can be predicted by mapping the molecular structure to the corresponding property using supervised learning, without resorting to physical laws. In addition to known molecules, new molecules can be designed by sampling the chemical space 180 using methods, such as autoencoders and CNNs, with the molecules coded as sequences or graphs. Retrosynthesis, the planning of synthetic routes, which was once considered an art, has now become much simpler with the help of ML algorithms. The Chemetica system, 181 for instance, is now capable of autonomous planning of synthetic routes that are subsequently proven to work in the laboratory. Once target molecules and the route of synthesis are determined, suitable reaction conditions can be predicted or optimized using ML techniques. 182

The integration of these AI-based approaches with robotics has enabled fully AI-guided robotics capable of automating the synthesis of small organic molecules without human intervention Figure 9 . 183 , 184

An external file that holds a picture, illustration, etc.
Object name is gr9.jpg

A closed loop workflow to enable automatic and intelligent design, synthesis, and assay of molecules in organic chemistry by AI

AI helps to search through vast catalyst design spaces

Catalytic chemistry originates from catalyst technologies in the chemical industry for efficient and sustainable production of chemicals and fuels. Thus far, it is still a challenging endeavor to make novel heterogeneous catalysts with good performance (i.e., stable, active, and selective) because a catalyst's performance depends on many properties: composition, support, surface termination, particle size, particle morphology, atomic coordination environment, porous structure, and reactor during the reaction. The inherent complexity of catalysis makes discovering and developing catalysts with desired properties more dependent on intuition and experiment, which is costly and time consuming. AI technologies, such as ML, when combined with experimental and in silico high-throughput screening of combinatorial catalyst libraries, can aid catalyst discovery by helping to search through vast design spaces. With a well-defined structure and standardized data, including reaction results and in situ characterization results, the complex association between catalytic structure and catalytic performance will be revealed by AI. 185 , 186 An accurate descriptor of the effect of molecules, molecular aggregation states, and molecular transport, on catalysts, could also be predicted. With this approach, researchers can build virtual laboratories to develop new catalysts and catalytic processes.

AI enables screening of chemicals in toxicology with minimum ethical concerns

A more complicated sub-field of chemistry is medical chemistry, which is a challenging field due to the complex interactions between the exotic substances and the inherent chemistry within a living system. Toxicology, for instance, as a broad field, seeks to predict and eliminate substances (e.g., pharmaceuticals, natural products, food products, and environmental substances), which may cause harm to a living organism. Living organisms are already complex, nearly any known substance can cause toxicity at a high enough exposure because of the already inherent complexity within living organisms. Moreover, toxicity is dependent on an array of other factors, including organism size, species, age, sex, genetics, diet, combination with other chemicals, overall health, and/or environmental context. Given the scale and complexity of toxicity problems, AI is likely to be the only realistic approach to meet regulatory body requirements for screening, prioritization, and risk assessment of chemicals (including mixtures), therefore revolutionizing the landscape in toxicology. 187 In summary, AI is turning chemistry from a labor-intensive branch of science to a highly intelligent, standardized, and automated field, and much more can be achieved compared with the limitation of human labor. Underlying knowledge with new concepts, rules, and theories is expected to advance with the application of AI algorithms. A large portion of new chemistry knowledge leading to significant breakthroughs is expected to be generated from AI-based chemistry research in the decades to come.

Conclusions

This paper carries out a comprehensive survey on the development and application of AI across a broad range of fundamental sciences, including information science, mathematics, medical science, materials science, geoscience, life science, physics, and chemistry. Despite the fact that AI has been pervasively used in a wide range of applications, there still exist ML security risks on data and ML models as attack targets during both training and execution phases. Firstly, since the performance of an ML system is highly dependent on the data used to train it, these input data are crucial for the security of the ML system. For instance, adversarial example attacks 188 providing malicious input data often lead the ML system into making false judgments (predictions or categorizations) with small perturbations that are imperceptible to humans; data poisoning by intentionally manipulating raw, training, or testing data can result in a decrease in model accuracy or lead to other error-specific attack purposes. Secondly, ML model attacks include backdoor attacks on DL, CNN, and federated learning that manipulate the model's parameters directly, as well as model stealing attack, model inversion attack, and membership inference attack, which can steal the model parameters or leak the sensitive training data. While a number of defense techniques against these security threats have been proposed, new attack models that target ML systems are constantly emerging. Thus, it is necessary to address the problem of ML security and develop robust ML systems that remain effective under malicious attacks.

Due to the data-driven character of the ML method, features of the training and testing data must be drawn from the same distribution, which is difficult to guarantee in practice. This is because, in practical application, the data source might be different from that in the training dataset. In addition, the data feature distribution may drift over time, which leads to a decline of the performance of the model. Moreover, if the model is trained with only new data, it will lead to catastrophic “forgetting” of the model, which means the model only remembers the new features and forgets the previously learned features. To solve this problem, more and more scholars pay attention on how to make the model have the ability of lifelong learning, that is, a change in the computing paradigm from “offline learning + online reasoning” to “online continuous learning,” and thus give the model have the ability of lifelong learning, just like a human being.

Acknowledgments

This work was partially supported by the National Key R&D Program of China (2018YFA0404603, 2019YFA0704900, 2020YFC1807000, and 2020YFB1313700), the Youth Innovation Promotion Association CAS (2011225, 2012006, 2013002, 2015316, 2016275, 2017017, 2017086, 2017120, 2017204, 2017300, 2017399, 2018356, 2020111, 2020179, Y201664, Y201822, and Y201911), NSFC (nos. 11971466, 12075253, 52173241, and 61902376), the Foundation of State Key Laboratory of Particle Detection and Electronics (SKLPDE-ZZ-201902), the Program of Science & Technology Service Network of CAS (KFJ-STS-QYZX-050), the Fundamental Science Center of the National Nature Science Foundation of China (nos. 52088101 and 11971466), the Scientific Instrument Developing Project of CAS (ZDKYYQ20210003), the Strategic Priority Research Program (B) of CAS (XDB33000000), the National Science Foundation of Fujian Province for Distinguished Young Scholars (2019J06023), the Key Research Program of Frontier Sciences, CAS (nos. ZDBS-LY-7022 and ZDBS-LY-DQC012), the CAS Project for Young Scientists in Basic Research (no. YSBR-005). The study is dedicated to the 10th anniversary of the Youth Innovation Promotion Association of the Chinese Academy of Sciences.

Author contributions

Y.X., Q.W., Z.A., Fei W., C.L., Z.C., J.M.T., and J.Z. conceived and designed the research. Z.A., Q.W., Fei W., Libo.Z., Y.W., F.D., and C.W.-Q. wrote the “ AI in information science ” section. Xin.L. wrote the “ AI in mathematics ” section. J.Q., K.H., W.S., J.W., H.X., Y.H., and X.C. wrote the “ AI in medical science ” section. E.L., C.F., Z.Y., and M.L. wrote the “ AI in materials science ” section. Fang W., R.R., S.D., M.V., and F.K. wrote the “ AI in geoscience ” section. C.H., Z.Z., L.Z., T.Z., J.D., J.Y., L.L., M.L., and T.H. wrote the “ AI in life sciences ” section. Z.L., S.Q., and T.A. wrote the “ AI in physics ” section. X.L., B.Z., X.H., S.C., X.L., W.Z., and J.P.L. wrote the “ AI in chemistry ” section. Y.X., Q.W., and Z.A. wrote the “Abstract,” “ introduction ,” “ history of AI ,” and “ conclusions ” sections.

Declaration of interests

The authors declare no competing interests.

Published Online: October 28, 2021

NTRS - NASA Technical Reports Server

Available downloads, related records.

IMAGES

  1. paper on AI

    technical research paper on artificial intelligence

  2. (PDF) Artificial Intelligence in Healthcare

    technical research paper on artificial intelligence

  3. (PDF) Artificial Intelligence Technologies in Newspaper Exclusive Topic

    technical research paper on artificial intelligence

  4. The Research and Application of Artificial Intelligence in the Field of

    technical research paper on artificial intelligence

  5. (PDF) A Review of Artificial Intelligence Methods for Data Science and

    technical research paper on artificial intelligence

  6. (PDF) Impact of Artificial Intelligence, Robotics, and Automation on

    technical research paper on artificial intelligence

VIDEO

  1. Solution of Artificial Intelligence Question Paper || AI || 843 Class 12 || CBSE Board 2023-24

  2. Solution of Artificial Intelligence Question Paper || AI || 843 Class 12 || CBSE Board 2022-23

  3. Best Ai Website For Research #shorts #research #ai #website

  4. How Artificial Intelligence is Reshaping Our World

  5. Introduction to Expert Systems (IT)

  6. Technical research paper writing briefly explained -tips and characteristics of research paper

COMMENTS

  1. The role of Artificial Intelligence in future technology

    at our disposal, AI is going to add a new level of ef ficiency and. sophistication to future technologies. One of the primary goals of AI field is to produce fully au-. tonomous intelligent ...

  2. Artificial intelligence: A powerful paradigm for scientific research

    Artificial intelligence (AI) is a rapidly evolving field that has transformed various domains of scientific research. This article provides an overview of the history, applications, challenges, and opportunities of AI in science. It also discusses how AI can enhance scientific creativity, collaboration, and communication. Learn more about the potential and impact of AI in science by reading ...

  3. AIJ

    The journal of Artificial Intelligence (AIJ) welcomes papers on broad aspects of AI that constitute advances in the overall field including, but not limited to, cognition and AI, automated reasoning and inference, case-based reasoning, commonsense reasoning, computer vision, constraint processing, ethical AI, heuristic search, human interfaces, intelligent robotics, knowledge representation ...

  4. PDF The Impact of Artificial Intelligence on Innovation

    ABSTRACT. Artificial intelligence may greatly increase the efficiency of the existing economy. But it may have an even larger impact by serving as a new general-purpose "method of invention" that can reshape the nature of the innovation process and the organization of R&D.

  5. Scientific discovery in the age of artificial intelligence

    Artificial intelligence (AI) is being increasingly integrated into scientific discovery to augment and accelerate research, helping scientists to generate hypotheses, design experiments, collect ...

  6. Journal of Artificial Intelligence Research

    The Journal of Artificial Intelligence Research (JAIR) is dedicated to the rapid dissemination of important research results to the global artificial intelligence (AI) community. The journal's scope encompasses all areas of AI, including agents and multi-agent systems, automated reasoning, constraint processing and search, knowledge ...

  7. Artificial Intelligence in the 21st Century

    The field of artificial intelligence (AI) has shown an upward trend of growth in the 21st century (from 2000 to 2015). The evolution in AI has advanced the development of human society in our own time, with dramatic revolutions shaped by both theories and techniques. However, the multidisciplinary and fast-growing features make AI a field in which it is difficult to be well understood. In this ...

  8. Artificial intelligence and machine learning research ...

    Also, the bachelor program of Computer Science has tracks in Artificial Intelligence and Cyber Security which was launched in Fall 2020 semester. Additionally, Energy & Technology and Smart Building Research Centers were established to support innovation in the technology and energy sectors.

  9. Artificial intelligence in information systems research: A systematic

    The aim of this research question is to identify the number of academic studies involving Artificial Intelligence and Machine Learning in the field of Information Systems, specifically those between the years 2005 and 2020 (see Fig. 4.).

  10. AI-Based Modeling: Techniques, Applications and Research ...

    Artificial intelligence (AI) is a leading technology of the current age of the Fourth Industrial Revolution (Industry 4.0 or 4IR), with the capability of incorporating human behavior and intelligence into machines or systems. Thus, AI-based modeling is the key to build automated, intelligent, and smart systems according to today's needs. To solve real-world issues, various types of AI such ...

  11. Generative Artificial Intelligence: Trends and Prospects

    Generative artificial intelligence can make powerful artifacts when used at scale, but developing trust in these artifacts and controlling their creation are es ... Papers. 10004. Full. Text Views. Alerts. Alerts. Manage Content Alerts . Add to Citation Alerts . Abstract. Authors. ... IEEE is the world's largest technical professional ...

  12. Impact of Artificial Intelligence: Applications, Transformation

    Abstract: In recent times Artificial Intelligence (AI) has climbed to the forefront of enterprises' technology priorities, owing to the presence of a large amount of data at disposal as well as the rapid advances in tools and capabilities. With the growing AI adoption, some companies are capturing value and generating revenue from AI at an enterprise level, while many cost reductions are at ...

  13. The present and future of AI

    The 2021 report is the second in a series that will be released every five years until 2116. Titled "Gathering Strength, Gathering Storms," the report explores the various ways AI is increasingly touching people's lives in settings that range from movie recommendations and voice assistants to autonomous driving and automated medical ...

  14. Technical Aspects of Artificial Intelligence: An Understanding ...

    As a background paper, it provides the technological basis for the Group's ongoing research relating thereto. The current version summarises insights gained from background literature research, interviews with practitioners and a workshop conducted in June 2019 in which experts in the field of artificial intelligence participated.

  15. Life

    Artificial intelligence (AI) has emerged as a powerful tool in healthcare significantly impacting practices from diagnostics to treatment delivery and patient management. This article examines the progress of AI in healthcare, starting from the field's inception in the 1960s to present-day innovative applications in areas such as precision medicine, robotic surgery, and drug development.

  16. [PDF] Technical Aspects of Artificial Intelligence: An Understanding

    An overview of artificial intelligence with a special focus on machine learning as a currently predominant subfield thereof is provided, which provides the technological basis for the Research Group on the Regulation of the Digital Economy of the Max Planck Institute for Innovation and Competition's ongoing research relating thereto. The present Q&A paper aims at providing an overview of ...

  17. Artificial intelligence in healthcare: transforming the practice of

    Artificial intelligence (AI) is a powerful and disruptive area of computer science, with the potential to fundamentally transform the practice of medicine and the delivery of healthcare. In this review article, we outline recent breakthroughs in the application of AI in healthcare, describe a roadmap to building effective, reliable and safe AI ...

  18. AI Index Report

    Mission. The AI Index report tracks, collates, distills, and visualizes data related to artificial intelligence (AI). Our mission is to provide unbiased, rigorously vetted, broadly sourced data in order for policymakers, researchers, executives, journalists, and the general public to develop a more thorough and nuanced understanding of the complex field of AI.

  19. Industry 4.0 Transformation: Analysing the Impact of Artificial ...

    The importance of artificial intelligence in the banking industry is reflected in the speed at which financial institutions are adopting and implementing AI solutions to improve their services and adapt to new market demands. The aim of this research is to conduct a bibliometric analysis of the involvement of artificial intelligence in the banking sector to provide a comprehensive overview of ...

  20. Using artificial intelligence in academic writing and research: An

    Keywords including "artificial intelligence," "academic writing," and "research" were used to find articles published in English since 2019. This search focused on identifying peer-reviewed articles, review papers, and empirical studies that explored AI's application in academic writing and research.

  21. [2404.13501] A Survey on the Memory Mechanism of Large Language Model

    Large language model (LLM) based agents have recently attracted much attention from the research and industry communities. Compared with original LLMs, LLM-based agents are featured in their self-evolving capability, which is the basis for solving real-world problems that need long-term and complex agent-environment interactions. The key component to support agent-environment interactions is ...

  22. PDF Research Paper on Artificial Intelligence & Its Applications

    Artificial intelligence forms the basis for all computer learning and is the future of all complex decision making. This paper examines features of artificial Intelligence, introduction, definitions of AI, history, applications, growth and achievements. KEYWORDS-machine learning,deep learning,neural networks,Natural Language Processing and ...

  23. AI Tools for Research

    The resources described in the table represent an incomplete list of tools specifically geared towards exploring and synthesizing research. As generative AI becomes more integrated in online search tools, even the very early stages of research and topic development could incorporate AI.If you have any questions about using these tools for your research, please Email a Librarian.

  24. The Ethics of Artificial Intelligence: exacerbated problems ...

    Floridi, Luciano, The Ethics of Artificial Intelligence: exacerbated problems, renewed problems, unprecedented problems - Introduction to the Special Issue of the American Philosophical Quarterly dedicated to The Ethics of AI (April 20, 2024). ... Research Paper Series. Follow. Centre for Digital Ethics (CEDE) Research Paper Series. Subscribe ...

  25. Design of highly functional genome editors by modeling the ...

    Gene editing has the potential to solve fundamental challenges in agriculture, biotechnology, and human health. CRISPR-based gene editors derived from microbes, while powerful, often show significant functional tradeoffs when ported into non-native environments, such as human cells. Artificial intelligence (AI) enabled design provides a powerful alternative with potential to bypass ...

  26. Artificial intelligence: A powerful paradigm for scientific research

    Abstract. Artificial intelligence (AI) coupled with promising machine learning (ML) techniques well known from computer science is broadly affecting many aspects of various fields including science and technology, industry, and even our day-to-day life. The ML techniques have been developed to analyze high-throughput data with a view to ...

  27. The Application of Artificial Intelligence Deep Learning to Visually

    Recent advances in Artificial Intelligence (AI) are changing the World. Novel approaches to training AI systems have led to dramatic reductions in the amount of time required. Training an AI system could take years and teams of people using traditional methods, but with the advancements of Deep Learning (DL) models this training can now be accomplished by an individual in a matter of minutes.