U.S. flag

An official website of the United States government

The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

  • Publications
  • Account settings

Preview improvements coming to the PMC website in October 2024. Learn More or Try it out now .

  • Advanced Search
  • Journal List
  • Sensors (Basel)

Logo of sensors

Movie Recommender Systems: Concepts, Methods, Challenges, and Future Directions

Sambandam jayalakshmi.

1 Department of Computer Science and Engineering, Vel Tech Multi Tech Dr. Rangarajan Dr. Sakunthala Engineering College, Chennai 600 062, India; gro.hcetitlumhcetlev@imhskalayaj (S.J.); gro.hcetitlumhcetlev@nhsenag (N.G.)

Narayanan Ganesh

Robert Čep.

2 Department of Machining, Assembly and Engineering Metrology, Faculty of Mechanical Engineering, VSB-Technical University of Ostrava, 708 00 Ostrava, Czech Republic

Janakiraman Senthil Murugan

3 Department of Computer Science and Engineering, Vel Tech High Tech Dr. Rangarajan Dr. Sakunthala Engineering College, Chennai 600 062, India; [email protected]

Associated Data

Not applicable.

Movie recommender systems are meant to give suggestions to the users based on the features they love the most. A highly performing movie recommendation will suggest movies that match the similarities with the highest degree of performance. This study conducts a systematic literature review on movie recommender systems. It highlights the filtering criteria in the recommender systems, algorithms implemented in movie recommender systems, the performance measurement criteria, the challenges in implementation, and recommendations for future research. Some of the most popular machine learning algorithms used in movie recommender systems such as K -means clustering, principal component analysis, and self-organizing maps with principal component analysis are discussed in detail. Special emphasis is given to research works performed using metaheuristic-based recommendation systems. The research aims to bring to light the advances made in developing the movie recommender systems, and what needs to be performed to reduce the current challenges in implementing the feasible solutions. The article will be helpful to researchers in the broad area of recommender systems as well as practicing data scientists involved in the implementation of such systems.

1. Introduction

Modern technology has revolutionized the volume, variety, and velocity at which data are generated. Digitalization of day-to-day experiences has led to the big data era. However, the enormous data have also led to the problem of information overload. Information overload may be defined as the state of being overwhelmed by the sheer volume of data presented to an average human for processing and decision making. Data mining methods can aid in obtaining and processing the relevant data and deal with the issue of information overload. Perhaps the most widely exploited tool among data mining methods is recommender systems.

Recommender systems work by assessing the available information about the likely patterns of the users and making suggestions from the information available [ 1 ]. The suggestions from the recommender systems help the system users find what is most suitable for them. Recommender systems are designed to ease product or service searches based on the least information available about the features [ 2 ]. A combination of various factors is used to assess the correlations in patterns and user characteristics to determine the best product suggestions for the customers [ 3 ].

The development of recommender systems depends on the field of application. The major application is in e-commerce websites where they suggest to the users the products or services based on the information available such as past search, age, gender, and other preferences [ 4 ]. They are also applied in job search platforms where the website suggests to a candidate the best possible positions fit for the skills. Since various industries have moved from an age of little available data to the era of big data, the junk information available is so much that it can delay the decision-making process. The recommender systems are typically made to ease the information search over the online systems so that the users find a more convenient way to connect to their preferences [ 5 ].

One of the applications of recommender systems is suggesting movies to watch to customers based on their preferences data. Movie recommender systems work by assessing the characteristic features of the users to make endorsements to the customers on what is best suited for them. It works by assessing the age, the previous preferences, gender, the content, context, and other demographic data to propose the movies. It checks the similarity among the users and items in the system to determine what could best fit the new user [ 6 ]. For example, a child will most likely receive recommendations for movies that children watch such as cartoons and animations based on the best similarity index for the children. Apart from that, children of various ages have different types of cartoons/animations to watch, and the recommender systems will propose the best depending on what other children of the same age are watching.

Movie recommender systems have helped the users overcome the chunk of information online to find only what is suited for them [ 7 ]. They use data mining techniques that match the similarities and help users find what is best suited for them [ 8 ]. Various criteria determine how the recommender systems work. The criteria are based on machine learning or deep learning algorithms that are used in matching the similarities before the suggestions are made. The algorithms achieve different levels of accuracy and require different computational times to retrieve the suggestions. Various computational algorithms have been proposed and used to increase the efficiency of recommender systems e [ 9 ]. However, each algorithm has its advantages and disadvantages; these make using the systems meet various needs based on their strengths. To reduce the limitations of each, the algorithms may be combined so that they perform better in making the recommendations [ 10 ].

This review paper aims to assess the challenges of recommender systems and make propositions to increase the accuracy of the systems. It assesses the recommendation approaches, the evaluation criteria of their efficiency, the challenges of these approaches, and possible solutions. A systematic literature review is conducted to determine the findings of the operational characteristics of the various recommendation approaches used and the performance criteria. The author aims to suggest the best solutions to make the approaches work better to achieve the operational expectations of the users.

The rest of the paper is arranged as follows: Section 2 details the methodology followed in this article. Section 3 describes different types of recommendation systems. Section 4 highlights some of the most popular machine learning algorithms used in movie recommender systems. Section 5 details the commonly used metaheuristic algorithms in movie recommendation tasks. Model metrics used for verifying the accuracy of recommendation systems are discussed in Section 6 . Some common problems with recommendation systems are discussed in detail in Section 7 . A critical discussion is presented in Section 8 . Section 9 presents the concluding remarks and the limitations of the study.

2. Review Methodology

This section describes the method used in obtaining information for the literature review. Peer-reviewed sources were used to gather information about movie recommender systems. The databases used were EBSCO Academic Search Premier, ScienceDirect, IEEE Library, ResearchGate, SpringerLink, and the ACM Portal. Google Scholar was also used to find leads to specific aspects of recommender systems for review.

Search Descriptors: Some of the keywords used in finding information about the movie recommender systems were “movie recommender systems”, “movie personalization”, “algorithms used in movie recommender systems”, “filtering techniques in movie recommender systems”, and “machine learning model metrics and measurement criteria”.

Inclusion Criteria: The inclusion criteria were papers that had information about recommender systems, the information had to be from published peer-reviewed sources. The paper abstracts were read to verify the validity of their information for use in this study. The exclusion criteria were papers that had grey literature on recommendation systems. The inclusion criteria for the articles and the methodology steps are summarized in Table 1 and Figure 1 .

An external file that holds a picture, illustration, etc.
Object name is sensors-22-04904-g001.jpg

Steps in conducting the systematic review.

Selection criteria for including sources in this review.

3. Movie Recommendation Systems

Movie recommendation work by filtering out data that is irrelevant and including only that which have matching characteristics or features [ 11 ]. As highlighted earlier, the world has moved from an era of scarcity of data online to an exponential growth in data. The systems work by manipulating the data to make sure it is efficient to drive data-driven decisions. In the jungle of available information about products, the systems need to evaluate what fits a certain customer and what does not. The systems go further in target and retargeting marketing to increase product viewership and hence increase the chance of the customers purchasing [ 12 ].

It is important for the developers to come up with systems that have higher performance characteristics and efficiency in matching the similarities in customer wants to seal the product sales or movie viewership [ 7 ]. The major types of filtering methods are collaborative filtering, content-based filtering, context-based filtering, and hybrid filtering.

3.1. Collaborative Filtering

Collaborative filtering works by matching the similarities in items and users. It looks at the characteristics of the users and the characteristics of the items the users have watched or searched for before [ 13 ]. In general, latent features obtained from rating matrices are looked at. In movie recommender systems, the recommendations are made based on the user information and what other people with similar user information are watching. For example, collaborative filtering in movie recommender systems picks the user demographic characteristics such as age, gender, and ethnicity [ 14 ]. Through these features, movie recommendations are made that match other people with similar demographic characteristics and previous user search history. Collaborative filtering suffers from a cold start if the user has not input any information, or the information is too little for any accurate clustering. In these cases, it does not know what to suggest [ 15 ]. The accuracy of the suggestion is also limited because people with similar demographic characteristics may not have similar preferences [ 16 ].

3.2. Content-Based Filtering

In contrast to collaborative filtering, content-based techniques employ user and item feature vectors to make recommendations. The fundamental differences between the two approaches are that content-based systems recommend items based on content features (no need for data about other users; recommendations about niche items, etc.) whereas collaborative filtering is based on user behaviour only and recommends items based on users with similar patterns (no domain knowledge; serendipity, etc.). A content-based filtering method works by making movie proposals to the user based on the content in the movies. It recognizes that clustering in the collaborative filtering recommendations may not match the preferences of the users [ 7 ]. The tastes and preferences of people with similar demographic characteristics are very different; what person X likes may not be similar to what person Y likes to watch. To solve this problem, content-based filtering algorithms give recommendations based on the contents of the movies [ 17 ]. In movie recommendations, some of the contents are the key characters and the genre of the movie.

3.3. Context-Based Filtering

This filtering technology is an improvement of the collaborative filtering method. It assumes that if person A and person B hold the same opinion on issue X, it is most likely that the same people will hold the same opinion/preference/thinking on a different issue Z. For example, if both people are attracted to Christmas movies from Netflix, it is most likely that they will still like Christmas movies by Showmax. The context-based filtering method recommends items with similar features or characteristics because the applications have just been extended to a different context [ 12 ]. It makes the same suggestions though the contexts are different. In most cases, web browsers import bookmarks and other settings when one upgrades from one browser to the next. This represents a change in context, since most of the settings and other items are imported into the new context, and the data available are used in making useful suggestions. Similarly, movie recommender systems may make a similar recommendation based on data from the previous context [ 18 ]. It is worth mentioning here about context-aware recommender systems (CARS), where the concept of context is well defined [ 19 , 20 , 21 , 22 ]. CARS acclimatize to the exact condition in which the recommended item will be used [ 23 , 24 , 25 , 26 ]. In this respect, CARS could avoid recommending a very long film to a user after a stressful day at work or suggest a romantic film if he/she is in the company of his/her partner.

3.4. Hybrid Filtering

This is a filtering technique that applies the concepts of all the other algorithms. It combines both collaborative filtering, content-based filtering, and context-based filtering to overcome the challenges of each method [ 10 ]. It is superior because it achieves higher performance in making the suggestions and also a faster computational time [ 11 ]. For instance, collaborative filtering may lack information about domain dependencies while content-based filtering lacks information about the preferences of the people [ 6 , 9 , 27 ]. A combination of these overcomes these challenges since user behaviour data and the content data are used to come up with recommendations.

4. Machine Leaning Algorithms for Movie Recommendation Systems

These are the algorithms that are used in filtering information and data mining so that the desired outcomes can be achieved. It is essential to understand the working of the information filtering methods so that the right algorithm is selected for the specific task in recommender systems [ 28 ].

4.1. K-Means Clustering

This is one of the simplest collaborative filtering approaches that categorizes the users based on their interests [ 29 ]. It is common for someone who wants to purchase an item to ask someone who has already purchased the product for their opinion. There is a higher chance that the influence of the current owner will affect the preferences and the tastes of the potentially new owner. Similarly, the algorithm compares the interesting features that can be associated with individuals that are classified to be within a group [ 30 ].

K -means clustering uses interests that are common among the users such as age, gender, movie time, history of the previous movies watched, etc. K -means clustering aims to group the features into clusters that represent the characteristics of the group [ 31 ]. If the classification is based on age, the probable K -means clustering will use children, teens, youth, and adult clustering methods. If a client falls within any of these age groups, movies are recommended based on what other people within that age group do. If the clustering depends on age, the closer an age is to the centroid age, the better the classification recommendation. The steps in the classification are measuring the similarity between the user and item features, selection of the neighbours, computing the prediction, and suggesting it [ 14 ].

4.1.1. Measurement of the Similarities

The first step is finding the similarity in the user features that the new user has with the previous system users. The algorithm always has the basic classifications for a beginning, where the user can give inputs and the predictions can be made [ 32 ]. Common features in finding the similarities are age, previous history, and geographical locations. Other recommender systems in movie theatres, including the price, the time to watch the movies, etc., are used in coming up with the means (centroids) for clustering. The distance from the centroids can be based on a Pearson correlation, cosine-based similarities, or an adjustment of the cosine-based similarity. The calculation of the similarity may be item-based or user-based. Item-based computation finds the similarities based on the features in the movies that similar people liked. If it is user-based, the calculation of the centroids is based on the demographic features of the user [ 15 ].

The computation of the similarities between items or users is shown in the mathematical equations below:

The equation above computes the correlation between the user and the item; it computes the closeness of the value to the centroid value. It is assumed that the two items i ∩ j are the correlated features (items or users); the value r j ¯ is the centroid feature, while the value r i is the value of the new user or new feature to be compared through correlation [ 33 ].

4.1.2. Selection of the Neighbours

There is always a consideration when developing the algorithm. The key metrics are the accuracy to obtain and the running time of the algorithm. To increase the accuracy of an algorithm, a large number of neighbours, which increases the computational time of the algorithm, is required. If a smaller computational time is needed, accuracy will be compromised [ 34 ]. To strike a balance, the selection may be threshold-based or use the top-N technique. The threshold technique will run only a specific number (sample number that meets the threshold value) of assessments of the neighbours and predict if that threshold is reached. For example, if the population is 1000, the system will run a prediction from 100 samples and predict out of the 100 samples [ 35 ]. In the top-N technique, only the top number of similarities (N) is run rather than the whole population of neighbours. For example, it will select only the top 10 for suggestions based on the nearest neighbours rather than assessing the whole population [ 36 ].

4.1.3. Prediction Computation

The computation of the subsequent predictions is based on the closest neighbours found in the system database. The prediction is obtained by the formula below:

The prediction or the nearest neighbour to the centroid ( K -mean) is made. In the equation above, the K -means is represented by r u ¯ while the correlation of the other variable on the right-hand side of the equation gives the nearest neighbour, both used in making the suggestion prediction [ 27 ].

4.1.4. Limitations of K -Means Clustering

  • Cold-Start Problem : This is a prediction problem that happens with a new user to the system. There is very little information about the user; hence it is difficult for the system to make any predictions until the user starts feeding some information that can be correlated to and suggestions made based on the user or previous item characteristics. The negative impact is the system accuracy is greatly reduced [ 6 ]. It is because of this problem that new and excellent movies are not recommended to users, or new users do not find what is best for them.
  • Sparsity in the dataset : The recommendation system involves assessing a large amount of data in the movie database. The users only look for a few items in the database; they are not able to use and assess a significant portion of the database to effectively evaluate the features. Apart from that, the users do not rate the movies they watched in the system. It becomes hard for the system to determine if the user liked the movie they watched, or they never liked it because they never left any rating. The negative impact is leaving some of the best movies not recommended in the large dataset since they have not been rated by the user. Moreover, the threshold/top-N techniques leave out the best matching suggestions [ 37 ].
  • Scalability : One of the challenges cited in the selection of the neighbours was balancing the computational time and the accuracy of the system. The K -means filtering technique is accurate when the database has a small number of movies to recommend or few users [ 38 ]. However, with an increase in the number of users and the number of movies, the computational time or the threshold number of items increases; therefore, the computational time increases [ 39 ]. To overcome this disadvantage, computation and training of the algorithm are performed offline so that when the systems are back online, recommendations are made easily [ 40 ].

The K -means filtering algorithm is the most basic collaborative filtering technique. It is from this technique that other filtering concepts are developed. The computation technique to arrive at the predictions may not be the same; the mode of working mimics the K -means nearest neighbour [ 41 ]. The other algorithms are developed to overcome the K -means clustering limitations.

4.2. Principal Component Analysis K-Means

This is a content-based movie filtering technique that improves on the K -means clustering technique. The major components in the movie are used to classify the movies before recommendations can be made to the customers. The K -means algorithm calculates the closeness of a feature to the centroid using the distance from the mean point. However, the principal component analysis creates a covariance matrix to calculate the eigenvectors and eigenvalues [ 42 ]. Therefore, it widens the scalability to find better comparisons to make the movie suggestions [ 43 ]. To illustrate this, assume a K -means algorithm computes the similarity of a single feature at a time. This implies the computational time and accuracy are compromised. However, using PCA, a covariance matrix of various features is created; hence the scalability is increased and computed faster. If there are similarities that fall within the matrix, they can be found easily, and its eigenvector is computed. Suggestions close to such an eigenvector are then made to recommend the movies [ 44 ].

Steps in Conducting Principal Component Analysis

Structure of the tuples.

Where r i is the movie rating, u i is the user characteristics, and i i is the item characteristics (movie characteristics). A and B will give a 2D matrix of dimension m × n matrix, while C will give a 3D matrix of dimension m × n × o .

  • Calculation of the covariance matrix: A covariance matrix of the dimension of the data formulated in the previous step is computed.
  • Calculation of the eigenvectors and eigenvalues : The covariance matrix calculated will be a square matrix of the dimension of the data. It is used to compute the eigenvalues and eigenvectors which characterize the data. The computed eigenvectors are sorted in decreasing order according to the eigenvalues; a future vector is constructed [ 45 ].

4.3. Principal Component Analysis Self-Organizing Maps (PCA-SOM)

Self-organizing maps (SOMs) is a technique based on neural networks; it is an unsupervised learning technique, and there is no need for intervention of humans during the learning phase. It is vital in clustering data without knowing the class memberships in the input data [ 46 ]. The self-organizing feature map (SOFM) is known for detecting the features inherent in particular items which is important for the features in the movie recommender systems. SOM also uses topology-preserving mapping, which implies that the algorithm preserves the relative distance between all the points in the initial dataset [ 47 ]. Therefore, it effectively achieves the objective of transforming the arbitrary dimensions into a 1D or 2D discrete map. PCA is integrated with SOM because it is easier for the PCA to convert the matrices generated by SOM to eigenvectors and eigenvalues for ranking in the order of significance [ 48 ]. The steps in working out PCA-SOM are listed below:

  • Obtain data without any rankings or classifications;
  • Data modelling;
  • SOM classified the data using unsupervised learning to bring together that which has similarities in features;
  • PCA takes over from the classification achieved by SOM, checks the principal components, and comes up with further classifications of the dataset;
  • The decision to make the suggestion.

Initialization: Once the data are obtained, random values are chosen for the weight of the initial vectors. The weights of the vectors represent the neurons in the data, and their values are also computed [ 49 ].

Sampling: A known sample x is drawn from the input space with a known probability. This is the activation pattern that is applied to the lattice. This pattern maps the x dimension to be proportional to the m-pattern in the new lattice [ 49 ].

Similarity Matching: The best matching is found at time step-n using the minimum Euclidean distance between the neuron centroid.

Updating: The synaptic weight of the neurons is adjusted using the formula below:

where η n is the learning rate, h j i x n is the neighbourhood function of i x winner neuron. These two are dynamic to obtain the optimum results.

Apply PCA: After the synaptic weights are derived from the minimum Euclidian distance from the formula above, the PCA process in creating the eigenvalues and eigenvectors is used further in processing the data to obtain a more accurate estimation [ 50 ].

Decision: After similarities are matched, the suggestions are made.

The great features of SOM that make it a good tool in recommender systems are:

Insights into the input space: The method uses unsupervised learning to classify the data by weight vectors and give output in a feature map. The cold starting is significantly reduced [ 51 ]. The user can then input data in light of the initial output features shown.

Topological arrangement: The feature map of SOM works by mapping the field of the input pattern to a spatial location in the output grid [ 52 ].

Density Matching: Once the input is fed into the system, any alterations in input distribution are equally represented in the output grid so that there will be a good representation of the highest density areas with the most matches and lower density areas with fewer matches [ 49 ].

Feature Selection: The SOM algorithm selects the best attributes for the non-linear distribution in the input data so that it can effectively match the similarities to the grids [ 50 ].

4.3.1. Advantages

Since it is based on unsupervised learning, it automatically updates the features and functions [ 53 ]. It is flexible to new input because it learns by itself. It is suitable for unidentified new inputs, for example, new movies that have no ratings or new users where there is no data about them. The new movies may be recommended when the system extracts their features, and the new users will not experience a cold start because they have somewhere to begin on the output feature map [ 54 ]. It is also faster in computation since it easily organizes complex data and makes a good representation of the mapping for easy interpretation.

4.3.2. Disadvantages

The major drawback is that feature classification may not be according to the expected output; therefore, the unsupervised learning classification algorithms have to be initialized often to maintain the relevance of the clustering [ 55 ].

5. Metaheuristic Algorithms for Movie Recommendation Systems

Metaheuristic algorithms are high-level methods or heuristics which have been developed to search, create, or select a heuristic that may produce a satisfactory solution for optimization problems. Metaheuristics find wide usage in almost all aspects of optimization problems. For example, metaheuristics have been used in design optimization [ 56 , 57 , 58 , 59 ], process optimization [ 60 , 61 , 62 , 63 ], structural optimization [ 64 , 65 ], knapsack problems [ 66 , 67 ], workflow scheduling [ 68 ], image segmentation [ 69 , 70 , 71 ], etc.

5.1. Genetic Algorithm

This is a hybrid filtering algorithm that uses the improved K -means clustering and is combined with the genetic algorithm (GA). It uses the PCA technique to partition the high dimensional space into clusters hence reducing the complexity of computations when making intelligent recommendations. The method has higher performance characteristics; hence it makes better recommendations. The steps of the recommendation system are outlined below:

5.1.1. Data Preprocessing Using PCA

The first step is processing the data, extracting it from the original high dimensional space into a linear relatively low space with denser features that carry the information. The PCA feature extraction technique has been very effective. It combines the data represented by the principal component with the highest eigenvalue with the significant information after ranking them. The components with lower significance are ignored but components with higher significance are given prominence. After the linear reduction, only a selected number of components from the rank is fed to the GA-KM algorithm for classification.

5.1.2. Enhanced K -Means Clustering Optimization by Generic Algorithms (GA-KM)

The objective is to make sure that the users/neighbours with like-minded interests or features are grouped. Therefore, it performs it in two stages which are K -means clustering and GA algorithms.

5.1.3. K -Means Clustering

The technique, as discussed, centres its clusters around centroids based on the linear distance from the central feature. The correlation of distance from the central point determines the similarity index. If it is too similar, there is convergence; if there is a high dissimilarity, then the dataset is sparse. As discussed, it suffers a cold start, and its first centroid may be based on the local optimum rather than the global optimum. The steps in K -means clustering are selecting the centroids, assigning objects to the closest clusters, computing the sum of squared distances from the members in the cluster, and checking for convergence in the computed objects. The procedure for computation is similar to that discussed.

5.1.4. Genetic Algorithm

This mimics biological evolution as explained by Darwin’s theory of evolution. The algorithm uses the population of individuals as chromosomes; the chromosomes represent possible solutions to the evolution problem [ 29 ]. Each of the chromosomes contains the genes with the survival ability. Therefore, through natural selection, the chromosomes with the highest quality genes have the highest chance of survival and are fit for reproduction for the next generation. The iterations are based on selection, crossover, and mutation. Selection picks just a proportion of the genes to breed for the next generation. Crossover swaps two parent chromosomes to be recombined into the offspring. Mutation randomly alters the value of a gene to produce offspring. The processes extend the diversity of the offspring. The processes end when the fitness conditions in the environment/context are met.

The GA algorithm is used to prevent premature convergence in the K -means algorithms. The centroids in K -means are considered the chromosomes; the fitness function to evaluate the quality of the solution is:

The fitness value is the sum of the distances of the inner points to the cluster centres. The values are minimized to find the optimal partitions. To find the optimal partitions, the three generic operators precede the construction of the offspring based on the survival fitness principles; convergence occurs when the fitness criterion is satisfied [ 72 ]. The pseudocode of the algorithm is summarized below:

  • Initialization

Parameter initialization: Set the maximum iterations, population size, cluster numbers, probability crossover, probability mutation, and fitness function to minimize the total distance of every sample to its nearest centre;

Population initialization: Randomly generate the initial population for each of the k -centers.

Selection operation;

Cross-over operation;

Mutation operations;

Obtain the initial k -centres with optimal fitness values;

K -means optimization: generate new clusters with k -centres.

When tested with the MovieLens dataset, the algorithm has better performance features especially in reducing the cold-start problem [ 29 ].

5.2. Firefly Algorithm

The algorithm is also bio-inspired from the fireflies and combines it with a fuzzy C -means clustering technique. In the natural world, the fireflies are pulled to the brightest firefly using the light signal. Each firefly pulls the other, but the brightest has the highest attractiveness, and other fireflies are clustered around it. Similarly, the algorithm centres its suggestion on features of the users with the highest attractiveness (highest user ratings) [ 55 ]. If a movie has the highest rating from many users, the movie recommender system will make subsequent recommendations based on movies rated highest by users with similar characteristics [ 73 ]. The algorithm for the recommender system is highlighted below:

  • All the fireflies are unisexual, and every firefly pulls to another firefly.
  • The attractiveness of a firefly depends on its brightness, and the other fireflies will be pulled closer to the brighter one (feature reduction using the firefly algorithm).
  • This brightness is related to a primary function in the FCM.
  • FCM allocates memberships and utilizes them to show data elements from one cluster to another.
  • The FCM separates a finite set of elements X = X 1 … .   X n from the memberships into a set of c fuzzy clusters; hence it comes up with a list of cluster centres C = C 1 …   C 2   . The partition matrix W = W i   0 , 1 ;   i = 1 …   n ;   j = 1 … c expresses the degree to which each element X i is placed into a cluster C j . The aim is to reduce the objective function to optimal. C a r g   ∑ i = 1 C ∑ j = 1 C W i j M   X i − C j 2   (6)
  • Then the fuzzy C -means clustering f l m m = 1 ∑ K = 1 C   x i − c m x i − c k   2 n − 1   (7)

The recommender system efficiency and performance are generally higher than the traditional K -means clustering.

5.3. Artificial Bee Colony

This is a bioinspired algorithm that makes recommendations based on the workings of the bees in finding flowers for the best nectar [ 74 ]. The bees are mainly divided into two groups. Scouting bees go out to scout flowers with the best nectar, and the employee bees follow after the best flowers have been found [ 53 ]. It is worth noting that several scouting bees are sent out and come back with information to the hive regarding the quality of the nectar found. The employee bees will filter out the low-quality nectar from the information and follow the scout bee to the source of the best nectar. Similarly, the artificial bee colony in recommender systems works as an improvement of the K -means clustering algorithm [ 75 ]. In the K -means algorithm, an assumption is made that the data is based on a centroid where the closeness of the feature to the centroid feature determines the recommendation. In an artificial bee colony, there are many centroids (just as there are many flowers), and information from or to these centroids will bring a variety. From this variety, the user may choose what is best suited for them; henceforth, the recommendation system will bring recommendations close to the centroid chosen [ 76 ]. It is a good method to solve the sparsity, scalability and cold-starting problem. The user will choose the best suited feature from the first random set of options available. Subsequent recommendations will depend on the K -means around that particular choice. The steps are summarized below:

  • Initialize the system users and movies in a matrix;
  • Use the K -means clustering to find several centroids of various product features. This finds several centroids for clustering;
  • Selection of the nearest clusters;
  • Calculation of the estimated rating values from the user history;
  • Use the artificial bee colony to select the closest to user likes based on ratings and features;
  • Reclassification of the users for further iterations;
  • Coming up with the recommendations.

ABC determines the community of vectors that explore the similarities in the neighbours. The objective function is then continually reduced when narrowing down to the nearest possibilities [ 77 ]. The aim is to minimize the objective function below:

The objective function is controlled by succeeding iterations determined by the detecting vector z → as below:

From the succeeding iterations, the points with the most similar features are selected, and the system recommends the movies to the user [ 78 ]. For instance, an initial allocation of the centroids may be classified as horror movies, thrillers, comedy, or thrillers. If the client selects thriller movies, a further classification may be Hollywood thriller, Bollywood thriller, etc. If the client selects any from these, the subsequent recommendations will be based on this particular centroid [ 74 ]. As seen, it has an advantage because there are always initial centroids that the user can select that further narrow down the selection. Optimization is reduced by the detecting vector to optimize the suggestion to the most viable suggestions [ 79 ].

5.4. Cuckoo Search

The cuckoo search algorithm is a combination of K -means clustering and the use of Levy’s flight function. In this process, the K -means algorithm divides the MovieLens Dataset into different clusters. This is performed using randomly selected centroids [ 14 ]. Measures such as the Euclidian distance and cosines are used to find the distance between centroids, and the features and/or users are reassigned to the closest cluster with similar characteristics.

The cuckoo search algorithm gets its inspiration from the cuckoo bird. The cuckoo bird does not sit on eggs to hatch; rather, it searches for the best nest with optimal conditions and lays eggs for the host bird to sit on to hatch [ 80 ]. If the host bird identifies the egg, it may throw it away or abandon the whole nest. Similarly, in the recommender algorithm, if the centroid does not present the optimal solution, the centroid is abandoned for a new iteration until no re-assignment happens. The pseudocode for the recommender system is outlined in the procedure below:

5.4.1. K -Means Clustering

  • Initialize the number of k clusters;
  • Random selection of centroids using K -means clustering;
  • While no centroid is changed, assign each data point to the closest centroid and calculate the new centroids;
  • Assign data points to the closest cluster mean.

5.4.2. Cuckoo Search Algorithm

  • Begin the fitness function f x ,   X i =   x 1 ,   x 2 … ;
  • Initialize the random population of n host nests (centroids from K -means);
  • Calculate the fitness function value for each nest;
  • Find the ith cuckoo randomly by Levy flights, and calculate its fitness, Fi ;
  • Select a nest (centroid);
  • If ( F j > F i ) ; replace j with the new solution;
  • The unqualified nests (centroids) are abandoned and new ones built by Levy flights function;
  • The best solutions found are ranked and suggested to the client.

The cuckoo search algorithm has higher precision, recall, and a lower MAE than the PCA- K -means, hence higher performance characteristics. The computations can also be performed offline so that recommendations are made when the system is online, making it faster to make suggestions to the user.

5.5. Grey Wolf Optimizer

This is a recommendation system that is based on mimicking the leadership and hunting tactics of the grey wolfs [ 81 ]. The algorithm first conducts feature selection using the grey wolf optimizer (GWO) method, before clustering using the FCM method [ 82 ]. The algorithm pseudocode is listed below:

  • Load the GWO culture;
  • Initialize the coefficient points r, Q and R;
  • The appropriateness of each explorer is estimated X α   X β   X δ ;
  • Carry out iterations to determine the appropriateness of entire explorer negotiators;
  • Return X α representing the position of centroids by GWO;
  • Randomly select the cluster centres based on fuzzy means;
  • Load the fuzzy clustering formula matrix and estimate f l m m as in the formula below: f l m m = 1 ∑ K = 1 C   x i − c m x i − c k   2 n − 1 (10)
  • Determine midpoints B (k) = [ c m ] with F (k) ; continue with iterations until ||F( k + 1) − F( k ))) < ε;
  • Return to the newly formed cluster centres and make recommendations based on the cluster centres.

This recommender system has a relatively better performance.

5.6. Other Metaheuristic Algorithms

Some researchers have used other metaheuristic algorithms to develop movie recommender systems. For example, Papneja et al. [ 83 ] developed a movie recommendation using a whale optimization algorithm. Tripathi et al. [ 84 ] hybridized a map-reduce-based tournament along with a WOA to achieve a superior recommendation experience.

6. Model Metrics

Various aspects have to be measured apart from the accuracy to make sure that the algorism makes the right predictions. For example, the algorithms may be highly accurate but have too much logarithmic loss. Accuracy is not the only metric to determine the performance efficiency of a model. The metrics are discussed below:

6.1. Mean Absolute Error (MAE)

This is the average difference between the predicted values and the original values. In our case, it is the average difference between the choice of the movie by the customer (user) from the suggestion made (prediction). It gives the variation between the suggestion and what the customer chose. The only disadvantage is it does not give the direction of the error [ 85 ]. Generally, a low mean absolute error is desirable. The mathematical formula for MAE is shown below:

where n s is the number of samples, y ¯ i is the predicted suggestion, and y i is the true feature that the user picks/wants.

6.2. Mean Squared Error (MSE)

This gives the square of the MAE (the square of the average difference between the original values and the predicted values). The advantage is that it makes the large errors more pronounced so that the model focuses on the large errors and their causes [ 86 ]. In addition, it is easier to model the linear programming models in the computation of the slope using the mean absolute error since the differences will be clearer. The formula for the MSE is shown below:

6.3. Log Loss

This is a cross-entropy loss given by probability estimates. It is used in neural networks and recommender system optimizations. It calculates the probability of the suggestions rather than giving only discrete predictions, especially during the ranking of the suggestions [ 35 , 87 ].

6.4. Confusion Matrix

This is one of the most used metrics in determining the accuracy of a model. It is mainly used for classification problems, especially when the outputs expected should have more classifications [ 34 ]. The various characteristics of the confusion matrix are shown in Table 3 .

Characteristics of a confusion matrix.

As noted earlier, the movies are clustered based on the features or the users. In clustering, the true represents the actual classification of the movie, while the predicted gives the predicted classification of the movie before recommendation [ 88 ]. For example, a movie may be classified as a comedy when it is a thriller movie. The user may choose to think it is a comedy because of the characters only to find it is a thriller movie. The movie may be classified as a thriller, and it is a thriller; hence the users get what they want. Such variations happen in the movie classification; hence there is a need for accurate predictions.

True Positive: This points to a case in the recommender system where the actual suggestion was positive, and the client selection of the movie was positive, i.e., the system suggests what the client needed. An example is when the movie is classified as a comedy when the client needed comedy and selects it [ 89 ].

True Negative: This happens when the actual classification is negative, and the prediction is also negative. In our movie recommendation example, if the movie is not a comedy and our algorithm does not classify it as comedy, this output is termed as true negative [ 90 ].

False Positive: This happens when the actual classification is negative, but the system predicts it as positive [ 91 ]. In the movie recommendation system example, the actual case may be the movie is not a comedy; yet the prediction algorithm classifies it as comedy. The actual movie is not a comedy, hence negative, but the prediction is comedy, hence the term positive.

False Negative: This happens when the actual data is true (positive), but the prediction is negative (false) [ 87 ]. The actual classification is true; yet the system predicts it as negative. In the movie recommender systems, the actual movie is a comedy, but the system predicts not comedy.

6.5. Precision

This assesses how many of the true positives are true positives. It gives a fraction of the true positive predictions to the total positive predictions [ 92 ]. The mathematical format is shown below:

6.6. Recall/Sensitivity

This is used as the fraction of true values from the total true values [ 35 ]. In our recommender systems, it gives a fraction of what is classified as comedy out of a total of what is comedy. The fraction of the true values out of the total true values is mathematically modelled as below. Note that false negative is true, but the algorithm classified it as else:

Precision focuses on capturing the classifications correctly, while recall focuses on whether the system is able to capture the features we want though they may not be correctly captured [ 34 ].

6.7. Accuracy

This is the fraction of correct predictions with the total predictions. The accuracy is mathematically modelled as below:

Accuracy as a metric should be used only when the data are balanced and have various classes [ 93 ]. It should not be used as a metric when the data are skewed (have a majority of only one class). For example, if the data are made of 100 movies, and only 5 movies are comedies, the rest are different genres such as thrillers. If the algorithm wrongly predicts all the movies as thrillers, it will be 95% accurate because 95% are thrillers, and only five are comedies. However, from a rational standpoint, the algorithm failed to classify comedy movies. In a recommendation system, the client would think that none of the movies in the datasets is comedy and not watch; yet there are five top-rated comedy movies. Therefore, accuracy should be used when the dataset is well-balanced.

This is simply the harmonic mean of precision and recall. It shows how precise the system was and how it never missed significant instances. If the F 1 score is high, the model performance is high [ 94 ]. The mathematical formula for precision is shown below:

6.9. Computational Time

This is the time that the algorithm takes to come up with the final solution in the prediction. If the systems take long, they may be unreliable if the users want an immediate response before they select the movies to watch. The algorithms should ensure that the best results are found within the shorted period possible [ 53 , 95 ]. A high-performing system returns the most efficient results within a relatively short period. It increases its reliability and dependency. Sometimes the data to be analyzed may be too large to give immediate results. To overcome this limitation, the computations are performed when the system is offline so that the output is shown when it is back online for effective prediction suggestions [ 96 ].

7. Problems Associated with Movie Recommender Systems

7.1. cold start.

The best target audience for a recommendation always depends on the previous user characteristics and the features of the products they watched. A comparison is always made on the characteristics of the user was and the features of the movie and the rating given to the movie. However, in some instances, there are no user characteristics that can be used for a recommendation if the user is new, and nothing is known about them [ 97 ]. Sometimes the user may not be new but has used a different device when accessing the movies’ websites hence there are no stored cookies that can trace the user history.

A cold-start problem occurs when the recommender system is not able to make any suggestions to the user because the user is new or there is no information available about the user [ 98 ]. The problem is common in collaborative filtering which uses only user details to make recommendations of the best movie. The problem is overcome by using content-based filtering, context-based filtering and hybrid filtering. In content-based filtering, the movies are classified by the features such as the main characters, the genre etc. The new user will select any genre based on the content. A context-based filter is based on some of the user information derived from the device such as location, and the operating system, and correlates them with what other users from similar contexts are using. In hybrid filtering, the content, context, and user characteristics are used, therefore, if the recommender system does not have any information about the user, it will use the content and context to make the first recommendations [ 99 ]. The subsequent recommendations will depend on the hints of information available.

7.2. Accuracy

If the database for the recommender systems has few movies, the system will have higher accuracy. If the database is large, there tends to be a lower accuracy because the pool of information searched is too large. To counter the problem, the K -means algorithm reduces the computational time by restricting the computation to a certain number of iterations or selecting only the top-N number of movies for recommendations [ 100 ]. However, if some of the movies have never been rated, they are likely to be biased in the searches [ 101 ].

To increase the system accuracy in making the recommendations, some of the algorithms have employed sophisticated search criteria that will conduct a thorough search and match the product features to user and item characteristics [ 102 ]. In addition, two or more algorithms are combined to allow the perfect user and feature analysis and come up with the desired output. Some of the classification processes such as the PCA-SOM conducts the logarithmic computations offline so that they give recommendations easily when they are online. It reduces the computational time and increases the recommender system accuracy. In modern recommender systems, cold-start problems and the accuracy is solved by having a dialog box where the users can type in the features they need, and recommendations will be given according to what matches the search words [ 103 ].

7.3. Diversity

New movies in the recommender systems rarely appear among those that are suggested to the users. The new excellent movies may end up not being watched because of the lack of being rated by the users. Some of the excellent movies also may not be rated, leaving the recommendation system blank about whether the movies are great for a specific class of watchers or not. To overcome some of these challenges for new movies and movies that are not rated by the users, the diversity aspect is introduced by the recommender systems. In diversification, the new movies or unrated movies are given priority so that they can be noticed by the users [ 104 ]. If they are pleasing, they will be rated and watched more. From this information, the recommender systems will make subsequent decisions on whether to recommend the movie or archive it. From the number of watches, it will also classify the movie according to the features or user characteristics [ 102 ]. Diversity is often used to recommend new debut movies to increase their marketability and presence. It increases the diversity of the user to try out new features or new products.

7.4. Scalability

While sparsity and diversity aim to increase the chances of movies with new features appearing in top searches, scalability aims to solve the problem of increased computational time and increase the performance of the recommender system. Scalability ensures that there is a balance obtained between accuracy and computational time. If it is necessary, some of the classification computations are performed beforehand so that by the time the user comes to select an item to watch, the system makes an almost immediate recommendation with high levels of efficiency [ 105 ].

7.5. Sparsity

Sparsity in the movie data relates to the large volume of movie data in the system, but the users only utilize a few of the features or resources. It is common in K -means clustering where the data is interpolated linearly and gives fewer perspectives to the non-linear data [ 106 ]. Recommendation systems may sometimes be biased by only suggesting the most rated or the most liked movies based on a limited assessment of all the possible cluster features. By using the top-N theory to make the recommendation, those that do not meet the threshold of these can find better algorithms that consider the sparsity of information available [ 107 ]. Some of the methods such as PCA-SOM map all the features on a lattice; hence the user can find most of the features. WOA also widens the scope of the search by using both linear and orthogonal systems to find the desirable features and make recommendations that are sparser and more diverse. Generally, implementing systems that consider non-linearly related data is efficient [ 108 , 109 ].

8. Discussions

The current movie recommendation systems have to work in contexts where there is so much data to be considered before making recommendations. Both user and context information are so varied that the accuracy and precision of the systems are brought to real tests. For example, most of the user information is shared through social media platforms to generate interest in the movies. The MovieLens dataset was created approximately 20 years ago when there was little or no developments in the use of social media where users share movie information to create interest. However, current technologies need to analyze the content, context. and user characteristics in social media platforms to recommend the right movies to the customers. Some companies have taken steps to integrate analytics in their recommender system algorithms. They ask the customer to connect to their social media accounts such as Twitter, YouTube, and Meta not only for advertising but also to analyze the activity of the user on these social media accounts to recommend the best movies for them. Through connecting to these platforms, they analyze the previous history of the user and recommend appropriate movies. This significantly reduces the cold-start problem since new user information can be obtained.

Context-based filtering is gaining traction in the movie recommender systems. It has been adequately used in product recommendations on e-commerce platforms., for example, the most discounted products during black Fridays, the holiday products during Christmas seasons, etc. Movie recommender systems that integrate time stamps to recommend the best movies in various contexts should be studied and developed. For example, it will help recommend movies for children learning during the day and children lullaby movies when it is time to sleep at night.

There are various advances in the use of blockchain technology, and some of these applications may affect the efficacy of algorithms in movie recommender systems. Blockchain technology enhances user privacy through user data encryption; yet collaborative filtering depends on the availability of user information so that it can match the features and characteristics before making recommendations. If user information is concealed by the blockchain systems, the algorithms have to use advanced methods to prevent a decline in the accuracies such as the use of context and content-based filtering.

9. Conclusions

In this article, movie recommender systems have been described and classified. The various types of recommender systems are introduced and discussed. Special emphasis is given to explain in detail the various machine learning and metaheuristic algorithms commonly deployed in movie recommendation research. The various model metrics that summarize the quality of the model are discussed at length. The problems associated with movie recommender systems are also summarized in a structured way and discussed. A total of 77 articles strictly on the area of movie recommender systems are included in the study, and their major conclusions are presented. In addition, 32 other related articles on metaheuristics and recommender systems (not for movies) are also introduced in various sections to present a coherent and meaningful review. One of the limitations of the study is that the Scopus and Web of Science databases were not directly used for selecting the articles for review. In contrast, EBSCO Academic Search Premier, ScienceDirect, IEEE Library, ResearchGate, SpringerLink and the ACM Portal were used for the literature search. Nevertheless, more than 80% of the reviewed papers were found to be indexed in Scopus while more than 60% were available in the Web of Science database.

Funding Statement

This research received no external funding.

Author Contributions

Conceptualization, S.J., N.G. and R.Č.; data curation, S.J., N.G. and J.S.M.; formal analysis, S.J., N.G., R.Č. and J.S.M.; investigation, J.S.M.; methodology, S.J., N.G. and R.Č.; supervision, R.Č.; writing—original draft, S.J., N.G. and J.S.M.; writing—review and editing, R.Č. All authors have read and agreed to the published version of the manuscript.

Institutional Review Board Statement

Informed consent statement, data availability statement, conflicts of interest.

The authors declare no conflict of interest.

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

AIP Publishing Logo

  • Previous Article
  • Next Article

Movie recommendation system using machine learning

[email protected]

[email protected]

[email protected]

[email protected]

  • Article contents
  • Figures & tables
  • Supplementary Data
  • Peer Review
  • Reprints and Permissions
  • Cite Icon Cite
  • Search Site

S. Vidya Sagar Appaji , M. Kusuma Patnaik , M. Dileep Nagendra Kumar , K. Manoj Kumar , K. Satish; Movie recommendation system using machine learning. AIP Conf. Proc. 4 October 2023; 2794 (1): 020028. https://doi.org/10.1063/5.0165752

Download citation file:

  • Ris (Zotero)
  • Reference Manager

One of the most common uses of machine learning is recommendation systems. A broad description of recommendation systems is that they aid in predicting user ratings or preferences for an item. Their major purpose is to provide suggestions based on previously collected data or user preferences. Recommendation systems study how a user interacts with a system or application and forecast the user’s future behavior. They analyze information using information filtering approaches to offer consumers possibly more relevant content. They have grown in popularity in recent years due to their goal of providing individualized experiences to each user and assisting them in making better decisions. In this study, we looked at some of the most often used movie recommendation systems, such as content-based techniques, goods, or products recommended to consumers by comparing and contrasting them based on a variety of characteristics. The collaborative filtering approach looks for commonalities between users and suggests things or goods that are likely to pique their interest. Hybrid filtering is a hybrid of the two methods. We also go through the different similarity metrics and approach performance indicators briefly. This study is concerned with comprehending various strategies for movie suggestions and how recommender systems shape today’s economy.

Sign in via your Institution

Citing articles via, publish with us - request a quote.

movie recommendation system using machine learning research paper

Sign up for alerts

  • Online ISSN 1551-7616
  • Print ISSN 0094-243X
  • For Researchers
  • For Librarians
  • For Advertisers
  • Our Publishing Partners  
  • Physics Today
  • Conference Proceedings
  • Special Topics

pubs.aip.org

  • Privacy Policy
  • Terms of Use

Connect with AIP Publishing

This feature is available to subscribers only.

Sign In or Create an Account

Help | Advanced Search

Computer Science > Information Retrieval

Title: comprehensive movie recommendation system.

Abstract: A recommender system, also known as a recommendation system, is a type of information filtering system that attempts to forecast a user's rating or preference for an item. This article designs and implements a complete movie recommendation system prototype based on the Genre, Pearson Correlation Coefficient, Cosine Similarity, KNN-Based, Content-Based Filtering using TFIDF and SVD, Collaborative Filtering using TFIDF and SVD, Surprise Library based recommendation system technology. Apart from that in this paper, we present a novel idea that applies machine learning techniques to construct a cluster for the movie based on genres and then observes the inertia value number of clusters were defined. The constraints of the approaches discussed in this work have been described, as well as how one strategy overcomes the disadvantages of another. The whole work has been done on the dataset Movie Lens present at the group lens website which contains 100836 ratings and 3683 tag applications across 9742 movies. These data were created by 610 users between March 29, 1996, and September 24, 2018.

Submission history

Access paper:.

  • Download PDF
  • Other Formats

References & Citations

  • Google Scholar
  • Semantic Scholar

DBLP - CS Bibliography

Bibtex formatted citation.

BibSonomy logo

Bibliographic and Citation Tools

Code, data and media associated with this article, recommenders and search tools.

  • Institution

arXivLabs: experimental projects with community collaborators

arXivLabs is a framework that allows collaborators to develop and share new arXiv features directly on our website.

Both individuals and organizations that work with arXivLabs have embraced and accepted our values of openness, community, excellence, and user data privacy. arXiv is committed to these values and only works with partners that adhere to them.

Have an idea for a project that will add value for arXiv's community? Learn more about arXivLabs .

A Review on Movie Recommendation System Using Machine Learning

Ieee account.

  • Change Username/Password
  • Update Address

Purchase Details

  • Payment Options
  • Order History
  • View Purchased Documents

Profile Information

  • Communications Preferences
  • Profession and Education
  • Technical Interests
  • US & Canada: +1 800 678 4333
  • Worldwide: +1 732 981 0060
  • Contact & Support
  • About IEEE Xplore
  • Accessibility
  • Terms of Use
  • Nondiscrimination Policy
  • Privacy & Opting Out of Cookies

A not-for-profit organization, IEEE is the world's largest technical professional organization dedicated to advancing technology for the benefit of humanity. © Copyright 2024 IEEE - All rights reserved. Use of this web site signifies your agreement to the terms and conditions.

Our Journals

TIJER

TIJER - INTERNATIONAL RESEARCH JOURNAL

  • ISSN Approved Journal No: 2349-9249
  • Impact Factor: 8.57 | ESTD Year: 2014
  • Scholarly open access journals, Peer-reviewed and Refereed Journals, Multidisciplinary, Monthly, Indexing in all major database & Metadata, Citation Generator, Digital Object Identifier(DOI)

JNRID

JOURNAL OF NOVEL RESEARCH AND INNOVATIVE DEVELOPMENT

  • ISSN Approved Journal No: 2984-8687
  • Impact Factor: 9.57 | ESTD Year: 2023

IJEDR

INTERNATIONAL JOURNAL OF ENGINEERING DEVELOPMENT AND RESEARCH

  • ISSN Approved Journal No: 2321-9939
  • Impact Factor: 7.37 | ESTD Year: 2013

ARCH

INTERNATIONAL THE ARCH RESEARCH JOURNAL

  • ISSN Approved Journal No: 2795-3173
  • Impact Factor: 8.85 | ESTD Year: 2021

Book cover

International Conference on Intelligent Human Computer Interaction

IHCI 2016: Intelligent Human Computer Interaction pp 94–110 Cite as

Intelligent Movie Recommender System Using Machine Learning

  • Abhishek Mahata 17 ,
  • Nandini Saini 17 ,
  • Sneha Saharawat 17 &
  • Ritu Tiwari 17  
  • Conference paper
  • First Online: 24 January 2017

1472 Accesses

8 Citations

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 10127))

Recommender systems are a representation of user choices for the purpose of suggesting items to view or purchase. The Intelligent movie recommender system that is proposed combines the concept of Human-Computer Interaction and Machine Learning. The proposed system is a subclass of information filtering system that captures facial feature points as well as emotions of a viewer and suggests them movies accordingly. It recommends movies best suited for users as per their age and gender and also as per the genres they prefer to watch. The recommended movie list is created by the cumulative effect of ratings and reviews given by previous users. A neural network is trained to detect genres of movies like horror, comedy based on the emotions of the user watching the trailer. Thus, proposed system is intelligent as well as secure as a user is verified by comparing his face at the time of login with one stored at the time of registration. The system is implemented by a fully dynamic interface i.e. a website that recommends movies to the user [ 22 ].

  • Recommendation systems
  • Sentimental analysis
  • Emotion analysis
  • Age and gender detection
  • Collaborative filtering
  • Content-based filtering
  • Supervised learning

This is a preview of subscription content, log in via an institution .

Buying options

  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
  • Available as EPUB and PDF
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Roy, D., Kundu, A.: Design of movie recommendation system by means of collaborative filtering. Int. J. Emerg. Technol. and Adv. Eng. 3 (2013) http://www.ijetae.com/files/Volume3Issue4/IJETAE-0413-12.pdf

Ho, A.T., Menezes, I.L.L., Tagmouti, Y.: E-mrs: emotion based movie recommender system. In: Proceedings of IADIS eCommerce Conference. University of Washington Both-ell, USA, pp. 1–8 (2006)

Google Scholar  

ANN: Definition of artificial neural networks (2013). http://briandolhansky.com/blog/artificial-neural-networks-linear-regression-part-1

API: Microsoft API for emotion detection (2015a). https://www.microsoft.com/cognitive-services/en-us/emotion-api/documentation

API: Microsoft API for face analysis (2015b). https://dev.projectoxford.ai/docs/services/563879b61984550e40cbbe8d/operations/563879b61984550f30395236

Pang, B., Lee, L., Vaithyanathan, S.: sentiment classification using machine learning techniques. Comput. Intell. 22 , 110–125 (2006)

Article   Google Scholar  

ENCOG: ENCOG framework for training data-set (2016). http://www.heatonresearch.com/encog/

ENCOG: Encog framework for training data-set (2016)

Arora, G., Kumar, A., Devre, G.S., Ghumare, A.: Movie recommendation system based on user’s similarity. IJCSMC J. 3 , 765–770 (2014). Springer

Adomavicius, G., Tuzhilin, A.: Context-aware recommender systems. In: Ricci, F., Rokach, L., Shapira, B., Kantor, P.B. (eds.) Recommender Systems Handbook, pp. 217–253. Springer, New York (2010). http://link.springer.com/chapter/10.10072F978-0-387-85820-3_7

HAIDER: review from haider movie (2014). http://www.imdb.com/title/tt3390572/reviews?ref_=tt_ql_3

IMDB: Data-set for sentiment and review analysis (1990). http://www.imdb.com

Wakil, K., Ali, K., Bakhtyar, R., Alaadin, K.: Improving web movie recommender system based on emotions. (IJACSA) Int. J. Adv. Comput. Sci. Appl. 6 (2015)

Madadipouya, K.: A location-based movie recommender system using collaborative filtering. Int. J. Found. Comput. Sci. Technol. (IJFCST) 5 (2015)

Kennedy, A., Inkpen, D.: Sentiment classification of movie reviews using contextual valence shifters. Int. J. Comput. Appl. 22 , 1467–8640 (2006)

MathSciNet   Google Scholar  

Kumar, M., Yadav, D.K., Singh A., Gupta, V.K.: A movie recommender system: Movrec. In: International Journal Of Computer Applications, ACL 2002 Conference on Empirical Methods in Natural Language Processing, vol. 124, pp. 0975– 8887 (2015)

Amini, M., Nasiri, M., Afzali, M.: Sentiment classification using machine learning techniques. (IJCSIS) Int. J. Comput. Sci. Inf. Secur. 12 (2014)

Samples, Data-set for making neural network (2016). www.recommendme.in/abhishek/neuralnetwork

Science Direct. Definition of cold start problem (2016). http://www.sciencedirect.com/science/article/pii/S0306437914001525

Kim, S.-M., Hovy, E.: Determining the sentiment of opinions. In: Proceedings of the COLING conference (2004). http://www.isi.edu/natural-language/people/hovy/papers/04Coling-opinionvalences.pdf

Sentiment analysis API. In: Natural Language Processing, vol. 124, pp. 0975–8887. http://www.intellexer.com/sentiment_analyzer.html/methods

Link of the website that we have created: http://www.recommendme.in/abhishek/startpage/

Download references

Author information

Authors and affiliations.

ABV-Indian Institute of Information Technology and Management, Gwalior, India

Abhishek Mahata, Nandini Saini, Sneha Saharawat & Ritu Tiwari

You can also search for this author in PubMed   Google Scholar

Corresponding author

Correspondence to Nandini Saini .

Editor information

Editors and affiliations.

IIT Kharagpur , Kharagpur, India

Anupam Basu

IIT Madras , Chennai, India

Sukhendu Das

Telecom SudParis , Evry, France

Patrick Horain

IIT Guwahati , Guwahati, India

Samit Bhattacharya

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer International Publishing AG

About this paper

Cite this paper.

Mahata, A., Saini, N., Saharawat, S., Tiwari, R. (2017). Intelligent Movie Recommender System Using Machine Learning. In: Basu, A., Das, S., Horain, P., Bhattacharya, S. (eds) Intelligent Human Computer Interaction. IHCI 2016. Lecture Notes in Computer Science(), vol 10127. Springer, Cham. https://doi.org/10.1007/978-3-319-52503-7_8

Download citation

DOI : https://doi.org/10.1007/978-3-319-52503-7_8

Published : 24 January 2017

Publisher Name : Springer, Cham

Print ISBN : 978-3-319-52502-0

Online ISBN : 978-3-319-52503-7

eBook Packages : Computer Science Computer Science (R0)

Share this paper

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Publish with us

Policies and ethics

  • Find a journal
  • Track your research

COMMENTS

  1. Movie recommendation and sentiment analysis using machine learning

    This paper describes an approach to a movie recommendation system using Cosine Similarity to recommend similar movies based on the one chosen by the user. Although the existing recommendation systems get the job done, it does not justify if the movie is worth spending time on. ... This paper implements five machine learning classifiers ...

  2. Machine Learning Model for Movie Recommendation System

    CineMatch is software that is embedded in the Netflix website where it applies machine learning and data mining to analyze customers' preferences on movie selection and recommend other movies that ...

  3. Movie Recommender Systems: Concepts, Methods, Challenges, and Future

    Some of the most popular machine learning algorithms used in movie recommender systems such as K-means clustering, principal component analysis, and self-organizing maps with principal component analysis are discussed in detail. Special emphasis is given to research works performed using metaheuristic-based recommendation systems.

  4. Movie recommendation system using machine learning

    The development of a movies recommendation system by considering the history of the movie watched by the users through the collaborative filtering approach and the result revealed that two movies; Get Shorty and Bad Boys are highly recommended by the Users. 1. Highly Influenced. 3 Excerpts.

  5. Hybrid Movie Recommendation System Using Machine Learning

    Every year new movies are released with a varied story-line or a genre which could be of potential interest to viewers. Various online movie or video streaming platforms can keep the customers engaged by recommending movies of the viewer's preference. A key research challenge for Recommender engines is make more targeted recommendations. This paper presents Filtering approaches including ...

  6. Movie recommendation system using machine learning

    Movie recommendation system using machine learning. One of the most common uses of machine learning is recommendation systems. A broad description of recommendation systems is that they aid in predicting user ratings or preferences for an item. Their major purpose is to provide suggestions based on previously collected data or user preferences.

  7. Machine Learning-Based Hybrid Movie Recommendation System

    The paper evaluates the algorithm using data from the MovieLens website, with data divided into training as well as testing sets, and LSTM-CNN algorithm is found to be effective in extracting data features for generating accurate movie recommendations. A machine learning-based movie recommendation model was developed by Ahmed et al. , who used ...

  8. Machine Learning Based Movie Recommendation System

    Recommendation systems plays are an important aspect nowadays due to its wide range of applications. It is widely used by many Companies to deliver better user experience and achieve growth by using it in a right manner. This paper presents information about methodologies and techniques used in recommendation systems. Recommendation systems are classified into 3 distinct types: Collaborative ...

  9. [2112.12463] Comprehensive Movie Recommendation System

    A recommender system, also known as a recommendation system, is a type of information filtering system that attempts to forecast a user's rating or preference for an item. This article designs and implements a complete movie recommendation system prototype based on the Genre, Pearson Correlation Coefficient, Cosine Similarity, KNN-Based, Content-Based Filtering using TFIDF and SVD ...

  10. A Smart Movie Recommendation System Using Machine Learning Predictive

    In this paper, we propose a machine learning-based smart movie recommendation system. We conducted research and made basic recommendations based on similar genres and films that users enjoy. The goal of this work is to learn how to execute feature engineering, missing values handling, data manipulation according to requirements, and ...

  11. Movie Recommendation System using Machine Learning

    The recommendation engine filters information using specific algorithms and recommends high quality content to customers. It starts capturing more consumer behavior and based on that, recommends products that consumers can purchase. Three key strategies are used in our recommendation structures. One Demographic Filtering i.e. They offer general suggestions for each individual, based entirely ...

  12. Movie recommendation system using machine learning

    The algorithms can recommend movies based on one or a combination of twoor more criteria by using collaborativeand content-based filtering techniques in machine learning.The cold-start problem, or ...

  13. (PDF) Movie Recommendation System

    The recommendation system derived into Collaborative Filtering, Content-based, and hybrid-based approaches. This paper classifies collaborative filtering using various approaches like matrix ...

  14. Movie Recommendation System Using Machine Learning and MERN Stack

    4 System Architecture. The system is divided into two units. First being the machine learning recommendation server made using flask framework and is written in Python. Second part is made using the MERN stack which constitutes the web server, front end, and the database.

  15. Movie Recommendation Models Using Machine Learning

    A recommendation system is essentially a kind of data filtering. Recommendation systems are very important in today's world since they give users suggestions based on their interests and needs when they are looking for something. Different types of recommendation systems are utilized on numerous platforms and have become an integral feature of a variety of applications. Movie recommendation ...

  16. Enhancing Sequence Movie Recommendation System Using Deep Learning and

    A flood of information has occurred, making it challenging for people to find and filter their favorite items. Recommendation systems (RSs) have emerged as a solution to this problem; however, traditional Appenrecommendation systems, including collaborative filtering, and content-based filtering, face significant challenges such as data scalability, data scarcity, and the cold-start problem ...

  17. PDF Movie Recommendation System Using Machine Learning, Nlp

    Abstract: In today's digital landscape, recommendation systems play a pivotal role in enhancing user experiences and driving business success. This research delves into the intricate workings of recommendation systems, with a particular focus on the Movie Recommendation System using Machine Learning and Natural Language Processing (NLP).

  18. (PDF) A Movie Recommender System: MOVREC

    Movie recommendation systems are one of the most widely used applications of recommendation systems. ... These systems use data analytics, machine learning algorithms and temporal analysis ...

  19. A Review on Movie Recommendation System Using Machine Learning

    System which gives recommendations usually filters the given data using various methodology and suggests the relevant one to the customer benefit. In day-to-day life people usually use a powered recommendation system in many areas like movies, books, music, news, items etc. In this paper a wide range of work is reviewed in the field of a recommender system for movies where dataset source ...

  20. PDF Movie Recommendation System Using Machine Learning"

    In the proposed system, a movie recommendation system is built using a Machine Learning algorithm that is Cosine Similarity. Based on the user preference that is entered by the user the movies will be recommended. In this recommendation system, we are going to recommend the top 5 movies related to their preference

  21. Movie Recommender System using Machine Learning

    "A Novel Scheme for Movie Recommendation System using User Similarity and Opinion Mining", International Journal of Innovative Technology and Exploring Engineering, vol: 8, 2019, pp: 316-322

  22. Intelligent Movie Recommender System Using Machine Learning

    Abstract. Recommender systems are a representation of user choices for the purpose of suggesting items to view or purchase. The Intelligent movie recommender system that is proposed combines the concept of Human-Computer Interaction and Machine Learning. The proposed system is a subclass of information filtering system that captures facial ...

  23. PDF A Research Paper on Machine Learning based Movie Recommendation System

    International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 08 Issue: 03 | Mar 2021 www.irjet.net p-ISSN: 2395-0072 ... A Research Paper on Machine Learning based Movie Recommendation System ... So recommendation system make the things easy. In this paper we are building a movie recommendation system with ...

  24. (PDF) Machine Learning: Movie Recommendation System

    Machine learning is a sub-category of Artificial Intelligence where a system is programmed to read and analyze datasets. Through Machine learning, systems can be programmed to make predictions in ...