<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Eficient Fusion Techniques for Result Diversification and Image Interestingness Tasks</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Prabavathy Balasundaram</string-name>
          <email>prabavathyb@ssn.edu.in</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>G Gnana Sai</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Kishore N</string-name>
          <email>kishore2110289@ssn.edu.in</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Olirva M</string-name>
          <email>olirva2110544@ssn.edu.in</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Makesh Vaibhav A.G</string-name>
          <email>makesh2110629@ssn.edu.in</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Naren Srinivasan Murali</string-name>
          <email>naren2110695@ssn.edu.in</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Parlapalli Sai Harshith</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Faculty, Department of Computer Science, Sri Sivasubramaniya Nadar College of Engineering</institution>
          ,
          <addr-line>Chennai, Tamil Nadu</addr-line>
          ,
          <country country="IN">India</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>UG Student, Sri Sivasubramaniya Nadar College of Engineering</institution>
          ,
          <addr-line>Chennai, Tamil Nadu</addr-line>
          ,
          <country country="IN">India</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>Result diversification aims to retrieve a set of images that is both relevant and diverse, efectively capturing the essence of a given query. Image interestingness aims to fulfill the need for accurately assessing and predicting the level of interest in images, enabling better user experience and content organization. These two tasks can use inducer fusion, which combines the outputs of multiple inducers to improve the accuracy and robustness of prediction models. In this work, independent and ensemble ML techniques were used to solve the challenges in inducer fusion. Experimental validation was carried out on Result diversification and Image interestingness datasets of ImageCLEF2023-Fusion task. our research contributes to advancing the field of inducer fusion and improving the performance of result diversification and image interestingness tasks.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;Result diversification</kwd>
        <kwd>Image interestingness</kwd>
        <kwd>Inducer fusion</kwd>
        <kwd>Machine learning techniques</kwd>
        <kwd>Ensemble Machine learning techniques</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>With the exponential growth of digital imagery on the internet, efective image retrieval systems
have become indispensable for users seeking visual content. Traditional image search engines
primarily rely on content-based features and textual metadata to generate a ranked list of visually
similar images. However, this approach often falls short in providing diverse search results,
leading to redundancy and limited exploration of the search space. The [1]-[2]-diversification
task was introduced to address this limitation and encourage the development of techniques
that enhance result diversification.</p>
      <p>The proliferation of visual content on various platforms necessitates efective techniques for
predicting image interestingness. Accurately determining the level of interestingness associated
with images holds immense value in applications such as image search, recommendation systems,
and content curation. The ability to automatically rank and retrieve interesting images not
only enhances user satisfaction but also streamlines information retrieval processes. In recent
years, substantial progress has been made in the development of computational models and
techniques for image interestingness prediction. However, the diverse and subjective nature
of interestingness poses significant challenges. To address these challenges, Constantin et al
[9] introduced the Interestingness10k dataset, which serves as a standardized benchmark for
evaluating image interestingness prediction methods.</p>
      <p>This paper presents a study on result diversification and image interestingness predictions
using fusion techniques. Furthermore, the research objective is to investigate the efectiveness
of inducer fusion, a technique that combines the outputs of multiple inducers, in enhancing
prediction performance. Inducer fusion aims to leverage the strengths of individual inducers
and mitigate their weaknesses, ultimately resulting in a more accurate and robust prediction
model.</p>
    </sec>
    <sec id="sec-2">
      <title>2. Related Work</title>
      <p>The existing work related to Result diversification and Image Interestingness are
summarised below.</p>
      <p>Hai-tao, Yu et al [3] proposes a framework dubbed MO4SRD for search result diversification.
While the current methods in use rely on a sequential selection procedure, MO4SRD suggests
a score-and-sort approach based on direct metric optimization. It represents the diversity
score of each document using probability distributions, enabling the development of diferent
variations of diversity metrics. A probabilistic neural scoring function that takes into account
cross-document interaction and permutation equivariance is incorporated into the system.
MO4SRD is tested on the four standard test datasets released in the diverse tasks of TREC Web
Track from 2009 to 2012 which suggests that it performs better than the current approaches.</p>
      <p>Shreya Sriram et al [4] proposed an ensembled approach for web search result diversification
using neural network models. The data was obtained from the Retrieving Diverse Social Images
Task dataset. Diferent networks namely, Multilayer Perceptron, Ridge Regressor using Grid
Search and Keras Regressor using Sequential model were ranked based on MAE. These ranks
and the models were fed as input for the Voting Regressor. The performance of the voting
regressor can again be measured with MAE. Among the ten best submissions done to ImageClef
2022, the best F1 score and CR score were 0.5604 and 0.4384 respectively.</p>
      <p>Lekshmi Kalinathan et al [5] presented a fusion approach for web search result diversification
using machine learning algorithms. The data was obtained from the Retrieving Diverse Social
Images Task dataset. A voting regressor of three predictor models K Nearest Regressor, Decision
Tree Regressor and SVM was used to predict the similarity scores of the models in the validation
dataset. Of the 10 best submissions done to ImageClef 2022, the best F1 score and CR score were
found to be 0.5634 and 0.4414 respectively.</p>
      <p>Maria, Shoukat et al [6] presented investigation on predicting media interestingness scores
using a novel late fusion framework. The individual inducers’ scores are extracted from the
Interestingness10k dataset which are provided by the task organizers. The proposed framework
combines multiple algorithms and employs two fusion strategies: naive fusion and merit-based
fusion. The results revealed that the proposed late fusion framework consistently outperformed
alternative approaches, exhibiting superior predictive accuracy and robustness. Overall, this
paper ofers a comprehensive exploration of media interestingness prediction, providing a
valuable contribution to the existing literature.</p>
      <p>Ying, Dai et al [7] have proposed two image interestingness models with diferent
convolutional neural network architectures and improves on their image aesthetic score (AS) prediction
by an ensemble. The models are trained on two datasets, CUHK-PQ and XihAA datasets. One
model extracts the subject of the image for predicting the image’s aesthetic score, and the other
extracts the holistic composition for the prediction. It is found that these models trained on the
XiheAA dataset seem to learn the latent photography principles, though it cannot be said that
they learn the aesthetic sense. The aggregated model improves the F1 value by 5.4% and 33.1%
compared to the first and second model respectively.</p>
      <p>V. Kalakota et al [8] proposed a model to retrieve diverse images of a particular landmark
location that cover diferent aspects of a query. Images required are obtained from the Flickr
Div150Cred dataset. Flickr Baseline Ranking Algorithm and a re-ranking strategy are applied to
retrieve the most relevant images out of all the possible set of images using the provided textual
metadata. A fusion-based strategy is employed to ensemble several cluster models and a final
summary of the query location is produced by selecting images from diferent clusters. The
model is evaluated based on P@10, CR@10, F1@10, P@20, CR@20 and F1@20. The proposed
method achieved a start-of-the-art performance on precision scores and F1 Score for images
retrieved 30 and above. Cluster Recall scores still need slight improvement for 10 or 20 images
being retrieved. Future work will be devoted to improving cluster recall metric without afecting
the initial precision scores.</p>
    </sec>
    <sec id="sec-3">
      <title>3. Task and Dataset Description</title>
      <sec id="sec-3-1">
        <title>3.1. Result Diversification</title>
        <p>The dataset used for this task is extracted from [2]. The data corresponds to the Retrieving
Diverse Social Images Task dataset [10]. An inducer is a model which predicts images related to
a query. The outputs from 56 inducers, representing a total of 123 queries are split into devset
(56 inducers for 60 queries) for training and testset (56 inducers for 63 queries) for testing. The
query id represents the unique id of the query
photo id the unique id of the photo represented by the entry
rank rank of the photo
sim similarity score of the photo to the query
run name a general name for the inducer
task is to diversify the results of image search. This fusion task is a retrieval task, where the
similarity scores of each image with the query is generated. Each entry or row in these files is
of the format as given below in the Table 1.</p>
      </sec>
      <sec id="sec-3-2">
        <title>3.2. Media Interestingness</title>
        <p>The Media interestingness fusion task corresponds to the problem of predicting the
interestingness of a particular image. An inducer is responsible to determine the interestingness of the
given images. The output of the inducer consists of the relevant images, their interestingness
classification and score. However, a single inducer is disadvantageous for application in certain
areas due to low precision and lack of performance. To tackle this problem, ensembling, a
technique that aggregates the predictions of several inducers, is used. The ensembled system is
expected to be superior when compared to the highest-performing individual inducer. The data
for this task is extracted and corresponds to the Interestingness10k dataset [Constantin2021b].
The output data from 29 inducers, representing visual interestingness predictions for 2435
images, is stored in separate text files for each inducer. Each entry of these files is as per the
format given in Table 2.</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>4. Methodologies used</title>
      <p>Diferent machine learning algorithms like Elastic net, Gradient Boosting Regressor,
Decision Tree were employed for the result diversification task and XGBoost Classifier, k-Nearest
Neighbors Classifier and Decision Tree were employed for the image interestingness task.</p>
      <sec id="sec-4-1">
        <title>4.1. XGBoost Classifier</title>
        <p>XGBoost is an eficient machine learning algorithm known for its ensembling-based approach.
It combines multiple decision tree models sequentially, leveraging gradient boosting to
improve predictions continuously by addressing errors made by previous trees. XGBoost avoids
overfitting and provides a range of hyperparameters for optimisation through regularisation
approaches. It has advanced capabilities like handling missing values and parallel processing
and can handle large-scale datasets efectively. Metrics including accuracy, precision, recall,
and F1-score are used in evaluation
4.2. K-Nearest Neighbors Classifier
k-Nearest Neighbors (k-NN) is a machine learning algorithm used for classification and
regression tasks. The k-Nearest Neighbours in the training set are taken into account for predicting
the value or class of a new instance. In terms of classification, it chooses the neighbor’s majority
class, and in terms of regression, it takes the average of those values. The bias-variance trade-of
and complexity of the model are influenced by the choice of k. Since it is non-parametric,
k-NN can be applied to a variety of situations, although it is sensitive to irrelevant features and
distance metrics.</p>
      </sec>
      <sec id="sec-4-2">
        <title>4.3. Elastic Net</title>
        <p>The Elastic Net is a regression technique that combines L1 (Lasso) and L2 (Ridge) regularisation
methods to achieve a balance between feature selection and feature grouping. The model
introduces two hyperparameters, alpha and l1 ratio, which control the extent of L1 and L2
regularisation applied during training. By adjusting these hyperparameters, the Elastic Net
model can efectively handle both feature selection and grouping, resulting in more accurate
and interpretable regression models. After training the model on the provided hyperparameters,
predictions are made on the test set. The performance of the model is evaluated using mean
absolute error (MAE).This metric provides insights into the accuracy and goodness of fit of the
Elastic Net model.</p>
      </sec>
      <sec id="sec-4-3">
        <title>4.4. Gradient Boosting</title>
        <p>The Gradient Boosting is a powerful machine learning algorithm used for regression tasks. It
combines multiple weak predictive models like decision trees in an ensemble to make accurate
predictions. The algorithm works by sequentially fitting the models to the residuals of the
previous model, allowing it to gradually improve its performance by focusing on the remaining
errors. This iterative process efectively captures complex patterns and relationships in the
data. The Gradient Boosting Regressor utilises gradient descent optimization to minimise a
loss function, such as mean squared error, and find the best fitting model. The algorithm also
incorporates regularisation techniques to prevent over-fitting and enhance generalisation.</p>
      </sec>
      <sec id="sec-4-4">
        <title>4.5. Decision Tree</title>
        <p>The decision tree algorithm is a powerful supervised learning method that constructs a tree-like
model as shown in Figure 1. In this model, internal nodes represent features or attributes,
branches represent decision rules, and leaf nodes correspond to predicted values. The algorithm
initiates by selecting the best attribute to split the dataset, evaluating various attributes and
measuring their impact on reducing the target variable’s impurity. This attribute selection
process is recursively applied to subsets of data until a predefined stopping criterion is satisfied.
Upon constructing the tree, each leaf node is assigned a predicted value based on the average of
the target variable. This enables the model to make predictions on new, unseen instances by
traversing the tree from the root node to a leaf node, guided by the instance’s attribute values.</p>
      </sec>
      <sec id="sec-4-5">
        <title>4.6. Voting Regressor</title>
        <p>The Voting Regressor shown in Figure 2 is a machine learning ensemble technique. It combines
multiple base models, including ElasticNet, Gradient Boosting Regressor, and Decision Tree
Regressor, to make predictions. Each base model contributes to the final prediction by voting or
averaging their individual predictions. The Voting Regressor leverages the strengths of each
base model, resulting in improved overall prediction accuracy and robustness. Here, the Voting
Regressor is trained on the training data and used to predict the target variable on the test set.
The performance of the model is evaluated using the Mean Squared Error (MSE).</p>
      </sec>
      <sec id="sec-4-6">
        <title>4.7. Stack Ensemble Regressor</title>
        <p>The Stack Ensemble combines multiple base models, including ElasticNet,
GradientBoostingRegressor, and DecisionTreeRegressor, to create a more robust and accurate predictive model. The
base models are trained individually on the training data, and their predictions are then used as
input features for a meta-model. The meta-model learns to combine the predictions of the base
models to make the final prediction. By leveraging the strengths of diferent base models, the
Stack Ensemble aims to improve the overall predictive performance. The ensemble model is
trained on the training data and evaluated on the test data using Mean Squared Error.</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>5. Result Analysis of Fusion Technique for Result Diversification</title>
    </sec>
    <sec id="sec-6">
      <title>Task</title>
      <p>This section discusses about the implementation fusion techniques with the analysis of the
results using evaluation metrics namely Mean Squared Error (MSE), Root Mean Squared Error
(RMSE), Mean Absolute Error (MAE).</p>
      <sec id="sec-6-1">
        <title>5.1. Implementation</title>
        <p>The inducers’ data contains information about query_id, inter, photo_id, rank, sim and run_name.
The rank and sim are extracted from the inducers data. The missing values of the rank and sim
attributes are filled using the SimpleImputer method. The data is then split into 80% training
set and 20% testing set. Three regression models M1, M2 and M3 are built to study how the
similarity scores are assigned.</p>
        <p>The model M1 is implemented using the ElasticNet Regressor where the alpha parameter
controls the regularization strength. It helps to prevent overfitting by shrinking the coeficients
towards zero. The l1_ratio parameter determines the balance between the L1 and L2 penalties. If
l1_ratio is 1 it indicates L1 regularization (Lasso) and if it 0 it indicates L2 regularization (Ridge).
If the value is in between 0 and 1, it represents a combination of both the penalties. The model
is trained on the training data using the fit method, which estimates the coeficients that best fit
the data.</p>
        <p>The model M2 is implemented using GradientBoostingRegressor which calculates the gradients
of the loss function with respect to the predictions made by the weak learners. In the code, the
gradients are implicitly computed during the training process of the GradientBoostingRegressor.</p>
        <p>The model M3 is implemented using Decision Tree Regressor. The fit (X,y) method is used to
train the Decision Tree Regressor on the given training data. The predict(X) method is used to
make predictions on new data using the trained Decision Tree Regressor.</p>
        <p>The model M4 is implemented using Voting Regressor. The VotingRegressor model is created
by passing the base models M1, M2, M3 as estimators to the ‘VotingRegressor‘ class. The voting
method used is ’hard’, which means the final prediction is based on the majority vote of the
base models.</p>
        <p>The ensemble model is created using the StackingRegressor from scikit-learn. The estimators
are defined as a list of tuples, where each tuple contains the name of the models M1, M2, and
M3. The final estimator, which is the gradient boosting regressor, is used to build a meta model.
An ensemble model M5 is obtained from the models M1, M2, and M3 using the final estimator.</p>
      </sec>
      <sec id="sec-6-2">
        <title>5.2. Results and discussion</title>
        <p>The built models M1, M2, M3, M4, and M5 are used to classify the test dataset. The predicted
values are compared with the actual values and various evaluation metrics are computed as
shown in Table 3 to assess the performance of these models. These metrics provide insights
into the model’s ability to correctly classify and predict the correct results based on rank and
similarity scores. After analysing the results for various evaluation metric in the Table 3 it is
clear that M1 model is the yielding best results among all the model.</p>
        <p>The M1 model has been tested with the CLEF test data and the F1@20 and CR@20 metrics
are used to compare and analyze the performance of the results. F1@20 combines precision and
recall into a single score, providing a balanced measure of the system’s performance. CR@20
calculates the proportion of relevant items or documents that are retrieved within the top 20
ranked results. A higher CR@20 score indicates a system’s ability to retrieve more relevant
items within the top-ranked results. An F1@20 of 0.5708 and CR@20 of 0.449 is obtained in
the top 10 results. Table 4 illustrates the F1@20 and CR@20 evaluated for the 10 best file
submissions.</p>
      </sec>
    </sec>
    <sec id="sec-7">
      <title>6. Result Analysis of Fusion Technique for Image Interestingness</title>
    </sec>
    <sec id="sec-8">
      <title>Task</title>
      <p>This section discusses about the implementation fusion techniques with the analysis of the
results using evaluation metrics namely Accuracy, Precision, Recall, F1 score, Mean Absolute
Error, Balanced Accuracy.</p>
      <sec id="sec-8-1">
        <title>6.1. Implementation</title>
        <p>The inducers’ data, contains information about video and image identifiers, classification labels,
and interestingness scores. The interestingness scores and classification labels are extracted
from the inducers’ data. The interestingness scores are stored in a numpy array, while the
classification labels are stored in a separate array. The data is then split into 80% training dataset
and 20% testing dataset. Three classifier models M1, M2, and M3 were built to study the nature
of classification of the images.</p>
        <p>The M1 classifier is implemented using the XGBoost algorithm with grid search to find the
best combination of hyperparameters such as ℎ, , and .
The GridSearchCV function from sklearn is used to perform the grid search, with the F1 score
as the evaluation metric.</p>
        <p>The M2 classifier is implemented using the decision tree algorithm with grid search to
ifnd the optimal combination of hyperparameters such as ℎ, , and
 .</p>
        <p>The M3 classifier is implemented using the K-nearest neighbours algorithm with grid search
to find the optimal combination of hyperparameters such as ℎ, weights, and p,
representing the number of neighbors, the weight function used in prediction, and the power
parameter for the Minkowski distance, respectively.</p>
        <p>A Voting Classifier is created with all the models M1, M2, and M3. A grid search is performed
to find the optimal combination of the voting scheme and weights. The best Voting Classifier
model (M4) is obtained based on the grid search results.</p>
        <p>The ensemble model is created using the StackingClassifier from scikit-learn. The estimators
are defined as a list of tuples, where each tuple contains the name of the models M1, M2, and
M3 and the corresponding best model instance. The final estimator, which is the decision tree
classifier, is used to build a meta model. An ensemble model M5 is obtained from the models
M1, M2, and M3 using the final estimator.</p>
      </sec>
      <sec id="sec-8-2">
        <title>6.2. Results and discussion</title>
        <p>The built models M1, M2, M3, M4, and M5 are used to classify the test dataset. The predicted
labels are compared with the actual labels and various evaluation metrics are computed as
shown in Table 5 to assess the performance of these models. These metrics provide insights
into the model’s ability to correctly classify and predict the interestingness of media content.
After analyzing the results for various evaluation metrics in the Table 5 it is clear that the M4
model is the yielding best results among all the models.</p>
        <p>The M4 model has been tested with the CLEF test data and MAP@10 metric is used to
compare and analyze the performance of the results. The Mean Average Precision at 10 ranges
from 0 to 1, where a higher value indicates better performance. It considers the order and
relevance of the recommended items, giving more weight to relevant items appearing at higher
positions in the recommendations. A MAP@10 of 0.1331 is obtained in the top 10 results. Table
6 illustrates the MAP@10 score evaluated for the 10 best file submissions.</p>
      </sec>
    </sec>
    <sec id="sec-9">
      <title>7. Conclusion</title>
      <p>In order to improve the predictions of the results of the inducers in the result diversification
task, three base regressors and two ensemble models were implemented. The model was trained
on data from 56 diferent inducers, containing 134,400 training values and tested on data from
56 inducers, containing 33,600 testing values. The base regressors obtained RMSE values of
0.0070, 0.1274 and 0.0360 each. The ensemble models obtained RMSE scores of 0.0445, and
0.1287 each. The best model is chosen based on the RMSE score. The model is then used to
predict the values of 56 inducers containing 176,400 values and among the ten best submissions,
the best F1 score and CR score are 0.5708 and 0.4295 respectively.</p>
      <p>In order to improve the predictions of the results of the inducers in the image interestingness
task, three base classifiers and two ensemble models were implemented. The model was trained
on data from 29 diferent inducers, containing 43,546 training values and tested on data from
29 inducers, containing 10,886 testing values. The base classifiers obtained Accuracy of 0.8705,
0.8637 and 0.8691 each. The ensemble models obtained Accuracy of 0.8756, and 0.8461 each.
The best model is chosen based on the Accuracy. The model is then used to predict the values
of 29 inducers containing 16,182 values and among the ten best submissions, the best MAP@10
score is 0.1331.</p>
    </sec>
    <sec id="sec-10">
      <title>8. References</title>
      <p>[1] Bogdan Ionescu, Henning Müller, Ana-Maria Drăgulinescu, Wen-wai Yim, Asma Ben
Abacha, Neal Snider, Grifin Adams, Meliha Yetisgen, Johannes Rückert, Alba García
Seco de Herrera, Christoph M. Friedrich, Louise Bloch, Raphael Brüngel, Ahmad
IdrissiYaghir, Henning Schäfer, Steven A. Hicks, Michael A. Riegler, Vajira Thambawita, Andrea
Storås, Pål Halvorsen, Nikolaos Papachrysos, Johanna Schöler, Debesh Jha,
AlexandraGeorgiana Andrei, Ahmedkhan Radzhabov, Ioan Coman, Vassili Kovalev, Alexandru Stan,
George Ioannidis, Hugo Manguinhas, Liviu Daniel S, tefan, Mihai Gabriel Constantin,
Mihai Dogariu, Jérôme Deshayes, Adrian Popescu, “Overview of the ImageCLEF 2023:
Multimedia Retrieval in Medical, Social Media and Recommender Systems Applications,”
Experimental IR Meets Multilinguality, Multimodality, and Interaction, Proceedings of the
14th International Conference of the CLEF Association (CLEF 2023), Thessaloniki, Greece,
September 18-21, 2023.
[2] Liviu-Daniel S, tefan, Mihai Gabriel Constantin, Mihai Dogariu, Bogdan Ionescu, “Overview
of ImageCLEFfusion 2023 Task - Testing Ensembling Methods in Diverse Scenarios,”
Experimental IR Meets Multilinguality, Multimodality, and Interaction, Proceedings of the
14th International Conference of the CLEF Association (CLEF 2023), Thessaloniki, Greece,
September 18-21, 2023.
[3] Hai-Tao Yu, “Optimize What You Evaluate With: Search Result Diversification Based on
Metric Optimization,” Proceedings of the AAAI Conference on Artificial Intelligence , vol. 36,
no. 9, pp. 10399–10407, 2022.
[4] Shreya Sriram, Ramachandran Balasundaram P, L. Kalinathan, “Ensembled Approach for
Web Search Result Diversification Using Neural Networks,” CLEF2022 Working Notes, CEUR
Workshop Proceedings, CEUR-WS.org, Bologna, Italy, 2022.
[5] L. Kalinathan, P. Balasundaram, Sriram, “A Fusion Approach for Web Search Result
Diversification Using Machine Learning Algorithms,” CLEF2022 Working Notes, CEUR Workshop
Proceedings, CEUR-WS.org, Bologna, Italy, 2022.
[6] Maria Shoukat, Khubaib Ahmad, Naina Said, Nasir Ahmad, Mohammed Hassanuzaman,
Kashif Ahmad, “A Late Fusion Framework with Multiple Optimization Methods for Media
Interestingness,” arXiv preprint arXiv:2207.04762, 2022.
[7] Ying Dai, “Building CNN-Based Models for Image Aesthetic Score Prediction Using an</p>
      <p>Ensemble,” Journal of Imaging, vol. 9, no. 2, pp. 2–30, 2023.
[8] Vaibhav Kalakota, Ajay Bansal, “Diversifying Relevant Search Results from Social
Media Using Community Contributed Images,” IEEE 45th Annual Computers, Software, and
Applications Conference (COMPSAC), 2021.
[9] Mihai Gabriel Constantin, Liviu-Daniel S,tefan, Bogdan Ionescu, Ngoc QK Duong,
ClaireHélène Demarty, and Mats Sjöberg, “Visual interestingness prediction: A benchmark
framework and literature review,” International Journal of Computer Vision, 129:1526–1550,
2021.
[10] Bogdan Ionescu, Mircea-Radu Rohm, Bogdan Boteanu, Adrian L. Gînscă, Mihai Lupu, and
Henning Müller, “Benchmarking Image Retrieval Diversification Techniques for Social
Media,” IEEE Transactions on Multimedia, 23:677–691, 2020.</p>
    </sec>
  </body>
  <back>
    <ref-list />
  </back>
</article>