<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <journal-title-group>
        <journal-title>G Wireless Networks,
July</journal-title>
      </journal-title-group>
    </journal-meta>
    <article-meta>
      <title-group>
        <article-title>FedTCS: Federated Learning with Time-based Client Selection to Optimize Edge Resources</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Saira Bano</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Nicola Tonellotto</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Pietro Cassarà</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Alberto Gotta</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>CNIT - National Inter-University Consortium for Telecommunications</institution>
          ,
          <addr-line>Parma</addr-line>
          ,
          <country country="IT">Italy</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Department of Information Engineering, University of Pisa</institution>
          ,
          <addr-line>Pisa</addr-line>
          ,
          <country country="IT">Italy</country>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>Information Science and Technology Institute "A. Faedo", National Research Council</institution>
          ,
          <addr-line>Pisa</addr-line>
          ,
          <country country="IT">Italy</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2022</year>
      </pub-date>
      <volume>21</volume>
      <issue>2022</issue>
      <fpage>0000</fpage>
      <lpage>0002</lpage>
      <abstract>
        <p>Client sampling in federated learning (FL) is a significant problem, especially in massive cross-device scenarios where communication with all devices is not possible. In this work, we study the client selection problem using a time-based back-of system in federated learning for a MEC-based network infrastructure. In the FL paradigm, where a group of nodes can jointly train a machine learning model with the help of a central server, client selection is expected to have a significant impact in FL applications deployed in future 6G networks, given the increasing number of connected devices. Our timer settings are based on an exponential distribution to obtain an expected number of clients for the FL process. Empirical results show that our technique is scalable and robust for a large number of clients and keeps data queues stable at the edge.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;Federated Learning</kwd>
        <kwd>Clients Selection</kwd>
        <kwd>Mobile Edge Computing (MEC) framework</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>
        Recently, mobile telecommunications is facing a digital transformation triggered by the more
frequent use of artificial intelligence (AI) based services and massive connection requests from
a plethora of smart devices, autonomous vehicles and industrial IoT-based systems. In this
scenario, a report published by CISCO [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ] estimates that the number of IoT devices will reach to
20 billion by 2023. The massive communication of these devices are generating ever-increasing
distributed trafic as they are integrated into many AI-based application scenarios such as
computer vision [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ] or autonomous vehicles [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]. The growth of these AI applications continues
to drive the development of wireless networks, and 6G is expected to bring the evolution of
mobile devices from "connected things" to "connected intelligence" [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ]. One of the fundamental
goals of 6G is to create a holistic system in which communication, computation and control
are jointly orchestrated to achieve unprecedented levels of reliability, energy eficiency and
sustainability [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ]. In this 6G ecosystem, machine learning (ML) and AI play a critical role
particularly through their deployment at the edge of wireless networks, close to end users,
and by using the user data to train AI models. However, the challenges of trustworthiness and
scalability, especially user’s privacy and security, are one of the most important requirements for
6G smart services and applications, for which the General Data Protection Regulation (GDPR)
guidelines must be met and the direct transmission or collection of user data is prohibited [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ].
      </p>
      <p>
        To address the problem of privacy, Google has proposed Federated Learning (FL) [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ], an
approach to address distributed learning problems where a shared global model can be learned
using locally generated models from remote clients without sharing private data. This approach
has two advantages: first, it decouples the need for a shared data repository for learning, and
second, it preserves user privacy. In the process of FL, clients willing to participate in the
learning process collaboratively execute training tasks together, which are orchestrated by edge
or cloud entities. In the FL, each participating client must download the current global shared
model, improve it through training with its local data, and summarise the changes in the form
of a light- focused update in the form of weights or gradients. These client-generated updates
are sent to the entity which is in charge of aggregating these model updates into the global
model. The updated model is then sent back to the clients, that replace their local model with
the updated model.
      </p>
      <p>
        In this paper, we focus on the implementation of an FL-based protocol suitable for network
infrastructure relying on Mobile Edge Computing (MEC). In particular, we address the problem
of client selection in the FL process. In the process of FL, even if an increasing number of clients
can generate a more accurate shared global model, the amount of data generated to update the
global model by these clients can lead to overflows and instability of the data queues at the
edge of the MEC-based network infrastructure. One of the solution to prevent this overflow
and to keep the data queues stable is to select a subset of clients to send their models to the
edge. There are many techniques that are proposed in the literature for the client selection,
such as in [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ], in which the authors try to overcome the challenge of selecting clients with
heterogeneous resources. In the proposed system, random clients receive the MEC operator’s
request to participate in the FL, then these clients inform the operator about their resources.
Finally, the MEC operator selects only those clients that can complete the tasks within a certain
time frame. In [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ], the authors proposed the biased client selection technique and showed that
selecting clients that have higher losses of their learning model can improve the speed of error
convergence, which significantly reduces the communication overhead. However, the above
techniques are dificult to implement in many application scenarios, especially in the case of
mobile nodes.
      </p>
      <p>In this paper, we propose a new protocol called FedTCS, i.e., Federated Learning with
Timebased Client Selection. In this protocol, the control agent at the edge of the MEC-based
infrastructure selects the clients involved in the FL process to enable more eficient updating of
the shared global model. We assume that the bufers for storing the model updates generated
by the clients are placed at the edge. We propose a time-based scheme protocol for selecting
clients involved in the FL process to avoid bufer overflows in the storage at the edge caused by
excessive data flow due to the increasing number of clients. We use Kafka 1, a distributed event
streaming platform used to process data streams. Precisely, clients store their local updates
in the Kafka broker, from where the entity in charge of averaging the updates can read these
updates to generate the global model. After averaging, the global model is again stored in the
broker so that clients can download the global model once it is available.</p>
      <p>The main contributions in this paper are summarised below:
• We propose a probabilistic time-based client selection scheme for the FL procedure based
on exponentially distributed timers. Through simulation-based analysis, we show that
our scheme allows to maintain queue stability even as the number of clients increases,
thus avoiding the implosion of model updates at the edge.
• We propose a client selection scheme to deal with stragglers of edge devices, by using an
equal selection probability during the FL procedure.
• We provide a preliminary performance analysis to prove the scalability of the proposed
selection scheme, while keeping the expected number of clients optimal according to the
size of the queue.</p>
      <p>The remainder of this article is organized as follows: in Section 2, we present our proposed
system for the client selection based on exponentially distributed timers; the experimental setup
and numerical results for FL are discussed in Section 3. Finally, in Section 4 we provide the
conclusions and future developments.</p>
    </sec>
    <sec id="sec-2">
      <title>2. Time-based Clients Selection</title>
      <p>Federated learning can involve communications between numerous clients and the model
averaging entity, resulting in excessive latency and network congestion. We propose a client
selection approach that reduces congestion and latency while keeping bufer queues stable
at the edge. Our approach allows computing the average number of transmitting clients to
maximise the time-averaged accuracy of the federated learning procedure.</p>
      <p>
        In this work, a two-tier distributed network infrastructure based on the MEC architecture is
used to make an optimal selection from a large number of clients for FL, as shown in Figure 1. In
this distributed architecture, the server FL is located in the cloud, while a control agent is placed
at the edge for eficient client selection. Since there may be many edge systems serving diferent
areas, we have a control agent (CA) on each of these systems that are closer to the users. This
control agent uses a system based on ACK (short message in the form of an acknowledgment)
to send a short message to all clients and the server. This ACK generated by CA is triggered by
the arrival of the first message from any client in the given round of FL, which is explained in
more detail later in the algorithm. This edge-based system also makes use of the Kafka broker,
which acts as a bufer to allow asynchronous reception of model updates from clients. It also
controls the flow of updates to and from clients and makes them available to the aggregation
server in the cloud. The goal of this work is to select a subset of clients such that the flow of
information in the form of client updates toward the edge is reduced and does not saturate the
Kafka bufer at the edge. For this purpose, in this work we propose a timer-based algorithm,
in which the optimal number of clients is selected according to the size of the queue and the
whole algorithm works as follows:
1. We assume that all clients train their model on locally available data and all the clients
are available for all rounds of FL. Before the FL task starts, the server broadcasts a
configuration message in the form of a tuple ( ,  ,  ) and sends the initial model
parameters to all the  clients, where  is the incremental FL round number,  is the
time interval, and  is the exponential distribution parameter.
2. Each client i receives the tuple from the server after a delay of 2. In this work, due
to space constraints, we assume that all clients have heterogeneous training times but
homogeneous delays  =  from the edge and each edge has the same delay  from the
server.
3. After receiving the request from the server, each client i extracts a random exponentially
distributed timer  also called a backof timer, based on the received parameter, i.e., the
interval size, and delays its training until the timer expires. Each backof timer value is
between [0,  ] according to a truncated exponential distribution.
4. The client with the least backof timer and training time sends its model parameter to
the edge. Upon receiving the model parameter, the CA at the edge as described above is
triggered and sends an ACK message to both the server and to all the clients connected
to the edge via a control topic managed by Kafka.
5. When clients send their model updates after their backof timer and training have finished,
some control parameters are also appended to the message header, namely the round
number, the training time, and their timer value (,  , ). These parameters are used
by the server to adjust the  for the next round and to check whether a received message
belongs to the current round or not.
6. Any client i that receives the ACK before it has finished its training or when the timer
has not yet expired will suppress training to save computational resources. This is due to
the fact that the sum of the client ’s training time   and its timer  is larger than the
sum of minimum training time, timer, and the round-trip delay 2 for receiving an ACK:
  +  &gt;   +  + 2.
7. After receiving an ACK, the server reads the messages from the broker at the edge and
calculates the average number of clients based on the timer values it receives from each
client for a given round of FL
8. After evaluating the expected number of clients (), it performs global aggregation
based on the FedProx algorithm [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ], which achieves higher accuracy compared to FedAvg
for heterogeneous networks. This updated model is then distributed to all the  clients.
9. The server also sends  and  for the next round along with the updated model, based
on the average number of clients it wants to receive and the maximum latency it can
experience in a round.
      </p>
      <p>accuracy of the global model is reached.</p>
      <p>10. All the above steps form a complete round of FL. This procedure continues until desired
This mechanism of sending the model updates based on probabilistic based timers requires a
very low computational complexity at every IoT edge device.</p>
      <sec id="sec-2-1">
        <title>2.1. Exponential Distributed Time-based Client Selection</title>
        <p>Suppose that for a random variable  with a probability density function (PDF), a mean (1/ )
is given, then the PDF of the random variable  with the truncated exponential distribution on
the right-hand side of  is given as follows:
 (; ,  ) =
{︃ 
0</p>
        <p>1
  − 1</p>
        <p>
          if 0 ≤  ≤ 
otherwise
where  is known and sent by the server. The exponential distribution has often been used to
estimate the mean of the desired population belonging to a particular group. The densities of
truncated distributions are significant in modelling such populations[
          <xref ref-type="bibr" rid="ref11">11</xref>
          ]. In order to find the
values of timer  for each client , we used the inverse cumulative distributive function (CDF)
sampling theorem and calculated  from the following expression:


(; ,  ) =
        </p>
        <p>
          log (︀ ( − 1) + 1)︀
where  belongs to the uniform distribution ∈ [
          <xref ref-type="bibr" rid="ref1">0, 1</xref>
          ]. The FL server adjusts these  and  values
based on the number of clients expected for a given number of rounds and the desired accuracy,
while keeping the queue stable. However, as we increase the values of  , the weight of density
shifts towards  , resulting in a dense timer setting [
          <xref ref-type="bibr" rid="ref12">12</xref>
          ]. In order to find the () for  = 1000
clients, we change the timer and  values as shown in Figure 2a.
        </p>
        <p>The results in Figure 2a show the expected number of clients for  equal to 4, 6, and 8,
where  is the delay between edge and client devices. For simplicity, we chose the same delay
between each device  ∈  devices and the edge when formulating the problem. The results
show that for diferent values of  , the optimal number of clients is almost reached at  = 10.
Further increase in  leads to higher number of clients and thus the lower number of clients
suppress sending their model updates by discarding their training. For  &gt; 10 larger timer
values are closer to  and when  &lt; 10 the timer values are closer to zero. In both cases, we
have a large number of clients because the timer values are distributed in a narrow range. So,
(1)
(2)
(a) () vs.  for  = 4, 6, 8, and number
of clients  = 1000
(b) () vs. number of clients  for  = 10,
and  = 4, 6, 8
for the given scenario, if we set the  parameter close to or equal to 10, our queue will stay
below the overflow. Also, all clients whose training time is close to or equal to the minimum
timer will complete their training and send their model updates before receiving the ACK from
the edge. Only the clients whose training time is greater suppress both training and model
updates. So it is not feasible to set the time equal to the delay time, as this would completely
ignore the straggler devices.</p>
        <p>We repeat the experiment by fixing the value of the parameter  and changing the values of
 . From Figure 2b, we can see that for the smaller values of  , the expected number of clients
in the queue () increases with the total number of clients. However, when we move to the
larger values of  , () remains stable even with a larger number of clients. This is because
for larger values of  , clients will receive the ACK before their timer expires, so a large number
of clients will suppress their training.</p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>3. Experimental Environment and Results for Federated</title>
    </sec>
    <sec id="sec-4">
      <title>Learning</title>
      <p>In the process of federated learning, each client sends its model updates to the server, which
aggregates them in each round of communication. However, when the number of clients is
large, the cost and overhead of communication between clients and edge can be a significant
bottleneck. In this paper, we propose to use a time-based technique to select only a small subset
of clients and communicate only the updates of this subset to the server. Our goal is to aggregate
only a subset of client updates, where the subset of clients selected in each round is diferent
and depends on the exponential timer, so that the final model update looks like an aggregation
of all client updates.</p>
      <p>We run the experiments with 50 user nodes participating in the FL process. Increasing the
number of nodes would be possible if a more powerful test environment were available. To
select a diferent number of clients using a probabilistic time-based client method based on
exponentially distributed timers, we set the timer parameters such as  and  for 5, 10, 15, 20,
30, and 40 clients to test how changing the number of clients afects the accuracy. Furthermore,
during the FL process we assigned weights to all clients that corresponded to the number of
samples that each client used for training. Each experiment is repeated 10 times to find the
average number of rounds for the selected clients. We then calculate the confidence interval for
the specified rounds and for the selected number of clients, as shown in Figure 3.</p>
      <sec id="sec-4-1">
        <title>3.1. Dataset Distribution</title>
        <p>We started with the MNIST dataset2 for convenience. The MNIST dataset contains 60,000
training examples and 10,000 test examples. The image has a fixed dimension of 28x28 pixels
with a value between 0 and 9. Each image is converted into an array of 784 features. We require
each client to randomly select images from a diferent subset of the training data that are not
independently and identically distributed (non-iid). For the training dataset of the experiment,
each user selects 1200 images at random, without repetitions.</p>
      </sec>
      <sec id="sec-4-2">
        <title>3.2. Experimental Results</title>
        <p>We use for the MNIST classification a neural network consisting of three layers, with 200
neurons in the first two layers and 10 output neurons in the third layer, with a learning rate of
0.01, a batch size of 32, and each client performs 10 local epochs before sending its model updates.
As shown in Figure 3, the number of FL rounds decreases as we increase the number of clients.
With 5 clients, we have a maximum number of rounds of 20, and with 50 clients, 13 rounds to
achieve 97% accuracy. In these experiments, we distribute the updated model to all clients and
not only to the clients that participated in the aggregation process, so whichever clients we
select have the same updated global model. This also reduces the communication cost selecting
only a subset of clients that deliver their local parameters and saves the computational resources
of the rest of the ensemble leading to a fast convergence in a small number of rounds. Also, the
communication cost from the server to distribute the model to all clients is compensated by a
smaller number of rounds. If we look at Figure 3, the intersection of the two lines shows the
optimal working point at which we expect () clients selected for the FL process based on
above-discussed parameters  and  , and the relative number of rounds for a given number of
clients to achieve the desired accuracy, which in these tests was up to 97%. After repeating each
test more than 10 times, we also defined a confidence interval of 95% for the number of rounds
to achieve the desired accuracy. According to the empirical results, our approach improves
learning eficiency and enables more uniform and fair performance among clients.</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>4. Conclusion and Future Works</title>
      <p>In this work, we have investigated a technique for selecting clients in the FL process by using
the probabilistic timers for a large number of clients. The proposed technique is robust and
scalable for a large number of clients. It controls the number of updates to be received to control
the queue length at the edge by adjusting the parameters for the average number of clients
and the latency for each round. We also investigate the relationship between adjusting the
parameters for the exponential distribution and the convergence time for FL. In future work,
we will refine the methods and details of the client selection technique by also considering
the heterogeneous delays between edge and client devices and analyzing the heterogeneous
computational constraints between devices. We will also try to implement the timers with
diferent probability models to see which distribution is more efective. While the focus of
this research is on the impact of client sampling on the global optimization goal of FL, future
developments could investigate the impact of the client selection technique on the performance
of individual users, especially in the context of heterogeneity.</p>
    </sec>
    <sec id="sec-6">
      <title>Acknowledgments</title>
      <p>This research was partially supported by TEACHING project funded by the EU Horizon 2020
research and innovation programme under GA n. 871385, by the Italian Ministry of Education
and Research in the framework of the CrossLab project (Departments of Excellence), and by the
University of Pisa in the framework of the PRA 2020 program (AUTENS project, Sustainable
Energy Autarky).</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          <source>[1] Cisco annual internet report</source>
          ,
          <year>2021</year>
          . URL: https://www.cisco.com/c/en/us/solutions/ executive-perspectives/annual-internet
          <source>-report/index.html.</source>
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>K.</given-names>
            <surname>Simonyan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Zisserman</surname>
          </string-name>
          ,
          <article-title>Very deep convolutional networks for large-scale image recognition</article-title>
          ,
          <source>arXiv preprint arXiv:1409.1556</source>
          (
          <year>2014</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>D.</given-names>
            <surname>Bacciu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Akarmazyan</surname>
          </string-name>
          , E. Armengaud,
          <string-name>
            <given-names>M.</given-names>
            <surname>Bacco</surname>
          </string-name>
          , G. Bravos,
          <string-name>
            <given-names>C.</given-names>
            <surname>Calandra</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Carlini</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Carta</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Cassarà</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Coppola</surname>
          </string-name>
          , et al.,
          <article-title>Teaching-trustworthy autonomous cyber-physical applications through human-centred intelligence</article-title>
          ,
          <source>in: 2021 IEEE International Conference on Omni-Layer Intelligent Systems (COINS)</source>
          , IEEE,
          <year>2021</year>
          , pp.
          <fpage>1</fpage>
          -
          <lpage>6</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <surname>K. B. Letaief</surname>
            ,
            <given-names>W.</given-names>
          </string-name>
          <string-name>
            <surname>Chen</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          <string-name>
            <surname>Shi</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          <string-name>
            <surname>Zhang</surname>
            ,
            <given-names>Y.-J. A. Zhang,</given-names>
          </string-name>
          <article-title>The roadmap to 6g: Ai empowered wireless networks</article-title>
          ,
          <source>IEEE Communications Magazine</source>
          <volume>57</volume>
          (
          <year>2019</year>
          )
          <fpage>84</fpage>
          -
          <lpage>90</lpage>
          . doi:
          <volume>10</volume>
          .1109/
          <string-name>
            <surname>MCOM</surname>
          </string-name>
          .
          <year>2019</year>
          .
          <volume>1900271</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>Y.</given-names>
            <surname>Shi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Yang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Jiang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Zhang</surname>
          </string-name>
          , K. B.
          <string-name>
            <surname>Letaief</surname>
          </string-name>
          ,
          <article-title>Communication-eficient edge ai: Algorithms and systems</article-title>
          ,
          <source>IEEE Communications Surveys &amp; Tutorials</source>
          <volume>22</volume>
          (
          <year>2020</year>
          )
          <fpage>2167</fpage>
          -
          <lpage>2191</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <surname>K. B. Letaief</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          <string-name>
            <surname>Shi</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          <string-name>
            <surname>Lu</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          <string-name>
            <surname>Lu</surname>
          </string-name>
          ,
          <article-title>Edge artificial intelligence for 6g: Vision, enabling technologies, and applications</article-title>
          ,
          <source>IEEE Journal on Selected Areas in Communications</source>
          <volume>40</volume>
          (
          <year>2021</year>
          )
          <fpage>5</fpage>
          -
          <lpage>36</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>B.</given-names>
            <surname>McMahan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Moore</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Ramage</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Hampson</surname>
          </string-name>
          ,
          <string-name>
            <surname>B. A. y Arcas</surname>
          </string-name>
          ,
          <article-title>Communication-eficient learning of deep networks from decentralized data</article-title>
          ,
          <source>in: Artificial Intelligence and Statistics</source>
          , PMLR,
          <year>2017</year>
          , pp.
          <fpage>1273</fpage>
          -
          <lpage>1282</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>T.</given-names>
            <surname>Nishio</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Yonetani</surname>
          </string-name>
          ,
          <article-title>Client selection for federated learning with heterogeneous resources in mobile edge</article-title>
          , in: ICC 2019
          <article-title>-2019 IEEE international conference on communications (ICC)</article-title>
          , IEEE,
          <year>2019</year>
          , pp.
          <fpage>1</fpage>
          -
          <lpage>7</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>Y. J.</given-names>
            <surname>Cho</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Gupta</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G.</given-names>
            <surname>Joshi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>O.</given-names>
            <surname>Yağan</surname>
          </string-name>
          ,
          <article-title>Bandit-based communication-eficient client selection strategies for federated learning</article-title>
          ,
          <source>in: 2020 54th Asilomar Conference on Signals, Systems, and Computers</source>
          , IEEE,
          <year>2020</year>
          , pp.
          <fpage>1066</fpage>
          -
          <lpage>1069</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <surname>A. K. Sahu</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          <string-name>
            <surname>Li</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          <string-name>
            <surname>Sanjabi</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          <string-name>
            <surname>Zaheer</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          <string-name>
            <surname>Talwalkar</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          <string-name>
            <surname>Smith</surname>
          </string-name>
          ,
          <article-title>On the convergence of federated optimization in heterogeneous networks</article-title>
          ,
          <source>arXiv preprint arXiv:1812.06127 3</source>
          (
          <issue>2018</issue>
          )
          <article-title>3</article-title>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <surname>M. F. M. Al-Athari</surname>
          </string-name>
          ,
          <article-title>Estimation of the mean of truncated exponential distribution</article-title>
          ,
          <source>Journal of Mathematics and Statistics</source>
          <volume>4</volume>
          (
          <year>2008</year>
          )
          <fpage>284</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>J.</given-names>
            <surname>Nonnenmacher</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E. W.</given-names>
            <surname>Biersack</surname>
          </string-name>
          ,
          <article-title>Optimal multicast feedback</article-title>
          ,
          <source>in: Proceedings. IEEE INFOCOM'98, the Conference on Computer Communications. Seventeenth Annual Joint Conference of the IEEE Computer and Communications Societies</source>
          .
          <article-title>Gateway to the 21st Century (Cat</article-title>
          . No.
          <volume>98</volume>
          , volume
          <volume>3</volume>
          , IEEE,
          <year>1998</year>
          , pp.
          <fpage>964</fpage>
          -
          <lpage>971</lpage>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>