-

Studying the feasibility of serverless actors

Daniel Barcelona Pons

daniel.barcelona@urv.cat 0

A´lvaro Ruiz Ollobarren

alvaro.ruiz@urv.cat 0

David Arroyo Pinto

david.arroyop@estudiants.urv.cat 0

Pedro Garc´ıa L´opez

pedro.garcia@urv.cat 0 0 Universitat Rovira i Virgili Tarragona , Spain

25 29

An actor is an isolated, independent unit of compute and state with single-threaded execution. Many actors Serverless is a very promising model with many bene- can execute simultaneously and independently of each fits to simplify the development of cloud applications. other to build complex applications. Actors can comHowever, many applications are not easy to build on a municate between them by sending messages and act serverless environment due to the lack of built-in key upon them. features, such as state and function coordination. In The actor model benefits from the serverless comthis paper we focus on the actor model. As this pop- puting framework in two main aspects: ular computational model challenges many aspects of the current serverless tools, the feasibility of building • Billing. FaaS platforms charge per compute a serverless actor framework is unclear. time-at small-grained periods-and the amount Our goal is to study whether the actor model can of resources consumed by the application, which be successfully deployed on top of a Functions as a in this case is more cost-effective than having to Service (FaaS) environment like AWS Lambda. To do purchase or rent servers with much more coarsethat, we design and build a prototype to evaluate the grained billing time units. We pay only for actual serverless actors requirements and performance. We actor run time. conclude with a successful prototype implementation and stating the necessary run-time extensions to the serverless core to improve the support for serverless actors. The serverless model benefits from several features that A simple, yet convenient use case would be the counter make the developer's life easier; such as scalability, example. In which, there's no need to have a server minimal deployment process, hardware and Operating running an application to handle the counter increSystem configuration, and sub-second billing. The cur- ments, as the state only changes in reaction to events. rent main offer of serverless services comes in the form This counter example can't be efficiently implemented of Functions as a Service, where the user codes small with stateless functions as they don't guarantee state functions that respond to events. persistence between invocations, so a remote external Unfortunately, not all applications have a straight- storage must be used. In contrast, serverless actors are forward migration to the serverless architecture that a better fit as a result of their combination of state easily benefits from its advantages. This is the case persistence and fine-grained billing. of the actor model, which is a highly popular compu- In this paper we discuss the feasibility of the actor tational pattern for building concurrent applications. migration to a serverless environment and the series of The model simplifies the job of composing parallel and challenges that arise. This study begins defining the distributed executions by using a basic unit of compu- main challenges that we have to solve to implement tation: the actor. serverless actors. After that, we design and implement

• Scalability. Users do not need to spend their time managing servers and setting up auto-scaling systems, as cloud providers are responsible for seamlessly scaling the capacity on demand. We can have a virtually infinite number of concurrent actors. [1] a solution on top of the current serverless offering explaining how we solve the previous challenges. Finally, as a result of evaluating the solution, we discuss the necessary run-time extensions to the serverless core to improve the support for serverless actors. 2

Challenges The base of our work is that a serverless function (FaaS) constitutes an actor. Actors can receive messages and act upon them and their own state. They only process one message at a time.

Serverless functions are not designed to support the actor model. As a consequence, there are certain requirements of implementing actors that are not offered in current FaaS cloud services.

Addressing. The most important element of the actor offer built-in state management for their FaaS services, model is that actors can receive and send messages to and external storage services must be used. other actors. A message income implies the execution Passivation A key element for serverless actors is the of the actor. That actor processes the received message time with no message income. by performing an action. In a fully event-driven system, actors should only

Despite that most cloud providers offer some kind of execute on message reception, as a reaction. This apinvocation endpoints, they are all limited to invocation proach in a FaaS environment would be unfeasible, as requests, in other words, once a lambda functions is each actor invocation would imply a cloud function rerunning, it can’t receive external data unless it makes quest and accesses to an external storage to load and an explicit request. Therefore, the usage of external dump state (severely affecting performance). communication services is required. However, keeping the actor running to avoid these

In addition, the serverless model works better with latencies, greatly increases computation cost, as we events, so the service should hold them until they are would be billed for unproductive execution functions. processed.

Naming is another issue. Each actor instance needs its unique identifier that other actors use to establish communication and send messages.

Performance. The actor model must be functional.

Therefore, a minimum performance is mandatory for the viability of its adoption. This is a special requirement given the high network latencies of the remote Atomicity. The actor model works on the base of components such as distributed storage and communiatomicity of actors. To maintain a consistent state, cation. and execute correct actions in response to messages, there cannot be more than one instance of the same actor executing at the same time. 3 Design and Implementation

Serverless functions scale automatically by spawning concurrent containers. This breaks the atomicity of In this section we present a solution1 for serverless acactors if two concurrent functions consume from the tors on top of AWS. Nevertheless, the structure is simsame message channel. Therefore, we need to limit ilar on other platforms like Azure. function concurrency. Fig. 1 depicts an overview of the solution. We use AWS Lambda as computation power for the serverless actors. Each actor instance is a new function. Then, spawning an actor means deployment and creation of a new Lambda function. Actors’ state is persisted in State. Actors are stateful and maintain a mutable state that takes part in action decision and logic. On the other hand, serverless functions are built stateless, in a way that consequent calls to the same function may not maintain previous state. Cloud providers don’t 1prototype available at github.com/danielBCN/faasactors DynamoDB. Finally, SQS queues are used to enable Passivation. We are at a crossroads between comcommunication between lambdas. plete passivation of actors to avoid extra billing and

The problems discussed in the previous section are maintaining them running to avoid extra latency. solved in the following ways. We propose a hybrid solution where actors’ state is persisted on a storage system, which allows passivation Addressing. Actors should react to messages and of actors when they haven’t received a message for a hold them until they are processed. This could be done while, solving the extra billing problem. But, when with a messaging queue or an event bus. Messaging invoked, they process all available messages on a single queues permit to store the messages persistently until execution (minimizing extra latency). Once the actor the actor processes them. is passivated, to process new messages, it would be

A queuing service also solves the naming issue, since necessary to invoke it again with a special event, in actors and queues could share identifier. A message m, which case it will recover its state from the remote which should be sent to an actor with identifier aid, storage. is queued to the queue with name aid. This approach This approach requires an event system with two allows communication between actors (and from any- main properties. 1) To trigger a new execution when where) by only knowing the actor’s identifier. the actor’s underlying function has been passivated.

In our implementation, actors (and their queues) are 2) When the function is running, notify it without enidentified by a unique string. This eases the actor queueing more functions invocations. Unfortunately, knowing its name at any moment and simplifies ad- cloud providers do not offer this kind of event system dressing. We use Simple Queue Service (SQS) for our as far as we know. Such is the case for AWS Lambda queues, as it is integrated with the other cloud systems and SQS. A lambda function with the SQS trigger enand offers good performance. Each actor has its queue abled, consumes all the available messages trying to and waits its messages on it. To communicate with an- enqueue invocations. As a result, when the function other actor, one only needs to know the other actor’s starts running and tries to receive messages, they are name and send a message to the corresponding queue. no longer available. This behavior, indeed, requires an Atomicity. We could solve this issue by limiting func- external client that schedules the execution of actors. tion concurrency to one. This way, while there will not This client is notified when an actor passivates. After be more than one invocation of the same function run- that, the client listens to the passivated actor queue, so ning at the same time, we can still exploit the FaaS that when the first message arrives, the client invokes scalability and deploy a virtually infinite number of the actor with the message in the payload. different functions concurrently.

There are several ways of doing so. AWS Lambda 3.1 Adapting actor’s code to FaaS offers a configuration parameter for reserved concurrency [2]. Creating the function with a concurrency reservation of one, multiple invocations are throttled and only one function is executed at a time. However, throttling can suppose significant delays in executions.

State. Our approach uses a disaggregated storage service for persisting state. In this way, the state is retrieved from the store when the function is invoked and stored back before the invocation finishes. This allows persisting the actor for indefinite time at a reduced cost (the storage service’s). In contrast, actors aren’t always listening for messages, and need to be awaken, which has an extra latency. The approach is closely related to the passivation of actors.

In particular, we use Amazon DynamoDB. The service presents latencies inferior to 10ms for puts and gets of small strings.

Actors usually communicate calling each other’s meth

ods. Unfortunately, neither the underlying FaaS nor SQS support remote method calling. Thus, we implement an abstraction layer following the Active Object Pattern [3] to seamlessly handle the needed reflection.

Firstly, we inspect the code dependencies for every actor and zip them into the AWS Lambda deployment package. We find similar approaches in [4]. Then we serialize every remote actor method call and send it over SQS. And finally, we deserialize every SQS message and call the appropriate actor method.

Related work

Microsoft’s Azure cloud offers features that seem to be the most helpful to build a serverless actor framework. Azure Durable Functions (ADF) [5] is an extension of Azure Functions and Azure WebJobs that 80 lets you write stateful serverless functions. These spe- Serverless Actor cial Functions as a Service can orchestrate other func- )s60 FaaS tions, providing state management, checkpointing, and (e sync/async function calling and chaining. ADF also in- tim40 cludes eternal orchestration functions [6]. This feature ino intends on maintaining a long execution of the same tcu20 function so that state is kept along. To avoid FaaS’ ex- ex ecution time limit, functions detect when the end time E 0 100 300 500 700 is near and create a new invocation with the state as payload. Additionally, ADF provides singleton orches- Number of messages trators functions, which ensure that only one invoca- Figure 2: Serverless actor vs FaaS. tion of them is ran at a time. When invoked, functions detect if there is already another instance of the same function running; in which case, the function ends im- serverless actors. However, they still fall short to promediately. Unfortunately, this singleton orchestrators vide atomicity, state and invocation performance, and are not atomic at the moment of writing [7], so they guaranteed event delivery. can’t be used to build serverless actors.

In [8] we find an example of an actor model imple- 5 Evaluation mentation using Azure Durable Functions. In this example, each orchestrator function (ADF) is an actor We evaluate our prototype and compare it to AWS with a specific instance identifier. The author makes Lambda to prove the usefulness of serverless actors. use of singleton orchestrators and their capability of Many actor use cases require communication between waiting for external events to perform actor opera- actors, which are not possible to implement using AWS tions. An actor receives messages from others by wait- Lambda or any other FaaS available. Thus, we pick the ing for external events. When the event is created, the counter example. A simple use case which can be imorchestrator function awakens, processes the message, plemented in both serverless actors and AWS Lambda. and calls itself with its state as payload by using the The experiment consists of measuring the amount of eternal orchestration feature. Orchestrator instances time that takes to process different loads of messages. (actors) are created, queried or terminated through The implementations used in the experiment are as folHTTP-triggered functions, which, in turn, raise spe- lows: cial events specifying the instance identifier and the desired actor operation. However, these orchestrator • Serverless actors: each actor’s message will be sent functions don’t guarantee message delivery, events can through SQS. Then the message will be read by an be lost depending on the function activity [9]. already running actor, or a new actor invocation

As for singleton functions, extra invocations that do will handle the new message and the upcoming not perform any work imply additional costs. More- burst. Each message will modify a counter variover, the implementation requires extra steps in func- able in the actor local memory. tion invocation (HTTP) that involve additional com- • FaaS: each message implies a new function invoplexity and latency. cation, which in turn will make a read and update

In addition, the state offered is limited for two rea- request to a remote DynamoDB. sons: it uses eternal functions, so the state is transferred by payload; and it uses an external service for In order to make a fair comparison, both implemenpersisting state when the function waits for message tations use a single concurrent lambda, 3 GB of memevents. That is, ADF persists the state between differ- ory, warm containers, and the same invocation process. ent function activations by an Event Sourcing replay The experiment was repeated 10 times. mechanism [5, 6]. Consequently, not only there is an Fig. 2 shows the average and standard deviation for overhead penalty when recreating the state, but also the processing time length for both Serverless Actor function executions must be deterministic [5]. and FaaS implementation. We see that our serverless In summary, ADF seems to offer great tools to build actor prototype is up to 5.95× faster than the AWS Lambda implementation. This is due to the high la- tion and showed that our implementation processes up tency of the invocation and remote storage which hap- to 5.95× more messages than its FaaS counterpart. pens for every request. We also observe a significant We have observed that, indeed, serverless actors are and increasing execution time deviation as a result of possible. However, we also argue that run-time extenthe Lambda throtling process, which seems to postpone sions to the serverless core would be necessary. In parlambda invocations if it receives frequent invocations. ticular, we claim that serverless functions would need support for intercommunication and an event system capable of processing messages efficiently and trigger6 Discussion ing new functions when necessary.

During the development of the design and implemen

tation of serverless actors, we have found several re- References strictions that deserve a discussion. We believe this is a consequence of services’ lack of built-in features [1] S. Tasharofi, P. Dinges, and R. E. Johnson, “Why needed for the application. do scala developers mix the actor model with other

One important aspect is addressing. Current FaaS concurrency models?” in European Conference on do not offer any kind of direct communication between Object-Oriented Programming. Springer, 2013, pp. functions once they are invoked. Therefore, remote 302–326. services such as SQS must be used. While performant [2] “AWS Lambda - Managing Concurrency,” enough for some cases, the distributed nature of this https://docs.aws.amazon.com/lambda/latest/dg/ service implies a suboptimal performance due to the concurrent-executions.html, 2018. latency. A serverless system with complete support for function’s intercommunication would get rid of this [3] R. G. Lavender and D. C. Schmidt, “Active object performance penalty. – an object behavioral pattern for concurrent pro

Passivation and event processing are the most im- gramming,” 1995. portant elements in a serverless actor system. Current offering allows two major approaches. The one with [4] J. Spillner, “Transformation of python applicomplete passivation can be thoroughly implemented cations into function-as-a-service deployments,” without external support but has a great penalty in arXiv preprint arXiv:1705.08169, 2017. performance; whilst the hybrid solution for passivation [5] “Durable Functions overview,” https://docs. that processes messages directly from a queue is per- microsoft.com/azure/azure-functions/durableformant, it still implies an external controller to wake functions-overview, 2018. up passivated actors.

The solution here is, once more, in the cloud service [6] “Eternal orchestrations in Durable Functions itself. We need run time support for functions capable (Azure Functions),” https://docs.microsoft.com/ of awakening when messages arrive to a queue, but able azure/azure-functions/durable-functions-eternalto read all available messages from that queue in a sin- orchestrations, 2017. gle execution, without requiring two different sources of events. This would suppose an internal manager [7] “Add singleton support for functions to ensure only that knows the function’s state. This process would one function running at a time,” https://github. send messages to a running function or create a new com/Azure/azure-functions-host/issues/912, 2018. invocation otherwise. [9] “External event message loss due to async activity in orchestration,” https://github.com/Azure/ azure-functions-durable-extension/issues/515, 2018.

Conclusions This paper studied the feasibility of building actors on

top of the current serverless offering. We presented the main challenges that we must overcome and how we implemented a prototype successfully solving them. The evaluation compared our prototype to a serverless func