The Frontier API > The Client API

  Documentation Home 

The Client API

The Client API is the portion of the Frontier API with which client applications deal directly when creating, monitoring, and controlling jobs and tasks. The Client API is implemented by the client library and provides a set of functionality used by a client application to communicate with the Frontier server. The functionality it encapsulates falls into several different categories:

This section describes the modes in which the Client API can be operated, how interaction with the Frontier environment occurs, job and task attributes, and how jobs and tasks are launched, monitored, and controlled.

 

Remote vs. Local Mode

The client library can be operated in either remote or local mode. The former manages a session with the Frontier server, relaying requests and responses back and forth between server and client application via the Frontier messaging protocol. This is the mode that allows a client application to actually command the resources of Frontier. Local mode, on the other hand, provides an efficient virtual session using only resources local to the client running the application, which is important for running applications using the Frontier API transparently without bringing Frontier itself into the equation. This can be especially useful for debugging client applications.

The choice of remote or local is made when a session is created via a SessionManager, the primary interface to the client library. Local sessions execute tasks exclusively on the the client application's host computer, making it useful for small jobs or debugging applications. Remote sessions execute tasks across a Frontier grid. The fact that the interface to both modes is identical, plus the ease with which modes can be toggled, means that a single application can be written to the Frontier API and yet only bring the power of a Frontier grid to bear when necessary. This section is written with remote job execution in mind, however, nuances of local mode execution will be mentioned where noteworthy.

 

SessionManager

An interaction with the Frontier environment, be it the actual Frontier platform or the local execution of tasks, takes place within the context of a session. Sessions are created and managed by an instance of com.parabon.client.SessionManager. Thus, this class becomes the entry point into the Client portion of the Frontier API. The SessionManager is responsible for acting as a proxy to manage the resources of the Frontier platform and the set of jobs a user is running. Two implementations of SessionManager exist: com.parabon.client.RemoteSessionManager and com.parabon.client.LocalSessionManager, which operate the client library in either remote or local mode, respectively.

The interface to both implementations of SessionManager is identical, but the behavior may differ in some cases. For example, sessions can be destroyed via the destroy() method. Destroying a session in remote mode means that the client library will disconnect from the Frontier server while submitted jobs and tasks continue to run. Destroying a session in local mode means the client library will wait until all tasks have completed before control is returned to the application.

 

Job and Task Attributes

As jobs and tasks are created via the Client API, it is important to have a means of later identifying them. Although the client library uses internally generated identifiers to track jobs and tasks, these identifiers are not exposed through the API. Instead, a more flexible scheme is provided to allow client applications to identify jobs and tasks. At creation time, a set of attributes (i.e., name <-> value pairs in a format similar to those used for task parameters and results) can be assigned to a job or task. These attributes are then propagated through the library and Frontier server. Attributes can encode information such as simple unique identifiers, the username of the person who initiated a job, the launch date of a task, a set of email addresses to send results to, or the name of a file stored locally to keep track or more complex job information and statistics. The important point is simply that attributes are user-defined, immutable, and simply passed through the Frontier system as identifying information to be later used by the client application. Note that any assigned attributes are not sent to provider nodes actually running the tasks.

In addition to observing the attributes of any given task or job, the client library provides mechanisms to select jobs and tasks based on a simple attribute template. This facility is useful when, for instance, a client application wishes to observe or listen to a subset of tasks in a large job without incurring the memory and communications overhead of obtaining information about all the tasks in that job.

 

Creation of Jobs and Tasks

Jobs

Jobs are created via the SessionManager.createJob() method, which is passed an instance of com.parabon.common.NamedParameterMap specifying the attributes to be associated with the new job. The job itself is represented by a com.parabon.client.Job instance. After creating a job—in effect an empty repository—a client application will generally 'fill' it with job-level elements (via the addDataElement() and addExecutableElement() methods) and tasks, as described below. It is important to note that once a task or element is added to a job, ownership of the object used to specify that task or element passes to the job. Subsequent modification of that object by the client application will result in undefined behavior.

Tasks

Once a job has been created, tasks can be added to it. This is done via the Job.addTask() method. This method takes as arguments a com.parabon.client.TaskSpec instance, which describes the specification of the task to be run, and a NamedParameterMap, which specifies task attributes. The method will return a com.parabon.client.TaskProxy instance, which can be subsequently used by the client application to control the task. Note that ownership of the TaskSpec is transferred to the client library. Any attempts to modify a TaskSpec instance after using it to create a new task will result in undefined behavior, and further, the contents of the TaskSpec itself are not guaranteed to remain unchanged by the client library.

A task specification, as described by an instance of com.parabon.client.TaskSpec, consists of four primary pieces:

Elements

Data and executable elements are represented by instances of the interfaces com.parabon.runtime.DataElement and com.parabon.runtime.ExecutableElement. For the purpose of adding an element to a job or task spec, a client application must either use an existing implementation of these interfaces (e.g., com.parabon.client.JarFileExecutableElement) or implement them itself. The interfaces are relatively straightforward. Most importantly, implementations must provide a getStream() method, which returns a stream representing the contents of the element itself. This method may be invoked multiple times, and each time must return an independent stream that produces data starting from the beginning of the element. The getStreamLength() method must also be implemented, either to return the length of the element if known (which can greatly increase the efficiency of some portions of the client library), or -1 otherwise. In addition, executable elements must provide other pieces of information about the contents of the executable element, namely language and packing, for which only "Java" and "jar", respectively, are currently supported.

It is important to note that the DataElement instances passed to tasks via a request to TaskContext.getDataElement() are not necessarily the same objects or even classes used to create the data element, and so no such assumptions about the implementation of DataElement or its underlying streams should be made within tasks. In particular, although in local mode the same instances are often passed through from client application to task for reasons of efficiency, this is not the case in remote mode.

Submitting Data to the Frontier Server

In remote mode, as soon as a manager is connected to the Frontier server, the manager will start to submit candidate jobs and applicable portions of their contents. Such information will continue to be submitted as it is created and becomes a candidate for submission. When the manager is destroyed, it will continue attempting to submit all information that became a candidate for submission any time before its destruction. This means that SessionManager.destroy() will block until pending submissions have completed successfully.

Three types of data can be submitted to the server: jobs, tasks, and elements. A job becomes a candidate for submission as soon as it is started. Tasks become candidates as soon as they are started. Elements become candidates as soon as at least one task that references them has been started. This means that a task that was created but never started or an element that was never referenced will never be sent to a server and will be lost when a session ends (that is, when the manager is destroyed).

 

Monitoring and Controlling

Once a job is created and its tasks started, the client application needs to listen to results and status and, when a job or task is complete, remove it. There are three ways to perform these functions:

These operations are performed for a job via the Job instance representing the job, or for a task via the corresponding TaskProxy instance supplied by a job or event.

Run Mode

A task's run mode represents which phase of execution it is in at a particular time as one of a small set of enumerated values, specified in com.parabon.common.TaskRunMode. A task's last known run mode can be obtained through the task's proxy as embodied by a corresponding TaskProxy instance obtained through a job or event. The mode a task was in when it generated a particular event is specified in the event itself, as described in Listening to Events. Possible run modes include the following:

Listening to Events

Both jobs and tasks allow listeners to be added and removed. Listeners are user-created classes that implement one of a set of listener interfaces, each of which receives via a method call one or more types of events generated by a job or task. A job listener will receive events corresponding to all the tasks in that job, while a task listener will receive only events corresponding to that particular task. Note that while all applicable job- and task-level listeners will be called for any given event, no guarantee is made about the order in which they are called.

All events generated by a task are described via a com.parabon.client.TaskEvent instance. This interface supplies a set of methods to query some common aspects of the task that generated an event:

Five types of task event listeners are defined, each of which extends the otherwise empty com.parabon.client.TaskEventListener interface. A listener class should implement one or more of these interfaces to perform application-specific operations when an event is received. The events sent to any given listener are determined by which interfaces it implements—for instance, if the listener implements com.parabon.client.TaskResultListener, it will be sent result events. Each listener receives one set of events through a method that takes as a parameter an instance of a particular class derived from TaskEvent. Note that it is possible for a given event to correspond to two listener methods within a single listener object (e.g., a result listener and a status listener). In this case the event will be sent to both applicable methods in an unspecified order.

Also note that events passed to listeners, and any objects obtained from them with the exception of TaskProxy instances, are valid only for the length of the method call. For instance, results obtained from result events must be copied during the listener event method call if a client application wishes to reference them after the method returns. Also, as ownership of these objects is not transferred to the client application, any changes to event objects or their children will result in undefined behavior for the remainder of the session.

The five types of task events are:

Removing Tasks and Jobs

As soon as a job or task is complete and its results have been recorded locally by the client application—or sooner if it is no longer required—it should be removed by calling either the Job.remove() or TaskProxy.remove() method, respectively. Removal of a job includes removal of all the tasks it contains. This action will stop execution if necessary, clean up all related records and elements from the Frontier server in remote mode, and release all local objects associated with that job or task from the client library. Until a job or task has been removed, all records of its structure, contents, and results will be maintained on the Frontier server.

 

Releasing and Reestablishing

In several circumstances, in remote mode, the local mirror of the jobs and tasks on the Frontier server may be incomplete. Any such information that exists on the server but not locally is referred to as released. The process used to selectively obtain information from the server once again is known as reestablishing. Reestablishing is useful not only for regaining information that has been released over the course of a session, but also during new sessions, for monitoring and controlling jobs created during previous sessions. Note that everything in this section pertains to remote mode. Although some of the methods described are common to both modes, they are merely stubs in local mode.

Release Behavior

Client applications must be aware of the possibility of data being released. Specifically, this means two things. First, just because a piece of data is not made available directly—for instance, a job doesn't appear in a job list or a task has null results—doesn't necessarily mean that it does not exist. When faced with such an absence, a client application should not necessarily behave as though a job cannot be accessed or a task failed. Second, a client application should know how to reestablish such data from the server either when faced with a required bit of data being released or as a general preventative measure.

Entire jobs might be released, in which case they will not be included in job listings until the session is reestablished (as described in the next section). Finer-grained pieces of data can also be released. For instance, task progress might be given as -1, indicating progress is not known, after a task has been released and reestablished. The API reference documentation lists the behavior of particular methods when dealing with released data.

Reestablishing

Reestablishing is a crucial piece of the launch-and-listen paradigm when the listening half of the equation is in a different session than the launching portion. Specifically, reestablishing refers to obtaining information about the contents and status of released jobs and tasks, most often those created in previous sessions, and attaching to them in order to monitor status changes. This is somewhat nontrivial because of the wealth of information that can be associated with a single user's jobs. It is not generally efficient or even feasible to simply transfer all possibly relevant data to a client application upon establishing a new session. Hence, the process of reestablishing involves a series of Client API methods used to bring a new session into synchronization with the server with respect to a subset of jobs and tasks without incurring additional overhead, as well as a set of rules governing how a reestablished session behaves from a client application standpoint.

There are three primary levels of information that a client application may want to reestablish: jobs, tasks, and task contents. For instance, to find the results of a particular task, a client application must:

  1. Start a new session. Create a RemoteSessionManager as usual.

  2. Find the correct job. Reestablish the job list (RemoteSessionManager.reestablish()), and iterate through it (RemoteSessionManager.getJobIterator()), using each job's attributes to identify it, until the desired job is found.

  3. Find the correct task. Either (1) reestablish the job's contents (Job.reestablish()), iterate through its task list (Job.getTaskIterator()), and look at the tasks' attributes, or (2) reestablish a partial set of tasks matching a pre-constructed attribute template (Job.findRemoteTasksByAttribute()) and iterate through this reduced list. The former method is more straightforward and somewhat more flexible, but requires transferring and storing the entire contents of the job—possibly tens of thousands of tasks, most of which may not be of interest. The latter method allows the Frontier server to cull the list of tasks transferred based on a simple attribute comparison mechanism, reducing the amount of data that must be transferred over the wire. A hybrid process would be to use the Job.reestablishPartial(), which reestablishes the part of a job's task list matching a given attribute template, but doesn't attempt to create an iterator through these tasks. Rather, it merely makes them available through later calls to the Job.getTaskIterator(), at least until the tasks are released once again.

  4. Get the results of the task. Either reestablish the task itself (TaskProxy.reestablish()) and query the desired fields via the TaskProxy (e.g., via TaskProxy.getResults()), or attach a listener to the task, which will then receive any further results sent by the task in addition to the current status, via the listener interface.

Many common operations require only a subset of these steps. For instance, to listen to the results of all tasks in a job, the job itself doesn't need to be reestablished; the application merely needs to reestablish the manager, find the relevant job, and attach a listener.

It should be noted that because of the hierarchical structure of the Client API, the fact that a Job instance exists, for instance, does not mean that Job is fully reestablished. RemoteSessionManager.reestablish() generally obtains only enough information to identify each of its jobs and make their attributes available, and hence attributes will be the only information accessible through the otherwise empty initial Job instance provided. Attributes are, in fact, the only piece of information guaranteed to always be available when given a Job or TaskProxy instance exists.

The attribute query mechanism bears further description. The template is in the form of a NamedParameterMap, similar to the attributes themselves. A task matches the template if and only if each of the entries in the template exist as attributes in the task and their respective values match as well. The definition of matching values is somewhat different for each type. For most types (e.g., strings, integers, etc.), the attribute and template value must match exactly. For structures, equality is defined as applying these matching criteria recursively—that is, the template entry value is itself treated as a template that the task's corresponding attribute value must match. Note that a task can contain additional attributes not present in the template and still match the template. This mechanism allows a set of simple, logical queries to be executed, depending on the attributes a client application employs. For instance, if each task corresponds to one entry in a two-dimensional table, and the row and table indices are stored as attributes for each task, a list of the tasks corresponding to a single row or column can be obtained by constructing a template with a single entry containing the desired row or column index.

 

Running Locally

When running in local mode, tasks are executed using local resources, within the same JVM as the client library itself is running. As noted earlier, local mode's primary raison d'etre is not a simulation of running a job on Frontier itself, but rather a low-overhead mechanism for direct, efficient execution of tasks written using the Frontier API within a client application. As a secondary consideration, this mode can be used for task debugging, with the caveat that bugs involved with the precise behavior of remote execution or that depend on the particular behavior of local mode may not be made apparent.

The serial or parallel nature of task execution can be controlled by a set of methods in LocalSessionManager. In particular, setMaxRunningTasks() can be used to set the number of tasks that the client library will attempt to run in parallel. Setting this to a large number will result in the client library attempting to execute all tasks in a job simultaneously, each in its own thread. This situation is generally far from optimal and could often result in resource and operating system-imposed limitations. At the other extreme, setting this value to 1 will result in tasks being executed in a completely serial fashion, each being started after the previous one has completed. The method getMaxRunningTasks() can be used to query the current value for this field, while getNumRunningTasks() will report the number of tasks currently being executed in parallel, each in its own thread. This number will always be less than or equal to the number of tasks that are candidates for execution (i.e., those that have been created and started but have not yet completed).

 

Size Considerations

The amount of data involved with a single job is often large enough to tax the resources of a client-side machine, and so it is often desirable to launch, monitor, and control a job without the burden of keeping all job-related data resident and up-to-date locally. This section contains a number of considerations that, when followed carefully, can allow jobs of arbitrary size to be created and monitored using limited local memory resources. Note that all of these considerations apply only to remote mode.

Message Queue Sizes

Incoming and outgoing message queues can often grow large very quickly. The former is often because task results can be retrieved more quickly than they can be processed, while the latter is because tasks can often be created more quickly than they can be serialized and sent over the wire to the server. These conditions can both be easily mitigated by using the RemoteSessionManager.setMaxIncomingMessageQueueSize() and RemoteSessionManager.setMaxOutgoingMessageQueueSize() methods, respectively. Both methods specify hints rather than hard limits. For instance, a batch of messages may be retrieved at once, resulting in the incoming message queue temporarily growing beyond its specified bounds. The price of using these limits is that efficiency may in some cases be reduced because of underutilization of available communications bandwidth. In addition, if an outgoing queue size limit is used, arbitrary methods in the Client API may block without warning, as they often have to enqueue messages in order to successfully complete. Thus, these messages may be forced to wait until a slot is available in the outgoing queue before returning.

Task Release

Jobs by default attempt to keep records of all tasks they contain and are aware of—both those created locally and those created in earlier sessions and reported by explicitly reestablishing the job or implicitly via status reports. As task records can be large in general (containing, among other items, last known results) and jobs can contain large numbers of tasks, it makes sense to optionally ease this restriction. This can be done by using the Job.setAllowTaskRelease() method. When task release is enabled, jobs may 'forget about' tasks arbitrarily in order to reduce resource requirements, as long as external references to them are not maintained via TaskProxy instances. Events will still be propagated reliably to listeners, and records will be maintained at least during the course of event consumption.

While this alternative behavior is powerful, the impact of using it is significant and twofold. The first issue is efficiency. As information can be forgotten about arbitrarily, it may not be available when a client application needs it. Thus, a reestablish process, which requires a messaging round trip to the server and a transfer of the resulting information, may occasionally be required to obtain information that was already known by the client library earlier in a session. How often this will occur depends on the structure of the client application and how many extra resources are actually available. In the best case, a client application will act only on information contained in events and so no reduction in efficiency will occur. In the worst case, the effective reduction of local cache could result in a 'thrashing' situation seen commonly in other caching situations (e.g., virtual memory usage) and reduce the performance of an application severely.

The second issue involves software development. Put simply, when utilizing remote mode, client applications will not be able to assume that information that was available a few milliseconds ago will still be available, unless the application itself keeps references to it. This means, for example, that even if a job was reestablished earlier in a session, the current list of tasks reported by a job may be incomplete. What this means in terms of programming complexity is entirely dependent on the structure of the client application itself. Be warned, however, that unless assumptions are carefully examined, unexpected and difficult-to-track application bugs could result.

Job Listeners

Applications that run with constant-order memory requirements must employ constant-order mechanisms, including listeners. If many tasks are each listened to explicitly—even if by the same listener object—then the client library must keep track of an explicit data structure for each task. Hence while listening to large numbers of tasks, a large amount of resources will be required. Furthermore, listening to a task may represent an implicit reference to that task, and so—although this behavior is not in any way guaranteed—the client library may keep track of all data relating to that task, including results, even if task release (as described above) is enabled.

The solution is to use a single job listener, which will receive events from all tasks in that job. The tradeoff is that if only a subset of tasks are relevant, then the listener will have to examine each event to determine whether it corresponds to a relevant task. In addition, all task events for that job will be propagated, possibly requiring additional unnecessary communication overhead if the subset of tasks being considered is relatively small. In summary, if most tasks within a job are being listened to, it is generally a good idea to listen to the entire job via a job listener rather than to each task via individual task listeners.

Referenced Structures

The referenced structure consideration is twofold. First, even if task release is enabled, tasks will not be released if references to them are maintained external to the client library, through TaskProxy instances, or any other implicit references to task information (e.g., attribute maps). The second consideration may be somewhat obvious, but it bears saying nonetheless. Applications should not, when it is possible to avoid, keep large, O(N) data structures—that is, structures in which some amount of information is kept for each task. Although doing so tends to be common practice, developers should keep in mind the scale of application made possible through Frontier and make every attempt to ensure that the client application itself and the resources available on local machines do not become the bottlenecks preventing the execution of arbitrarily large jobs.