vidi:approach

# Research proposal

## Approach

The candidate is aware that a large amount of literature is available on stochastic multi-class queueing networks, assuming stochastic inter-arrival and service times. Nevertheless, the main emphasis of this proposal is on deterministic flow networks, since it is necessary to first understand the essence of controller design for networks of switched servers. That is, we consider the amount of jobs as a continuum and we assume deterministic arrival and processing rates. Furthermore, as we would like to derive tailor made controllers, we assume that the network is completely known a priori and its structure is not subject to change. Typical examples of such networks are manufacturing systems with machines serving different types of jobs and urban road networks with traffic lights serving cars from different directions, as illustrated in Figure [vidifig1].

The proposed approach is based on ideas from nonlinear control theory. A parallel can be drawn with an autopilot for an airplane. For a given airplane, first a feasible reference trajectory should be specified. Next, the autopilot uses this desired reference trajectory to determine what control actions to apply given the current state of the airplane (position, speed, orientation, etc.). No matter what reference trajectory is given, the autopilot comes up with suitable control actions. Similarly, for a given network of switching servers, first feasible network behavior should be specified. Next, this desired reference can be used to determine what servers should be doing, based on the current state of the system. Also in this case, no matter what reference trajectory is given, suitable control actions for the network should be determined.

An approach for deriving these control actions for an autopilot is by means of Lyapunov's direct method, cf. references. The basic idea behind Lyapunov's direct method is that whenever the total energy of a mechanical or electrical system is continuously dissipated, the system must eventually settle down to an equilibrium point. So if we are able to come up with a kind of `energy-function' for the error-dynamics of the system under consideration, and if we are able to design our controller such that this energy is decreasing all the time, the system should settle down to a constant energy level and we may conclude stability of our system.

When controlling a network of switching servers towards feasible desired reference behavior we might apply similar ideas, as the applicant introduced in references. In that paper we showed that starting from given desired network behavior, it is possible to derive a Lyapunov function candidate by considering the difference of the amount of work in the system with respect to that of possible desired behavior. By making this Lyapunov function candidate always decrease as much as possible, a network controller can be derived which makes the network behavior converge to the desired network behavior. In references we showed that the same idea can also be applied when buffer constraints need to be taken into account.

The controllers resulting from the approach introduced in references are central controllers (or state-feedback controllers) which determine for each server when to switch to what job type, based on global state information. In a manufacturing setting, global information usually is available and a central controller can be implemented relatively easily. However, for controlling urban road networks with traffic lights, implementing a central controller is almost infeasible. In the latter case, distributed controllers are needed, i.e. each intersection should have its own controller which, based on local information only, decides what to do. This relates to the concept of dynamic output feedback from control theory.

Consider again the problem of controlling an airplane, or any other mechanical system. Usually a controller needs both position and velocity for determining the required control action. However, often only position measurements are available. Nevertheless, after measuring positions for a while, usually the velocity is rather well known. So to overcome the problem of missing velocity measurements an observer for the velocity is built based on both the system dynamics and the position measurements. Subsequently, instead of velocity measurements the controller uses the estimates for the velocity obtained from the observer.

As an example of this observer design, consider the dynamics

$x(k+1) = A x(k) y(k) = C x(k)$

where $x(k)\in R^n$ denotes the state-vector at time $k\in Z$, $y(k)\in R^m$ the available measurements, and $A$ and $C$ are matrices of appropriate dimensions. An observer for $x$ is given by

$\hat x(k+1) = A \hat x(k) + L[y(k)-\hat y(k)] \hat y(k) = C \hat x(k),$

where $\hat x(k)$ denotes an estimate for the state-vector at time $k$, $\hat y(k)$ denotes the estimated outcome of the measurements, and $L$ is a matrix with appropriate dimensions. Notice that whenever $\hat x(k)=x(k)$, the dynamics (??) reduces to (??). Whenever $\hat x(k)\neq x(k)$ this can only be noticed by a difference between $\hat y(k)$ and $y(k)$. This difference is used for updating the state estimate. If we define the observation error $e(k)=x(k)-\hat x(k)$ we obtain for the observation error dynamics

$e(k)=[A-LC]e(k).$

Therefore, the observation error converges to zero iff it is possible to choose $L$ such that all eigenvalues of the matrix $A-LC$ are strictly within the complex unit disc. This can be done if

rank$\begin{bmatrix} C CA CA^2 \vdots CA^{n-1} \end{bmatrix} =n.$

We propose to use the idea of observers for deriving distributed policies for switching networks. Since the layout of the network is known, as well as the controllers used in each server, under certain conditions it is possible for a server to reconstruct the global state of the system based on local information only. To illustrate this, consider the example introduced in references.

A single job-type is considered which first visits machine A, then machine B, then machine B again, and finally machine A again. The successive buffers visited are denoted by 1, 2, 3, and 4, respectively. The constant input rate $\lambda$ into buffer 1 is 1 job/time-unit, while the maximal process rates required at the buffers are $\mu_1=1/0.3$, $\mu_2=1/0.6$, $\mu_3=1/0.3$, and $\mu_4=1/0.6$, respectively. Lastly, the times for setting up buffers 1 and 4 at machine A are $\sigma_{41}=50$ and $\sigma_{14}=50$, the times for setting up buffers 2 and 3 at machine B are $\sigma_{32}=50$ and $\sigma_{23}=50$.

As shown in references, the orbit which minimizes both the mean amount of jobs and the mean amount of work in the system (as well as the mean flow time of jobs in the system) is depicted in \figurename —fig:orbit—.

As shown in references, the following controller can be derived , which achieves convergence towards this desired behavior:

• If initially in \mode13, switch to \mode12.
• If in \mode12, stay in this mode until both $x_1=0$ and $x_2+x_3\geq 1000$. Then switch to \mode42. Both machines serve at the highest possible rate (which might be the arrival rate).
• If in \mode42, stay in this mode until both $x_2=0$ and $x_4\leq 83\frac{1}{3}$. Then switch to \mode43. Both machines serve at the highest possible rate (which might be 0).
• If in \mode43, stay in this mode until $x_3=0$. Then switch to \mode12. Both machines serve at the highest possible rate (which might be 0).

Clearly, this controller is a central controller, since switching Machine A from buffer 1 to buffer 4 or vice versa requires state information at Machine B. Also, switching Machine B from buffer 2 to buffer 3 requires state information at Machine A.

Nevertheless, using the concept of observers, a distributed implementation can be made. It is relatively easy to see that for Machine B the following controller can be used as well:

• Serve step 2 at the highest possible rate (which might be at the arrival rate or even idling) until both $x_2=0$ and at least 1000 jobs have been served. Then switch to step 3.
• Serve step 3 at maximal rate until $x_3=0$. Then switch to step 2.

At first sight, the controller for Machine A does need global state information. However, using the knowledge of the layout of the system as well as the controller implemented at Machine B, the required state information can be reconstructed. For instance, when jobs stop arriving from Machine B to Machine A at step 4, Machine A knows that apparently buffer 3 has become empty. Since Machine A also knows when service of buffer 3 started, it can even determine when Machine A started and stopped serving buffer 2 before serving buffer 3. Finally, since Machine A has been sending jobs to Machine B, Machine A is able to reconstruct the global state information, which can then be used in the controller for Machine A. The estimate at Machine A can be updated as soon as jobs start or stop arriving from Machine B. Applying these insights results in the following controller for Machine A, as explained in references:

• If serving step 1, continue until both $x_1=0$ and at least 1000 jobs have been served. Then switch to step 4. Let ${\bar x}_1$ be the number of jobs served during this mode.
• Let ${\bar x}_4$ denote the number of jobs in buffer 4 when the setup to serving step 4 has completed. If serving step 4, continue until ${\bar x}_4+\frac{1}{2}{\bar x}_1$ jobs have been served. Then switch to step 1.

It has been shown in references that using these distributed controllers, convergence of the system behavior towards the behavior as depicted in \figurename —fig:orbit— is guaranteed.

This example shows that it is possible to use the notion of observers from control theory to derive tailor made distributed controllers for a deterministic flow network of switched servers with setup times.

vidi/approach.txt · Last modified: Thursday, 02 April 2009 : 16:34:48 by hvrooy