See discussions, stats, and author profiles for this publication at: https://www.researchgate.net/publication/263502603 Modular Neural Network Modelling for Long-range Prediction of an Evaporator Article  in  Control Engineering Practice · January 2000 DOI: 10.1016/S0967-0661(99)00123-9 CITATIONS 15 READS 24 3 authors, including: Huub H. C. Bakker Massey University 26 PUBLICATIONS   151 CITATIONS    SEE PROFILE All content following this page was uploaded by Huub H. C. Bakker on 11 January 2015. The user has requested enhancement of the downloaded file. https://www.researchgate.net/publication/263502603_Modular_Neural_Network_Modelling_for_Long-range_Prediction_of_an_Evaporator?enrichId=rgreq-b925235eda71dddde1dcb1bc4cbad38f-XXX&enrichSource=Y292ZXJQYWdlOzI2MzUwMjYwMztBUzoxODQ1Mzc1MDYxMzE5NzBAMTQyMTAwODU3ODUzNQ%3D%3D&el=1_x_2&_esc=publicationCoverPdf https://www.researchgate.net/publication/263502603_Modular_Neural_Network_Modelling_for_Long-range_Prediction_of_an_Evaporator?enrichId=rgreq-b925235eda71dddde1dcb1bc4cbad38f-XXX&enrichSource=Y292ZXJQYWdlOzI2MzUwMjYwMztBUzoxODQ1Mzc1MDYxMzE5NzBAMTQyMTAwODU3ODUzNQ%3D%3D&el=1_x_3&_esc=publicationCoverPdf https://www.researchgate.net/?enrichId=rgreq-b925235eda71dddde1dcb1bc4cbad38f-XXX&enrichSource=Y292ZXJQYWdlOzI2MzUwMjYwMztBUzoxODQ1Mzc1MDYxMzE5NzBAMTQyMTAwODU3ODUzNQ%3D%3D&el=1_x_1&_esc=publicationCoverPdf https://www.researchgate.net/profile/Huub_Bakker?enrichId=rgreq-b925235eda71dddde1dcb1bc4cbad38f-XXX&enrichSource=Y292ZXJQYWdlOzI2MzUwMjYwMztBUzoxODQ1Mzc1MDYxMzE5NzBAMTQyMTAwODU3ODUzNQ%3D%3D&el=1_x_4&_esc=publicationCoverPdf https://www.researchgate.net/profile/Huub_Bakker?enrichId=rgreq-b925235eda71dddde1dcb1bc4cbad38f-XXX&enrichSource=Y292ZXJQYWdlOzI2MzUwMjYwMztBUzoxODQ1Mzc1MDYxMzE5NzBAMTQyMTAwODU3ODUzNQ%3D%3D&el=1_x_5&_esc=publicationCoverPdf https://www.researchgate.net/institution/Massey_University?enrichId=rgreq-b925235eda71dddde1dcb1bc4cbad38f-XXX&enrichSource=Y292ZXJQYWdlOzI2MzUwMjYwMztBUzoxODQ1Mzc1MDYxMzE5NzBAMTQyMTAwODU3ODUzNQ%3D%3D&el=1_x_6&_esc=publicationCoverPdf https://www.researchgate.net/profile/Huub_Bakker?enrichId=rgreq-b925235eda71dddde1dcb1bc4cbad38f-XXX&enrichSource=Y292ZXJQYWdlOzI2MzUwMjYwMztBUzoxODQ1Mzc1MDYxMzE5NzBAMTQyMTAwODU3ODUzNQ%3D%3D&el=1_x_7&_esc=publicationCoverPdf https://www.researchgate.net/profile/Huub_Bakker?enrichId=rgreq-b925235eda71dddde1dcb1bc4cbad38f-XXX&enrichSource=Y292ZXJQYWdlOzI2MzUwMjYwMztBUzoxODQ1Mzc1MDYxMzE5NzBAMTQyMTAwODU3ODUzNQ%3D%3D&el=1_x_10&_esc=publicationCoverPdf Paper for publication in Control Engineering Practice Paper Number: CD 807 Title: Modular Neural Network Modelling for Long-Range Prediction of an Evaporator Running title: Modular Neural Network Modelling for an Evaporator Authors: N.T. Russell†, H.H.C. Bakker‡ & R.I. Chaplin‡ Authors’ Affiliation: ‡Institute of Technology & Engineering, Massey University, Palmerston North, New Zealand † Predictive Control Ltd, Highbank House, Exchange St, Stockport, Cheshire SK3 0ET, UK Fax: +44-1606-44592 Contact Author: Dr. H. Bakker Institute of Technology & Engineering Massey University Private Bag 11222 Palmerston North NEW ZEALAND Fax: +64-6-3505604 Email: H.H.Bakker@massey.ac.nz Abstract This paper presents the development of a modular neural network model of a three-effect, falling-film evaporator. The model comprises a number of sub-networks each modelling a specific element of the overall system. The modular structure was employed in order to provide benefits in terms of improved model training and performance. The performance of the modular neural model is demonstrated for long-range prediction by comparing it with process data, an analytical simulation and a linear ARX model. The results show that the modular neural model can satisfactorily predict over a horizon of arbitrary length and is suited for implementation within a predictive control scheme. Benefits in terms of model flexibility and interpretability are also discussed. Keywords: Neural networks; simulation; prediction; modular modelling; evaporators; model- based predictive control. Modular neural network modelling for long-range prediction of an evaporator N.T. Russell†, H.H.C. Bakker∗ ‡ & R.I. Chaplin‡ ‡ Institute of Technology & Engineering, Massey University, Private Bag 11222, Palmerston North, New Zealand † Predictive Control Ltd, Highbank House, Exchange St, Stockport, Cheshire SK3 0ET, UK Abstract This paper presents the development of a modular neural network model of a three-effect, falling-film evaporator. The model comprises a number of sub-networks each modelling a specific element of the overall system. The modular structure was employed in order to provide benefits in terms of improved model training and performance. The performance of the modular neural model is demonstrated for long-range prediction by comparing it with process data, an analytical simulation and a linear ARX model. The results show that the modular neural model can satisfactorily predict over a horizon of arbitrary length and is suited for implementation within a predictive control scheme. Benefits in terms of model flexibility and interpretability are also discussed. Keywords: Neural networks; simulation; prediction; modular modelling; evaporators; model- based predictive control. 1. Introduction Artificial neural networks (NNs) are a parallel processing paradigm inspired by the way biological nervous systems learn and process information. They have become useful tools for process identification because of their ability to represent nonlinear systems and their easy development (relative to conventional methods) using measured process data. With the advent of artificial NNs the cause of nonlinear process control has been significantly advanced. Neural modelling techniques have been implemented within a range of control strategies including inferential control (Willis et al., 1991), model reference adaptive control ∗ Author to whom correspondence should be addressed – facsimile: +64-6-3505604; email: H.H.Bakker@massey.ac.nz Modular neural network modelling for long-range prediction of an evaporator (Narendra & Parthasarathy, 1990), internal model control and model-based predictive control (Hunt et al., 1992; Morris et al., 1994). In recent chemical process control research model-based predictive control (MBPC) schemes have become one of the most common control methodologies to which NNs have been applied. In general, model-based predictive control is a receding-horizon strategy which aims to optimise the control moves over a future horizon based on a desired control objective. The controller makes use of an explicit dynamic process model to estimate the process outputs over the prediction horizon. The model plays a vital role in the operation of the controller and, in effect, determining an appropriate model to use constitutes the majority of the controller design. A model used within MBPC is required to be a dynamic model with the ability to predict ahead in time, referred to as n-step-ahead prediction, where n is the length of the finite prediction horizon. Some dynamic models perform well in estimating over a short horizon but when faced with a more realistic and useful horizon of say 20 or more time steps their performance can degrade rapidly. This is particularly true for NN models. The ability to predict ahead must therefore be built into the training methodology. Often the exact length of prediction horizon is not known prior to the training of the NN model therefore it is sensible to train the network to predict over the entire range of the available data. A data set which spans N time steps would then be used to train a network to predict up to N steps ahead. For the purposes of this paper this is referred to as long-range prediction. Using the network to produce long-range predictions is equivalent to implementing the network as a pure simulation model where only past information up to time t is used to predict up to time t + N. In general NNs are time-independent models. To capture the dynamics of a system time- delay networks are commonly used (Narendra & Parthasarathy, 1990). These networks can however dramatically increase the dimensions of the NN, which can decrease the speed and performance of the training method. The use of modular modelling approaches can alleviate these problems. Additionally, by combining prior knowledge of the Modular neural network modelling for long-range prediction of an evaporator system to be modelled within a modular structure, one can increase the transparency and interpretability of the trained NN model. 2. The evaporation process The evaporation process is a good example of a process that requires accurate control. A large proportion of energy used in industry is given to drying processes and evaporators play a significant role in the industrial drying of a number of food products like milk powders. Good evaporator control is particularly important since evaporators are often located directly upstream from energy-intensive processes such as spray drying. Tight control on the evaporator leads directly to better control in the dryer which results in better energy efficiency and a more consistent product. The subject of this study is a pilot-scale evaporator resident within the Institute of Technology & Engineering, Massey University. The evaporator is a three-effect, falling-film evaporator with two preheater stages and a condenser. A schematic of a single evaporation stage (effect) of the pilot-plant is shown in Figure 1Fig. 1. Industrial-scale evaporators have many evaporation tubes per effect, however the pilot-plant has just a single tube. The product flow enters the top of the effect through a distribution nozzle and plate arrangement and flows down the evaporation tube as a film, boiling off as it descends. The vapour is separated from the liquid and is drawn off to be used as the heating medium for the downstream effect. The concentrated product forms a level in the reservoir at the base of the effect and is feed to the next effect for further concentration. The pilot-plant evaporator has yet to be operated with liquids containing a solute, using instead pure water as the product stream. The product concentration is not therefore considered in this paper. Incorporating equations for the concentration of a solute would increase the nonlinear characteristic of the model. The variables of interest are the product temperature in the effect, the concentrate level in the reservoir and the product flowrate out of the effect. Modular neural network modelling for long-range prediction of an evaporator An analytical model of the pilot-plant evaporator was developed as a simulation tool (Russell, 1997). This model was used to compare with the empirical models described later in this paper. The approach used for the analytical evaporator model development was taken from the work of Quaak and Gerritsen (1990). They developed a dynamic model for a multi-effect evaporator and used a systems approach in which the process was divided into sub-systems for the purpose of analysis. The sub-systems included the model distribution plate, evaporation tube, product transport, and energy flows for each effect. The states of the model are the concentrated product temperatures, flowrates and levels. The model of the Massey University evaporator extended Quaak and Gerritsen’s model to include additional sub- systems relating to feed, preheater and condenser systems. The settling times for the pilot-plant were approximately 1 minute for the concentrate levels and of the order of 2 to 4 minutes for the effect temperatures. It was assumed that the time constants for the flowrates were negligible. Data from the plant was sampled at 5 second intervals. 3. Neural network model development 3.1 Modular modelling approach The objectives in the structuring of the NN model were to • simplify the training task as much as possible, and • develop a model with a more meaningful structure. To this end it was decided to create a modular model which made use of prior knowledge of the system. More specifically the NN model uses prior knowledge to create a modular model in two ways: i) The modelling task was decomposed into smaller elements or modules representing sub- units within the total system, similar to that used in the analytical model. The sub- networks were combined to form the full model according to the structure of the actual process. Each network was used to predict only one output variable. The inputs were Modular neural network modelling for long-range prediction of an evaporator chosen to be those variables that were known to strongly influence the output and could include outputs of other sub-networks. ii) Following the approach of Mavrovouniotis & Chang (1992) each sub-network was further simplified through localised computation by grouping related inputs together. The structure of the sub-networks were determined using the following guidelines: • The delayed inputs for each variable were connected to two hidden neurons. • All the inputs relating to a particular time instant were also connected to a single hidden neuron. • All the hidden neurons were connected to a single output neuron. These guidelines represent a simple and compact approach to sub-network structuring. The two hidden neurons coupled to each input attempt to capture the dynamics inherent within each input variable. Using only one neuron for this task would not be sufficient to describe complex behaviour whereas using more than two neurons is unlikely to significantly improve the characterisation. The additional hidden neurons attempt to capture the inter-relationship between the inputs at each time instance. These time-related hidden neurons also prevent the network structure becoming simply a parallel combination of the inputs and can be considered to provide a ‘snap-shot’ of the system inputs at a particular time. An example of the structure of a sub-network with localised connections and output feedback is illustrated in Figure 2. The output feedback is necessary to produce n-step-ahead predictions since past outputs need to be available for the model at future times steps. This structure of network is known as an externally-recurrent network (Su et al, 1992). The benefits of a modular modelling approach using prior knowledge are three-fold: i) to provide an efficient method for tackling large modelling problems, resulting in improved training and generalisation; Modular neural network modelling for long-range prediction of an evaporator ii) to provide flexibility so that the model can be easily updated and modified as the need arises and allows differing model characteristics and methodologies to be included within the single overall model; iii) to give structure and an element of transparency to a NN so that the model can be more easily analysed and understood with respect to the actual system. Point iii) is especially appealing since NNs generally have an amorphous structure whose relevance to the physical system they represent is not apparent. It is also difficult to analyse the dependencies occurring within in a network and how the outputs are calculated. 3.2 Sub-network training methodology The network training method developed for this study combined a backpropagation through time procedure (BPTT) with the Levenberg-Marquardt optimisation technique (Marquardt, 1963; Levenberg, 1944). The name Levenberg-Marquardt through time (LMTT) was chosen to describe this combined methodology. The backpropagation through time (BPTT) approach (Rumelhart et al, 1986; Werbos, 1990) uses data consisting of N time steps to train an externally-recurrent network by unfolding the network N times to create a feedforward network where the weights in each layer are identical. A modified version of the standard backpropagation (BP) algorithm is used to update the weights of the network. Previous work of a similar nature to this current study has employed conjugate gradient and random search techniques with BPTT to overcome poor convergence (Su et al, 1992). The Levenberg-Marquardt optimisation method (LM) offers a more efficient solution for network training. LM can easily be incorporated into the backpropagation method replacing steepest descent minimisation and its superior performance over steepest descent and conjugate gradient learning in NNs has been demonstrated (Hagan & Menhaj, 1994). However, its main disadvantage is that significantly more processing memory is required compared with other methods. When using such a recurrent training approach it is necessary during the presentation phase that the input matrix is updated with the network predictions. The training data is presented in Modular neural network modelling for long-range prediction of an evaporator batch mode and the network predictions are used to update the input vectors. This ensures that the network is trained using a parallel method (Narendra & Parthasarathy, 1990) and the resulting errors are calculated based on a recurrent implementation. There are alternative approaches to BPTT for training recurrent networks (Pearlmutter, 1990) which have been developed for fully recurrent networks. These alternative methods are often more efficient than BPTT since they are able to train networks without the need for unfolding the network. These approaches have a higher computational cost than BPTT but are more memory efficient. A BPTT approach was chosen because of its simplicity and easy adaptation from the standard BP algorithm. 4. The evaporator neural network model 4.1 Process data Pseudo-random sequences were applied to the inputs of the evaporator simulation in order to excite the model states and demonstrate the dynamics of the evaporator. For nonlinear identification the response data needs to represent the full range of the system dynamics and therefore a larger number of operating levels were applied to each input variable. Three separate trials (carried out on different days) were run in order to obtain three independent data sets. These three data sets were then assigned to be the training, testing and validation sets for the evaporator modelling task. The random inputs were simultaneously applied to the feed pump, each of the effect pumps and the steam valve to produce dynamically rich response data. A sampling rate of 10 seconds for the training data was considered appropriate for the short time constants present in the evaporator process. The training data set was selected as the set with the widest variation in process outputs. This data set covered an interval of over two hours. The testing set was 90 minutes long and was also used during the training phase. The validation data covered an interval of 53 minutes and was used to give an unbiased estimate of the model performance. Modular neural network modelling for long-range prediction of an evaporator 4.2 Structure selection The network structures were chosen on the basis of known relationships through experience with the process and the analytical model development. Cross-correlation tests were also performed both to assist in selecting the correct cause-to-effect relationships and to determine the delay spread of the time-delayed inputs into the networks. Similar network structures have been used in previous work related to the pilot-plant evaporator which made use of test data from the analytical simulation (Russell & Bakker, 1997). The model structures used were found to be successful and the study proved a useful preliminary study. The choice of the number of hidden neurons was not necessary for the most of the sub- networks since their structures were determined by the guidelines described in Section 3.1. Therefore the number of hidden neurons is dependent on the number of input neurons. The only exceptions to this were the flowrate sub-networks which were only first order systems, had no time delayed input streams and therefore did not follow the structure rules outlined in Section 3.1. Some examples of the second effect sub-networks will help to demonstrate the structures used. Each network was used to predict only one output variable and its inputs are variables that are known to strongly influence that particular output. The resulting sub-nets have multi- input, single-output structures. The following equations define the functionality of the three sub-networks that make up the effect model. • The effect two temperature was selected to be a function of the upstream and downstream effect temperatures (T1 and T3), the flowrate into the effect (Q1) and previous values of the effect two temperature. T2 = f(T1, T3, Q1, T2) (1) These variables were selected since the upstream and downstream temperatures directly influence the effect temperature and the mass of product in the evaporation tube strongly Modular neural network modelling for long-range prediction of an evaporator influences the rate at which the temperature changes. After investigating cross-correlation results and testing a number of model structures it was decided that each input to the network would have the same delay spread of four samples. The number of hidden neurons used was determined by the guidelines in Section 3.1. • The effect two concentrate level was chosen to be a function of the flow in (Q1) and out (Q2) of the effect, the temperature of the effect (T2) and previous values of the level. L2 = f(Q1, Q2, T2, L2) (2) These variables were selected since, according to the analytical model, the change in the level is a function of the product flow into the effect and the product and vapour flows out of the effect. The temperature has also been included as a variable to represent the vapour flow since the higher the temperature the greater the evaporation. After investigating cross-correlation results and testing a number of model structures it was decided that each input to the network would have the same delay spread of three samples. The number of hidden neurons used was determined by the guidelines in Section 3.1. • The flow out of effect two was determined as a function of the differential pressure between effect two and three which is estimated using the temperature difference T2 – T3, the pump speed (N2), the concentrate level (L2) and the previous value of the flowrate. Q2 = f(T2 – T3, N2, L2, Q2) (3) These variables were selected after considering the energy balance across the product transport system. Using an input for the pressure difference in this network, rather than the temperature difference, did not produce an improved performance. Since the flowrate dynamics are fast it was decided that each input to the network would have a delay spread of 1 sample – essentially producing a static model. This meant that the flowrate network topology would not be determined according to the rules of Section 3.1. Instead the number of hidden neurons used was determined by performing selection trials where the Modular neural network modelling for long-range prediction of an evaporator number of neurons was varied between two and twelve with at least ten networks tested at each number of hidden neurons. The structure that minimised the testing error was selected. For the four flowrate sub-nets in the evaporator model either five or six hidden neurons were used. Each sub-network had sigmoidal hidden-layer neurons and linear output neurons. Sigmoidal type networks were used since similar recurrent network training had been carried out using this type of network (Su et al., 1992). 4.3 Network training All training and testing data was scaled to lie within the range –1 to +1. The network training was applied individually to each sub-network. As the network weights were updated the testing data was passed through the network to provide a measure of the network’s generalisation capabilities during the training process. The network with the lowest testing prediction error was selected as the ‘best’ model. The test for each network was to perform a long-range prediction over the whole testing data set. This equates to 90 minutes of data. 4.4 Complete evaporator network model The complete evaporator neural model consists of thirteen sub-networks each describing a separate state of the three-effect system. The model states include the product temperatures, concentrate levels and product flowrates throughout the evaporator. The inputs are the steam valve position and the speeds for the feed pump and each of the effect pumps. Once trained the sub-networks were combined to form the complete evaporator model. The general structure of the complete evaporator model is illustrated in Figure 3. The complete model consists of the thirteen sub-networks connected in parallel and is equivalent to a sparsely-connected two-layered network with an additional input layer. Any secondary relationships between the system variables that are not modelled within the sub-networks are Modular neural network modelling for long-range prediction of an evaporator still accounted for in the manner in which they are joined together since the outputs of some sub-networks are used as inputs for others. 5. Alternative models Additional models of the evaporator were developed which employed alternative structures or modelling methods in order to provide a comparison for the modular NN model. 5.1 Other neural network models NNs with alternative topologies were developed to compare with the locally-connected, modular sub-networks to determine whether the proposed modular structures produced improved results. The alternative networks included: • Fully connected sub-networks to compare with the locally-connected sub-networks for a sub-section of the evaporator. • Locally-connected, three-output NN to compare with the modular-structured NN model for a sub-section of the evaporator. The second effect of the evaporator was selected as the sub-section to be modelled for the above comparisons. The alternative models were fed the identical inputs used by the corresponding modular sub-networks. The number of hidden nodes used in the fully-connected sub-networks were chosen so as to give a similar number of weights as the locally-connected sub-networks (Table 1). The number of weights used in the three-output network with a locally connected structure was less than that of the modular structured network as shown in Table 2. 5.2 Linear regression model A second modular model was developed for the full evaporator system with an equivalent overall structure to that of the modular NN model of Figure 3 but with the nonlinear sub- networks being replaced with linear ARX (AutoRegressive with eXogenous inputs) regression sub-models. These sub-models were identified using the least squares approach of Ljung (1987). Modular neural network modelling for long-range prediction of an evaporator 6. Results and discussion All the models were cross-validated using an independent data set from the actual process to perform an unbiased test. All the following results are based on comparing the model performances to this validation data set. 6.1 Comparison of locally-connected sub-networks and alternative networks The locally-connected sub-networks exhibited improved training and superior long-range prediction performance over the sub-networks with a fully-connected topology. The training of the locally connected sub-networks generally converged to more reliable solutions and more rapidly than the training of the alternative models (Russell, 1997). The scaled mean- square errors (MSEs) for the second effect product temperature and concentrate level predictions are shown in Figure 4. As well as the improved performance the locally- connected networks have fewer network parameters and allow the network behaviour to be more readily analysed and understood. Results for the flowrate sub-network are not shown since it is a first order model and therefore does not have a locally-connected structure. Similar results were obtained when comparing the training and performance of a single three- output NN and the modular model consisting of three sub-networks for the second effect (Russell, 1997). The MSE for the long-range predictions are plotted in Figure 5. Despite having fewer network parameters the three-output NN model does not generalise as well as the modular network model. 6.2 Performance of complete evaporator models Implementing the modular NN and ARX models to produce long-range predictions allows a direct comparison to be made with a nonlinear analytical simulation of the evaporator (Russell, 1997). The MSE results for the long-range prediction of the validation data for each variable of interest are plotted in Figure 6. Generally the performance of all the models were similar in character in that the temperature (Ti) and flowrate (Qi) estimates were closer to the plant data Modular neural network modelling for long-range prediction of an evaporator while the concentrate levels (Li) were the hardest to model accurately. It is clear from the results that both the NN and ARX models out-perform the analytical model in estimating the plant data over the majority of the variables. It is also interesting to note that the linear ARX model error results are generally similar or better than that of the nonlinear NN. This leads one to conclude that the data may not, in fact, be highly nonlinear (since the process was not operated with a product containing solids) and that some of the sub-networks did not capture all the nonlinear behaviour. However one must investigate plots of the model responses before making such conclusions since the MSE results only give a partial indication of the model fit. By way of example two plots of the model predictions are illustrated below. Figure 7 shows a plot of the actual second effect temperature (T2) compared with the predicted outputs of the NN and analytical models. Both models perform well in matching the actual data. The ARX model has a similar response to the two that are plotted. All the models had difficulty in predicting the second effect concentrate level (L2) accurately over the entire time range. The empirical models however out-perform the analytical model. While the NN model has a worse MSE result than the ARX model it is mainly due to its poor response at the low values of the level during the period of 30 to 40 minutes (Figure 8). The remainder of the NN model estimate is superior to that of the ARX model having a closer fit. Subsequent analysis of the training data for the L2 sub-network showed that it did not span an adequate region of the variable range. This highlights one problem when training NN models to predict a number of variables; it can be difficult to obtain process data containing adequate excitation of all the variables of interest. Figure 7 is an example of a typical prediction result whereas L2 (Figure 8) is the poorest predictor from the complete model. 6.3 Network performance versus prediction horizon To test whether the modular neural model could predict over an arbitrary length horizon the model was simulated with varying lengths of prediction horizons in order to predict the Modular neural network modelling for long-range prediction of an evaporator validation data. Figure 9 shows a plot of MSE prediction error versus the horizon length, n, for the temperature variables. The trends show in general that as the prediction horizon is increased the errors initially increase but then reach a plateau where the error remains approximately constant for increased horizons. The exception to this appears to be the second effect temperature. However the plots appear to indicate that the network does allow an arbitrary length prediction horizon to be used without significant degradation of the model predictions. Plots exhibiting similar trends were also obtained for the flowrate and level predictions. 6.4 Model flexibility The flexible nature of the model structure allows the development of a mixed-model representation of a system where various alternative modelling techniques can be applied to represent different portions of the system. Some sub-systems may be modelled with nonlinear NNs while others may only require a linear model. The sub-networks could also be incorporated with analytically derived models to enhance the model where theoretical process knowledge is lacking. A modular structure also allows modifications and extensions to the model to be easily carried out if changes in the process need to be taken into account, thus reducing future model development effort. If a particular sub-network is performing poorly than this element can be singled-out for re-development. 6.5 Model transparency As an example the T2 locally-connected sub-network can be used to illustrate how a measure of interpretation can be given to the relationships between the inputs and the T2 output. The outputs of the three time-based hidden-layer neurons are plotted in Figure 10. The inputs of one neuron are all the input variables at time t – 1, another connects with the variables at time t – 2 and the other at time t – 4. From this plot it appears that different neurons are Modular neural network modelling for long-range prediction of an evaporator characterising different variations in T2. The inputs at t – 2 produce a fairly constant output while the inputs t –4 model the major variations in the output, T2. While it is not obvious from Figure 10 what the precise relationships are, the structure allows one to more easily break down the relationships within the overall network and discriminate between the effects of the different inputs. 6.6 Training effort The locally-connected sub-networks trained better than fully-connected networks, which were prone to becoming trapped in local minima causing the LMTT method to terminate prematurely. Training times for the locally-connected sub-networks were generally less than 100 epochs and converged to a reasonable solution whereas the fully-connected sub-networks typically took greater than 400 epochs and still failed to converge. Similar problems were experienced with the three-output network despite having a localised structure. However the training and generalisation capabilities of this network were superior to a fully-connected, three-output network which was also developed. Modular neural network modelling for long-range prediction of an evaporator 7. Conclusions A framework for developing a dynamic NN model, made up of several sub-networks, has been presented which enables prior process knowledge to be built into the model topology. This approach of breaking the model into sub-systems removes the need to optimise a problem of high dimension. The modular structure of the model also provides benefits in terms of model flexibility and transparency. It was found that generally the more complex the structure the more difficult it was to train effectively and hence networks with poorer predictions were produced. The suggestion that ‘the more network weights–the harder to train’ holds true when comparing the training and performance of the fully-connected and locally-connected sub-networks. However the modular model out-performed the three-output network despite it having more weights. This is due to the fact that the individual sub-networks were easier to train individually and when linked together performed better than the single network structure. Based on observations and results from this research it be can stated that using: • localised connections reduces the number of network parameters which benefits training and provides an element of improved model interpretability, while • modularisation improves network prediction performance and offers flexibility. The complete evaporator NN model performed long-range predictions satisfactorily when presented with a validation data set and had a superior performance to that of a nonlinear analytical simulation of the pilot-plant. Compared with an equivalent linear ARX model the NN model had comparable or slightly worse error prediction results. For a nonlinear chemical process one would expect that the nonlinear NN would perform better. These results lead to the conclusion that either: • the process is not highly nonlinear and linear models are sufficient for this application, or • some of the sub-networks failed to converge to a satisfactory solution Modular neural network modelling for long-range prediction of an evaporator The absence of highly nonlinear data could be due to the fact that the evaporator was operated with pure water rather than a product like milk, which has complex physical properties. The second point raises the question of how to improve the convergence for recurrent training. Some improvement has been experienced when a teacher forcing technique is applied to the training (Williams & Zipser, 1989; Pearlmutter, 1990). With this scheme the target output is used to drive the network dynamics in place of the output feedback. This can help to improve network convergence but can also cause instability in the responses. Another solution which may improve the accuracy of the modular NN is to apply a global optimisation to the model after individually training the sub-networks (Gallinari, 1995). This could help reduce the accumulation of errors through the model but would require a more efficient training algorithm than used here. Either way, more work is required to investigate these possibilities further. Both in terms of using data from trials using a more realistic product stream and investigating alternative modular network-training strategies. These will be the focus of further research. Despite this, the prediction results shown here do demonstrate that the modular NN model can model a chemical process and generate satisfactory predictions over an arbitrary prediction horizon. This type of model therefore has the potential to be implemented within a model- based predictive control strategy. 8. References Gallinari, P. (1995). Training of modular neural net systems. The Handbook of Brain Theory and Neural Networks, (M. A. Arbib, Ed.), MIT Press, MA, USA, pp. 582-585. Hagan, M.T. & Menhaj, M.B. (1994). Training feedforward networks with the Marquardt algorithm. IEEE Trans. on Neural Networks, 5, 989-993. Hunt, K.J., Sbarbaro, D., Zbikowski, R. & Gawthrop, P.J. (1992). Neural networks for control systems—a survey. Automatica, 28, 1083-1112. Modular neural network modelling for long-range prediction of an evaporator Levenberg, K. (1944). A method for the solution of certain nonlinear problems in least squares. Quarterly of Applied Mathematics, 2, 164-168. Ljung, L. (1987). System Identification - Theory for the User, Prentice-Hall, Englewood Cliffs, NJ, USA. Marquardt, D.W. (1963). An algorithm for least-squares estimation of nonlinear parameters. Journal. Soc. Indust. Appl. Math., 11, 431-441. Mavrovouniotis, M.L. & Chang, S. (1992). Hierarchical neural networks. Computers & Chem. Engng., 16,. 347-369. Morris, A.J., Montague, G.A. & Willis, M.J. (1994). Artificial neural networks: Studies in process modelling and control. Trans. Inst. Chem. Eng., 72, 3-19. Narendra, K.S. & Parthasarathy, K. (1990). Identification and control of dynamical systems using neural networks. IEEE Trans. on Neural Networks, 1, 4-27. Pearlmutter, B.A. (1990). Dynamic recurrent neural networks. Technical Report CMU-CS- 90-196, Carnegie Mellon University, Pittsburgh, PA. Quaak, P. & Gerritsen, J.B.M. (1990). Modelling dynamic behaviour of multiple-effect falling-film evaporators, Computer Applications in Chemical Engineering, (H.T. Bussemaker & P.D. Iedema, Eds.), Elsevier, Amsterdam, pp. 59-64. Rumelhart, D.E., Hinton, G.E. & Williams, R.J. (1986). Learning internal representations by error propagation, Parallel Distributed Processing, (D.E. Rumelhart, & J.L. McClelland Eds.), MIT Press, Cambridge, MA, USA, Vol. 1, pp. 318-362. Russell. N.T. (1997). Dynamic Modelling of a Falling-Film Evaporator for Model Predictive Control, PhD Thesis, Massey University, New Zealand. Russell, N.T. & Bakker, H.H.C. (1997). Modular modelling of an evaporator for long-range prediction. Artificial Intelligence in Engineering, 11, 347-355. Modular neural network modelling for long-range prediction of an evaporator Su, H.-T., McAvoy, J. & Werbos, P. (1992). Long-term predictions of chemical processes using recurrent neural networks: A parallel training approach. Industrial Engineering Chemistry Research, 31, 1338-1352. Werbos, P.J. (1990). Backpropagation through time: what it is and how to do it. Proc. of the IEEE, 78, 1550-1560. Williams, R.J. & Zipser, D. (1989). A learning algorithm for continually running fully recurrent neural networks. Neural Computation, 1, 270-280. Willis, M.J., Di Massimo, C., Montague, G.A., Tham, M.T. & Morris, A.J. (1991). Artificial neural networks in process engineering. IEE Proceedings - D, Control Theory Appl., 138, 256-266. Modular neural network modelling for long-range prediction of an evaporator Fig. 1. A single effect of the three-effect evaporator Modular neural network modelling for long-range prediction of an evaporator u1 (t) u1(t-1) u1(t-2) u2 (t-2) u2(t-1) u2(t) z-1 u1 nodes t-2 node t-1 node t node } }u2 nodes } } } } y nodes z-1 z-1 !( )y t +1 Fig. 2. An externally-recurrent sub-network with localised computation ! ! ! Sub nets ! ! ! ! ! ! ! ! ! ! ! ! ! TDL TDL TDL TDL ! ! ! [u1(t-1)...u1(t-Ku)] T Fan out blocks TDL [u2(t-1)...u2(t-Ku)] T [u3(t-1)...u3(t-Ku)] T [um(t-1)...um(t-Ku)] T ! ( )y t2 ! ( )y t1 ! ( )y tn ! ( )y t3 ! ( )y t4 ! ( )y t5 ! ( )y t6 NN NN NN NN NN NN NN Fig. 3. General structure of the NN evaporator model Modular neural network modelling for long-range prediction of an evaporator 0.4 0.3 3.3 1.8 0 0.5 1 1.5 2 2.5 3 3.5 Level Temperature M SE (x 10-2 ) Fully-connected Locally-connected Fig. 4. Long-range prediction errors for the second effect sub-networks 0.5 0.3 0.8 4.4 3.5 2.0 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 Level Temperature Flowrate M SE (x 10-2 ) Three-output network Modular network Fig. 5. Long-range prediction errors for the second effect NN models Modular neural network modelling for long-range prediction of an evaporator 0 2 4 6 8 10 Q3 Q2 Q1 Q0 L3 L2 L1 Tph2 Tph1 T3 T2 T1 Ts V ar ia bl e Mean-Squared Error (x10-2) ARX Model Modular NN Analytical Model Fig. 6. Comparison of prediction errors for the ARX, NN and analytical models Actual Modular NN (MSE = 0.8e-2) Analytical Model (MSE = 2.7e-2) 0 5 10 15 20 25 30 35 40 45 50 61 62 63 64 65 66 67 68 Time (min) T e m p e r a t u r e ( ° C ) T 2 Fig. 7. Comparison of actual data with NN and analytical model predictions for the second effect temperature (T2) Modular neural network modelling for long-range prediction of an evaporator Actual Modular NN (MSE = 8.4e-2) Linear ARX Model (MSE = 5.8e-2) 0 5 10 15 20 25 30 35 40 45 50 0.2 0.4 0.6 0.8 1 1.2 1.4 Time (min) L e v e l ( m ) L 2 Fig. 8. Comparison of actual data with NN and ARX model predictions for the second effect concentrate level (L2) 0 10 20 30 40 50 60 70 80 90 100 0 5 10 15 20 25 30 35 n M SE (x 10 -3 ) Tph1 Tph2 T2 T1 T3 Fig. 9. MSE vs prediction horizon for the evaporator temperatures Modular neural network modelling for long-range prediction of an evaporator 0 0.2 0.4 0.6 0.8 1 0 10 20 30 40 50 60 70 80 90 100 110 120 Time (min) O ut pu t t-1 t-2 t-4 Fig. 10. Outputs from the three time-input hidden neurons Table 1. Number of parameter for second effect sub-networks Output Locally-connected Fully-connected T2 59 57 L2 59 57 Table 2: Number of parameters in the Effect 2 models Modular network Locally-connected 3- output network 137 88 View publication statsView publication stats https://www.researchgate.net/publication/263502603