1: \documentclass[acus]{JAC2000}
2:
3: %%
4: %% Use \documentclass[boxit]{JAC2000}
5: %% To draw a frame with the correct margins on the output.
6: %%
7: %% Use \documentclass[acus]{JAC2000}
8: %% For US letter paper layout
9: %%
10:
11: \usepackage{graphicx}
12:
13:
14: %%
15: %% VARIABLE HEIGHT FOR THE TITLE BOX (default 35mm)
16: %%
17:
18: \setlength{\titleblockheight}{45mm}
19:
20: \begin{document}
21: \title{\flushright{WEAP023}\\[15pt] \centering Modernising the ESRF control system
22: with GNU/Linux}
23:
24: \author{A.G\"otz, A.Homs, B.Regad, M.Perez, P.M\"akij\"arvi, W-D.Klotz\\
25: ESRF, 6 rue Jules Horowitz, Grenoble 38043, FRANCE}
26:
27:
28: \maketitle
29:
30: \begin{abstract}
31: The ESRF control system is in the process of being modernised. The present control
32: system is based on VME, 10 MHz Ethernet, OS9, Solaris, HP-UX, NFS/RPC,
33: Motif and C.
34: The new control system will be based on compact PCI, 100 MHz Ethernet,
35: Linux, Windows, Solaris, CORBA/IIOP, C++, Java and Python. The main frontend
36: operating system will be GNU/Linux running on Intel/x86 and Motorola/68k.
37: Linux will also be used on handheld devices for mobile control.
38: This poster describes how GNU/Linux is being used to modernise the control
39: system and what problems have been encountered so far\footnote{work supported by J.Klora, J.M.Chaize and P.Fajardo}.
40:
41: \end{abstract}
42:
43: \section{Introduction}
44: The ESRF control systems control 3 accelerators and 32 beamlines.
45: They have been built using the same technology and are completely compatible.
46: They were built 10 years ago based on the state-of-the-art technology ten
47: years ago.
48: This included VME, 10 MHz Ethernet, OS-9, Solaris, HP-UX, NFS/RPC, Motif
49: and C. Most of these technologies have not evolved over the last few
50: years.
51: In our search for better tools, support, ease of programming, and overall
52: stability and quality we have put all our old technologies to the test.
53: Our main criterium was which technology or tool will allow us to offer
54: users a better control system.
55: A better control system means one which offers more features to users
56: without losing any of the present good features.
57: %Some of the features will be immediately visible to users whilst others
58: %will be indirect like an easier programming environment for the developers,
59: %better support, more stability, and prolonged life of existing park.
60:
61: The result of this technology survey was 100 MHz Ethernet, VME (for the
62: existing hardware), CompactPCI (cPCI) and PCI for new hardware,
63: Linux as main frontend operating system, Windows for commercially supported
64: hardware and software, Solaris and GNU/Linux as main desktop operating systems,
65: CORBA/IIOP as new network protocol, C++, Java and Python as main programming
66: languages.
67:
68: %This paper will describe why we have chosen GNU/Linux and what we have done
69: %so far with it. It will also go into the problems we have experienced
70: %and take a look at the future.
71:
72: \section{Why GNU/Linux ?}
73: %Over the last few years there has been a phenomenal growth in the so-called
74: %{\em sourceware} movement. Sourceware means software for which the source
75: %code is freely available. It often means that the software itself is also
76: %free. It includes the well-known GNU software but also software under
77: %other licences. GNU/Linux is perhaps the most popular operating system of
78: %the sourceware projects. It consists of the GNU/Linux kernel and a huge
79: %suite of software most of it under the GNU licence (GPL) hence the collective
80: %name GNU/Linux.
81:
82: What does GNU/Linux offer that other systems doesn't offer ?
83: \begin{enumerate}
84: \item
85: FREEDOM ! Freedom in this context means access to all the source code
86: so that it can be compiled, understood and improved.
87: An additonal freedom is the freedom from supplier pressure and fees.
88: \item
89: Technology we know well (Unix) and
90: which is conceptually simple to understand and program.
91: This is an important feature in our case because we need to develop device
92: drivers. In addition to being easy to understand it is well-documented.
93: \item
94: A rich set of software packages. Almost all known
95: sourceware packages have been developed or ported to GNU/Linux.
96: \item
97: It is easy to manage in a network environment and has
98: excellent support for all network protocols. Because our control
99: systems are distributed over the network this played a strong role in
100: our choice for GNU/Linux.
101: \end{enumerate}
102:
103: \section{Linux/m68k + VME}
104: The ESRF has over 200 VME crates installed.
105: This represents an investment of millions of Euros as well as many
106: tens of years of work hardware and software development.
107: Any modernization project must take this investment into account.
108: The modernization foresees two ways to do this - using the Motorola CPU's (MVME-162)
109: to run GNU/Linux directly or replacing the CPU with a bus extender which allows
110: the VME bus to be controlled from PC running Linux/x86. This section describes
111: the first option. The bus extender solution is discussed in the next section.
112:
113: For Linux/m68k we use the Debian distribution 2.1.
114: It can be downloaded from the Debian website\footnote{http://www.debian.org} and
115: is available in source code and binary format.
116: The standard kernel (we are running kernel 2.2.10) includes the support for the
117: Motorola CPU port (originally done by Richard Hirst\footnote{rhirst@sleepie.demon.co.uk}).
118: We run all our Linux/m68k crates without harddisk (diskless). The root disk
119: is NFS mounted readonly. In addition there is a RAM disk for /etc, /dev, /var and /tmp.
120: This means crates can be switched on/off without risk of losing data nor do we have
121: to do fsck's.
122: We have rewritten device drivers for all our main VME cards.
123: For many of them we subcontracted the driver writing for the first version
124: to Richard Hirst (later Linuxcare).
125: Maintenance and further development is now done in house.
126: Client programs communicate with the hardware via the network using TACO/TANGO
127: device servers (cf. below).
128: We use the GNU tools for compiling and debugging (g++ and gdb).
129:
130: Our experience with Linux/m68k compared to OS-9 (the commercial operating system
131: we were using previously) is that it is at least if not more stable,
132: the TCP/IP implementation is more efficient and robust and that it is
133: easy to add new features to our software using standard techniques like
134: multithreading.
135:
136: \section{Linux/x86 + bus extenders}
137:
138: The modernization project of the instrument control at the ESRF using GNU/Linux
139: supports two main hardware platforms: PCI/cPCI and VME.
140: The former provides access to the most recent interface boards developed
141: for a highly demanding market, and hence, with better performance/price ratios.
142: The latter is needed for a gradual transition between the current VME
143: instrumentation and the PCI technology.
144: VME boards can be controlled from a Motorola MVME CPU or from a PC
145: through a PCI/VME bus extender, both running GNU/Linux as OS.
146:
147: As it was said before, the modernization project also includes the cPCI
148: platform, which, in combination of PCI/cPCI bus extenders, notably increases
149: the flexibility in the hardware configuration.
150: It is well known that due to the dynamic resource configuration in the
151: PCI specification, identical boards are only distinguishable by their
152: slot position in the bus.
153: Most of the drivers available for PCI boards enumerate the boards in the
154: same order they are found by the BIOS / OS at boot time.
155: This means that the board identification number will change when a
156: similar one, situated before in the PCI bus structure, is removed.
157: Moreover, if we apply the same logic to slave cPCI crates, their
158: bus numbers will change when another is removed.
159: To solve this problem a differentiation between physical numbers,
160: those used by the drivers, and logical numbers used by applications is made.
161: The mechanism responsible to make this mapping keeps track of the boards
162: present in the system and detects any change in the bus configuration.
163: Any non-trivial change is informed to the user, avoiding wrong addressing
164: to the boards.
165: The position of the boards are presented to the user in terms of chassis
166: and slot, which are translated to PCI device numbers by hardware specific
167: mappings.
168:
169: Setups based on the PCI architecture have been mounted using both a
170: desktop PC and an industrial PC that implements the PICMG standard.
171: Remote VME crates are controlled through SBS Technologies PCI/VME bus
172: extenders, and cPCI crates are directly linked to the main PCI bus by
173: means of National Instruments MXI-3 PCI/cPCI/PXI bus adapters.
174: These adapters expand by a large factor the amount of hardware that
175: can be managed by a single host. Furthermore, both MVME and PCI GNU/Linux
176: can independently control boards in the same crate, providing even
177: more possibilities for the VME - PCI transition.
178:
179: %\begin{figure*}[t]
180: %\centering
181: %\begin{tabular}{|c|c|c|c|} \hline
182: %{\em card} & {\em bus} & {\em supplier} & {\em description} \\ \hline
183: %CC133 & VME & Compcontrol & 12 channel incremental encoder \\ \hline
184: %ICV196 & VME & ADAS & 96 channel digital input/output \\ \hline
185: %ICV150 & VME & ADAS & 32 channel analog input \\ \hline
186: %ICV712 & VME & ADAS & 16 channel ananlog output \\ \hline
187: %VPAP & VME & ESRF & 8 channel stepper motor controller \\ \hline
188: %VCT6 & VME & ESRF & 6 channel counter timer \\ \hline
189: %PCI/PXI-7344 & PCI/cPCI & National Instruments & 4 channel fast motor controller \\\hline
190: %RocketPort & PCI & Comtrol & 16 channel serial line controller \\ \hline
191: %\end{tabular}
192: %\caption{Table of hardware supported under Linux} \label{l2ea4-f2}
193: %\end{figure*}
194:
195:
196: \section{Device Drivers}
197: In order to use the same device driver codes in both systems, an interface
198: layer was implemented to manage I/O addresses and IRQs, taken from the
199: module parameters at load time.
200: In the bus coupler configuration, this interface does the necessary PCI to
201: VME address mappings during the initialization of the VME board drivers,
202: allowing boards on different (remote) VME buses to be controlled from
203: the same host as local.
204: This interface also exports automatically the state and configuration of
205: each board to the {\tt /proc} virtual file system.
206:
207: In experiment automation it is often very useful to record the value of
208: several magnitudes when an event occurs.
209: Such an event can be generated by a hardware signal or by a software condition.
210: To provide this functionality a buffering mechanism was developed in the
211: kernel, named hook after a similar facility developed at the ESRF for OS/9
212: drivers, which hooks data on hardware interrupts.
213: The values to be written in the buffer are run-time configurable
214: by specifying the driver name, the board and the channel to be read.
215: Each driver that can export its channels will register with the hook
216: module during initialization.
217: When an application wants to read one of its channels, the hook asks
218: for the necessary actions to be done.
219: If the actions are just simple register read/write operations
220: (one single read is very common), they are returned in the form of a "program".
221: Otherwise, if the process is more complicated, a pointer to a function is saved.
222: One source of hook events is a timer provided by the hook itself,
223: which attaches to the system software timer, and hence has a minimum
224: repetition period of 10 ms on standard installations.
225: Higher rates can be achieved with hardware interrupts generated by
226: counter/timer boards like the ESRF VCT6.
227:
228: Not all the boards allow a fast reading of their registers, and
229: the system should not wait in an interrupt handler (actually a bottom half handler).
230: This problem can be overcome with an asynchronous buffer writing, as long as it is
231: done before the next event arrives.
232: Finally, the hook buffer can be filled in linear mode, which stops acquisitions
233: when the end of the buffer is reached, or in circular mode for continuous measurement.
234:
235: \section{Device Servers}
236:
237: The device drivers are the first layer in our control system architecture.
238: The second layer is called the {\em device server} layer and provides
239: transparent network access.
240: This means hardware can be shared transparently between geographically
241: separated parts of the accelerator and/or beamline, thereby adding a
242: layer of flexibility which would otherwise not be available (except by
243: recabling).
244: The device servers at the ESRF are of two flavours. The original flavour
245: called TACO\footnote{http://www.esrf.fr/taco} uses the ONC/RPC as network
246: protocol and is a lightweight protocol. It has the advantage that the
247: ONC/RPC runs everywhere where NFS runs.
248: The second flavour called TANGO\footnote{http://www.esrf.fr/tango}
249: is based on CORBA and uses the IIOP protocol for the network layer.
250: CORBA is slightly more heavyweight compared to ONC/RPC but offers
251: more high-level services.
252: Both flavours of device servers offer synchronous, asynchronous and
253: event-driven communication paradigms and a database for permanent storage.
254: A large number (hundreds) of device servers have been written at
255: the ESRF and other sites (FRM II, Lure, HartRAO). Refer to the
256: websites for more information.
257:
258: \section{Administration}
259:
260: The challenge we are facing with the modernization of the control system is
261: not only to be able to provide the best combination of
262: operating system+hardware, but also to be able to do the system
263: administration of the system installed all over the site.
264: Administration means two important things :
265: \begin{itemize}
266: \item
267: quick recovery of a system after a failure.
268: \item
269: new release of the system.
270: \end{itemize}
271:
272: Our present control system is based on VME / OS9 diskless systems.
273: These OS9 systems are served by bootp servers which give them an
274: identity and then downloads the kernel onto the VME crate at startup.
275: The VME crates then mounts all the same remote file system using NFS.
276: Thus, for example if a CPU fails we have just to change it and press
277: the ON button.
278: This type of action takes less than 15 minutes.
279:
280: With the modernized control system, BOOTP technique is
281: replaced with DHCP which is based on BOOTP but has more powerful features.
282: In this way DHCP allows for dynamic allocation of the network address.
283:
284: The compact PCI crates that are being installed are equipped with a hard disk.
285: It is inconceivable to think that numerous systems that will be running in the
286: future have each their own configuration.
287: In this case it should be almost impossible to do the system administration
288: and to provide a good service to the users of these systems.
289:
290: The decision was taken to do a base system and to duplicate it as many time
291: as it will be needed.
292: This technique is also called "cloning".
293: The ESRF has bought a commercial product called REMBO (for REMote BOot).
294: This tool is able to deal with DHCP technology to give a network identity
295: to a client that broadcasts a DHCP request.
296: Moreover this tool can be used to make a base image of any system
297: (Linux / Windows) and to upload that image on a computer on which a
298: Rembo server has been installed .
299: REMBO is a cloning tool and a backup utility as well. The administration of the Compact
300: PCI can be improved with the capability of REMBO to do differential image.
301: We use this feature to manage different hardware configuration starting from
302: a base system.
303: For example the crates dedicated to vacuum and the crates dedicated to the
304: front end system have the same base hardware (same crate, same CPU ,
305: network card) and therefore will have the same base image (same
306: operating system, network drivers, etc.).
307: But since these crates do not have the same I/O boards installed
308: a differential image for each type of hardware configuration will be made.
309: In case of failure (crash of the hard disk) the tool will be able to
310: rebuild the whole system from the base image and the differential one .
311:
312: \section{Handheld devices}
313:
314: We have long had the need for handheld portable controllers for motors,
315: and other hardware which need local tuning far from the control room.
316: Enter the new generation of handheld devices, wireless Ethernet and
317: Linux. We have used the iPAQ from Compaq running GNU/Linux and an X11
318: based client (Labview for example) to control motors remotely
319: using a simple one-click interface.
320: The fact of choosing GNU/Linux makes the task of network and graphical
321: display much easier than if we were using Windows CE for example.
322:
323: \section{Realtime}
324: {\em ``Linux isn't realtime''.}
325: This is true therefore we do not claim to do any realtime with GNU/Linux.
326: However we have found that for our applications we need little or no
327: realtime. Most realtime needs are delegated to hardware or DSP's.
328: Where we do need soft realtime (i.e. 99\% guarantee) we use interrupt
329: routines in device drivers. We measured an interrupt response time
330: of $<50 \mu s$ for Linux on 68k and x86.
331: Using a driver interrupt routine we achieve a
332: soft realtime response of $500 \mu s$ for a function generator
333: (providing we do not recompile the kernel at the same time !).
334: For the rest of our applications {\em ``as-fast-as-possible''} is good enough.
335: And for that GNU/Linux on commodity hardware is surprisingly good.
336:
337: %\section{Online Data Reduction}
338: %Because GNU/Linux comes with a complete set of applications to build websites,
339: %databases, scientific programs, you-name-it, it is very easy to use GNU/Linux
340: %not online to control accelerators and experiments but also to do data analysis
341: %online and offline.
342: %We are using our own data reduction programs to do online data binning,
343: %smoothing
344: %and peak fitting and then plotting in Matlab. The programs communicate via
345: %shared memory or the network.
346: %Paramers and results are stored in a MySQL database for archiving.
347: %The advantage of GNU/Linux is the rich set of tools it offers both free
348: %and commercial and the fact that it runs on low-cost commodity hardware.
349: %GNU/Linux is also used for offline data analysis
350:
351: \section{Problems}
352: GNU/Linux is not without problems.
353: The main problems we have identified so far are :
354: \begin{itemize}
355: \item
356: the standard GNU/Linux distributions are not easily adapted to running on
357: diskless systems
358: \item
359: most commercial hardware does not have GNU/Linux drivers but Windows drivers
360: \end{itemize}
361:
362: \section{Conclusion}
363: There is a viable alternative to Windows 95/98/ME, NT/2000/CE/XP for
364: building control systems and it is called GNU/Linux !
365: Linux is sufficiently mature for the task and even offers some advantages
366: i.e. it is easier to program, is better adapted to distributed control
367: and is free of commercial pressure.
368:
369: %\begin{thebibliography}{9}
370: %
371: %\bibitem{linuxjournal}
372: %A.G\"otz. P.M\"akij\"arvi, B.Regad, M.Perez and P.Mangiagalli, ``Embedding Linux
373: %to control Accelerators and Experiments'', Linux Journal, October 1999
374: %
375: %\end{thebibliography}
376:
377: \end{document}
378: