0111:cs0111033/linux.tex

1: \documentclass[acus]{JAC2000}

2:

3: %%

4: %%  Use \documentclass[boxit]{JAC2000}

5: %%  To draw a frame with the correct margins on the output.

6: %%

7: %%  Use \documentclass[acus]{JAC2000}

8: %%  For US letter paper layout

9: %%

10:

11: \usepackage{graphicx}

12:

13:

14: %%

15: %%   VARIABLE HEIGHT FOR THE TITLE BOX (default 35mm)

16: %%

17:

18: \setlength{\titleblockheight}{45mm}

19:

20: \begin{document}

21: \title{\flushright{WEAP023}\\[15pt] \centering Modernising the ESRF control system

22: with GNU/Linux}

23:

24: \author{A.G\"otz, A.Homs, B.Regad, M.Perez, P.M\"akij\"arvi, W-D.Klotz\\

25: ESRF, 6 rue Jules Horowitz, Grenoble 38043, FRANCE}

26:

27:

28: \maketitle

29:

30: \begin{abstract}

31: The ESRF control system is in the process of being modernised. The present control

32: system is based on VME, 10 MHz Ethernet, OS9, Solaris, HP-UX, NFS/RPC,

33: Motif and C.

34: The new control system will be based on compact PCI, 100 MHz Ethernet,

35: Linux, Windows, Solaris, CORBA/IIOP, C++, Java and Python. The main frontend

36: operating system will be GNU/Linux running on Intel/x86 and Motorola/68k.

37: Linux will also be used on handheld devices for mobile control.

38: This poster describes how GNU/Linux is being used to modernise the control

39: system and what problems have been encountered so far\footnote{work supported by J.Klora, J.M.Chaize and P.Fajardo}.

40:

41: \end{abstract}

42:

43: \section{Introduction}

44: The ESRF control systems control 3 accelerators and 32 beamlines.

45: They have been built using the same technology and are completely compatible.

46: They were built 10 years ago based on the state-of-the-art technology ten

47: years ago.

48: This included VME, 10 MHz Ethernet, OS-9, Solaris, HP-UX, NFS/RPC, Motif

49: and C. Most of these technologies have not evolved over the last few

50: years.

51: In our search for better tools, support, ease of programming, and overall

52: stability and quality we have put all our old technologies to the test.

53: Our main criterium was which technology or tool will allow us to offer

54: users a better control system.

55: A better control system means one which offers more features  to users

56: without losing any of the present good features.

57: %Some of the features will be immediately visible to users whilst others

58: %will be indirect like an easier programming environment for the developers,

59: %better support, more stability, and prolonged life of existing park.

60:

61: The result of this technology survey was 100 MHz Ethernet, VME (for the

62: existing hardware), CompactPCI (cPCI) and PCI for new hardware,

63: Linux as main frontend operating system, Windows for commercially supported

64: hardware and software, Solaris and GNU/Linux as main desktop operating systems,

65: CORBA/IIOP as new network protocol, C++, Java and Python as main programming

66: languages.

67:

68: %This paper will describe why we have chosen GNU/Linux and what we have done

69: %so far with it. It will also go into the problems we have experienced

70: %and take a look at the future.

71:

72: \section{Why GNU/Linux ?}

73: %Over the last few years there has been a phenomenal growth in the so-called

74: %{\em sourceware} movement. Sourceware means software for which the source

75: %code is freely available. It often means that the software itself is also

76: %free. It includes the well-known GNU software but also software under

77: %other licences. GNU/Linux is perhaps the most popular operating system of

78: %the sourceware projects. It consists of the GNU/Linux kernel and a huge

79: %suite of software most of it under the GNU licence (GPL) hence the collective

80: %name GNU/Linux.

81:

82: What does GNU/Linux offer that other systems doesn't offer ?

83: \begin{enumerate}

84: \item

85: FREEDOM ! Freedom in this context means access to all the source code

86: so that it can be compiled, understood and improved.

87: An additonal freedom is the freedom from supplier pressure and fees.

88: \item

89: Technology we know well (Unix) and

90: which is conceptually simple to understand and program.

91: This is an important feature in our case because we need to develop device

92: drivers. In addition to being easy to understand it is well-documented.

93: \item

94: A rich set of software packages. Almost all known

95: sourceware packages have been developed or ported to GNU/Linux.

96: \item

97: It is easy to manage in a network environment and has

98: excellent support for all network protocols. Because our control

99: systems are distributed over the network this played a strong role in

100: our choice for GNU/Linux.

101: \end{enumerate}

102:

103: \section{Linux/m68k + VME}

104: The ESRF has over 200 VME crates installed.

105: This represents an investment of millions of Euros as well as many

106: tens of years of work hardware and software development.

107: Any modernization project must take this investment into account.

108: The modernization foresees two ways to do this - using the Motorola CPU's (MVME-162)

109: to run GNU/Linux directly or replacing the CPU with a bus extender which allows

110: the VME bus to be controlled from PC running Linux/x86. This section describes

111: the first option. The bus extender solution is discussed in the next section.

112:

113: For Linux/m68k we use the Debian distribution 2.1.

114: It can be downloaded from the Debian website\footnote{http://www.debian.org} and

115: is available in source code and binary format.

116: The standard kernel (we are running kernel 2.2.10) includes the support for the

117: Motorola CPU port (originally done by Richard Hirst\footnote{rhirst@sleepie.demon.co.uk}).

118: We run all our Linux/m68k crates without harddisk (diskless). The root disk

119: is NFS mounted readonly. In addition there is a RAM disk for /etc, /dev, /var and /tmp.

120: This means crates can be switched on/off without risk of losing data nor do we have

121: to do fsck's.

122: We have rewritten device drivers for all our main VME cards.

123: For many of them we subcontracted the driver writing for the first version

124: to Richard Hirst (later Linuxcare).

125: Maintenance and further development is now done in house.

126: Client programs communicate with the hardware via the network using TACO/TANGO

127: device servers (cf. below).

128: We use the GNU tools for compiling and debugging (g++ and gdb).

129:

130: Our experience with Linux/m68k compared to OS-9 (the commercial operating system

131: we were using previously) is that it is at least if not more stable,

132: the TCP/IP implementation is more efficient and robust and that it is

133: easy to add new features to our software using standard techniques like

134: multithreading.

135:

136: \section{Linux/x86 + bus extenders}

137:

138: The modernization project of the instrument control at the ESRF using GNU/Linux

139: supports two main hardware platforms: PCI/cPCI and VME.

140: The former provides access to the most recent interface boards developed

141: for a highly demanding market, and hence, with better performance/price ratios.

142: The latter is needed for a gradual transition between the current VME

143: instrumentation and the PCI technology.

144: VME boards can be controlled from a Motorola MVME CPU or from a PC

145: through a PCI/VME bus extender, both running GNU/Linux as OS.

146:

147: As it was said before, the modernization project also includes the cPCI

148: platform, which, in combination of PCI/cPCI bus extenders, notably increases

149: the flexibility in the hardware configuration.

150: It is well known that due to the dynamic resource configuration in the

151: PCI specification, identical boards are only distinguishable by their

152: slot position in the bus.

153: Most of the drivers available for PCI boards enumerate the boards in the

154: same order they are found by the BIOS / OS at boot time.

155: This means that the board identification number will change when a

156: similar one, situated before in the PCI bus structure, is removed.

157: Moreover, if we apply the same logic to slave cPCI crates, their

158: bus numbers will change when another is removed.

159: To solve this problem a differentiation between physical numbers,

160: those used by the drivers, and logical numbers used by applications is made.

161: The mechanism responsible to make this mapping keeps track of the boards

162: present in the system and detects any change in the bus configuration.

163: Any non-trivial change is informed to the user, avoiding wrong addressing

164: to the boards.

165: The position of the boards are presented to the user in terms of chassis

166: and slot, which are translated to PCI device numbers by hardware specific

167: mappings.

168:

169: Setups based on the PCI architecture have been mounted using both a

170: desktop PC and an industrial PC that implements the PICMG standard.

171: Remote VME crates are controlled through SBS Technologies PCI/VME bus

172: extenders, and cPCI crates are directly linked to the main PCI bus by

173: means of National Instruments MXI-3 PCI/cPCI/PXI bus adapters.

174: These adapters expand by a large factor the amount of hardware that

175: can be managed by a single host. Furthermore, both MVME and PCI GNU/Linux

176: can independently control boards in the same crate, providing even

177: more possibilities for the VME - PCI transition.

178:

179: %\begin{figure*}[t]

180: %\centering

181: %\begin{tabular}{|c|c|c|c|} \hline

182: %{\em card} & {\em bus} & {\em supplier} & {\em description} \\ \hline

183: %CC133 & VME & Compcontrol & 12 channel incremental encoder \\ \hline

184: %ICV196 & VME & ADAS & 96 channel digital input/output \\ \hline

185: %ICV150 & VME & ADAS & 32 channel analog input \\ \hline

186: %ICV712 & VME & ADAS & 16 channel ananlog output \\ \hline

187: %VPAP & VME & ESRF & 8 channel stepper motor controller \\ \hline

188: %VCT6 & VME & ESRF & 6 channel counter timer \\ \hline

189: %PCI/PXI-7344 & PCI/cPCI & National Instruments & 4 channel fast motor controller \\\hline

190: %RocketPort & PCI & Comtrol & 16 channel serial line controller \\ \hline

191: %\end{tabular}

192: %\caption{Table of hardware supported under Linux} \label{l2ea4-f2}

193: %\end{figure*}

194:

195:

196: \section{Device Drivers}

197: In order to use the same device driver codes in both systems, an interface

198: layer was implemented to manage I/O addresses and IRQs, taken from the

199: module parameters at load time.

200: In the bus coupler configuration, this interface does the necessary PCI to

201: VME address mappings during the initialization of the VME board drivers,

202: allowing boards on different   (remote) VME buses to be controlled from

203: the same host as local.

204: This interface also exports automatically the state and configuration of

205: each board to the {\tt /proc} virtual file system.

206:

207: In experiment automation it is often very useful to record the value of

208: several magnitudes when an event occurs.

209: Such an event can be generated by a hardware signal or by a software condition.

210: To provide this functionality a buffering mechanism was developed in the

211: kernel, named hook after a similar facility developed at the ESRF for OS/9

212: drivers, which hooks data on hardware interrupts.

213: The values to be written in the buffer are run-time configurable

214: by specifying the driver name, the board and the channel to be read.

215: Each driver that can export its channels will register with the hook

216: module during initialization.

217: When an application wants to read one of its channels, the hook asks

218: for the necessary actions to be done.

219: If the actions are just simple register read/write operations

220: (one single read is very common), they are returned in the form of a "program".

221: Otherwise, if the process is more complicated, a pointer to a function is saved.

222: One source of hook events is a timer provided by the hook itself,

223: which attaches to the system software timer, and hence has a minimum

224: repetition period of 10 ms on standard installations.

225: Higher rates can be achieved with hardware interrupts generated by

226: counter/timer boards like the ESRF VCT6.

227:

228: Not all the boards allow a fast reading of their registers, and

229: the system should not wait in an interrupt handler (actually a bottom half handler).

230: This problem can be overcome with an asynchronous buffer writing, as long as it is

231: done before the next event arrives.

232: Finally, the hook buffer can be filled in linear mode, which stops acquisitions

233: when the end of the buffer is reached, or in circular mode for continuous measurement.

234:

235: \section{Device Servers}

236:

237: The device drivers are the first layer in our control system architecture.

238: The second layer is called the {\em device server} layer and provides

239: transparent network access.

240: This means hardware can be shared transparently between geographically

241: separated parts of the accelerator and/or beamline, thereby adding a

242: layer of flexibility which would otherwise not be available (except by

243: recabling).

244: The device servers at the ESRF are of two flavours. The original flavour

245: called TACO\footnote{http://www.esrf.fr/taco} uses the ONC/RPC as network

246: protocol and is a lightweight protocol. It has the advantage that the

247: ONC/RPC runs everywhere where NFS runs.

248: The second flavour called TANGO\footnote{http://www.esrf.fr/tango}

249: is based on CORBA and uses the IIOP protocol for the network layer.

250: CORBA is slightly more heavyweight compared to ONC/RPC but offers

251: more high-level services.

252: Both flavours of device servers offer synchronous, asynchronous and

253: event-driven communication paradigms and a database for permanent storage.

254: A large number (hundreds) of device servers have been written at

255: the ESRF and other sites (FRM II, Lure, HartRAO). Refer to the

256: websites for more information.

257:

258: \section{Administration}

259:

260: The challenge we are facing with the modernization of the control system is

261: not only to be able to provide the best combination of

262: operating system+hardware, but also to be able to do the system

263: administration  of the system installed all over the site.

264: Administration means two important things :

265: \begin{itemize}

266: \item

267: quick recovery of a  system after a failure.

268: \item

269: new release of the system.

270: \end{itemize}

271:

272: Our present control system is based on VME / OS9 diskless systems.

273: These OS9 systems are served by bootp servers which give them an

274: identity and then  downloads the kernel onto the VME crate at startup.

275: The VME crates then mounts all the same remote file system using NFS.

276: Thus,  for example  if a CPU fails we have just to change it and press

277: the ON button.

278: This type of action takes less than 15 minutes.

279:

280: With the modernized control system,  BOOTP technique is

281: replaced with DHCP which is based on BOOTP but has more powerful features.

282: In this way DHCP allows for dynamic allocation of the network address.

283:

284: The compact PCI crates that are being installed are equipped with  a hard disk.

285: It is inconceivable to think that numerous systems that will be running in the

286: future have each their  own configuration.

287: In this case it should be almost impossible to do the system administration

288: and to provide a good service to the users of these systems.

289:

290: The decision was taken to do a base system and to duplicate it as many time

291: as it will be needed.

292: This technique is also called  "cloning".

293: The ESRF has bought a commercial product called REMBO (for REMote BOot).

294: This tool is able to deal with DHCP technology to give a network identity

295: to a client that broadcasts a DHCP request.

296: Moreover this tool can be  used to make a base image of any system

297: (Linux / Windows) and to upload that image on a computer on which  a

298: Rembo server has been installed .

299: REMBO is a cloning tool and a backup utility as well. The administration of the Compact

300: PCI can be improved with the capability of REMBO to do differential image.

301: We use this feature to manage different hardware configuration starting from

302: a base system.

303: For example the crates dedicated to vacuum and the crates dedicated to the

304: front end system have the same base hardware (same crate, same CPU ,

305: network card) and therefore will have the same base image (same

306: operating system, network drivers, etc.).

307: But since these crates do not have the same I/O boards installed

308: a differential image for each type of hardware configuration  will be made.

309: In case of failure (crash of the hard disk) the tool will be able to

310: rebuild the  whole system from the base image and the differential one .

311:

312: \section{Handheld devices}

313:

314: We have long had the need for handheld portable controllers for motors,

315: and other hardware which need local tuning far from the control room.

316: Enter the new generation of handheld devices, wireless Ethernet and

317: Linux. We have used the iPAQ from Compaq running GNU/Linux and an X11

318: based client (Labview for example) to control motors remotely

319: using a simple one-click interface.

320: The fact of choosing GNU/Linux makes the task of network and graphical

321: display much easier than if we were using Windows CE for example.

322:

323: \section{Realtime}

324: {\em ``Linux isn't realtime''.}

325: This is true therefore we do not claim to do any realtime with GNU/Linux.

326: However we have found that for our applications we need little or no

327: realtime. Most realtime needs are delegated to hardware or DSP's.

328: Where we do need soft realtime (i.e. 99\% guarantee) we use interrupt

329: routines in device drivers. We measured an interrupt response time

330: of $<50 \mu s$ for Linux on 68k and x86.

331: Using a driver interrupt routine we achieve a

332: soft realtime response of $500 \mu s$ for a function generator

333: (providing we do not recompile the kernel at the same time !).

334: For the rest of our applications {\em ``as-fast-as-possible''} is good enough.

335: And for that GNU/Linux on commodity hardware is surprisingly good.

336:

337: %\section{Online Data Reduction}

338: %Because GNU/Linux comes with a complete set of applications to build websites,

339: %databases, scientific programs, you-name-it, it is very easy to use GNU/Linux

340: %not online to control accelerators and experiments but also to do data analysis

341: %online and offline.

342: %We are using our own data reduction programs to do online data binning,

343: %smoothing

344: %and peak fitting and then plotting in Matlab. The programs communicate via

345: %shared memory or the network.

346: %Paramers and results are stored in a MySQL database for archiving.

347: %The advantage of GNU/Linux is the rich set of tools it offers both free

348: %and commercial and the fact that it runs on low-cost commodity hardware.

349: %GNU/Linux is also used for offline data analysis

350:

351: \section{Problems}

352: GNU/Linux is not without problems.

353: The main problems we have identified so far are :

354: \begin{itemize}

355: \item

356: the standard GNU/Linux distributions are not easily adapted to running on

357: diskless systems

358: \item

359: most commercial hardware does not have GNU/Linux drivers but Windows drivers

360: \end{itemize}

361:

362: \section{Conclusion}

363: There is a viable alternative to Windows 95/98/ME, NT/2000/CE/XP for

364: building control systems and it is called GNU/Linux !

365: Linux is sufficiently mature for the task and even offers some advantages

366: i.e. it is easier to program, is better adapted to distributed control

367: and is free of commercial pressure.

368:

369: %\begin{thebibliography}{9}

370: %

371: %\bibitem{linuxjournal}

372: %A.G\"otz. P.M\"akij\"arvi, B.Regad, M.Perez and P.Mangiagalli, ``Embedding Linux

373: %to control Accelerators and Experiments'', Linux Journal, October 1999

374: %

375: %\end{thebibliography}

376:

377: \end{document}

378: