0309:cs0309031/2003.tex

1: \documentclass{aadebug}

2: \corr{0309027}{87}

3: \hyphenation{time-stamp}

4:

5: \begin{document}

6:

7: \runningheads{Kazutaka Maruyama et al.}{Timestamp Based Execution Control for C and Java Programs}

8:

9: \title{Timestamp Based Execution Control for C and Java Programs}

10:

11: \author{

12: Kazutaka~Maruyama\addressnum{1}\comma\extranum{1},

13: Minoru~Terada\addressnum{2}\comma\extranum{2}

14: }

15:

16: \address{1}{

17: Dept. of Mechano-Informatics,

18: Grad. School of Information Science and Technology,

19: The University of Tokyo,

20: Japan

21: }

22:

23: \address{2}{

24: Dept. of Information and Communication Engineering,

25: The University of Electro-Communications,

26: Japan

27: }

28:

29: \extra{1}{E-mail:kazutaka@acm.org}

30: \extra{2}{E-mail:terada@ice.uec.ac.jp}

31:

32: \pdfinfo{

33: /Title (Timestamp Based Execution Control for C and Java Programs)

34: /Author (Kazutaka Maruyama, Minoru Terada)

35: }

36:

37:

38: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

39:

40: \begin{abstract}

41: Many programmers have had to deal with an overwritten variable

42: resulting for example from an aliasing problem.

43: The culprit is obviously the last write-access to that memory

44: location before the manifestation of the bug.

45: The usual technique for removing such bugs starts with the debugger

46: by (1) finding the last write and (2) moving the control point

47: of execution back to that time by re-executing the program from

48: the beginning.

49: We wish to automate this.

50: Step (2) is easy if we can somehow mark the last write found in

51: step (1) and control the execution-point to move it back to this time.

52:

53: In this paper we propose a new concept, \textit{position}, that is,

54: a point in the program execution trace, as needed for step (2) above.

55: The position enables debuggers to automate the control of program

56: execution to support common debugging activities.

57: We have implemented position in C by modifying GCC and in Java with

58: a bytecode transformer.  Measurements show that position can

59: be provided with an acceptable amount of overhead.

60: \end{abstract}

61:

62: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

63:

64: \keywords{debug, debugger, reverse execution, Java bytecode transformation}

65:

66: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

67:

68: \section{Introduction}\label{sec:intr}

69: Studies about

70: formal system specification with mathematical notation\cite{ince-z},

71: automatic testing by semantics\cite{meyer}, and so on,

72: have continued for long years.

73: However,

74: these projects still evolve and cannot be used for real problems

75: and are not enough to exterminate all bugs.

76: Programmers must debug programs by hand.

77:

78: In debugging,

79: we usually use debuggers to observe the behavior of programs.

80: Debuggers have various functions supported by hardware

81: and operating system\cite{howdebuggerswork},

82: but the functions offer too low-level commands.

83: Operations that programmers want to do in debugging

84: are more abstract than raw commands of debuggers,

85: and they must break down their operations into them.

86: Programmers have to waste energy thinking about how to use debugger commands

87: while they examine the behavior of programs

88: because of a gap between what debuggers can do

89: and what programmers need.

90: On the other hand,

91: there are some patterns of the operations

92: which programmers want debuggers to do.

93: To automate them is useful for efficient debugging.

94:

95: In this paper,

96: we propose a new idea \textit{position}

97: as a base technique of the execution control

98: useful for automating some typical debugging operations

99: that programmers want to do.

100: In order to implement it,

101: we introduce a counter \textit{timestamp}

102: which increases whenever the control point jumps backward

103: and we insert the codes for updating timestamp

104: into programs to be debugged.

105: We describe the implementation details for C and Java programs.

106: Overhead measurements of programs with the updating codes

107: are also included.

108:

109: The rest of this paper is organized as follows.

110: Section \ref{sec:pos} proposes the notion of ``position''

111: and describes the advantages of its applications.

112: Section \ref{sec:implc} and \ref{sec:implj}

113: describe the implementation details for C and Java respectively,

114: along with the result of overhead measurements.

115: Section \ref{sec:vs} describes another representation of position

116: without timestamp and the difference between the two.

117: Section \ref{sec:relw} discusses the relevance to other works.

118: Section \ref{sec:conc} and \ref{sec:futu}

119: describe conclusion and future work.

120:

121: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

122:

123: \section{Position: New Idea for Execution Control}\label{sec:pos}

124: Figure \ref{fig:layer} shows the structure of our proposal.

125: In this section,

126: we first propose the idea of the position

127: and introduce ``timestamp'' as its base.

128: Next, we implement a simple application of the position,

129: ``dynamic breakpoint''.

130: Finally, we describe three applications of the position,

131: ``bookmarking positions'', ``reverse watchpoint''

132: and ``binary search method''.

133:

134: \begin{figure}

135: \centering

136: \includegraphics{layer}

137: \caption{Structure of our proposal}

138: \label{fig:layer}

139: \end{figure}

140:

141: \subsection{Timestamp and Position}\label{subsec:pos-ts}

142: We introduce a new idea, \textit{position},

143: in order to specify one point in the program trace,

144: the series of statements executed in order of time.\footnote{

145: The \textit{point} is really \textit{one statement} of the source code.}

146: The position introduces an absolute coordinate in program traces

147: and indicates a target point of the execution control.

148:

149: In debugging, the control of program execution used so far

150: is based on static information such as line numbers in source codes

151: and cannot express the position

152: because the backward jumps of the control point

153: may cause multiple executions of one statement.

154: To distinguish them from each other,

155: we introduce a new counter into debuggees.

156: We call the counter \textit{timestamp},

157: which increases whenever the control point jumps backward.

158: The position is expressed as the pair of

159: the line number and the value of the timestamp.

160:

161: An example code with a loop structure

162: is shown in figure \ref{fig:ts-sample}.

163: There are multiple appearances of three lines (1) to (3)

164: in the program trace (figure \ref{fig:position}).

165: We call the static point of execution expressed as the line number

166: \textit{location}.

167:

168: \begin{figure}

169: {\small

170: \ \hrule \

171: \begin{verbatim}

172:         :

173: (1) while(i < a){

174: (2)   i += b;

175: (3) }

176:         :

177: \end{verbatim}

178: \ \hrule \ }

179: \caption{Code with a loop structure}

180: \label{fig:ts-sample}

181: \end{figure}

182:

183: \begin{figure}

184: \centering

185: \includegraphics{position}

186: \caption{Location and position in program trace}

187: \label{fig:position}

188: \end{figure}

189:

190: Timestamp increases whenever the loop body is repeated,

191: so the pair of the line number and the value of timestamp

192: expresses the position, the dynamic point of execution.

193: Using the value of timestamp,

194: we can distinguish each of multiple appearances of a location

195: in the trace.

196: For Java programs,

197: timestamp should increase at the following cases:

198: \begin{itemize}

199: \item entrance and exit of method invocations,

200: \item loop body,

201: \item exception.

202: \end{itemize}

203:

204: Position could be expressed as the whole history of debugger commands,

205: rather than the pair.

206: We discuss this topic in section \ref{sec:vs}.

207:

208: \subsection{Dynamic Breakpoint}\label{subsec:pos-dbp}

209: We here describe how to move the control point to a position.

210: We propose a new breakpoint facility,

211: \textit{dynamic breakpoint},

212: to be set at a position, rather than at a location.\footnote{

213: We call the normal breakpoint \textit{static breakpoint}.}

214: Because the implementation of it needs the existing breakpoints

215: we get support from debuggers.

216:

217: A simple implementation of dynamic breakpoint

218: could use ``conditional breakpoint'' of debuggers.

219: We show an instruction example

220: with the notation of GDB\cite{gdb}.

221: For example,

222: if we want to stop the debuggee at the position,

223: \texttt{(test.java:2, 8)},\footnote{

224: Debugger already stopped at the position and recorded the timestamp

225: in a previous run.

226: If second execution path is unexpectedly different from first,

227: the recorded position becomes meaningless and

228: there is no way the debuggee knows such change by itself.

229: Discussion about non-deterministic debuggees are in section \ref{sec:futu}.}

230: we instruct Java debugger, JDB, as follows:\footnote{

231: JDB does not support conditional breakpoints yet.}

232: \begin{verbatim}

233: break test.java:2 if Timestamp.ts = 8

234: \end{verbatim}

235:

236: This implementation has performance problem,

237: since many context switchings may occur

238: in order to evaluate the given conditions.

239: We will show more effective implementation later.

240:

241: \subsection{Applications of Position}\label{subsec:pos-app}

242: We describe three applications of position.

243: We assume that debuggees are deterministic including

244: the execution environments;

245: we discuss non-deterministic cases in section \ref{sec:futu}.

246:

247: \subsubsection{Bookmarking Positions}\label{subsubsec:pos-app-mark}

248: When we know where the cause of the bug is only roughly,

249: we use breakpoints to move the control point before the target position,

250: and then use a step execution repeatedly.

251: If a programmer who walks through a large program

252: mistakenly pass over the desired position,

253: he must recall and replay all the commands

254: he had given from the beginning.

255:

256: Bookmarking positions is like a mountaineer

257: placing anchors for his rope as he goes along.

258: We think that the behavior of the debuggee is correct

259: as far as here,

260: then we bookmark the position by using a dynamic breakpoint.

261: If we mistake something later,

262: we can go back to the lost anchor point

263: before things went wrong.

264:

265: Furthermore,

266: if a programmer annotates the position

267: as its identifier instead of the ID number,

268: he would remember the position easily.

269: For example,

270: the comment might be:

271: ``Just read a right brace; the parser is about to process

272: a compound statement.''

273:

274: \subsubsection{Reverse Watchpoint}\label{subsubsec:pos-app-rw}

275: Suppose that a program allocates memory dynamically.

276: If the program writes beyond the range of allocated area,

277: it may destroy the header information of it used by

278: allocator functions \texttt{malloc} and \texttt{free}.

279: But the destruction operation itself does not cause

280: the bug manifestation immediately,

281: and the result is usually manifested much later.

282: Similarly,

283: if an object is unexpectedly pointed

284: from two different references (called \textit{aliasing}),

285: the content of the object could be destroyed.

286:

287: It is difficult to fix these bugs

288: which are caused by writing an invalid value to a variable unexpectedly

289: because their manifestation occurs later than the write.

290: To catch these invalid writes,

291: debuggers provide data access breakpoint facility,

292: \textit{watchpoint} in GDB,

293: which traps all write accesses to a certain variable.

294: First, in debugging,

295: we examine which variable are invalid.

296: Second, we look for the operation which destroyed it

297: by using watchpoint.

298:

299: When we use watchpoint,

300: the debuggee stops many times

301: and we examine all the output

302: to know whether the write access is relevant to the bug.

303: Such work requires too much time for us

304: to concentrate our attention on the whole debugging session

305: and to find a sign of the bug to be found.

306:

307: On the other hand,

308: the last write to the variable obviously causes the bug manifestation.

309: We do not know which write is the last one until the bug manifests.

310: To know the write,

311: we must set a watchpoint at the variable,

312: re-start the debuggee with counting stops by the watchpoint

313: until the control point reaches where bug manifests,

314: and re-start it again to go back to the last write.

315: We are going to automate this procedure.

316:

317: We propose the ``reverse watchpoint'' facility

318: which automatically moves the control point of the debuggee

319: to the last write to a certain variable.

320: This new debugger command takes a variable name

321: to be observed as its argument and does such control movement.

322: Using the reverse watchpoint,

323: programmers do not have to care about each stop by watchpoint

324: and can concentrate their attention on more intelligent work in debugging.

325:

326: Reverse watchpoint is easily implemented

327: by using the dynamic breakpoint described above

328: and existing debugger.

329: The procedure is as follows.

330:

331: \begin{enumerate}

332: \item

333: Set a dynamic breakpoint at the position

334: where reverse watchpoint is instructed

335: (\textbf{S} in figure \ref{fig:revwatch}).

336: \item

337: Pass 1:

338: \begin{enumerate}

339: \item

340: set a normal watchpoint at the target variable

341: and re-start the debuggee.

342: Whenever it stops by the trap of the watchpoint,

343: collect the information for its position

344: (actually the value of the timestamp)

345: in order to mark the position

346: \textbf{W1} to \textbf{Wn} in figure \ref{fig:revwatch}.

347: \item

348: Repeat the collection

349: until the control point reaches the position \textbf{S}.\footnote{

350: We can know it by stopping at the dynamic breakpoint

351: set at step 1.}

352: \end{enumerate}

353: \item

354: Pass 2:

355: \begin{enumerate}

356: \item

357: go back to the beginning again,

358: set another dynamic breakpoint at the most recent position \textbf{Wn},

359: and re-start it.

360: \item

361: The program stops at the target position \textbf{Wn}.

362: \end{enumerate}

363: \end{enumerate}

364:

365: \begin{figure}

366: \centering

367: \includegraphics{revwatch}

368: \caption{Procedure of reverse watchpoint}

369: \label{fig:revwatch}

370: \end{figure}

371:

372: This procedure corresponds to

373: creating dynamic slicing\cite{dynamic-slicing} manually.

374: The advantage of our proposal is

375: that control point of execution can be moved at

376: interesting positions where a variable is assigned

377: without problems about ``unconstrained pointers''\cite{slicing-survey},

378: such as those found in C.

379: Slicing techniques are based on analysis of source codes,

380: but watchpoint of debuggers receive support from hardware

381: for fully precise information of assignments.

382:

383: \subsubsection{Binary Search Method for Locating Bugs}\label{subsubsec:pos-app-bs}

384: We propose a binary search method for locating bugs

385: using timestamp as a facility for automatically locating

386: the position where a condition becomes false at the first time.

387:

388: Suppose that a program constructs a doubly linked list and

389: we want to find the position

390: where the consistency of the list is lost.

391: We can do this more easily by using timestamp

392: as an index of the binary search method.

393: A driver program which implements the method

394: should control a debuggee as follows:

395: \begin{enumerate}

396: \item

397: move the control point to the position where the timestamp value

398: is the middle of the left end

399: (initially, the beginning of the execution)

400: and the right end

401: (first, the end),

402: \item

403: evaluate the condition,

404: and

405: \begin{itemize}

406: \item

407: if it is true then go to the position

408: between the current position and the right end,

409: and narrow the target area by dealing with

410: the new position as the new left end,

411: \item

412: if it is false then go back to the position between

413: the left end and the current,

414: and narrow the target area by dealing with

415: the new position as the new right end,

416: \end{itemize}

417: \item

418: repeat above.

419: \end{enumerate}

420:

421: Inserting assertions at various source lines

422: and evaluating them repeatedly

423: might seem to achieve the same effect as ours.

424: But our method has two advantages.

425: First,

426: \texttt{assert} must be inserted manually

427: at all the locations where it may be necessary.

428: The binary method does not have to do.

429: Second,

430: the inserted \texttt{assert}s cause

431: condition evaluations at each call.

432: Using our method,

433: the maximum number of the examinations is $\log n$

434: such that $n$ is the timestamp value at the end of the execution.

435: So, if the condition is complex,

436: the performance of the method may be better than the one

437: of \texttt{assert}.

438:

439: This method is already proposed

440: by Tolmach and Appel\cite{sml-debugger}.

441: We take it

442: from the world of functional programming

443: to one of procedural programming.

444:

445: Because binary search method requires the condition

446: to be monotonous (or at least become false at the end),

447: we might have to repeat the process

448: in order to arrive the moment of true bug.

449: Suppose the following scenario:

450: the true bug is a temporal invalid value of a variable in a condition C1;

451: the invalid value is propagated to another variable in a condition C2;

452: C1 recovers the correct state;

453: C2 causes the crash.

454: If we use C1 to test the debuggee,

455: we cannot find the moment C1 becomes invalid

456: because the condition is not monotonous.

457: But C2, the direct cause of the crash,

458: leads us to the moment when it becomes invalid.

459: After this step,

460: we use C1 to arrive at the true bug.

461:

462: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

463:

464: \section{Implementation for C}\label{sec:implc}

465: We need to modify target program

466: to include codes for updating timestamp.

467:

468: \subsection{Discussion about Target of Transformation}\label{subsec:implj-gcc}

469: We have implemented the transformer in intermediate code level.

470: There might be three levels of C program transformation:

471: \begin{enumerate}

472: \item source code level,

473: \item intermediate code level,

474: \item assembly code level.

475: \end{enumerate}

476:

477: The first one has the advantages of

478: independence from hardware platforms,

479: operating systems, and compilers.

480: However,

481: codes which programmers see in debugging

482: are different from ones they wrote

483: and the implementation is slightly difficult

484: because the transformer have to analysis the output of C preprocessor.

485:

486: The advantage of the second one

487: is to be independent of target architectures

488: if the compiler supports that platform,

489: while the disadvantage is to be dependent on a certain compiler.

490:

491: The target of the third level is assembly code emitted by compilers.

492: We first implemented a simple transformer

493: in this level\cite{kazutaka-revfunc},

494: because it is the easiest method,

495: but is dependent on target architectures

496: and lacks the portability.

497:

498: GNU C Compiler (GCC)\cite{gcc} is chosen as the target compiler

499: because it is used in various platforms.

500: GCC generates the intermediate code,

501: called Register Transfer Language (RTL),

502: from the source code

503: and we modified a part of GCC to insert the codes of increment of timestamp

504: at the RTL generation stage.

505: The target of our implementation is GCC-2.95.2

506: which was the latest release of GCC at the time.

507: The details of the modification is described

508: in section \ref{subsec:implc-gcc}.

509:

510: When compiled, debuggees are inserted

511: the macro \texttt{INC\_TS} shown in figure \ref{fig:incts}

512: at proper locations.

513:

514: \begin{figure}

515: \centering

516: \ \hrule \

517: \begin{verbatim}

518: int timestamp = -1;

519: int ref = -1;

520:

521: void brake(void){}

522:

523: #define INC_TS if(++timestamp == ref) brake();

524: \end{verbatim}

525: \ \hrule \

526: \caption{Codes for updating timestamp}

527: \label{fig:incts}

528: \end{figure}

529:

530: \subsection{Implementation of Dynamic Breakpoint for C}\label{subsec:implc-dbp}

531: The implementation of dynamic breakpoint

532: described in section \ref{subsec:pos-dbp}

533: has performance problem.

534: Conditional breakpoint evaluates the given condition

535: whenever the control point reaches the location

536: shown as $\triangle$ in figure \ref{fig:dbp},

537: so the execution usually slows down seriously.

538:

539: The implementation with the least overhead (only 2 breaks)

540: is as follows.

541:

542: \begin{description}

543: \item[Step 1]

544: Prior to the execution,

545: set a static breakpoint in the function \texttt{brake()},

546: assign the target value of timestamp

547: to the debuggee's variable \texttt{ref}.

548: Start the execution

549: until it stops by the breakpoint

550: when the value of timestamp reaches \texttt{ref}.

551: \item[Step 2]

552: Set another static breakpoint at the target location

553: and continue the execution.

554: \end{description}

555:

556: \begin{figure}

557: \centering

558: \includegraphics{dbp}

559: \caption{Procedure of dynamic breakpoint}

560: \label{fig:dbp}

561: \end{figure}

562:

563: \subsection{Modifications of GCC}\label{subsec:implc-gcc}

564: In GCC,

565: the function of the parser, \texttt{yyparse},

566: invokes its actions which generate RTL.

567: We modify GCC

568: so that it has a new command line option, \texttt{-pg2},

569: and emits the RTL for updating timestamp

570: if the option is given.

571: In C programs,

572: timestamp should increase at the following cases.\footnote{

573: Our implementation does not increase timestamp

574: when \texttt{longjmp()} is called.

575: It does not matter

576: unless \texttt{setjmp()} and \texttt{longjmp()} are called

577: in the same function,

578: which is usually expressed by \texttt{goto} statement.}

579:

580: \begin{description}

581: \item[loops]

582: The increment code is emitted

583: just after the label, \texttt{start\_label},

584: which is placed at the start of loops in \texttt{expand\_start\_loop}.

585: The label is the target of jumps from tails of loop bodies

586: and is emitted after the initialization of \texttt{for} statements

587: (figure \ref{fig:loop-expand}).

588: It works well in the cases of \texttt{while} statements

589: and \texttt{do/while} statements.\footnote{

590: Timestamp should increase

591: just after \texttt{continue\_label}

592: in terms of backward jumps.

593: But we do not choose this approach

594: for the unification of the implementations

595: for three kinds of loop statements.

596: Therefore one extra increment of timestamp occurs

597: at each loop structure,

598: but it is negligible.}

599: \item[\texttt{goto} statements]

600: The increment code is emitted

601: just before the invocation of\\

602: \texttt{expand\_goto\_internal}

603: in \texttt{expand\_goto}.

604: Jumps of which the destination label is forward

605: do not need to cause the increment of timestamp.

606: But we do not decide whether a jump is backward

607: for the sake of keeping the implementation simple.\footnote{

608: Too frequent updating of timestamp does not destroy

609: the consistency of position,

610: but causes a drop in the performance.}

611: GCC has its original features,

612: ``nonlocal goto'' and ``computed goto''.

613: Our modified GCC does not regard these \texttt{goto}s

614: as the target of the increment of timestamp

615: because of the simple implementation.

616: \item[\texttt{return} statements]

617: \texttt{return}s with a return value

618: put it in the registers of CPU.

619: The increment code is emitted

620: just before the computation of arguments of \texttt{return}s

621: in \texttt{c\_expand\_return}

622: in order to prevent the value in the registers from being destroyed.

623: \item[entrance of function calls]

624: The code is emitted

625: just after the invocation of \texttt{store\_parm\_decls},

626: which registers the name and type of the arguments of the function call.

627: \item[exit of function calls]

628: This is relevant to the tail of \texttt{void} functions

629: without explicit \texttt{return}.

630: In order to prevent extra increments

631: after one at \texttt{return} statements,

632: the code is emitted

633: just before \texttt{return\_label}

634: which is emitted in epilogues of functions.\footnote{

635: \texttt{return\_label} is the destination of jumps

636: from \texttt{return} statements.}

637: \end{description}

638:

639: \begin{figure}

640: \centering

641: \includegraphics{loop-expand}

642: \caption{Expansion of \texttt{for} statements}

643: \label{fig:loop-expand}

644: \end{figure}

645:

646: Our modifications to GCC have around 70 lines.

647: We have confirmed the behavior of our GCC

648: on four platforms.

649: In order to port it to other platforms,

650: all we have to do is the modification

651: of a macro which is relevant to command line options

652: in a header file for each platform.

653:

654: \begin{itemize}

655: \item i386-linux

656: \item alpha-linux

657: \item sparc-sunos4

658: \item sparc-sun-solaris2.8

659: \end{itemize}

660:

661: Figure \ref{fig:addedcode} shows

662: the inserted codes on i386-linux and sparc-sunos4

663: in assembly code notation.

664: The codes are generated with options

665: for debugging, optimizations and timestamp.

666:

667: \begin{figure*}[t]

668: \centering

669: \ \hrule \

670: \begin{verbatim}

671: incl timestamp                 sethi   %hi(_timestamp), %o1

672: movl timestamp,%eax            ld      [%o1+%lo(_timestamp)], %o0

673: cmpl ref,%eax                  add     %o0, 1, %o0

674: jne .L6                        st      %o0, [%o1+%lo(_timestamp)]

675: call brake                     sethi   %hi(_ref), %o1

676:                                ld      [%o1+%lo(_ref)], %o1

677:                                cmp     %o0, %o1

678:                                bne     L21

679:                                mov     0, %l2

680:                                call    _brake, 0

681: \end{verbatim}

682: \ \hrule \

683: \caption{Inserted assembly codes for i386-linux (left) and sparc-sunos4 (right)}

684: \label{fig:addedcode}

685: \end{figure*}

686:

687: \subsection{Overhead Measurement of C Implementation}\label{subsec:implc-over}

688: Table \ref{tab:overheadc} shows the result of measurement

689: of runtime overhead.

690: The debuggee of this benchmark is gawk-2.15.6.

691: Overhead of other C applications is also measured (table \ref{tab:overheadc2}).

692: The benchmark ``empty loop'' has only one loop structure

693: whose body is empty and shows the worst case of overhead of our system,

694: because both increment of timestamp and comparing it to \texttt{ref} occur

695: whenever the loop is repeated.

696: The ``sed'' benchmark is sed-1.18.

697: Platforms in these tables are shown in table \ref{tab:platforms}.\footnote{

698: We regret to lose SPARC-1 platform and cannot know

699: its clock of CPU.}

700:

701: The number of increments of timestamp is

702: 500000002 in ``loop'', 85888810 in ``sed'' and

703: 96817515 in ``gawk''.

704: The average time required for increment is about

705: $13.7$, $9.76$ and $16.5$ nanoseconds respectively.

706: We think it is acceptable for practical use.

707:

708: \begin{table}

709: \centering

710: \caption{Overhead of gawk with timestamp system}

711: \label{tab:overheadc}

712: \hbox to\hsize{\hfil

713: \begin{tabular}{|l|r|r|r|}

714: \hline

715: GCC options & Intel-1 & Alpha & SPARC-1 \\

716: \hline\hline

717: -g -O & 2.57 & 4.64 & 17.44 \\

718: & (1.00) & (1.00) & (1.00) \\

719: \hline

720: -g -O -pg2 & 3.78 & 6.34 & 24.40 \\

721: & (1.47) & (1.37) & (1.40) \\

722: \hline

723: \multicolumn{4}{r}{seconds (ratio)}\\

724: \end{tabular}\hfil}

725: \end{table}

726:

727: \begin{table}

728: \centering

729: \caption{Overhead of other C applications and platforms}

730: \label{tab:overheadc2}

731: \hbox to\hsize{\hfil

732: \begin{tabular}{|l|l|r|r|}

733: \hline

734: App. & GCC options & Intel-2 & SPARC-2 \\

735: \hline\hline

736: empty loop & -g -O & 2.762 & 1.012 \\

737: & & (1.00) & (1.00) \\

738: \cline{2-4}

739: & -g -O -pg2 & 9.632 & 10.112 \\

740: & & (3.49) & (9.99) \\

741: \hline\hline

742: sed & -g -O & 9.182 & 3.182 \\

743: & & (1.00) & (1.00) \\

744: \cline{2-4}

745: & -g -O -pg2 & 10.02 & 4.916 \\

746: & & (1.09) & (1.54) \\

747: \hline\hline

748: gawk & -g -O & 2.87 & 2.842 \\

749: & & (1.00) & (1.00) \\

750: \cline{2-4}

751: & -g -O -pg2 & 4.466 & 4.598 \\

752: & & (1.56) & (1.62) \\

753: \hline

754: \multicolumn{4}{r}{seconds (ratio)}\\

755: \end{tabular}\hfil}

756: \end{table}

757:

758: \begin{table}

759: \centering

760: \caption{Platforms of C applications}

761: \label{tab:platforms}

762: \hbox to\hsize{\hfil

763: \begin{tabular}{|l|lr|l|}

764: \hline

765: \multicolumn{1}{|c|}{Platform} & \multicolumn{2}{c|}{Architecture} &

766:   \multicolumn{1}{c|}{OS} \\

767: \hline

768: Alpha   & Alpha   & 500MHz & Linux (glibc2) \\

769: Intel-1 & Celeron & 400MHz & Linux (glibc1) \\

770: Intel-2 & Celeron & 366MHz & Linux (glibc2) \\

771: SPARC-1 & SPARC   & N/A    & SunOS-4 \\

772: SPARC-2 & UltraSPARC & 500MHz & Solaris-8 \\

773: \hline

774: \end{tabular}\hfil}

775: \end{table}

776:

777: Table \ref{tab:overheadcsize} shows the total size of files of

778: each benchmark.

779: The increases are acceptable.

780:

781: \begin{table}

782: \centering

783: \caption{Increase of file size of C applications}

784: \label{tab:overheadcsize}

785: \hbox to\hsize{\hfil

786: \begin{tabular}{|l|l|r|r|}

787: \hline

788: App. & GCC options & Intel-2 & SPARC-2 \\

789: \hline\hline

790: empty loop & -g -O & 1144 & 2160 \\

791: & & (1.00) & (1.00) \\

792: \cline{2-4}

793: & -g -O -pg2 & 1248 & 2416 \\

794: & & (1.09) & (1.12) \\

795: \hline\hline

796: sed & -g -O & 41030 & 55263 \\

797: & & (1.00) & (1.00) \\

798: \cline{2-4}

799: & -g -O -pg2 & 52430 & 77215 \\

800: & & (1.28) & (1.40) \\

801: \hline\hline

802: gawk & -g -O & 140046 & 169310 \\

803: & & (1.00) & (1.00) \\

804: \cline{2-4}

805: & -g -O -pg2 & 169414 & 223078 \\

806: & & (1.21) & (1.32) \\

807: \hline

808: \multicolumn{4}{r}{bytes (ratio)}\\

809: \end{tabular}\hfil}

810: \end{table}

811:

812: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

813:

814: \section{Implementation for Java}\label{sec:implj}

815: From the experience of C,

816: we have chosen the implementation using bytecode transformation.

817: We considered four levels to implement the transformer for Java

818: before the decision:

819: \begin{enumerate}

820: \item compilers (source code level),

821: \item class files (bytecode level),

822: \item virtual machines (bytecode execution level),

823: \item Java Platform Debugger Architecture\cite{jpda}

824: (debugger interface level).\footnote{

825: Java Platform Debugger Architecture (JPDA) is included in JDK.

826: Java debugger JDB also uses it.}

827: \end{enumerate}

828:

829: The first and the third are excluded

830: because of the same reason (portability) as the case of C.

831: The fourth is very portable

832: but we excluded this approach

833: because of the expected high overhead.

834:

835: Although bytecode can be regarded as assembly code in the case of C,

836: the format is established

837: by Java virtual machine specification\cite{javavmspec},

838: and it is independent of target architectures.

839:

840: \subsection{List of Bytecodes which Cause Increment of Timestamp}\label{subsec:implj-bc}

841: As explained in section \ref{subsec:pos-ts},

842: timestamp must be updated at several types of program structure.

843:

844: \begin{description}

845: \item[entrance and exit of method invocations]

846: No bytecodes correspond with the entrance of method invocations.

847: The top of method body is used instead.

848: \texttt{return} bytecodes (opcode: 172--177)

849: correspond with exit of methods.\\

850: Bytecodes which invokes methods such as \texttt{invokevirtual}

851: could be used instead of the beginning of method body.

852: We selected the latter,

853: because the amount of added codes is less than the former.

854: Another advantage of this choice

855: is that the timestamp overhead is only added

856: to the methods of the modified class.

857: There will be no overhead for calling non-modified classes

858: (such as ones in system library).

859: \item[branches]

860: \texttt{ifeq} (153) to \texttt{if\_acmpne} (166),

861: \texttt{ifnull} (198) and \texttt{ifnonnull} (199).

862: \item[goto]

863: \texttt{goto} (167) and \texttt{goto\_w} (200).

864: \item[other jumps]

865: \texttt{jsr} (168), \texttt{ret} (169) and \texttt{jsr\_w} (201).

866: \item[exceptions]

867: \texttt{athrow} bytecode throws exceptions,

868: but there exists certain exceptions

869: which is not \texttt{athrow}ed explicitly

870: (such as NullPointerException).

871: Instead the codes of increment of timestamp are inserted

872: into the entry of the \texttt{catch} block.

873: \end{description}

874:

875: Our transformer inserts a bytecode for updating timestamp,

876: just before bytecodes described above.

877: The inserted one is only \texttt{invokestatic}

878: followed by the index number of Methodref tag

879: which indicates \texttt{Timestamp.inc()} in constant pool.

880: Bytecodes of \textit{branches}, \textit{goto}s and \textit{other jumps}

881: described above have an operand which designates an offset

882: to its target address.

883: The update of timestamp is inserted when the operand has a negative value,

884: i.e. backward jump.

885: The implementation of \texttt{Timestamp} class

886: at present is shown in figure \ref{fig:ts-class}.

887:

888: \begin{figure}

889: {\small

890: \ \hrule \

891: \begin{verbatim}

892: public final class Timestamp{

893:   private static long ts, ref;

894:

895:   static{

896:     ts = 0;

897:     ref = 0;

898:   }

899:

900:   public static void inc(){

901:     if(++ts == ref) brake();

902:   }

903:

904:   private static void brake(){}

905: }

906: \end{verbatim}

907: \ \hrule \ }

908: \caption{\texttt{Timestamp} class}

909: \label{fig:ts-class}

910: \end{figure}

911:

912: \subsection{Bytecode Transformer}\label{subsec:implj-conv}

913: Our bytecode transformer which transforms Java class files

914: is written in Java using Bytecode Engineering Library\cite{bcel} (BCEL)

915: and has around 160 lines.

916:

917: It is not necessary to transform all class files of a program:

918: it is possible to do only those classes the user considers suspicious.

919: This reduces the overhead significantly.

920:

921: \subsection{Overhead Measurement of Java Implementation}\label{subsec:implj-over}

922: We show the results of measurement of

923: runtime overhead and increase of size of transformed class files.

924: The target class files include an empty loop

925: and benchmark programs of SPEC JVM98\cite{specjvm98}.

926: We run them under JDK-1.4.0 on Linux PC

927: (Pentium III 733MHz, 640MB memory, and Linux-2.4.7).

928:

929: We ran each benchmark seven times

930: and found the means of five results except the best and the worst.

931: The result is shown in table \ref{tab:overhead}.

932: In the case of empty loop,

933: whenever the loop is repeated,

934: the method invocation of \texttt{Timestamp.inc()} occurs.

935: This benchmark shows the worst case of overhead of our system.

936: Our implementation at present slows down around 4 times.

937: For other benchmarks

938: overheads are around 1.5 to 2 times of slowing down.

939: We think it is acceptable for practical use.

940: When the implementation of reverse watchpoint completed,

941: its overhead would be the sum of two:

942: timestamp system overhead which is

943: around 1.5 to 2 for each of two pass

944: and watchpoint overhead which JDB produces.

945: We may estimate the overhead of reverse watchpoint

946: to be less than around 4 times of slowing down in most cases.

947: Note that \texttt{\_227\_mtrt} benchmark is a multi-threaded program

948: and we added \texttt{synchronized} to \texttt{Timestamp.inc()} method

949: only for this benchmark.

950: So the overhead is heavier than others.

951:

952: \begin{table*}[t]

953: \caption{Overhead of Java Programs with Timestamp System}

954: \label{tab:overhead}

955: \hbox to\hsize{\hfil

956: \begin{tabular}{|l|rr|rr|}

957: \hline

958: Benchmark & \multicolumn{2}{c|}{Original} &

959:  \multicolumn{2}{c|}{Backward Jumps}\\

960: \hline

961: \hline

962: empty loop & 12.774 & (1.00) & 56.416 & (4.42)\\

963: \hline

964: \_201\_compress & 1.4006 & (1.00) & 2.7754 & (1.98)\\

965: \_202\_jess & 0.2126 & (1.00) & 0.271 & (1.27)\\

966: \_209\_db & 0.412 & (1.00) & 0.433 & (1.05)\\

967: \_222\_mpegaudio & 0.1826 & (1.00) & 0.2792 & (1.52)\\

968: \_227\_mtrt & 0.363 & (1.00) & 0.9038 & (2.49)\\

969: \_228\_jack & 0.603 & (1.00) & 0.7212 & (1.19)\\

970: \hline

971: \multicolumn{5}{r}{seconds (ratio)}\\

972: \end{tabular}\hfil}

973: \end{table*}

974:

975: Table \ref{tab:filesize} shows the total size of files of each benchmark.

976: The increase of empty loop benchmark whose file size is very small

977: and that of \texttt{\_201\_compress} benchmark are a little large,

978: but others increase little.

979:

980: \begin{table*}[t]

981: \caption{Increase of Java class file size}

982: \label{tab:filesize}

983: \hbox to\hsize{\hfil

984: \begin{tabular}{|l|rr|rr|}

985: \hline

986: Benchmark & \multicolumn{2}{c|}{Original} &

987:  \multicolumn{2}{c|}{Backward Jumps}\\

988: \hline

989: \hline

990: empty loop & 276 & (1.00) & 328 & (1.19)\\

991: \hline

992: \_201\_compress & 14443 & (1.00) & 18640 & (1.29)\\

993: \_202\_jess & 396536 & (1.00) & 407240 & (1.03)\\

994: \_209\_db & 10156 & (1.00) & 10588 & (1.04)\\

995: \_222\_mpegaudio & 120182 & (1.00) & 124438 & (1.04)\\

996: \_227\_mtrt & 859 & (1.00) & 920 & (1.07)\\

997: \_228\_jack & 132516 & (1.00) & 138109 & (1.04)\\

998: \hline

999: \multicolumn{5}{r}{bytes (ratio)}\\

1000: \end{tabular}\hfil}

1001: \end{table*}

1002:

1003: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

1004:

1005: \section{Debugger Command History as a Position}\label{sec:vs}

1006: We chose \textit{timestamp representation},

1007: the pair of line number and timestamp, for position.

1008: Another one is \textit{command-history representation}.

1009: When we arrive at a certain position in a debugging session,

1010: the whole history of debugger commands enables us

1011: to come back to the position by re-execution from the beginning.

1012: The advantages of our representation are as follows.

1013:

1014: \begin{description}

1015: \item[Total order]

1016: Timestamp representation is totally ordered

1017: and any position can be compared to each other in order of time,

1018: so we can do binary search method described

1019: in section \ref{subsubsec:pos-app-bs}.

1020: \item[Uniqueness]

1021: Our choice gives a unique representation for a position

1022: while there may be many command histories

1023: leading to a position.

1024: This enables a 1-to-1 correspondence

1025: between a position and a bookmark for it.

1026: \item[Performance]

1027: In most cases,

1028: playing back debugger commands requires high overhead

1029: compared to our method.

1030: Longer history produces higher overhead.

1031: \end{description}

1032:

1033: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

1034:

1035: \section{Related Work}\label{sec:relw}

1036: Boothe\cite{boothe-bdb} made a C debugger with

1037: reverse execution capability

1038: using a step counter which counts the number of step executions

1039: and re-execution from the beginning of debuggees.

1040: The capability could be also implemented

1041: with our timestamp counter and re-execution.

1042: The difference comes from the purpose of each project.

1043: Boothe made reverse execution version of existing debugger commands

1044: such as ``backward step'', ``backward finish'', and so on.

1045: Since we try to implement more abstract control of program execution

1046: than raw debugger commands,

1047: the counter of step execution is too expensive for our purpose.

1048:

1049: Feldman et al.\cite{igor}, Moher\cite{provide}

1050: and Wilson et al.\cite{demonic}

1051: save complete memory history of process

1052: to achieve fully random accessibility to program states.

1053: Their systems have to deal with large ``log''.

1054: Our system, however, saves only a pair of line number and

1055: value of timestamp to obtain the same capability

1056: by assuming the determinism of debuggees.

1057:

1058: Lieberman et al.\cite{zstep,ecoop87} developed a reversible,

1059: animated source code stepper, ZStep95.

1060: Its modified interpreter saves the order

1061: of evaluating S-expression of Lisp programs

1062: to provide fully reversible execution.

1063: ZStep95 also provides correspondence between

1064: a S-expression and a graphical output

1065: which is produced by the expression.

1066: The correspondence is similar to position,

1067: but the interpreter supports only a subset of Lisp

1068: and works very slow.

1069:

1070: Bertot\cite{occurrences} introduced ``occurrences'' into

1071: the lazy $\lambda$-calculus,

1072: which makes copies of a subtree in reduction.

1073: For example, when an expression $e$ is applied to a lambda function

1074: $\lambda x.x + x$, two copies of $e$ will be made and used for

1075: both operands of $+$.

1076: This can be regarded as creation of multiple positions

1077: from one correspondent location at the time of execution.

1078: Bertot achieved a breakpoint capability,

1079: which is set at an expression and enables the program to stop

1080: at any copy of the expression is evaluated.

1081: In procedural languages, the identification is very easy;

1082: use the address of the instruction as a breakpoint.

1083: Their purpose is to unify multiple positions to the location;

1084: our purpose is to distinguish positions from each other.

1085:

1086: Zeller et al.\cite{ddmin,cechain} propose

1087: ``Delta Debugging'' which automatically find out data or variables

1088: which are concerned with errors

1089: by comparing the input data or variables

1090: which exit normally with ones which cause errors.

1091: It is very useful method

1092: and similar with our method in terms of

1093: reducing the labor of programmers by power of recent computers.

1094: We, however, want to establish

1095: more interactive and flexible debugging method

1096: and our method is complementary to their one.

1097:

1098: Ducass\'{e}\cite{coca-icse} allows the programmer

1099: to control the execution

1100: not by source statement orientation,

1101: but by event orientation

1102: such as assignments, function calls, loops, and so on.

1103: Users write Prolog-like forms

1104: to designate breakpoints which have complex conditions.

1105: This mechanism is complementary to our system

1106: and suitable for a front end of it

1107: in order to designate appropriate positions

1108: where we would move control point to.

1109:

1110: Templer et al.\cite{cci,cci-light}

1111: developed a event-based instrumentation tool, CCI,

1112: which inserts instrumentation codes into C source codes.

1113: The converted codes have platform independence.

1114: The execution slowdown, however, is

1115: 2.09 times in the case of \texttt{laplace.c}

1116: and 5.85 times in the case of \texttt{life.c}\cite{cci-light}.

1117: In order to achieve position system,

1118: events only about control flow should be generated.

1119:

1120: Larus et al.\cite{eel} made EEL,

1121: which is a library for building tools to

1122: analyze and modify an executable program.

1123: Using EEL, we could implement the insertion of

1124: codes to maintain timestamp in executable code level.

1125: The solution, however, is dependent on a specified platform,

1126: so we chose the intermediate code level and

1127: modified GCC.

1128:

1129: Binder et al.\cite{prcjava}

1130: integrated a resource management system of CPU and memory

1131: into J-SEAL2 mobile agent system

1132: using bytecode transformation for complete portability.

1133: They use a counter which counts statements executed

1134: for CPU resource management and

1135: each thread executed updates the counter at each basic block.

1136: They reduce the frequency of the update using control flow analysis

1137: and the overhead of the system including other components

1138: of agent system is $1.41$ times of slowing down in the worst case.

1139: Hayami et al.\cite{bctrans-javacpu}

1140: also inserted similar codes of counter update

1141: using bytecode transformation for the same purpose.

1142: They implemented more fine grain management

1143: and the overhead is $1.63$ times of slowing down.

1144: These results are better than ours,

1145: because our implementation uses method invocation

1146: and its overhead is serious, we think.

1147:

1148:

1149: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

1150:

1151: \section{Conclusion}\label{sec:conc}

1152: We proposed a new idea, position,

1153: as the base of execution control.

1154: It introduces an absolute coordinate into program traces

1155: and indicates a point in traces.

1156: In order to implement it,

1157: we introduced a counter, timestamp,

1158: as a global variable of the debuggee,

1159: which increases whenever the control point jumps backward.

1160: Position is expressed

1161: as a pair of the line number and the timestamp value.

1162: We introduced the idea of dynamic breakpoint

1163: as ``breakpoint at a position''

1164: and described three applications.

1165:

1166: We also described the implementation details of

1167: the timestamp system for C and Java programs

1168: using the modification to GCC and the bytecode transformation respectively.

1169: Our GCC and bytecode transformer insert the codes of increment of timestamp

1170: at certain locations which cause control jumps.

1171: We measured the result of runtime overhead of C and Java implementations

1172: and increase of size of transformed class files

1173: and showed that they are acceptable for real use.

1174:

1175:

1176: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

1177:

1178: \section{Future Work}\label{sec:futu}

1179: The bytecode transformer transforms all the methods in given class files.

1180: We should make it able to do selectively.

1181:

1182: The driver program of reverse watchpoint

1183: for deterministic C programs is completed

1184: but one for Java, based on JDB, is still under development.

1185: Although JDB included in JDK-1.4.1 does not provide

1186: the support of ``watchpoint to individual objects'',

1187: WatchpointRequest class of JPDA in JDK-1.4.1

1188: now have the capability of adding instance filters

1189: by using the \texttt{addInstanceFilter} method.

1190: JDB will have the support soon.

1191:

1192: For non-deterministic programs,

1193: the applications described in section \ref{subsec:pos-app}

1194: do not work well without appropriate replay mechanisms.

1195: If non-determinism of debuggees was based on

1196: external environment such as input data,

1197: we could use some tools to record and replay the environment.

1198: For example, Xlab\cite{xlab} could be used

1199: to record and replay X window system events.

1200: If debuggees have internal non-determinism

1201: such as multi-threaded programs,

1202: tools to replay the timing of thread switchings

1203: would be needed.

1204: Choi et al.\cite{dejavu} implemented a modified Java VM

1205: which can replay multi-threaded Java programs.

1206: There is another way that the timing is saved via JPDA.

1207:

1208: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

1209:

1210: \section{Acknowledgments}

1211: Thanks to Naoshi Higuchi for the wealth of his knowledge

1212: about Java programming and its APIs.

1213:

1214:

1215: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

1216:

1217: \bibliography{bibliography-e}

1218:

1219: \end{document}

1220: