Do While Continue
Posted on May 10, 2023 by Michael Keane GallowayI was working on processing records from DynamoDB. I had a query that would get me a few records at a time and process them so that we would have new attributes for the objects we were storing. My usual pattern for writing such code is to use a do while
loop. That way we always make the initial request to Dynamo and terminate when there’s nothing left to process.
In this scenario, I almost made a mistake. There was condition where if detected, I wanted to use continue to skip the rest of the loop. Since the statement to update the request object before querying again was at the end of the loop body, this would have resulted in an infinite loop like the following:
int iterationsToContinue = 45;
int iterationsToStopTheLoop = 5;
int iteration = 0;
do
{
.WriteLine(iteration);
Consoleif (iteration < iterationsToContinue) continue;
++;
iteration} while (iteration < iterationsToStopTheLoop);
Seeing that I almost created an infinite loop, I got curious as to where the program counter jumps to when for continue
in a do while
. I first wrote up a class to encapsulate a do while
without a continue
:
public class DoWhile
{
public static int DoSomething(int iterations){
int iteration = 0;
do
{
.WriteLine(iteration++);
Console} while (iteration < iterations);
return iteration;
}
}
I then used ilspycmd
to dump out the IL into a text file. I’ve copied just the method for DoSomething
into the following code listing.
.method public hidebysig static
DoSomething (
int32
int32 iterations) cil managed
{
// Method begins at RVA 0x2050
// Header size: 12
// Code size: 30 (0x1e)
.maxstack 3
.locals init (
[0] int32,
[1] bool,
[2] int32
)
: nop
IL_0000: ldc.i4.0
IL_0001: stloc.0
IL_0002// loop start (head: IL_0003)
: nop
IL_0003: ldloc.0
IL_0004: dup
IL_0005: ldc.i4.1
IL_0006: add
IL_0007: stloc.0
IL_0008: call void [System.Console]System.Console::WriteLine(int32)
IL_0009: nop
IL_000e: nop
IL_000f: ldloc.0
IL_0010: ldarg.0
IL_0011: clt
IL_0012: stloc.1
IL_0014: ldloc.1
IL_0015: brtrue.s IL_0003
IL_0016// end loop
: ldloc.0
IL_0018: stloc.2
IL_0019: br.s IL_001c
IL_001a
: ldloc.2
IL_001c: ret
IL_001d} // end of method DoWhile::DoSomething
I haven’t taken the time to learn all of the ops in IL, but from some of my experience with x86 has helped. It looks like our loop startes at IL_003
with a nop
(a blank command: no operation). Then it looks like load and store operations; addition; and calling other code. At the end of the loop IL_0010
to IL_0016
, we end up loading the relavent integer values and comparing them clt
. After the comparison, it looks like we execute brtru.s
to either return to IL_0003
or fall out of the loop to execute IL_0018
.
I then wrote a treatment of the same loop. This time with an if
statement to continue if the iteration
is less than 5:
public class DoWhileContinue
{
private const int iterationsToContinue = 5;
public static int DoSomething(int iterations){
int iteration = 0;
do
{
.WriteLine(iteration++);
Consoleif (iteration < iterationsToContinue) continue;
} while (iteration < iterations);
return iteration;
}
}
I have extracted the IL for the C# code above in the same manner as the previous method:
.class public auto ansi beforefieldinit DoWhileContinue.DoWhileContinue
[System.Runtime]System.Object
extends {
// Fields
.field private static literal int32 iterationsToContinue = int32(5)
// Methods
.method public hidebysig static
DoSomething (
int32
int32 iterations) cil managed
{
// Method begins at RVA 0x2084
// Header size: 12
// Code size: 40 (0x28)
.maxstack 3
.locals init (
[0] int32,
[1] bool,
[2] bool,
[3] int32
)
: nop
IL_0000: ldc.i4.0
IL_0001: stloc.0
IL_0002// loop start (head: IL_0003)
: nop
IL_0003: ldloc.0
IL_0004: dup
IL_0005: ldc.i4.1
IL_0006: add
IL_0007: stloc.0
IL_0008: call void [System.Console]System.Console::WriteLine(int32)
IL_0009: nop
IL_000e: ldloc.0
IL_000f: ldc.i4.5
IL_0010: clt
IL_0011: stloc.1
IL_0013: ldloc.1
IL_0014: brfalse.s IL_0019
IL_0015
: br.s IL_001a
IL_0017
: nop
IL_0019
: ldloc.0
IL_001a: ldarg.0
IL_001b: clt
IL_001c: stloc.2
IL_001e: ldloc.2
IL_001f: brtrue.s IL_0003
IL_0020// end loop
: ldloc.0
IL_0022: stloc.3
IL_0023: br.s IL_0026
IL_0024
: ldloc.3
IL_0026: ret
IL_0027} // end of method DoWhileContinue::DoSomething
The logic for updating the value of iteration
and calling Console::WriteLine
is still the same. However, we have a new set of operations that compare iteration
with the constant value 5. Once that clt
has been exectured, we either brfalse.s
to IL_0019
, or we br.s
to IL_001a
. This may seem redundant. There’s nothing between IL_0019
and IL001a
. It seems like something that could be optimized out. But this is the enforcement of standard behavior. continue
and break
should jump to a location for evaluating the controlling logic for the loop. The rendudnant brfalse.s
in this case lets us put more logic afte our branch.
This could be a topic for another blog, but I wonder what the JIT does with this useless branch? I should at the very least be able to pre-JIT my assembly and then object dump resulting Linux binary.
While I was at it, I also wrote up a version of my loops in C. That way I could compile it and see what a native binary might look like for comparison (especially since I have more familiartity with Intel/AMD assembler).
#include<stdio.h>
int DoSomething1(int iterations){
int iteration = 0;
do
{
("%d\n", iteration++);
printf} while (iteration < iterations);
return iteration;
}
const int iterationsToContinue = 5;
int DoSomething2(int iterations){
int iteration = 0;
do
{
("%d\n", iteration++);
printfif (iteration < iterationsToContinue) continue;
} while (iteration < iterations);
return iteration;
}
After compiling the above in GCC with the default optimization level, and then used objdump
to create a text file with the following assembler:
>:
0000000000000000 <DoSomething10f 1e fa endbr64
0: f3 push %rbp
4: 55 %rsp,%rbp
5: 48 89 e5 mov 20 sub $0x20,%rsp
8: 48 83 ec c: 89 7d ec mov %edi,-0x14(%rbp)
f: c7 45 fc 00 00 00 00 movl $0x0,-0x4(%rbp)
-0x4(%rbp),%eax
16: 8b 45 fc mov lea 0x1(%rax),%edx
19: 8d 50 01 %edx,-0x4(%rbp)
1c: 89 55 fc mov %eax,%esi
1f: 89 c6 mov lea 0x0(%rip),%rdi # 28 <DoSomething1+0x28>
21: 48 8d 3d 00 00 00 00 00 00 00 00 mov $0x0,%eax
28: b8 00 00 00 00 callq 32 <DoSomething1+0x32>
2d: e8 -0x4(%rbp),%eax
32: 8b 45 fc mov -0x14(%rbp),%eax
35: 3b 45 ec cmp 16 <DoSomething1+0x16>
38: 7c dc jl -0x4(%rbp),%eax
3a: 8b 45 fc mov
3d: c9 leaveq
3e: c3 retq
>:
000000000000003f <DoSomething20f 1e fa endbr64
3f: f3 push %rbp
43: 55 %rsp,%rbp
44: 48 89 e5 mov 20 sub $0x20,%rsp
47: 48 83 ec %edi,-0x14(%rbp)
4b: 89 7d ec mov 45 fc 00 00 00 00 movl $0x0,-0x4(%rbp)
4e: c7 -0x4(%rbp),%eax
55: 8b 45 fc mov lea 0x1(%rax),%edx
58: 8d 50 01 %edx,-0x4(%rbp)
5b: 89 55 fc mov %eax,%esi
5e: 89 c6 mov lea 0x0(%rip),%rdi # 67 <DoSomething2+0x28>
60: 48 8d 3d 00 00 00 00 00 00 00 00 mov $0x0,%eax
67: b8 00 00 00 00 callq 71 <DoSomething2+0x32>
6c: e8 05 00 00 00 mov $0x5,%eax
71: b8 %eax,-0x4(%rbp)
76: 39 45 fc cmp -0x4(%rbp),%eax
79: 8b 45 fc mov -0x14(%rbp),%eax
7c: 3b 45 ec cmp 55 <DoSomething2+0x16>
7f: 7c d4 jl -0x4(%rbp),%eax
81: 8b 45 fc mov
84: c9 leaveq 85: c3 retq
Interestingly, we have an extra cmp
op in our do while
with a continue
, but unlike our IL code, we just drop into the operations for controlling the loop. I suppose that even with a default optimization level of 0, GCC doesn’t include the extra jumps since they both take the program counter to the same location. If I had more statements after the if
, I would imagine that there would be at least a jump to the controlling logic for the loop.
As final thoughts, I’d like to share what I actually did when I encountered this question for myself. Since I was working, and wouldn’t have wanted to take the time to write all of this extra code and dump it out, I simply wrote a loop, and ran it through the debugger. That way I could see the behavior quickly, jot down the thougth in my blog notes, and get back to what I’m being paid for. Though now that I’ve gone through this exercise, I see some more experiments to run with the C# compiler (and more blogs to write).