Do While Continue
Posted on May 10, 2023 by Michael Keane Galloway![]()
I was working on processing records from DynamoDB. I had a query that would get me a few records at a time and process them so that we would have new attributes for the objects we were storing. My usual pattern for writing such code is to use a do while loop. That way we always make the initial request to Dynamo and terminate when there’s nothing left to process.
In this scenario, I almost made a mistake. There was condition where if detected, I wanted to use continue to skip the rest of the loop. Since the statement to update the request object before querying again was at the end of the loop body, this would have resulted in an infinite loop like the following:
int iterationsToContinue = 45;
int iterationsToStopTheLoop = 5;
int iteration = 0;
do
{
Console.WriteLine(iteration);
if (iteration < iterationsToContinue) continue;
iteration++;
} while (iteration < iterationsToStopTheLoop);Seeing that I almost created an infinite loop, I got curious as to where the program counter jumps to when for continue in a do while. I first wrote up a class to encapsulate a do while without a continue:
public class DoWhile
{
public static int DoSomething(int iterations){
int iteration = 0;
do
{
Console.WriteLine(iteration++);
} while (iteration < iterations);
return iteration;
}
}I then used ilspycmd to dump out the IL into a text file. I’ve copied just the method for DoSomething into the following code listing.
.method public hidebysig static
int32 DoSomething (
int32 iterations
) cil managed
{
// Method begins at RVA 0x2050
// Header size: 12
// Code size: 30 (0x1e)
.maxstack 3
.locals init (
[0] int32,
[1] bool,
[2] int32
)
IL_0000: nop
IL_0001: ldc.i4.0
IL_0002: stloc.0
// loop start (head: IL_0003)
IL_0003: nop
IL_0004: ldloc.0
IL_0005: dup
IL_0006: ldc.i4.1
IL_0007: add
IL_0008: stloc.0
IL_0009: call void [System.Console]System.Console::WriteLine(int32)
IL_000e: nop
IL_000f: nop
IL_0010: ldloc.0
IL_0011: ldarg.0
IL_0012: clt
IL_0014: stloc.1
IL_0015: ldloc.1
IL_0016: brtrue.s IL_0003
// end loop
IL_0018: ldloc.0
IL_0019: stloc.2
IL_001a: br.s IL_001c
IL_001c: ldloc.2
IL_001d: ret
} // end of method DoWhile::DoSomethingI haven’t taken the time to learn all of the ops in IL, but from some of my experience with x86 has helped. It looks like our loop startes at IL_003 with a nop (a blank command: no operation). Then it looks like load and store operations; addition; and calling other code. At the end of the loop IL_0010 to IL_0016, we end up loading the relavent integer values and comparing them clt. After the comparison, it looks like we execute brtru.s to either return to IL_0003 or fall out of the loop to execute IL_0018.
I then wrote a treatment of the same loop. This time with an if statement to continue if the iteration is less than 5:
public class DoWhileContinue
{
private const int iterationsToContinue = 5;
public static int DoSomething(int iterations){
int iteration = 0;
do
{
Console.WriteLine(iteration++);
if (iteration < iterationsToContinue) continue;
} while (iteration < iterations);
return iteration;
}
}I have extracted the IL for the C# code above in the same manner as the previous method:
.class public auto ansi beforefieldinit DoWhileContinue.DoWhileContinue
extends [System.Runtime]System.Object
{
// Fields
.field private static literal int32 iterationsToContinue = int32(5)
// Methods
.method public hidebysig static
int32 DoSomething (
int32 iterations
) cil managed
{
// Method begins at RVA 0x2084
// Header size: 12
// Code size: 40 (0x28)
.maxstack 3
.locals init (
[0] int32,
[1] bool,
[2] bool,
[3] int32
)
IL_0000: nop
IL_0001: ldc.i4.0
IL_0002: stloc.0
// loop start (head: IL_0003)
IL_0003: nop
IL_0004: ldloc.0
IL_0005: dup
IL_0006: ldc.i4.1
IL_0007: add
IL_0008: stloc.0
IL_0009: call void [System.Console]System.Console::WriteLine(int32)
IL_000e: nop
IL_000f: ldloc.0
IL_0010: ldc.i4.5
IL_0011: clt
IL_0013: stloc.1
IL_0014: ldloc.1
IL_0015: brfalse.s IL_0019
IL_0017: br.s IL_001a
IL_0019: nop
IL_001a: ldloc.0
IL_001b: ldarg.0
IL_001c: clt
IL_001e: stloc.2
IL_001f: ldloc.2
IL_0020: brtrue.s IL_0003
// end loop
IL_0022: ldloc.0
IL_0023: stloc.3
IL_0024: br.s IL_0026
IL_0026: ldloc.3
IL_0027: ret
} // end of method DoWhileContinue::DoSomethingThe logic for updating the value of iteration and calling Console::WriteLine is still the same. However, we have a new set of operations that compare iteration with the constant value 5. Once that clt has been exectured, we either brfalse.s to IL_0019, or we br.s to IL_001a. This may seem redundant. There’s nothing between IL_0019 and IL001a. It seems like something that could be optimized out. But this is the enforcement of standard behavior. continue and break should jump to a location for evaluating the controlling logic for the loop. The rendudnant brfalse.s in this case lets us put more logic afte our branch.
This could be a topic for another blog, but I wonder what the JIT does with this useless branch? I should at the very least be able to pre-JIT my assembly and then object dump resulting Linux binary.
While I was at it, I also wrote up a version of my loops in C. That way I could compile it and see what a native binary might look like for comparison (especially since I have more familiartity with Intel/AMD assembler).
#include<stdio.h>
int DoSomething1(int iterations){
int iteration = 0;
do
{
printf("%d\n", iteration++);
} while (iteration < iterations);
return iteration;
}
const int iterationsToContinue = 5;
int DoSomething2(int iterations){
int iteration = 0;
do
{
printf("%d\n", iteration++);
if (iteration < iterationsToContinue) continue;
} while (iteration < iterations);
return iteration;
}After compiling the above in GCC with the default optimization level, and then used objdump to create a text file with the following assembler:
0000000000000000 <DoSomething1>:
0: f3 0f 1e fa endbr64
4: 55 push %rbp
5: 48 89 e5 mov %rsp,%rbp
8: 48 83 ec 20 sub $0x20,%rsp
c: 89 7d ec mov %edi,-0x14(%rbp)
f: c7 45 fc 00 00 00 00 movl $0x0,-0x4(%rbp)
16: 8b 45 fc mov -0x4(%rbp),%eax
19: 8d 50 01 lea 0x1(%rax),%edx
1c: 89 55 fc mov %edx,-0x4(%rbp)
1f: 89 c6 mov %eax,%esi
21: 48 8d 3d 00 00 00 00 lea 0x0(%rip),%rdi # 28 <DoSomething1+0x28>
28: b8 00 00 00 00 mov $0x0,%eax
2d: e8 00 00 00 00 callq 32 <DoSomething1+0x32>
32: 8b 45 fc mov -0x4(%rbp),%eax
35: 3b 45 ec cmp -0x14(%rbp),%eax
38: 7c dc jl 16 <DoSomething1+0x16>
3a: 8b 45 fc mov -0x4(%rbp),%eax
3d: c9 leaveq
3e: c3 retq
000000000000003f <DoSomething2>:
3f: f3 0f 1e fa endbr64
43: 55 push %rbp
44: 48 89 e5 mov %rsp,%rbp
47: 48 83 ec 20 sub $0x20,%rsp
4b: 89 7d ec mov %edi,-0x14(%rbp)
4e: c7 45 fc 00 00 00 00 movl $0x0,-0x4(%rbp)
55: 8b 45 fc mov -0x4(%rbp),%eax
58: 8d 50 01 lea 0x1(%rax),%edx
5b: 89 55 fc mov %edx,-0x4(%rbp)
5e: 89 c6 mov %eax,%esi
60: 48 8d 3d 00 00 00 00 lea 0x0(%rip),%rdi # 67 <DoSomething2+0x28>
67: b8 00 00 00 00 mov $0x0,%eax
6c: e8 00 00 00 00 callq 71 <DoSomething2+0x32>
71: b8 05 00 00 00 mov $0x5,%eax
76: 39 45 fc cmp %eax,-0x4(%rbp)
79: 8b 45 fc mov -0x4(%rbp),%eax
7c: 3b 45 ec cmp -0x14(%rbp),%eax
7f: 7c d4 jl 55 <DoSomething2+0x16>
81: 8b 45 fc mov -0x4(%rbp),%eax
84: c9 leaveq
85: c3 retq Interestingly, we have an extra cmp op in our do while with a continue, but unlike our IL code, we just drop into the operations for controlling the loop. I suppose that even with a default optimization level of 0, GCC doesn’t include the extra jumps since they both take the program counter to the same location. If I had more statements after the if, I would imagine that there would be at least a jump to the controlling logic for the loop.
As final thoughts, I’d like to share what I actually did when I encountered this question for myself. Since I was working, and wouldn’t have wanted to take the time to write all of this extra code and dump it out, I simply wrote a loop, and ran it through the debugger. That way I could see the behavior quickly, jot down the thougth in my blog notes, and get back to what I’m being paid for. Though now that I’ve gone through this exercise, I see some more experiments to run with the C# compiler (and more blogs to write).