[Update: fixes for TMonitor were made in XE2 update 4. Consequently, this post should be considered for historical information only.]
Probably out of masochism or something, I’ve been looking at TMonitor again. For anyone who doesn’t recall the earlier episodes, TMonitor is the ‘new’ (and rather low-level) threading device introduced in D2009. Alas, but it proved buggy as hell, as well as slow. Given the publicity however, it received some love for XE2 RTM. So, all right in the end? Er no, since Darian Miller on StackOverflow quickly tweaked one of the test projects that demonstrated its brokeness in XE and… found it still broken in XE2!
More exactly, the test project in question uses the TThreadedQueue class. Having experimented a bit more with TMonitor and found it seemingly OK, I thought I’d investigate the TThreadedQueue implementation. My initial idea was to test its logic distinct from TMonitor’s by replacing use of TMonitor with calls to equivalent Windows APIs (a monitor combines a critical section and a ‘conditional variable’, and the Windows API acquired a condition variable primitive in Vista). However, on inspecting the code, I found it using the (to me) rather tricky-to-comprehend 3 parameter overload of TMonitor.Wait – and in my usage, I’ve stuck to the simpler, 2 parameter overload. This made me think: if I just rewrote TThreadedQueue similarly, would that fix the problem? I think it has, but I would be grateful if anyone else could verify:
1. Download the problem project from the Windows QC client – browse to report number 101114, and get the DPR from the ‘attachments’ tab.
2. Open the project in the IDE, and add a new unit to it.
3. Copy the code for TThreadedQueue from System.Generics.Collections.pas. As it is almost completely self-contained, you need only System.SyncObjs in the new unit’s uses clause.
4. In the copied class definition, remove FQueueNotEmpty and FQueueNotFull. Everything else in the interface stays the same however.
5. Remove the lines constructing and freeing FQueueNotEmpty and FQueueNotFull in Create and Destroy.
6. Change the implementation of the substantive PopItem variant to this:
function TThreadedQueue<T>.PopItem(var AQueueSize: Integer; var AItem: T): TWaitResult; begin AItem := Default(T); TMonitor.Enter(FQueueLock); try while (FQueueSize = 0) and not FShutDown do if not TMonitor.Wait(FQueueLock, FPopTimeout) then Exit(wrTimeout); if FShutDown then Exit(wrAbandoned); Result := wrSignaled; AItem := FQueue[FQueueOffset]; if FQueueSize = Length(FQueue) then TMonitor.PulseAll(FQueueLock); Dec(FQueueSize); Inc(FQueueOffset); Inc(FTotalItemsPopped); if FQueueOffset = Length(FQueue) then FQueueOffset := 0; finally AQueueSize := FQueueSize; TMonitor.Exit(FQueueLock); end; end;
The main thing here is that we are now using the simpler variant of Wait. As an aside, it’s OK to ‘pulse’ as early as we do, since waiting threads only get notified once we’ve exited the monitor.
7. Change the implementation of the substantive PushItem overload to this:
function TThreadedQueue<T>.PushItem(const AItem: T; var AQueueSize: Integer): TWaitResult; begin TMonitor.Enter(FQueueLock); try while (Length(FQueue) = FQueueSize) and not FShutDown do if not TMonitor.Wait(FQueueLock, FPushTimeout) then Exit(wrTimeout); if FShutDown then Exit(wrAbandoned); Result := wrSignaled; if FQueueSize = 0 then TMonitor.PulseAll(FQueueLock); FQueue[(FQueueOffset + FQueueSize) mod Length(FQueue)] := AItem; Inc(FQueueSize); Inc(FTotalItemsPushed); finally AQueueSize := FQueueSize; TMonitor.Exit(FQueueLock); end; end;
8. Change DoShutdown to this:
procedure TThreadedQueue<T>.DoShutDown; begin FShutDown := True; TMonitor.PulseAll(FQueueLock); end;
I’ve removed the locking around setting FShutdown since I don’t see what good it does (no one will be trying to set it to False).
9. Rerun the problem project: hopefully, it will now work (meaning, it will count up to 1000 without crashing), notwithstanding the fact we haven’t actually fixed the bug in TMonitor itself…