c# - Async execution: Investigating intermittent delay between two log statements - Stack Overflow

admin2025-04-21  2

Application: We have a multithreaded library prepared with NetStandard2.0 using the C# programming language.

Additional Info (Update) : This library is consumed by multiple processes at same time. For example; we might have 40 tools, and every tool will be executed in same physical or virtual machine and used this library.

Deployment: Windows Server

Problem Description: I'm experiencing an intermittent blocking issue when executing the following code block in my library. Under normal conditions, the execution completes within the same milliseconds, but sometimes there's a delay, (lastly appr. 45 sec) between two log statements. I want to understand what might be causing this delay.

The problematic code block is within the using statement in the following function:

public class ToolCheckLayer
{
    private async Task<bool> CheckAsync(CancellationToken cancellationToken)
    {
        Logger?.Info("KeepAlive is working...");
        using (var messageListener = await _bus.CreateListenerAsync(cancellationToken)
            .ConfigureAwait(false))
        {
            // Some operations
            return true;
        }
        return false;
    }
}

Code Snippet:

Here is the relevant code related to the bus listener creation:

public class MessageBus
{
    private readonly ILogger _logger;
    private readonly SemaphoreSlim _listenerSemaphore;        
  
    public class MessageBusListener : IDisposable
    {
        private readonly ILogger _logger;        
        private readonly bool _privileged;        
        private readonly BufferBlock<MessageBase> _queue;        

        public MessageBusListener(ILogger logger, bool privileged,
            Action disposeAction)
        {
            _logger = logger;
            _priviledged = privileged;
            _queue = new BufferBlock<MessageBase>();
        }

        // Some Functions...
    }

    public MessageBus(ILogger logger, int maxConcurrentListeners)
    {
        _logger = logger;
        _listenerSemaphore = new SemaphoreSlim(maxConcurrentListeners);
    }

    public async Task<MessageBusListener> CreateListenerAsync(
        CancellationToken cancellationToken)
    {
        _logger?.Debug("Creating listener...");        
        await _listenerSemaphore.WaitAsync(cancellationToken);

        var listener = new MessageBusListener(_logger, privileged: false, () =>
        {
            _listenerSemaphore.Release();
        });
        
        // Operations        

        return listener;
    }
}

Investigations So Far:

The issue does not occur consistently; it happens randomly after weeks or months of smooth operation. There are no CPU, memory, or I/O bottlenecks observed during the blocking period.

Latest Findings (Update-2)

Tool 1
10:20:16.4951|INFO|ToolCheckLayer [1]: KeepAlive is working...
--- Here: ~45 seconds blocking sometimes less... -----------> 
10:21:00.0168|DEBUG|MessageBus [1]: Creating listener...

Tool 2
10:20:15.2180|INFO|ToolCheckLayer [1]: KeepAlive is working...
--- Here: ~45 seconds blocking sometimes less... -----------> 
10:21:00.0171|DEBUG|MessageBus [1]: Creating listener...

Tool 3
2025-01-23 10:20:16.4638|INFO|ToolCheckLayer [1]: KeepAlive is working...
--- Here: ~45 seconds blocking sometimes less... ----------->
2025-01-23 10:21:00.0144|DEBUG|MessageBus [1]: Creating listener...
...

We encountered the issue again on 24 tools but this time the blocking duration is quite long. According to log timestamps, CreateListenerAsync() for every tool is blocked until a specific time. (10:21:00.01xx) Then all blocks are free.

Zabbix metrics and traces are clean. No data was obtained to raise suspicion. CPU Utilization %3, Memory %40, etc.

Many thanks to everyone who contributed in advance.

Application: We have a multithreaded library prepared with NetStandard2.0 using the C# programming language.

Additional Info (Update) : This library is consumed by multiple processes at same time. For example; we might have 40 tools, and every tool will be executed in same physical or virtual machine and used this library.

Deployment: Windows Server

Problem Description: I'm experiencing an intermittent blocking issue when executing the following code block in my library. Under normal conditions, the execution completes within the same milliseconds, but sometimes there's a delay, (lastly appr. 45 sec) between two log statements. I want to understand what might be causing this delay.

The problematic code block is within the using statement in the following function:

public class ToolCheckLayer
{
    private async Task<bool> CheckAsync(CancellationToken cancellationToken)
    {
        Logger?.Info("KeepAlive is working...");
        using (var messageListener = await _bus.CreateListenerAsync(cancellationToken)
            .ConfigureAwait(false))
        {
            // Some operations
            return true;
        }
        return false;
    }
}

Code Snippet:

Here is the relevant code related to the bus listener creation:

public class MessageBus
{
    private readonly ILogger _logger;
    private readonly SemaphoreSlim _listenerSemaphore;        
  
    public class MessageBusListener : IDisposable
    {
        private readonly ILogger _logger;        
        private readonly bool _privileged;        
        private readonly BufferBlock<MessageBase> _queue;        

        public MessageBusListener(ILogger logger, bool privileged,
            Action disposeAction)
        {
            _logger = logger;
            _priviledged = privileged;
            _queue = new BufferBlock<MessageBase>();
        }

        // Some Functions...
    }

    public MessageBus(ILogger logger, int maxConcurrentListeners)
    {
        _logger = logger;
        _listenerSemaphore = new SemaphoreSlim(maxConcurrentListeners);
    }

    public async Task<MessageBusListener> CreateListenerAsync(
        CancellationToken cancellationToken)
    {
        _logger?.Debug("Creating listener...");        
        await _listenerSemaphore.WaitAsync(cancellationToken);

        var listener = new MessageBusListener(_logger, privileged: false, () =>
        {
            _listenerSemaphore.Release();
        });
        
        // Operations        

        return listener;
    }
}

Investigations So Far:

The issue does not occur consistently; it happens randomly after weeks or months of smooth operation. There are no CPU, memory, or I/O bottlenecks observed during the blocking period.

Latest Findings (Update-2)

Tool 1
10:20:16.4951|INFO|ToolCheckLayer [1]: KeepAlive is working...
--- Here: ~45 seconds blocking sometimes less... -----------> 
10:21:00.0168|DEBUG|MessageBus [1]: Creating listener...

Tool 2
10:20:15.2180|INFO|ToolCheckLayer [1]: KeepAlive is working...
--- Here: ~45 seconds blocking sometimes less... -----------> 
10:21:00.0171|DEBUG|MessageBus [1]: Creating listener...

Tool 3
2025-01-23 10:20:16.4638|INFO|ToolCheckLayer [1]: KeepAlive is working...
--- Here: ~45 seconds blocking sometimes less... ----------->
2025-01-23 10:21:00.0144|DEBUG|MessageBus [1]: Creating listener...
...

We encountered the issue again on 24 tools but this time the blocking duration is quite long. According to log timestamps, CreateListenerAsync() for every tool is blocked until a specific time. (10:21:00.01xx) Then all blocks are free.

Zabbix metrics and traces are clean. No data was obtained to raise suspicion. CPU Utilization %3, Memory %40, etc.

Many thanks to everyone who contributed in advance.

Share Improve this question edited Jan 24 at 14:14 Theodor Zoulias 44.5k7 gold badges107 silver badges145 bronze badges asked Jan 22 at 23:39 KhanHKhanH 111 silver badge2 bronze badges 3
  • 1 As a side note, the await semaphore.WaitAsync(); is typically placed before the try/finally, not inside the try. It shouldn't make any difference in your case though. – Theodor Zoulias Commented Jan 23 at 0:13
  • As for the delay, it is totally inexplicable based on the code that you are showing us. The two log statements are executed the one after the other, synchronously, on the same thread. The operating system might suspend a thread for milliseconds, not seconds. I would look at the logging software, and at a potential malfunction of the storage device where the logs are written. – Theodor Zoulias Commented Jan 23 at 0:21
  • have you observed any case that a process holding the lock crashed and did not release until there was a timeout to automatically release the lock? Maybe we can try changing the timeout and observe again the see if there is any difference – Khanh TO Commented Jan 25 at 5:10
Add a comment  | 

1 Answer 1

Reset to default 1

As to why it happens, there are two possibilities in my opinion:

  1. Infrastructure problem - server with Service Bus is slow and creation of listener takes 10 seconds

  2. Semaphore blocking code from entering - try placing log directly around await _listenerSetSemaphore.WaitAsync();. If all semaphore "slots" are taken, then the waiting will take some time.

For me second possibility is more likely and I would investigate it.

转载请注明原文地址:http://anycun.com/QandA/1745223582a90429.html