QNX pthread_mutex_lock causing deadlock error ( 45 = EDEADLK )
I am implementing an asynchronous log writing mechanism for my project's multithreaded application. Below is the partial code of the part where the error occurs.
void CTraceFileWriterThread::run()
{
bool fShoudIRun = shouldThreadsRun(); // Some global function which decided if operations need to stop. Not really relevant here. Assume "true" value.
while(fShoudIRun)
{
std::string nextMessage = fetchNext();
if( !nextMessage.empty() )
{
process(nextMessage);
}
else
{
fShoudIRun = shouldThreadsRun();
condVarTraceWriter.wait();
}
}
}
//This is the consumer. This is in my thread with lower priority
std::string CTraceFileWriterThread::fetchNext()
{
// When there are a lot of logs, I mean A LOT, I believe the
// control stays in this function for a long time and an other
// thread calling the "add" function is not able to acquire the lock
// since its held here.
std::string message;
if( !writeQueue.empty() )
{
writeQueueMutex.lock(); // Obj of our wrapper around pthread_mutex_lock
message = writeQueue.front();
writeQueue.pop(); // std::queue
writeQueueMutex.unLock() ;
}
return message;
}
// This is the producer and is called from multiple threads.
void CTraceFileWriterThread::add( std::string outputString ) {
if ( !outputString.empty() )
{
// crashes here while trying to acquire the lock when there are lots of
// logs in prod systems.
writeQueueMutex.lock();
const size_t writeQueueSize = writeQueue.size();
if ( writeQueueSize == maximumWriteQueueCapacity )
{
outputString.append ("n queue full, discarding traces, traces are incomplete" );
}
if ( writeQueueSize <= maximumWriteQueueCapacity )
{
bool wasEmpty = writeQueue.empty();
writeQueue.push(outputString);
condVarTraceWriter.post(); // will be waiting in a function which calls "fetchNext"
}
writeQueueMutex.unLock();
}
int wrapperMutex::lock() {
//#[ operation lock()
int iRetval;
int iRetry = 10;
do
{
//
iRetry--;
tRfcErrno = pthread_mutex_lock (&tMutex);
if ( (tRfcErrno == EINTR) || (tRfcErrno == EAGAIN) )
{
iRetval = RFC_ERROR;
(void)sched_yield();
}
else if (tRfcErrno != EOK)
{
iRetval = RFC_ERROR;
iRetry = 0;
}
else
{
iRetval = RFC_OK;
iRetry = 0;
}
} while (iRetry > 0);
return iRetval;
//#]
}
I generated the core dump and analysed it with GDB and here are some findings
Program terminated with signal 11, Segmentation fault.
"Errno=45" at the add function where I am trying to acquire the lock. The wrapper we have around pthread_mutex_lock tries to acquire the lock for around 10 times before it gives up.
The code works fine when there are fewer logs. Also, we do not have C++11 or further and hence restricted to mutex of QNX. Any help is appreciated as I am looking at this issue for over a month with little progress. Please ask if anymore info is required.
c++ queue mutex deadlock qnx
|
show 13 more comments
I am implementing an asynchronous log writing mechanism for my project's multithreaded application. Below is the partial code of the part where the error occurs.
void CTraceFileWriterThread::run()
{
bool fShoudIRun = shouldThreadsRun(); // Some global function which decided if operations need to stop. Not really relevant here. Assume "true" value.
while(fShoudIRun)
{
std::string nextMessage = fetchNext();
if( !nextMessage.empty() )
{
process(nextMessage);
}
else
{
fShoudIRun = shouldThreadsRun();
condVarTraceWriter.wait();
}
}
}
//This is the consumer. This is in my thread with lower priority
std::string CTraceFileWriterThread::fetchNext()
{
// When there are a lot of logs, I mean A LOT, I believe the
// control stays in this function for a long time and an other
// thread calling the "add" function is not able to acquire the lock
// since its held here.
std::string message;
if( !writeQueue.empty() )
{
writeQueueMutex.lock(); // Obj of our wrapper around pthread_mutex_lock
message = writeQueue.front();
writeQueue.pop(); // std::queue
writeQueueMutex.unLock() ;
}
return message;
}
// This is the producer and is called from multiple threads.
void CTraceFileWriterThread::add( std::string outputString ) {
if ( !outputString.empty() )
{
// crashes here while trying to acquire the lock when there are lots of
// logs in prod systems.
writeQueueMutex.lock();
const size_t writeQueueSize = writeQueue.size();
if ( writeQueueSize == maximumWriteQueueCapacity )
{
outputString.append ("n queue full, discarding traces, traces are incomplete" );
}
if ( writeQueueSize <= maximumWriteQueueCapacity )
{
bool wasEmpty = writeQueue.empty();
writeQueue.push(outputString);
condVarTraceWriter.post(); // will be waiting in a function which calls "fetchNext"
}
writeQueueMutex.unLock();
}
int wrapperMutex::lock() {
//#[ operation lock()
int iRetval;
int iRetry = 10;
do
{
//
iRetry--;
tRfcErrno = pthread_mutex_lock (&tMutex);
if ( (tRfcErrno == EINTR) || (tRfcErrno == EAGAIN) )
{
iRetval = RFC_ERROR;
(void)sched_yield();
}
else if (tRfcErrno != EOK)
{
iRetval = RFC_ERROR;
iRetry = 0;
}
else
{
iRetval = RFC_OK;
iRetry = 0;
}
} while (iRetry > 0);
return iRetval;
//#]
}
I generated the core dump and analysed it with GDB and here are some findings
Program terminated with signal 11, Segmentation fault.
"Errno=45" at the add function where I am trying to acquire the lock. The wrapper we have around pthread_mutex_lock tries to acquire the lock for around 10 times before it gives up.
The code works fine when there are fewer logs. Also, we do not have C++11 or further and hence restricted to mutex of QNX. Any help is appreciated as I am looking at this issue for over a month with little progress. Please ask if anymore info is required.
c++ queue mutex deadlock qnx
1
We need to see any code that touches the queue, mutex, or condvar. Also, if it's safe to callwriteQueue.empty
without the mutex, what does the mutex protect exactly?
– David Schwartz
Jan 2 at 10:27
I have added an other relevant function. Only the empty() function was called without the mutex lock. The thought was that it wasn't worth it to make the other higher prio threads wait in the "add" function even when the queue was empty. It doesn't really matter if the queue is deemed empty when an entry is getting written since the log writing operation is of low priority and can wait a few milliseconds.
– Abhilash Anand
Jan 2 at 10:43
1
You never check the value returned bywrapperMutex::lock
so it's quite possible you're accessing a non-threadsafe queue without the lock in place. Also, what type of mutex istMutex
-- normal, recursive, something else? If it's a normal pthread mutex then a deadlock might suggest a thread is trying to relock a mutex that it already owns.
– G.M.
Jan 2 at 11:35
1
@AbhilashAnand You wait if the queue was empty. Not if the queue is empty. Yourrun
function does not hold the mutex all the way from testing the queue to deciding to wait! (AfterfetchNext
releases the mutex, you callwait
, having no idea if the queue is still empty or if the add has already happened.)
– David Schwartz
Jan 2 at 12:24
1
@AbhilashAnand The whole point of a condition variable is to provide an atomic "unlock and wait" function so that you don't wait for something that has already happened. Misusing a condition variable will definitely lead to deadlock. Maybe you didn't really want a condition variable though.
– David Schwartz
Jan 3 at 6:03
|
show 13 more comments
I am implementing an asynchronous log writing mechanism for my project's multithreaded application. Below is the partial code of the part where the error occurs.
void CTraceFileWriterThread::run()
{
bool fShoudIRun = shouldThreadsRun(); // Some global function which decided if operations need to stop. Not really relevant here. Assume "true" value.
while(fShoudIRun)
{
std::string nextMessage = fetchNext();
if( !nextMessage.empty() )
{
process(nextMessage);
}
else
{
fShoudIRun = shouldThreadsRun();
condVarTraceWriter.wait();
}
}
}
//This is the consumer. This is in my thread with lower priority
std::string CTraceFileWriterThread::fetchNext()
{
// When there are a lot of logs, I mean A LOT, I believe the
// control stays in this function for a long time and an other
// thread calling the "add" function is not able to acquire the lock
// since its held here.
std::string message;
if( !writeQueue.empty() )
{
writeQueueMutex.lock(); // Obj of our wrapper around pthread_mutex_lock
message = writeQueue.front();
writeQueue.pop(); // std::queue
writeQueueMutex.unLock() ;
}
return message;
}
// This is the producer and is called from multiple threads.
void CTraceFileWriterThread::add( std::string outputString ) {
if ( !outputString.empty() )
{
// crashes here while trying to acquire the lock when there are lots of
// logs in prod systems.
writeQueueMutex.lock();
const size_t writeQueueSize = writeQueue.size();
if ( writeQueueSize == maximumWriteQueueCapacity )
{
outputString.append ("n queue full, discarding traces, traces are incomplete" );
}
if ( writeQueueSize <= maximumWriteQueueCapacity )
{
bool wasEmpty = writeQueue.empty();
writeQueue.push(outputString);
condVarTraceWriter.post(); // will be waiting in a function which calls "fetchNext"
}
writeQueueMutex.unLock();
}
int wrapperMutex::lock() {
//#[ operation lock()
int iRetval;
int iRetry = 10;
do
{
//
iRetry--;
tRfcErrno = pthread_mutex_lock (&tMutex);
if ( (tRfcErrno == EINTR) || (tRfcErrno == EAGAIN) )
{
iRetval = RFC_ERROR;
(void)sched_yield();
}
else if (tRfcErrno != EOK)
{
iRetval = RFC_ERROR;
iRetry = 0;
}
else
{
iRetval = RFC_OK;
iRetry = 0;
}
} while (iRetry > 0);
return iRetval;
//#]
}
I generated the core dump and analysed it with GDB and here are some findings
Program terminated with signal 11, Segmentation fault.
"Errno=45" at the add function where I am trying to acquire the lock. The wrapper we have around pthread_mutex_lock tries to acquire the lock for around 10 times before it gives up.
The code works fine when there are fewer logs. Also, we do not have C++11 or further and hence restricted to mutex of QNX. Any help is appreciated as I am looking at this issue for over a month with little progress. Please ask if anymore info is required.
c++ queue mutex deadlock qnx
I am implementing an asynchronous log writing mechanism for my project's multithreaded application. Below is the partial code of the part where the error occurs.
void CTraceFileWriterThread::run()
{
bool fShoudIRun = shouldThreadsRun(); // Some global function which decided if operations need to stop. Not really relevant here. Assume "true" value.
while(fShoudIRun)
{
std::string nextMessage = fetchNext();
if( !nextMessage.empty() )
{
process(nextMessage);
}
else
{
fShoudIRun = shouldThreadsRun();
condVarTraceWriter.wait();
}
}
}
//This is the consumer. This is in my thread with lower priority
std::string CTraceFileWriterThread::fetchNext()
{
// When there are a lot of logs, I mean A LOT, I believe the
// control stays in this function for a long time and an other
// thread calling the "add" function is not able to acquire the lock
// since its held here.
std::string message;
if( !writeQueue.empty() )
{
writeQueueMutex.lock(); // Obj of our wrapper around pthread_mutex_lock
message = writeQueue.front();
writeQueue.pop(); // std::queue
writeQueueMutex.unLock() ;
}
return message;
}
// This is the producer and is called from multiple threads.
void CTraceFileWriterThread::add( std::string outputString ) {
if ( !outputString.empty() )
{
// crashes here while trying to acquire the lock when there are lots of
// logs in prod systems.
writeQueueMutex.lock();
const size_t writeQueueSize = writeQueue.size();
if ( writeQueueSize == maximumWriteQueueCapacity )
{
outputString.append ("n queue full, discarding traces, traces are incomplete" );
}
if ( writeQueueSize <= maximumWriteQueueCapacity )
{
bool wasEmpty = writeQueue.empty();
writeQueue.push(outputString);
condVarTraceWriter.post(); // will be waiting in a function which calls "fetchNext"
}
writeQueueMutex.unLock();
}
int wrapperMutex::lock() {
//#[ operation lock()
int iRetval;
int iRetry = 10;
do
{
//
iRetry--;
tRfcErrno = pthread_mutex_lock (&tMutex);
if ( (tRfcErrno == EINTR) || (tRfcErrno == EAGAIN) )
{
iRetval = RFC_ERROR;
(void)sched_yield();
}
else if (tRfcErrno != EOK)
{
iRetval = RFC_ERROR;
iRetry = 0;
}
else
{
iRetval = RFC_OK;
iRetry = 0;
}
} while (iRetry > 0);
return iRetval;
//#]
}
I generated the core dump and analysed it with GDB and here are some findings
Program terminated with signal 11, Segmentation fault.
"Errno=45" at the add function where I am trying to acquire the lock. The wrapper we have around pthread_mutex_lock tries to acquire the lock for around 10 times before it gives up.
The code works fine when there are fewer logs. Also, we do not have C++11 or further and hence restricted to mutex of QNX. Any help is appreciated as I am looking at this issue for over a month with little progress. Please ask if anymore info is required.
c++ queue mutex deadlock qnx
c++ queue mutex deadlock qnx
edited Jan 8 at 7:11
Abhilash Anand
asked Jan 2 at 9:54
Abhilash AnandAbhilash Anand
379
379
1
We need to see any code that touches the queue, mutex, or condvar. Also, if it's safe to callwriteQueue.empty
without the mutex, what does the mutex protect exactly?
– David Schwartz
Jan 2 at 10:27
I have added an other relevant function. Only the empty() function was called without the mutex lock. The thought was that it wasn't worth it to make the other higher prio threads wait in the "add" function even when the queue was empty. It doesn't really matter if the queue is deemed empty when an entry is getting written since the log writing operation is of low priority and can wait a few milliseconds.
– Abhilash Anand
Jan 2 at 10:43
1
You never check the value returned bywrapperMutex::lock
so it's quite possible you're accessing a non-threadsafe queue without the lock in place. Also, what type of mutex istMutex
-- normal, recursive, something else? If it's a normal pthread mutex then a deadlock might suggest a thread is trying to relock a mutex that it already owns.
– G.M.
Jan 2 at 11:35
1
@AbhilashAnand You wait if the queue was empty. Not if the queue is empty. Yourrun
function does not hold the mutex all the way from testing the queue to deciding to wait! (AfterfetchNext
releases the mutex, you callwait
, having no idea if the queue is still empty or if the add has already happened.)
– David Schwartz
Jan 2 at 12:24
1
@AbhilashAnand The whole point of a condition variable is to provide an atomic "unlock and wait" function so that you don't wait for something that has already happened. Misusing a condition variable will definitely lead to deadlock. Maybe you didn't really want a condition variable though.
– David Schwartz
Jan 3 at 6:03
|
show 13 more comments
1
We need to see any code that touches the queue, mutex, or condvar. Also, if it's safe to callwriteQueue.empty
without the mutex, what does the mutex protect exactly?
– David Schwartz
Jan 2 at 10:27
I have added an other relevant function. Only the empty() function was called without the mutex lock. The thought was that it wasn't worth it to make the other higher prio threads wait in the "add" function even when the queue was empty. It doesn't really matter if the queue is deemed empty when an entry is getting written since the log writing operation is of low priority and can wait a few milliseconds.
– Abhilash Anand
Jan 2 at 10:43
1
You never check the value returned bywrapperMutex::lock
so it's quite possible you're accessing a non-threadsafe queue without the lock in place. Also, what type of mutex istMutex
-- normal, recursive, something else? If it's a normal pthread mutex then a deadlock might suggest a thread is trying to relock a mutex that it already owns.
– G.M.
Jan 2 at 11:35
1
@AbhilashAnand You wait if the queue was empty. Not if the queue is empty. Yourrun
function does not hold the mutex all the way from testing the queue to deciding to wait! (AfterfetchNext
releases the mutex, you callwait
, having no idea if the queue is still empty or if the add has already happened.)
– David Schwartz
Jan 2 at 12:24
1
@AbhilashAnand The whole point of a condition variable is to provide an atomic "unlock and wait" function so that you don't wait for something that has already happened. Misusing a condition variable will definitely lead to deadlock. Maybe you didn't really want a condition variable though.
– David Schwartz
Jan 3 at 6:03
1
1
We need to see any code that touches the queue, mutex, or condvar. Also, if it's safe to call
writeQueue.empty
without the mutex, what does the mutex protect exactly?– David Schwartz
Jan 2 at 10:27
We need to see any code that touches the queue, mutex, or condvar. Also, if it's safe to call
writeQueue.empty
without the mutex, what does the mutex protect exactly?– David Schwartz
Jan 2 at 10:27
I have added an other relevant function. Only the empty() function was called without the mutex lock. The thought was that it wasn't worth it to make the other higher prio threads wait in the "add" function even when the queue was empty. It doesn't really matter if the queue is deemed empty when an entry is getting written since the log writing operation is of low priority and can wait a few milliseconds.
– Abhilash Anand
Jan 2 at 10:43
I have added an other relevant function. Only the empty() function was called without the mutex lock. The thought was that it wasn't worth it to make the other higher prio threads wait in the "add" function even when the queue was empty. It doesn't really matter if the queue is deemed empty when an entry is getting written since the log writing operation is of low priority and can wait a few milliseconds.
– Abhilash Anand
Jan 2 at 10:43
1
1
You never check the value returned by
wrapperMutex::lock
so it's quite possible you're accessing a non-threadsafe queue without the lock in place. Also, what type of mutex is tMutex
-- normal, recursive, something else? If it's a normal pthread mutex then a deadlock might suggest a thread is trying to relock a mutex that it already owns.– G.M.
Jan 2 at 11:35
You never check the value returned by
wrapperMutex::lock
so it's quite possible you're accessing a non-threadsafe queue without the lock in place. Also, what type of mutex is tMutex
-- normal, recursive, something else? If it's a normal pthread mutex then a deadlock might suggest a thread is trying to relock a mutex that it already owns.– G.M.
Jan 2 at 11:35
1
1
@AbhilashAnand You wait if the queue was empty. Not if the queue is empty. Your
run
function does not hold the mutex all the way from testing the queue to deciding to wait! (After fetchNext
releases the mutex, you call wait
, having no idea if the queue is still empty or if the add has already happened.)– David Schwartz
Jan 2 at 12:24
@AbhilashAnand You wait if the queue was empty. Not if the queue is empty. Your
run
function does not hold the mutex all the way from testing the queue to deciding to wait! (After fetchNext
releases the mutex, you call wait
, having no idea if the queue is still empty or if the add has already happened.)– David Schwartz
Jan 2 at 12:24
1
1
@AbhilashAnand The whole point of a condition variable is to provide an atomic "unlock and wait" function so that you don't wait for something that has already happened. Misusing a condition variable will definitely lead to deadlock. Maybe you didn't really want a condition variable though.
– David Schwartz
Jan 3 at 6:03
@AbhilashAnand The whole point of a condition variable is to provide an atomic "unlock and wait" function so that you don't wait for something that has already happened. Misusing a condition variable will definitely lead to deadlock. Maybe you didn't really want a condition variable though.
– David Schwartz
Jan 3 at 6:03
|
show 13 more comments
0
active
oldest
votes
Your Answer
StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f54004239%2fqnx-pthread-mutex-lock-causing-deadlock-error-45-edeadlk%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
0
active
oldest
votes
0
active
oldest
votes
active
oldest
votes
active
oldest
votes
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f54004239%2fqnx-pthread-mutex-lock-causing-deadlock-error-45-edeadlk%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
1
We need to see any code that touches the queue, mutex, or condvar. Also, if it's safe to call
writeQueue.empty
without the mutex, what does the mutex protect exactly?– David Schwartz
Jan 2 at 10:27
I have added an other relevant function. Only the empty() function was called without the mutex lock. The thought was that it wasn't worth it to make the other higher prio threads wait in the "add" function even when the queue was empty. It doesn't really matter if the queue is deemed empty when an entry is getting written since the log writing operation is of low priority and can wait a few milliseconds.
– Abhilash Anand
Jan 2 at 10:43
1
You never check the value returned by
wrapperMutex::lock
so it's quite possible you're accessing a non-threadsafe queue without the lock in place. Also, what type of mutex istMutex
-- normal, recursive, something else? If it's a normal pthread mutex then a deadlock might suggest a thread is trying to relock a mutex that it already owns.– G.M.
Jan 2 at 11:35
1
@AbhilashAnand You wait if the queue was empty. Not if the queue is empty. Your
run
function does not hold the mutex all the way from testing the queue to deciding to wait! (AfterfetchNext
releases the mutex, you callwait
, having no idea if the queue is still empty or if the add has already happened.)– David Schwartz
Jan 2 at 12:24
1
@AbhilashAnand The whole point of a condition variable is to provide an atomic "unlock and wait" function so that you don't wait for something that has already happened. Misusing a condition variable will definitely lead to deadlock. Maybe you didn't really want a condition variable though.
– David Schwartz
Jan 3 at 6:03