QNX pthread_mutex_lock causing deadlock error ( 45 = EDEADLK )












0















I am implementing an asynchronous log writing mechanism for my project's multithreaded application. Below is the partial code of the part where the error occurs.



void CTraceFileWriterThread::run() 
{
bool fShoudIRun = shouldThreadsRun(); // Some global function which decided if operations need to stop. Not really relevant here. Assume "true" value.

while(fShoudIRun)
{
std::string nextMessage = fetchNext();
if( !nextMessage.empty() )
{
process(nextMessage);
}
else
{
fShoudIRun = shouldThreadsRun();
condVarTraceWriter.wait();
}
}
}

//This is the consumer. This is in my thread with lower priority
std::string CTraceFileWriterThread::fetchNext()
{
// When there are a lot of logs, I mean A LOT, I believe the
// control stays in this function for a long time and an other
// thread calling the "add" function is not able to acquire the lock
// since its held here.

std::string message;

if( !writeQueue.empty() )
{
writeQueueMutex.lock(); // Obj of our wrapper around pthread_mutex_lock
message = writeQueue.front();
writeQueue.pop(); // std::queue
writeQueueMutex.unLock() ;
}
return message;
}




// This is the producer and is called from multiple threads.
void CTraceFileWriterThread::add( std::string outputString ) {

if ( !outputString.empty() )
{
// crashes here while trying to acquire the lock when there are lots of
// logs in prod systems.

writeQueueMutex.lock();
const size_t writeQueueSize = writeQueue.size();

if ( writeQueueSize == maximumWriteQueueCapacity )
{
outputString.append ("n queue full, discarding traces, traces are incomplete" );
}

if ( writeQueueSize <= maximumWriteQueueCapacity )
{
bool wasEmpty = writeQueue.empty();
writeQueue.push(outputString);



condVarTraceWriter.post(); // will be waiting in a function which calls "fetchNext"

}
writeQueueMutex.unLock();

}


int wrapperMutex::lock() {
//#[ operation lock()

int iRetval;
int iRetry = 10;

do
{
//
iRetry--;
tRfcErrno = pthread_mutex_lock (&tMutex);
if ( (tRfcErrno == EINTR) || (tRfcErrno == EAGAIN) )
{
iRetval = RFC_ERROR;
(void)sched_yield();
}
else if (tRfcErrno != EOK)
{
iRetval = RFC_ERROR;
iRetry = 0;
}
else
{
iRetval = RFC_OK;
iRetry = 0;
}
} while (iRetry > 0);

return iRetval;


//#]


}



I generated the core dump and analysed it with GDB and here are some findings




  1. Program terminated with signal 11, Segmentation fault.


  2. "Errno=45" at the add function where I am trying to acquire the lock. The wrapper we have around pthread_mutex_lock tries to acquire the lock for around 10 times before it gives up.



The code works fine when there are fewer logs. Also, we do not have C++11 or further and hence restricted to mutex of QNX. Any help is appreciated as I am looking at this issue for over a month with little progress. Please ask if anymore info is required.










share|improve this question




















  • 1





    We need to see any code that touches the queue, mutex, or condvar. Also, if it's safe to call writeQueue.empty without the mutex, what does the mutex protect exactly?

    – David Schwartz
    Jan 2 at 10:27













  • I have added an other relevant function. Only the empty() function was called without the mutex lock. The thought was that it wasn't worth it to make the other higher prio threads wait in the "add" function even when the queue was empty. It doesn't really matter if the queue is deemed empty when an entry is getting written since the log writing operation is of low priority and can wait a few milliseconds.

    – Abhilash Anand
    Jan 2 at 10:43






  • 1





    You never check the value returned by wrapperMutex::lock so it's quite possible you're accessing a non-threadsafe queue without the lock in place. Also, what type of mutex is tMutex -- normal, recursive, something else? If it's a normal pthread mutex then a deadlock might suggest a thread is trying to relock a mutex that it already owns.

    – G.M.
    Jan 2 at 11:35






  • 1





    @AbhilashAnand You wait if the queue was empty. Not if the queue is empty. Your run function does not hold the mutex all the way from testing the queue to deciding to wait! (After fetchNext releases the mutex, you call wait, having no idea if the queue is still empty or if the add has already happened.)

    – David Schwartz
    Jan 2 at 12:24








  • 1





    @AbhilashAnand The whole point of a condition variable is to provide an atomic "unlock and wait" function so that you don't wait for something that has already happened. Misusing a condition variable will definitely lead to deadlock. Maybe you didn't really want a condition variable though.

    – David Schwartz
    Jan 3 at 6:03
















0















I am implementing an asynchronous log writing mechanism for my project's multithreaded application. Below is the partial code of the part where the error occurs.



void CTraceFileWriterThread::run() 
{
bool fShoudIRun = shouldThreadsRun(); // Some global function which decided if operations need to stop. Not really relevant here. Assume "true" value.

while(fShoudIRun)
{
std::string nextMessage = fetchNext();
if( !nextMessage.empty() )
{
process(nextMessage);
}
else
{
fShoudIRun = shouldThreadsRun();
condVarTraceWriter.wait();
}
}
}

//This is the consumer. This is in my thread with lower priority
std::string CTraceFileWriterThread::fetchNext()
{
// When there are a lot of logs, I mean A LOT, I believe the
// control stays in this function for a long time and an other
// thread calling the "add" function is not able to acquire the lock
// since its held here.

std::string message;

if( !writeQueue.empty() )
{
writeQueueMutex.lock(); // Obj of our wrapper around pthread_mutex_lock
message = writeQueue.front();
writeQueue.pop(); // std::queue
writeQueueMutex.unLock() ;
}
return message;
}




// This is the producer and is called from multiple threads.
void CTraceFileWriterThread::add( std::string outputString ) {

if ( !outputString.empty() )
{
// crashes here while trying to acquire the lock when there are lots of
// logs in prod systems.

writeQueueMutex.lock();
const size_t writeQueueSize = writeQueue.size();

if ( writeQueueSize == maximumWriteQueueCapacity )
{
outputString.append ("n queue full, discarding traces, traces are incomplete" );
}

if ( writeQueueSize <= maximumWriteQueueCapacity )
{
bool wasEmpty = writeQueue.empty();
writeQueue.push(outputString);



condVarTraceWriter.post(); // will be waiting in a function which calls "fetchNext"

}
writeQueueMutex.unLock();

}


int wrapperMutex::lock() {
//#[ operation lock()

int iRetval;
int iRetry = 10;

do
{
//
iRetry--;
tRfcErrno = pthread_mutex_lock (&tMutex);
if ( (tRfcErrno == EINTR) || (tRfcErrno == EAGAIN) )
{
iRetval = RFC_ERROR;
(void)sched_yield();
}
else if (tRfcErrno != EOK)
{
iRetval = RFC_ERROR;
iRetry = 0;
}
else
{
iRetval = RFC_OK;
iRetry = 0;
}
} while (iRetry > 0);

return iRetval;


//#]


}



I generated the core dump and analysed it with GDB and here are some findings




  1. Program terminated with signal 11, Segmentation fault.


  2. "Errno=45" at the add function where I am trying to acquire the lock. The wrapper we have around pthread_mutex_lock tries to acquire the lock for around 10 times before it gives up.



The code works fine when there are fewer logs. Also, we do not have C++11 or further and hence restricted to mutex of QNX. Any help is appreciated as I am looking at this issue for over a month with little progress. Please ask if anymore info is required.










share|improve this question




















  • 1





    We need to see any code that touches the queue, mutex, or condvar. Also, if it's safe to call writeQueue.empty without the mutex, what does the mutex protect exactly?

    – David Schwartz
    Jan 2 at 10:27













  • I have added an other relevant function. Only the empty() function was called without the mutex lock. The thought was that it wasn't worth it to make the other higher prio threads wait in the "add" function even when the queue was empty. It doesn't really matter if the queue is deemed empty when an entry is getting written since the log writing operation is of low priority and can wait a few milliseconds.

    – Abhilash Anand
    Jan 2 at 10:43






  • 1





    You never check the value returned by wrapperMutex::lock so it's quite possible you're accessing a non-threadsafe queue without the lock in place. Also, what type of mutex is tMutex -- normal, recursive, something else? If it's a normal pthread mutex then a deadlock might suggest a thread is trying to relock a mutex that it already owns.

    – G.M.
    Jan 2 at 11:35






  • 1





    @AbhilashAnand You wait if the queue was empty. Not if the queue is empty. Your run function does not hold the mutex all the way from testing the queue to deciding to wait! (After fetchNext releases the mutex, you call wait, having no idea if the queue is still empty or if the add has already happened.)

    – David Schwartz
    Jan 2 at 12:24








  • 1





    @AbhilashAnand The whole point of a condition variable is to provide an atomic "unlock and wait" function so that you don't wait for something that has already happened. Misusing a condition variable will definitely lead to deadlock. Maybe you didn't really want a condition variable though.

    – David Schwartz
    Jan 3 at 6:03














0












0








0








I am implementing an asynchronous log writing mechanism for my project's multithreaded application. Below is the partial code of the part where the error occurs.



void CTraceFileWriterThread::run() 
{
bool fShoudIRun = shouldThreadsRun(); // Some global function which decided if operations need to stop. Not really relevant here. Assume "true" value.

while(fShoudIRun)
{
std::string nextMessage = fetchNext();
if( !nextMessage.empty() )
{
process(nextMessage);
}
else
{
fShoudIRun = shouldThreadsRun();
condVarTraceWriter.wait();
}
}
}

//This is the consumer. This is in my thread with lower priority
std::string CTraceFileWriterThread::fetchNext()
{
// When there are a lot of logs, I mean A LOT, I believe the
// control stays in this function for a long time and an other
// thread calling the "add" function is not able to acquire the lock
// since its held here.

std::string message;

if( !writeQueue.empty() )
{
writeQueueMutex.lock(); // Obj of our wrapper around pthread_mutex_lock
message = writeQueue.front();
writeQueue.pop(); // std::queue
writeQueueMutex.unLock() ;
}
return message;
}




// This is the producer and is called from multiple threads.
void CTraceFileWriterThread::add( std::string outputString ) {

if ( !outputString.empty() )
{
// crashes here while trying to acquire the lock when there are lots of
// logs in prod systems.

writeQueueMutex.lock();
const size_t writeQueueSize = writeQueue.size();

if ( writeQueueSize == maximumWriteQueueCapacity )
{
outputString.append ("n queue full, discarding traces, traces are incomplete" );
}

if ( writeQueueSize <= maximumWriteQueueCapacity )
{
bool wasEmpty = writeQueue.empty();
writeQueue.push(outputString);



condVarTraceWriter.post(); // will be waiting in a function which calls "fetchNext"

}
writeQueueMutex.unLock();

}


int wrapperMutex::lock() {
//#[ operation lock()

int iRetval;
int iRetry = 10;

do
{
//
iRetry--;
tRfcErrno = pthread_mutex_lock (&tMutex);
if ( (tRfcErrno == EINTR) || (tRfcErrno == EAGAIN) )
{
iRetval = RFC_ERROR;
(void)sched_yield();
}
else if (tRfcErrno != EOK)
{
iRetval = RFC_ERROR;
iRetry = 0;
}
else
{
iRetval = RFC_OK;
iRetry = 0;
}
} while (iRetry > 0);

return iRetval;


//#]


}



I generated the core dump and analysed it with GDB and here are some findings




  1. Program terminated with signal 11, Segmentation fault.


  2. "Errno=45" at the add function where I am trying to acquire the lock. The wrapper we have around pthread_mutex_lock tries to acquire the lock for around 10 times before it gives up.



The code works fine when there are fewer logs. Also, we do not have C++11 or further and hence restricted to mutex of QNX. Any help is appreciated as I am looking at this issue for over a month with little progress. Please ask if anymore info is required.










share|improve this question
















I am implementing an asynchronous log writing mechanism for my project's multithreaded application. Below is the partial code of the part where the error occurs.



void CTraceFileWriterThread::run() 
{
bool fShoudIRun = shouldThreadsRun(); // Some global function which decided if operations need to stop. Not really relevant here. Assume "true" value.

while(fShoudIRun)
{
std::string nextMessage = fetchNext();
if( !nextMessage.empty() )
{
process(nextMessage);
}
else
{
fShoudIRun = shouldThreadsRun();
condVarTraceWriter.wait();
}
}
}

//This is the consumer. This is in my thread with lower priority
std::string CTraceFileWriterThread::fetchNext()
{
// When there are a lot of logs, I mean A LOT, I believe the
// control stays in this function for a long time and an other
// thread calling the "add" function is not able to acquire the lock
// since its held here.

std::string message;

if( !writeQueue.empty() )
{
writeQueueMutex.lock(); // Obj of our wrapper around pthread_mutex_lock
message = writeQueue.front();
writeQueue.pop(); // std::queue
writeQueueMutex.unLock() ;
}
return message;
}




// This is the producer and is called from multiple threads.
void CTraceFileWriterThread::add( std::string outputString ) {

if ( !outputString.empty() )
{
// crashes here while trying to acquire the lock when there are lots of
// logs in prod systems.

writeQueueMutex.lock();
const size_t writeQueueSize = writeQueue.size();

if ( writeQueueSize == maximumWriteQueueCapacity )
{
outputString.append ("n queue full, discarding traces, traces are incomplete" );
}

if ( writeQueueSize <= maximumWriteQueueCapacity )
{
bool wasEmpty = writeQueue.empty();
writeQueue.push(outputString);



condVarTraceWriter.post(); // will be waiting in a function which calls "fetchNext"

}
writeQueueMutex.unLock();

}


int wrapperMutex::lock() {
//#[ operation lock()

int iRetval;
int iRetry = 10;

do
{
//
iRetry--;
tRfcErrno = pthread_mutex_lock (&tMutex);
if ( (tRfcErrno == EINTR) || (tRfcErrno == EAGAIN) )
{
iRetval = RFC_ERROR;
(void)sched_yield();
}
else if (tRfcErrno != EOK)
{
iRetval = RFC_ERROR;
iRetry = 0;
}
else
{
iRetval = RFC_OK;
iRetry = 0;
}
} while (iRetry > 0);

return iRetval;


//#]


}



I generated the core dump and analysed it with GDB and here are some findings




  1. Program terminated with signal 11, Segmentation fault.


  2. "Errno=45" at the add function where I am trying to acquire the lock. The wrapper we have around pthread_mutex_lock tries to acquire the lock for around 10 times before it gives up.



The code works fine when there are fewer logs. Also, we do not have C++11 or further and hence restricted to mutex of QNX. Any help is appreciated as I am looking at this issue for over a month with little progress. Please ask if anymore info is required.







c++ queue mutex deadlock qnx






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Jan 8 at 7:11







Abhilash Anand

















asked Jan 2 at 9:54









Abhilash AnandAbhilash Anand

379




379








  • 1





    We need to see any code that touches the queue, mutex, or condvar. Also, if it's safe to call writeQueue.empty without the mutex, what does the mutex protect exactly?

    – David Schwartz
    Jan 2 at 10:27













  • I have added an other relevant function. Only the empty() function was called without the mutex lock. The thought was that it wasn't worth it to make the other higher prio threads wait in the "add" function even when the queue was empty. It doesn't really matter if the queue is deemed empty when an entry is getting written since the log writing operation is of low priority and can wait a few milliseconds.

    – Abhilash Anand
    Jan 2 at 10:43






  • 1





    You never check the value returned by wrapperMutex::lock so it's quite possible you're accessing a non-threadsafe queue without the lock in place. Also, what type of mutex is tMutex -- normal, recursive, something else? If it's a normal pthread mutex then a deadlock might suggest a thread is trying to relock a mutex that it already owns.

    – G.M.
    Jan 2 at 11:35






  • 1





    @AbhilashAnand You wait if the queue was empty. Not if the queue is empty. Your run function does not hold the mutex all the way from testing the queue to deciding to wait! (After fetchNext releases the mutex, you call wait, having no idea if the queue is still empty or if the add has already happened.)

    – David Schwartz
    Jan 2 at 12:24








  • 1





    @AbhilashAnand The whole point of a condition variable is to provide an atomic "unlock and wait" function so that you don't wait for something that has already happened. Misusing a condition variable will definitely lead to deadlock. Maybe you didn't really want a condition variable though.

    – David Schwartz
    Jan 3 at 6:03














  • 1





    We need to see any code that touches the queue, mutex, or condvar. Also, if it's safe to call writeQueue.empty without the mutex, what does the mutex protect exactly?

    – David Schwartz
    Jan 2 at 10:27













  • I have added an other relevant function. Only the empty() function was called without the mutex lock. The thought was that it wasn't worth it to make the other higher prio threads wait in the "add" function even when the queue was empty. It doesn't really matter if the queue is deemed empty when an entry is getting written since the log writing operation is of low priority and can wait a few milliseconds.

    – Abhilash Anand
    Jan 2 at 10:43






  • 1





    You never check the value returned by wrapperMutex::lock so it's quite possible you're accessing a non-threadsafe queue without the lock in place. Also, what type of mutex is tMutex -- normal, recursive, something else? If it's a normal pthread mutex then a deadlock might suggest a thread is trying to relock a mutex that it already owns.

    – G.M.
    Jan 2 at 11:35






  • 1





    @AbhilashAnand You wait if the queue was empty. Not if the queue is empty. Your run function does not hold the mutex all the way from testing the queue to deciding to wait! (After fetchNext releases the mutex, you call wait, having no idea if the queue is still empty or if the add has already happened.)

    – David Schwartz
    Jan 2 at 12:24








  • 1





    @AbhilashAnand The whole point of a condition variable is to provide an atomic "unlock and wait" function so that you don't wait for something that has already happened. Misusing a condition variable will definitely lead to deadlock. Maybe you didn't really want a condition variable though.

    – David Schwartz
    Jan 3 at 6:03








1




1





We need to see any code that touches the queue, mutex, or condvar. Also, if it's safe to call writeQueue.empty without the mutex, what does the mutex protect exactly?

– David Schwartz
Jan 2 at 10:27







We need to see any code that touches the queue, mutex, or condvar. Also, if it's safe to call writeQueue.empty without the mutex, what does the mutex protect exactly?

– David Schwartz
Jan 2 at 10:27















I have added an other relevant function. Only the empty() function was called without the mutex lock. The thought was that it wasn't worth it to make the other higher prio threads wait in the "add" function even when the queue was empty. It doesn't really matter if the queue is deemed empty when an entry is getting written since the log writing operation is of low priority and can wait a few milliseconds.

– Abhilash Anand
Jan 2 at 10:43





I have added an other relevant function. Only the empty() function was called without the mutex lock. The thought was that it wasn't worth it to make the other higher prio threads wait in the "add" function even when the queue was empty. It doesn't really matter if the queue is deemed empty when an entry is getting written since the log writing operation is of low priority and can wait a few milliseconds.

– Abhilash Anand
Jan 2 at 10:43




1




1





You never check the value returned by wrapperMutex::lock so it's quite possible you're accessing a non-threadsafe queue without the lock in place. Also, what type of mutex is tMutex -- normal, recursive, something else? If it's a normal pthread mutex then a deadlock might suggest a thread is trying to relock a mutex that it already owns.

– G.M.
Jan 2 at 11:35





You never check the value returned by wrapperMutex::lock so it's quite possible you're accessing a non-threadsafe queue without the lock in place. Also, what type of mutex is tMutex -- normal, recursive, something else? If it's a normal pthread mutex then a deadlock might suggest a thread is trying to relock a mutex that it already owns.

– G.M.
Jan 2 at 11:35




1




1





@AbhilashAnand You wait if the queue was empty. Not if the queue is empty. Your run function does not hold the mutex all the way from testing the queue to deciding to wait! (After fetchNext releases the mutex, you call wait, having no idea if the queue is still empty or if the add has already happened.)

– David Schwartz
Jan 2 at 12:24







@AbhilashAnand You wait if the queue was empty. Not if the queue is empty. Your run function does not hold the mutex all the way from testing the queue to deciding to wait! (After fetchNext releases the mutex, you call wait, having no idea if the queue is still empty or if the add has already happened.)

– David Schwartz
Jan 2 at 12:24






1




1





@AbhilashAnand The whole point of a condition variable is to provide an atomic "unlock and wait" function so that you don't wait for something that has already happened. Misusing a condition variable will definitely lead to deadlock. Maybe you didn't really want a condition variable though.

– David Schwartz
Jan 3 at 6:03





@AbhilashAnand The whole point of a condition variable is to provide an atomic "unlock and wait" function so that you don't wait for something that has already happened. Misusing a condition variable will definitely lead to deadlock. Maybe you didn't really want a condition variable though.

– David Schwartz
Jan 3 at 6:03












0






active

oldest

votes











Your Answer






StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});


}
});














draft saved

draft discarded


















StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f54004239%2fqnx-pthread-mutex-lock-causing-deadlock-error-45-edeadlk%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown

























0






active

oldest

votes








0






active

oldest

votes









active

oldest

votes






active

oldest

votes
















draft saved

draft discarded




















































Thanks for contributing an answer to Stack Overflow!


  • Please be sure to answer the question. Provide details and share your research!

But avoid



  • Asking for help, clarification, or responding to other answers.

  • Making statements based on opinion; back them up with references or personal experience.


To learn more, see our tips on writing great answers.




draft saved


draft discarded














StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f54004239%2fqnx-pthread-mutex-lock-causing-deadlock-error-45-edeadlk%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown





















































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown

































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown







Popular posts from this blog

MongoDB - Not Authorized To Execute Command

in spring boot 2.1 many test slices are not allowed anymore due to multiple @BootstrapWith

How to fix TextFormField cause rebuild widget in Flutter