A better approach for throttling on external process in c#












1














I need to write a GUI application to process some bunches of files on the external command-line tools. And I need to parallel them by the file to and throttle them on the threads of the CPU to maximize the cpu-usage and the throughput. I did some works and some research on it:



Parallel.ForEach



When I asked this question first time on the StackOverflow, someone advised me to use the Parallel.Foreach. It does work; but it just blocks some of threads and wastes CPU for waiting external processes. And if the external process runs on a long time, it would reduce the threads on parallelism! So finally I gave up on using this and tried to find other solutions.



Semaphoreslim



I simply use



SemaphoreSlim sem = new SemaphoreSlim(Environment.ProcessorCount);


to throttle the number of the external process
and just use



await task.whenall(tasks);


for waiting all of the process without blocking my GUI program.



Now I am using this. It works very well.



But just one problem: it is mentioned in the MSDN that semaphoreslim is designed for a single process when wait times are expected to be very short.
But in my external process, it often runs very long (the process time is based on the input file's type and size). So the Spinwait wastes the CPU resource in my case. So I am really wondering if there are some solutions to avoid this spinwait but until now I can't find one. Some may say to use traditional semaphore. I have tried. But semaphore can not be Awaitable, so it blocked my GUI and if I use



await Task.run()


with it, then it doesn't perform better than semaphoreslim.



TPL Dataflow



The other solution I found is to use the TPL dataflow library. It does perform better than semaphoreslim slightly. But some of my specific use-case can not be implemented in the TPL Dataflow.



For example, I have a bunch of archives. I need to decompress them and process the files inside each archive and then re-compress. In TPL Dataflow, I thought to split to the "decompress block" (palarism:1), "file process block" (palarism:12), and "compress block" (palarism:1). But I don't know how to wait some of the tasks in the all task in TPL Dataflow. If my understanding is not wrong, TPL Dataflow can just wait until a block is finished or not. In my case, if the files of archive one is processed, the compress block has no way to know it. It needs to wait until all files have been processed.



But in semaphoreslim, I can use



await Task.whenall(someoftasks); 


in each foreach iteration of the archives to await them. So I can get higher throughput, so I finally gave up on using TPL dataflow.



Conclusion



So after my research, I am still using semaphoreslim. It works very well, but I am confused of its spinwait for wasting CPU resource. So I am wondering if there is any better approach for throttling on external process in c#.










share|improve this question
























  • There's very little point in worrying about the overhead of your waiting mechanism, since the cost of creating and destroying processes dwarfs all of that. Busy waiting would still be something to worry about, but SemaphoreSlim only uses spinwaits for very short intervals before switching to a "proper" wait. The confusion stems from thinking SemaphoreSlim is only appropriate for short waits -- that's not true. It's just that when you have a short wait, it's less expensive than Semaphore. When you have a long wait, the difference is negligible.
    – Jeroen Mostert
    Nov 19 '18 at 13:25










  • You should have a look at this implementation using ActionBlock<T>, I think it does what you want
    – Liam
    Nov 19 '18 at 13:27










  • Confusingly stated, it is not what they meant. Note that the article compares Semaphore against SemaphoreSlim. "very short" wait times gives SemaphoreSlim an edge over Semaphore. If they are not short then it is pretty likely that the semaphore needs to block an then you can't see the difference anymore.
    – Hans Passant
    Nov 19 '18 at 18:44


















1














I need to write a GUI application to process some bunches of files on the external command-line tools. And I need to parallel them by the file to and throttle them on the threads of the CPU to maximize the cpu-usage and the throughput. I did some works and some research on it:



Parallel.ForEach



When I asked this question first time on the StackOverflow, someone advised me to use the Parallel.Foreach. It does work; but it just blocks some of threads and wastes CPU for waiting external processes. And if the external process runs on a long time, it would reduce the threads on parallelism! So finally I gave up on using this and tried to find other solutions.



Semaphoreslim



I simply use



SemaphoreSlim sem = new SemaphoreSlim(Environment.ProcessorCount);


to throttle the number of the external process
and just use



await task.whenall(tasks);


for waiting all of the process without blocking my GUI program.



Now I am using this. It works very well.



But just one problem: it is mentioned in the MSDN that semaphoreslim is designed for a single process when wait times are expected to be very short.
But in my external process, it often runs very long (the process time is based on the input file's type and size). So the Spinwait wastes the CPU resource in my case. So I am really wondering if there are some solutions to avoid this spinwait but until now I can't find one. Some may say to use traditional semaphore. I have tried. But semaphore can not be Awaitable, so it blocked my GUI and if I use



await Task.run()


with it, then it doesn't perform better than semaphoreslim.



TPL Dataflow



The other solution I found is to use the TPL dataflow library. It does perform better than semaphoreslim slightly. But some of my specific use-case can not be implemented in the TPL Dataflow.



For example, I have a bunch of archives. I need to decompress them and process the files inside each archive and then re-compress. In TPL Dataflow, I thought to split to the "decompress block" (palarism:1), "file process block" (palarism:12), and "compress block" (palarism:1). But I don't know how to wait some of the tasks in the all task in TPL Dataflow. If my understanding is not wrong, TPL Dataflow can just wait until a block is finished or not. In my case, if the files of archive one is processed, the compress block has no way to know it. It needs to wait until all files have been processed.



But in semaphoreslim, I can use



await Task.whenall(someoftasks); 


in each foreach iteration of the archives to await them. So I can get higher throughput, so I finally gave up on using TPL dataflow.



Conclusion



So after my research, I am still using semaphoreslim. It works very well, but I am confused of its spinwait for wasting CPU resource. So I am wondering if there is any better approach for throttling on external process in c#.










share|improve this question
























  • There's very little point in worrying about the overhead of your waiting mechanism, since the cost of creating and destroying processes dwarfs all of that. Busy waiting would still be something to worry about, but SemaphoreSlim only uses spinwaits for very short intervals before switching to a "proper" wait. The confusion stems from thinking SemaphoreSlim is only appropriate for short waits -- that's not true. It's just that when you have a short wait, it's less expensive than Semaphore. When you have a long wait, the difference is negligible.
    – Jeroen Mostert
    Nov 19 '18 at 13:25










  • You should have a look at this implementation using ActionBlock<T>, I think it does what you want
    – Liam
    Nov 19 '18 at 13:27










  • Confusingly stated, it is not what they meant. Note that the article compares Semaphore against SemaphoreSlim. "very short" wait times gives SemaphoreSlim an edge over Semaphore. If they are not short then it is pretty likely that the semaphore needs to block an then you can't see the difference anymore.
    – Hans Passant
    Nov 19 '18 at 18:44
















1












1








1


2





I need to write a GUI application to process some bunches of files on the external command-line tools. And I need to parallel them by the file to and throttle them on the threads of the CPU to maximize the cpu-usage and the throughput. I did some works and some research on it:



Parallel.ForEach



When I asked this question first time on the StackOverflow, someone advised me to use the Parallel.Foreach. It does work; but it just blocks some of threads and wastes CPU for waiting external processes. And if the external process runs on a long time, it would reduce the threads on parallelism! So finally I gave up on using this and tried to find other solutions.



Semaphoreslim



I simply use



SemaphoreSlim sem = new SemaphoreSlim(Environment.ProcessorCount);


to throttle the number of the external process
and just use



await task.whenall(tasks);


for waiting all of the process without blocking my GUI program.



Now I am using this. It works very well.



But just one problem: it is mentioned in the MSDN that semaphoreslim is designed for a single process when wait times are expected to be very short.
But in my external process, it often runs very long (the process time is based on the input file's type and size). So the Spinwait wastes the CPU resource in my case. So I am really wondering if there are some solutions to avoid this spinwait but until now I can't find one. Some may say to use traditional semaphore. I have tried. But semaphore can not be Awaitable, so it blocked my GUI and if I use



await Task.run()


with it, then it doesn't perform better than semaphoreslim.



TPL Dataflow



The other solution I found is to use the TPL dataflow library. It does perform better than semaphoreslim slightly. But some of my specific use-case can not be implemented in the TPL Dataflow.



For example, I have a bunch of archives. I need to decompress them and process the files inside each archive and then re-compress. In TPL Dataflow, I thought to split to the "decompress block" (palarism:1), "file process block" (palarism:12), and "compress block" (palarism:1). But I don't know how to wait some of the tasks in the all task in TPL Dataflow. If my understanding is not wrong, TPL Dataflow can just wait until a block is finished or not. In my case, if the files of archive one is processed, the compress block has no way to know it. It needs to wait until all files have been processed.



But in semaphoreslim, I can use



await Task.whenall(someoftasks); 


in each foreach iteration of the archives to await them. So I can get higher throughput, so I finally gave up on using TPL dataflow.



Conclusion



So after my research, I am still using semaphoreslim. It works very well, but I am confused of its spinwait for wasting CPU resource. So I am wondering if there is any better approach for throttling on external process in c#.










share|improve this question















I need to write a GUI application to process some bunches of files on the external command-line tools. And I need to parallel them by the file to and throttle them on the threads of the CPU to maximize the cpu-usage and the throughput. I did some works and some research on it:



Parallel.ForEach



When I asked this question first time on the StackOverflow, someone advised me to use the Parallel.Foreach. It does work; but it just blocks some of threads and wastes CPU for waiting external processes. And if the external process runs on a long time, it would reduce the threads on parallelism! So finally I gave up on using this and tried to find other solutions.



Semaphoreslim



I simply use



SemaphoreSlim sem = new SemaphoreSlim(Environment.ProcessorCount);


to throttle the number of the external process
and just use



await task.whenall(tasks);


for waiting all of the process without blocking my GUI program.



Now I am using this. It works very well.



But just one problem: it is mentioned in the MSDN that semaphoreslim is designed for a single process when wait times are expected to be very short.
But in my external process, it often runs very long (the process time is based on the input file's type and size). So the Spinwait wastes the CPU resource in my case. So I am really wondering if there are some solutions to avoid this spinwait but until now I can't find one. Some may say to use traditional semaphore. I have tried. But semaphore can not be Awaitable, so it blocked my GUI and if I use



await Task.run()


with it, then it doesn't perform better than semaphoreslim.



TPL Dataflow



The other solution I found is to use the TPL dataflow library. It does perform better than semaphoreslim slightly. But some of my specific use-case can not be implemented in the TPL Dataflow.



For example, I have a bunch of archives. I need to decompress them and process the files inside each archive and then re-compress. In TPL Dataflow, I thought to split to the "decompress block" (palarism:1), "file process block" (palarism:12), and "compress block" (palarism:1). But I don't know how to wait some of the tasks in the all task in TPL Dataflow. If my understanding is not wrong, TPL Dataflow can just wait until a block is finished or not. In my case, if the files of archive one is processed, the compress block has no way to know it. It needs to wait until all files have been processed.



But in semaphoreslim, I can use



await Task.whenall(someoftasks); 


in each foreach iteration of the archives to await them. So I can get higher throughput, so I finally gave up on using TPL dataflow.



Conclusion



So after my research, I am still using semaphoreslim. It works very well, but I am confused of its spinwait for wasting CPU resource. So I am wondering if there is any better approach for throttling on external process in c#.







c# .net async-await semaphore






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Nov 19 '18 at 21:14









Richardissimo

4,1462726




4,1462726










asked Nov 19 '18 at 13:16









Syun

62




62












  • There's very little point in worrying about the overhead of your waiting mechanism, since the cost of creating and destroying processes dwarfs all of that. Busy waiting would still be something to worry about, but SemaphoreSlim only uses spinwaits for very short intervals before switching to a "proper" wait. The confusion stems from thinking SemaphoreSlim is only appropriate for short waits -- that's not true. It's just that when you have a short wait, it's less expensive than Semaphore. When you have a long wait, the difference is negligible.
    – Jeroen Mostert
    Nov 19 '18 at 13:25










  • You should have a look at this implementation using ActionBlock<T>, I think it does what you want
    – Liam
    Nov 19 '18 at 13:27










  • Confusingly stated, it is not what they meant. Note that the article compares Semaphore against SemaphoreSlim. "very short" wait times gives SemaphoreSlim an edge over Semaphore. If they are not short then it is pretty likely that the semaphore needs to block an then you can't see the difference anymore.
    – Hans Passant
    Nov 19 '18 at 18:44




















  • There's very little point in worrying about the overhead of your waiting mechanism, since the cost of creating and destroying processes dwarfs all of that. Busy waiting would still be something to worry about, but SemaphoreSlim only uses spinwaits for very short intervals before switching to a "proper" wait. The confusion stems from thinking SemaphoreSlim is only appropriate for short waits -- that's not true. It's just that when you have a short wait, it's less expensive than Semaphore. When you have a long wait, the difference is negligible.
    – Jeroen Mostert
    Nov 19 '18 at 13:25










  • You should have a look at this implementation using ActionBlock<T>, I think it does what you want
    – Liam
    Nov 19 '18 at 13:27










  • Confusingly stated, it is not what they meant. Note that the article compares Semaphore against SemaphoreSlim. "very short" wait times gives SemaphoreSlim an edge over Semaphore. If they are not short then it is pretty likely that the semaphore needs to block an then you can't see the difference anymore.
    – Hans Passant
    Nov 19 '18 at 18:44


















There's very little point in worrying about the overhead of your waiting mechanism, since the cost of creating and destroying processes dwarfs all of that. Busy waiting would still be something to worry about, but SemaphoreSlim only uses spinwaits for very short intervals before switching to a "proper" wait. The confusion stems from thinking SemaphoreSlim is only appropriate for short waits -- that's not true. It's just that when you have a short wait, it's less expensive than Semaphore. When you have a long wait, the difference is negligible.
– Jeroen Mostert
Nov 19 '18 at 13:25




There's very little point in worrying about the overhead of your waiting mechanism, since the cost of creating and destroying processes dwarfs all of that. Busy waiting would still be something to worry about, but SemaphoreSlim only uses spinwaits for very short intervals before switching to a "proper" wait. The confusion stems from thinking SemaphoreSlim is only appropriate for short waits -- that's not true. It's just that when you have a short wait, it's less expensive than Semaphore. When you have a long wait, the difference is negligible.
– Jeroen Mostert
Nov 19 '18 at 13:25












You should have a look at this implementation using ActionBlock<T>, I think it does what you want
– Liam
Nov 19 '18 at 13:27




You should have a look at this implementation using ActionBlock<T>, I think it does what you want
– Liam
Nov 19 '18 at 13:27












Confusingly stated, it is not what they meant. Note that the article compares Semaphore against SemaphoreSlim. "very short" wait times gives SemaphoreSlim an edge over Semaphore. If they are not short then it is pretty likely that the semaphore needs to block an then you can't see the difference anymore.
– Hans Passant
Nov 19 '18 at 18:44






Confusingly stated, it is not what they meant. Note that the article compares Semaphore against SemaphoreSlim. "very short" wait times gives SemaphoreSlim an edge over Semaphore. If they are not short then it is pretty likely that the semaphore needs to block an then you can't see the difference anymore.
– Hans Passant
Nov 19 '18 at 18:44



















active

oldest

votes











Your Answer






StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});


}
});














draft saved

draft discarded


















StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53375471%2fa-better-approach-for-throttling-on-external-process-in-c-sharp%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown






























active

oldest

votes













active

oldest

votes









active

oldest

votes






active

oldest

votes
















draft saved

draft discarded




















































Thanks for contributing an answer to Stack Overflow!


  • Please be sure to answer the question. Provide details and share your research!

But avoid



  • Asking for help, clarification, or responding to other answers.

  • Making statements based on opinion; back them up with references or personal experience.


To learn more, see our tips on writing great answers.





Some of your past answers have not been well-received, and you're in danger of being blocked from answering.


Please pay close attention to the following guidance:


  • Please be sure to answer the question. Provide details and share your research!

But avoid



  • Asking for help, clarification, or responding to other answers.

  • Making statements based on opinion; back them up with references or personal experience.


To learn more, see our tips on writing great answers.




draft saved


draft discarded














StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53375471%2fa-better-approach-for-throttling-on-external-process-in-c-sharp%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown





















































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown

































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown







Popular posts from this blog

MongoDB - Not Authorized To Execute Command

Npm cannot find a required file even through it is in the searched directory

in spring boot 2.1 many test slices are not allowed anymore due to multiple @BootstrapWith