Message boards :
Number crunching :
Quite a few of my tasks end with 197 (0x000000C5) EXIT_TIME_LIMIT_EXCEEDED
Message board moderation
Author | Message |
---|---|
Send message Joined: 15 Jan 17 Posts: 1 Credit: 1,317 RAC: 0 |
Quite a few of my tasks running on a W10-host seem to end with an error indicating a exceeded time limit. An example is https://boinc.nanohub.org/nanoHUB_at_home/result.php?resultid=440412, host id 115 Can't see why though, any thoughts? Edit: The last 2 seems ok. |
Send message Joined: 11 Jan 17 Posts: 99 Credit: 224,673 RAC: 0 |
I have not seen that error message, but I have had several that apparently got stuck. I am running on an i7-3770 (Ubuntu 16.04), and one work unit now is at the 9 hour point; most finish in about 2 to 4 hours. Also, the Progress % is shown at 100% on BoincTasks, and the CPU% is now decreasing. https://boinc.nanohub.org/nanoHUB_at_home/result.php?resultid=441643 I will let it go for a few more hours, and then get rid of it. It may be the same thing you see, but without the error message. |
Send message Joined: 11 Jan 17 Posts: 99 Credit: 224,673 RAC: 0 |
one work unit now is at the 9 hour point; most finish in about 2 to 4 hours. I ended it at the 11 hour point, as the CPU% continued to drop. Clearly, it was not doing any work. |
Send message Joined: 8 Feb 18 Posts: 9 Credit: 64,768 RAC: 0 |
I've just started getting some of the Wu's to finish successfully today. |
Send message Joined: 31 Mar 19 Posts: 1 Credit: 2,962 RAC: 0 |
|
Send message Joined: 19 Jun 19 Posts: 1 Credit: 25,868 RAC: 0 |
Maybe useful for debugging: The WUs that fail for me (with time exceeded) throw a lot of tar errors upon booting and initialization (something along the lines of "cannot open file, file exists"). Obviously, something goes wrong with preparing the environment. Additionally, these WUs will never print the final two lines following the "Running..." one. Healthy WUs have {wuname}.boinc {wuname}.sh at the end. The faulty ones haven't. Work never starts. |
Send message Joined: 24 Apr 19 Posts: 53 Credit: 114,639 RAC: 0 |
one work unit now is at the 9 hour point; most finish in about 2 to 4 hours. Those WUs running endlessly are p...g me right off. Letting those WUs run like that you loose valuable time which could be used to do crunch for other projects. I hope this will be resolved soon. |
Send message Joined: 7 Apr 17 Posts: 60 Credit: 26,471 RAC: 0 |
Some of these wus <message> |
Send message Joined: 29 Dec 18 Posts: 7 Credit: 13,559 RAC: 0 |
Since 30th of October I have produced 8 of these errors on my single contributing machine (ID 706). This machine otherwise produces error-free results in most cases (only these 8 errors plus 3 more which I have caused by plaiyng around with the machine settings) indicating that it is a systematic error not related to my machine. And yes: ALL of these errors have occurred on virtually all other machines trying to process these tasks: https://boinc.nanohub.org/nanoHUB_at_home/workunit.php?wuid=5889752 https://boinc.nanohub.org/nanoHUB_at_home/workunit.php?wuid=5870545 https://boinc.nanohub.org/nanoHUB_at_home/workunit.php?wuid=5814681 https://boinc.nanohub.org/nanoHUB_at_home/workunit.php?wuid=5920751 https://boinc.nanohub.org/nanoHUB_at_home/workunit.php?wuid=5876333 https://boinc.nanohub.org/nanoHUB_at_home/workunit.php?wuid=5876330 https://boinc.nanohub.org/nanoHUB_at_home/workunit.php?wuid=5922819 https://boinc.nanohub.org/nanoHUB_at_home/workunit.php?wuid=5871189 The tasks are being resend with a replication of 5 although the cause of the error seems clear: Somehow these tasks do not "converge to a result" and it appears to me that the project lead has (validly) decided to run these for around an hour and then abort them (safety measurement?). This would actually be OK with me BUT THEN THESE TASKS REQUIRE RECEIVING CREDITS PROPORTIONAL TO THE CPU POWER INVESTED FOR AROUND THAT HOUR. Anything else makes no sense and is unfair, because with correctly runing tasks I can complete one approx. every 1-3 minutes and the cause of this error is unrelated to the machine hardware. These are valid runs which appear to indicate scientifically unfavorable starting conditions? Michael. President of Rechenkraft.net e.V. - Germany. |
Send message Joined: 8 Oct 19 Posts: 5 Credit: 35,026 RAC: 0 |
yeah i just had a bunch of them give errors after running for 50 minutes or so. i ran memtest86+ for 3 hours afterwards just in case, but there were no ram faults detected i just let them keep running for hours on end so that the underlying algorithm or mathematics can be adjusted later in the event that it's an actually useful error. |
Send message Joined: 7 Apr 17 Posts: 60 Credit: 26,471 RAC: 0 |
Same here, a lot of "EXIT_DISK_LIMIT_EXCEEDED" errors. At the beginning i think it's a problem of my pc, but seems it's not. |
Send message Joined: 29 Dec 18 Posts: 7 Credit: 13,559 RAC: 0 |
Same here, a lot of "EXIT_DISK_LIMIT_EXCEEDED" errors. The "196 (0x000000C4) EXIT_DISK_LIMIT_EXCEEDED" is one of the two other errors I mentioned above. Since tasks of this error type were completed successfully by others, I did not go into further details. One thing, however, is clear: It is not caused by insufficient disk space on my HD. I checked the BOINC settings for max. harddisk usage and there were still >>10 GB free. I now re-adjusted the settings to allow for 100 GB disk usage, still I occasionally get the same error: Errors on all machines: https://boinc.nanohub.org/nanoHUB_at_home/workunit.php?wuid=5910677 https://boinc.nanohub.org/nanoHUB_at_home/workunit.php?wuid=5866696 https://boinc.nanohub.org/nanoHUB_at_home/workunit.php?wuid=5913490 Error on my machine but successfully validated by another: https://boinc.nanohub.org/nanoHUB_at_home/result.php?resultid=7180261 https://boinc.nanohub.org/nanoHUB_at_home/result.php?resultid=7180246 Error on my machine, currently under validation by another: https://boinc.nanohub.org/nanoHUB_at_home/workunit.php?wuid=5916623 https://boinc.nanohub.org/nanoHUB_at_home/result.php?resultid=7243981 https://boinc.nanohub.org/nanoHUB_at_home/workunit.php?wuid=5930751 Michael. P.S.: Another WU will soom cross the 1 hr runtime limit. I guess it will then give the "197 (0x000000C5) EXIT_TIME_LIMIT_EXCEEDED" error... President of Rechenkraft.net e.V. - Germany. |
Send message Joined: 29 Dec 18 Posts: 7 Credit: 13,559 RAC: 0 |
P.S.: Another WU will soom cross the 1 hr runtime limit. I guess it will then give the "197 (0x000000C5) EXIT_TIME_LIMIT_EXCEEDED" error... As expected: https://boinc.nanohub.org/nanoHUB_at_home/workunit.php?wuid=5781326 Michael. President of Rechenkraft.net e.V. - Germany. |
Send message Joined: 29 Dec 18 Posts: 7 Credit: 13,559 RAC: 0 |
...in the meantime, around 67 faulty tasks have accumulated, of course consuming relevant compute ressources. I find it quite problematic that, apparently, nobody of the project lead even comments on these issues. To my experience, error reports as those posted above are a very valuable (and free!) tool to improve a project. Michael. President of Rechenkraft.net e.V. - Germany. |
Send message Joined: 27 Mar 20 Posts: 1 Credit: 0 RAC: 0 |
Has this problem been replicated when running 90% of available Boinc cpus? |
Send message Joined: 24 Apr 19 Posts: 53 Credit: 114,639 RAC: 0 |
Has this problem been replicated when running 90% of available Boinc cpus? The project didn't splutter out any WUs in weeks. Without new work, no one can tell if problem still exists. |
Send message Joined: 11 Jan 17 Posts: 99 Credit: 224,673 RAC: 0 |
I am getting this error on about 10% of my work units on an i9-10900F running Ubuntu 20.04.2. Here is a typical one: https://boinc.nanohub.org/nanoHUB_at_home/result.php?resultid=10208839 But the errors are not the problem. The long one-hour timeout is the problem. It should be shortened to no more than 30 minutes; even 10 minutes would be enough. The successful work units are running less than 3 minutes on this machine. This project is wasting a lot of time that could be used for COVID. They need to wake up. |
Send message Joined: 27 Apr 18 Posts: 1 Credit: 663 RAC: 0 |
So many of the tasks I've downloaded recently have errored out (with "Validate error") and there's been no easy way of understanding why. Here's an example: https://boinc.nanohub.org/nanoHUB_at_home/result.php?resultid=10123526 I'm guessing it must be something to do with VirtualBox, as I don't have any issues with other projects that use VB...so I cannot for sure say where the problem is. So, I've sadly aborted many tasks so that other crunchers can download them and they may have better luck than my hosts. regards Tim |
Send message Joined: 11 Jan 17 Posts: 99 Credit: 224,673 RAC: 0 |
I'm guessing it must be something to do with VirtualBox, as I don't have any issues with other projects that use VB...so I cannot for sure say where the problem is.Do you have enough memory? I can't see your machine. Unfortunately, looking at the result will not show much, since your PC completed it OK insofar as it was concerned. Another possibility is overclocking. That is never good on scientific projects. And make sure that the BOINC Data folder is excluded from your anti-virus program. |
Send message Joined: 7 Apr 17 Posts: 60 Credit: 26,471 RAC: 0 |
Same here, a LOT of "EXIT_TIME_LIMITE_EXCEEDED" erros. It's a problem of VirtualBox?? |
©2025 COPYRIGHT 2017-2018 NCN