Posts by Jim1348

1) Message boards : Number crunching : Long work units run better with Linux (and so do short ones) (Message 711)
Posted 21 Apr 2021 by Jim1348
Post:
Thanks for the info. I really did not have the statistics.
But you might try letting the longer ones just run until they time out. I am seeing longer ones that actually work, which is a change.
2) Message boards : Number crunching : Long work units run better with Linux (and so do short ones) (Message 708)
Posted 20 Apr 2021 by Jim1348
Post:
I have noticed some long (over 1 hour, and even over 2 hours) work units in the last day or so that completed successfully. Those always used to error out for me. But in checking the results, I noticed that they had all failed on several other machines, all running Windows 10. So it appears that Windows 10 VirtualBox does not run the same as Linux VirtualBox.
https://boinc.nanohub.org/nanoHUB_at_home/results.php?hostid=11493&offset=0&show_names=0&state=4&appid=

Then I looked at some of the short ones too, which ran for the usual couple of minutes. I don't see any valid Windows ones there either.

So I will leave this machine running. It seems to be one of the few that works. Good luck to all.

PS - I am seeing only the ones that failed on Windows. Apparently "validate" does not mean comparing two machines, as on most projects. I get a failure rate of about 10%. Maybe someone with Windows could report on what they see.
3) Message boards : Number crunching : 2 machines, thousands of WU's, 10 credit in 36 hours. What's wrong? (Message 706)
Posted 19 Apr 2021 by Jim1348
Post:
Most of mine are _1 through _4. But I am getting a number of _0 too, so it appears that they are generating new work.
It may not show up on the server list, since the demand about equals supply.
4) Message boards : Number crunching : 2 machines, thousands of WU's, 10 credit in 36 hours. What's wrong? (Message 698)
Posted 17 Apr 2021 by Jim1348
Post:
These WU's thrash the drives terribly upon copying a fresh VM to each new BOINC data slot folder.

I posted on that shorty after the project started several years ago. They will kill an SSD fast. I have never seen such high writes.
In fact, I measured it a couple of days ago on an i9-10900F (the low-power version, not the fastest) running four virtual cores on nanoHUB under Ubuntu 20.04.2.
The writes were 52 GB/hour, or 1.25 TB/day. I never allow more than 70 GB/day, though that may be a bit conservative.

But I always use a write-cache anyway. On that machine, I set it to 8 GB and 1 hour latency (write delay), using the built-in cache that Linux provides.
https://lonesysadmin.net/2013/12/22/better-linux-disk-caching-performance-vm-dirty_ratio/
You would probably be OK with less memory, maybe 2 GB and 10 minutes latency, since the work units are so short, but use something.

On Windows, I use PrimoCache. You need it.
https://www.romexsoftware.com/en-us/primo-cache/index.html
5) Message boards : Number crunching : Quite a few of my tasks end with 197 (0x000000C5) EXIT_TIME_LIMIT_EXCEEDED (Message 691)
Posted 16 Apr 2021 by Jim1348
Post:
I'm guessing it must be something to do with VirtualBox, as I don't have any issues with other projects that use VB...so I cannot for sure say where the problem is.
Do you have enough memory? I can't see your machine.
Unfortunately, looking at the result will not show much, since your PC completed it OK insofar as it was concerned.

Another possibility is overclocking. That is never good on scientific projects.
And make sure that the BOINC Data folder is excluded from your anti-virus program.
6) Message boards : Number crunching : Quite a few of my tasks end with 197 (0x000000C5) EXIT_TIME_LIMIT_EXCEEDED (Message 689)
Posted 16 Apr 2021 by Jim1348
Post:
I am getting this error on about 10% of my work units on an i9-10900F running Ubuntu 20.04.2.
Here is a typical one:
https://boinc.nanohub.org/nanoHUB_at_home/result.php?resultid=10208839

But the errors are not the problem. The long one-hour timeout is the problem.
It should be shortened to no more than 30 minutes; even 10 minutes would be enough.

The successful work units are running less than 3 minutes on this machine.
This project is wasting a lot of time that could be used for COVID. They need to wake up.
7) Message boards : Number crunching : Export statistics (Message 687)
Posted 15 Apr 2021 by Jim1348
Post:
But the project is still not exporting statistics.
I haven't looked at this in years, but today BOINC Stats says:
nanoHUB@home 214,366.43 0.01 4,287 149 149 149 18
https://www.boincstats.com/stats/-1/user/detail/256953/projectList
8) Message boards : News : Maintenance complete (Message 676)
Posted 1 Jan 2021 by Jim1348
Post:
Even the new year might not do it. Maybe the new school year will help, though I wouldn't bet on it.
9) Questions and Answers : Getting started : Downloading task (Message 670)
Posted 26 Nov 2020 by Jim1348
Post:
Sure. Check the server status. They have no work.
https://boinc.nanohub.org/nanoHUB_at_home/server_status.php
10) Message boards : News : Request for feedback (Message 640)
Posted 6 Oct 2020 by Jim1348
Post:
They need a plan for reducing their error rate also. It is practically unusable.
11) Message boards : Number crunching : New app 1.14 (vbox64_mt) (Message 623)
Posted 21 Sep 2020 by Jim1348
Post:
The good news is that the error rate appears to be lower, though after less than one day I can say too much.
The bad news is that the ones that do error take longer to do it. One group takes 1 1/2 hours, another group takes 2 1/2 hours.
https://boinc.nanohub.org/nanoHUB_at_home/results.php?hostid=8898&offset=0&show_names=0&state=6&appid=

Why not just get on with it?
12) Message boards : Number crunching : New app 1.14 (vbox64_mt) (Message 622)
Posted 21 Sep 2020 by Jim1348
Post:
I am just starting to run this new version on a Ryzen 3600 (Ubuntu 20.04.1, VirtualBox 6.1.6 machine), and they are OK for the moment.
https://boinc.nanohub.org/nanoHUB_at_home/results.php?hostid=8898
13) Message boards : Number crunching : Message: VM VM Hypervisor failed to enter an online state in a timely fashion (Message 617)
Posted 25 Aug 2020 by Jim1348
Post:
Currently, I had this set at 60% and I still got this message. Checking disk usage was pegged at 100%. Meaning the current batch of tasks from nanohub were slaming my disk drive..

Yes, nanoHUB does hammer the disk drive. I posted on it a long time ago. The write rate is high, but the work units are short. I don't know if it will damage the drive or not. But I use SSDs, which are fast enough to avoid this problem, but may need to be protected. In Linux, I increase the size of the built-in write buffer to around 4 GB in size and an hour write-delay. In Windows, I use the Samsung Rapid Mode cache (included in their Magician utility), or even better PrimoCache, which allows for a bigger write buffer (similar to the Linux values). I think the Crucial SSDs have a cache included in their utility too.
14) Message boards : Number crunching : Message: VM VM Hypervisor failed to enter an online state in a timely fashion (Message 615)
Posted 25 Aug 2020 by Jim1348
Post:
I usually limit it to a maximum of 8 work units at a time

How can I do it?

I use a simple version:

<app_config>
  <app>
  <name>boinc2docker</name>
  <max_concurrent>8</max_concurrent>
  </app>
</app_config>

It runs a maximum of 8 works units, with one CPU core per work unit.
You can add the other items as mikey suggests if you wish.
15) Message boards : Number crunching : Message: VM VM Hypervisor failed to enter an online state in a timely fashion (Message 612)
Posted 24 Aug 2020 by Jim1348
Post:
45 WUs are stuck in the “Posponed: VM Hypervisor failed....." (33% of progress).

In BOINC settings, are you allowing enough memory to be used? I set it to 95%.

I haven't seen a problem with my Ryzen 2700, with 8/16 cores and 32 GB (Ubuntu 18.04 and VBox 5.2.42),
though I usually limit it to a maximum of 8 work units at a time.
16) Message boards : Number crunching : Message: VM VM Hypervisor failed to enter an online state in a timely fashion (Message 609)
Posted 22 Aug 2020 by Jim1348
Post:
How many work units are you trying to run? They take about 2 GB each.
You have 24 cores and 20 GB of memory.
17) Message boards : Number crunching : Nanohub and wrapper (Message 598)
Posted 11 Aug 2020 by Jim1348
Post:
As such, and since most of the other projects I'm involved with require an updated v6+ VBox, sadly I'll have to go "no new tasks" with this project until this new wrapper is adopted.

I use VirtualBox 5.2.42 on several Ubuntu 18.04.4 machines on Cosmology and LHC, as well as nanoHUB. It usually causes fewer problems than 6.x from the reports I have seen. It seems that the extra delay in VBox 6.x causes "file not found" and similar errors.

I have used 5.2.x on Win7 64-bit without problems also. I don't recall a project where it did not work, but you must know of something.
18) Message boards : Number crunching : New app version (1.13) (Message 595)
Posted 10 Aug 2020 by Jim1348
Post:
They are back to their old behavior. Failures occur at the 1 hour point, about as many as before.
The new version did not change that. I will let it run, on two cores only.
19) Message boards : Number crunching : New app version (1.13) (Message 594)
Posted 9 Aug 2020 by Jim1348
Post:
Maybe it is unrelated, but I may be seeing fewer errors. But the main difference is that the timeout is now down to about 47 minutes.
Often they fail in only 2 or 3 minutes. That is progress.
20) Message boards : Number crunching : A lot of errors (Message 589)
Posted 2 Aug 2020 by Jim1348
Post:
Yes. The main problem is that it takes too long to fail - 2 hours 30 minutes on my Ryzen 2700.
That is even longer than before, when they failed in about an hour.

They are going backwards.


Next 20


©2021 COPYRIGHT 2017-2018 NCN