Thread 'Server problems'

Message boards : News : Server problems
Message board moderation

To post messages, you must log in.

1 · 2 · Next

AuthorMessage
ice00
Project administrator
Project developer

Send message
Joined: 28 Oct 17
Posts: 220
Credit: 59,125
RAC: 12
Message 3692 - Posted: 9 Sep 2023, 11:09:55 UTC

Hi,

during the holiday one disk of the server suddenly died, so it gets unavailable.

Unfortunately I come back from holiday a week later, so all workunits and credits were already time out at that time.

Disk died on 20 Aug during the day. The DB is the one of 8:00 as it from a clean backup (all WU and result where not touched from the disk damages, however some WU may be missed into the DB for credits, but not for the results collected).

As the server were using an old SO that soon will go out of update, I go for installing even the new version of the server, for having years of updates.

Now there is still some problems in configuring all the services as there are incompatibility from old to new libraries, so you should considerate to reactivate your ODLK1 calculation for the next week at least.

However, bookmark this page: http://boinc-status.multi-pool.info/

This is in another server, so in case of problems you can see the state of ODLK1 there (unless the internet connection is broken, as both uses the same one).

Thanks
ID: 3692 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
kotenok2000
Avatar

Send message
Joined: 29 Dec 19
Posts: 2
Credit: 8,424
RAC: 0
Message 3693 - Posted: 9 Sep 2023, 13:12:01 UTC - in response to Message 3692.  

What is difference betweebeen this project and https://boinc.progger.info/odlk/ ?
ID: 3693 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
ice00
Project administrator
Project developer

Send message
Joined: 28 Oct 17
Posts: 220
Credit: 59,125
RAC: 12
Message 3694 - Posted: 9 Sep 2023, 16:52:38 UTC - in response to Message 3693.  

What is difference betweebeen this project and https://boinc.progger.info/odlk/ ?


Essentially ODLK is supported in Russian language, while ODLK1 is in English language. ODLK1 born to gives more hardware resources to the latin squares research.

The used algorithms are the same in the two projects, but they uses different input spaces (so no duplicates should be find by the two projects).

Result are then analysed by the same mathematicians.
ID: 3694 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
ProfileSteve Dodd

Send message
Joined: 5 Dec 17
Posts: 8
Credit: 3,274,863
RAC: 2,194
Message 3695 - Posted: 9 Sep 2023, 17:34:18 UTC - in response to Message 3692.  

So, I have hundreds of work units that are going to be failed due to time out - no response. Should I just delete them all from my work queue and lose all of the credit?
ID: 3695 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Greger

Send message
Joined: 14 Jan 18
Posts: 6
Credit: 16,600,886
RAC: 4,786
Message 3696 - Posted: 9 Sep 2023, 18:31:31 UTC

Not able to fetch new work
latinsquares	Sep 09, 2023, 08:28:15 PM	Scheduler request failed: Error 403
ID: 3696 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
ice00
Project administrator
Project developer

Send message
Joined: 28 Oct 17
Posts: 220
Credit: 59,125
RAC: 12
Message 3697 - Posted: 9 Sep 2023, 18:53:16 UTC - in response to Message 3695.  

So, I have hundreds of work units that are going to be failed due to time out - no response. Should I just delete them all from my work queue and lose all of the credit?


Unfortunately Boinc timeout is one week, so I think that it internally marked already the WU for being resent, however I did not know if it gives credits (maybe reduced) even if a WU arrived after the timout or not.
ID: 3697 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
ice00
Project administrator
Project developer

Send message
Joined: 28 Oct 17
Posts: 220
Credit: 59,125
RAC: 12
Message 3698 - Posted: 9 Sep 2023, 19:10:16 UTC - in response to Message 3696.  

Not able to fetch new work
latinsquares	Sep 09, 2023, 08:28:15 PM	Scheduler request failed: Error 403


No, the server is still in maintenance as every service must be reactivated without having errors due to the changed SO and libraries, while DB and WU filesystem need to be synchronized.

Else, having changed the SSL certificate with a new one, I should expected that the project need to be reattached in Boinc manager when all is working (due to his security control, even if actually this not seems to happen and this could be good).
ID: 3698 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
ProfileConan

Send message
Joined: 10 Nov 17
Posts: 15
Credit: 2,125,847
RAC: 1,954
Message 3699 - Posted: 11 Sep 2023, 9:05:41 UTC
Last modified: 11 Sep 2023, 9:12:47 UTC

Getting ERROR 403 when trying to contact project, is this the SSL Certificate change causing this?

Conan

PS:- Just tried detaching and re-attaching and still getting the same error.
ID: 3699 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Demis

Send message
Joined: 7 Nov 17
Posts: 29
Credit: 13,770,308
RAC: 154
Message 3700 - Posted: 11 Sep 2023, 15:23:47 UTC - in response to Message 3699.  

Yes.
This problem is known.
The restoration work has not yet been fully completed.
Just need to wait.
ID: 3700 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Demis

Send message
Joined: 7 Nov 17
Posts: 29
Credit: 13,770,308
RAC: 154
Message 3701 - Posted: 11 Sep 2023, 21:18:30 UTC - in response to Message 3699.  

Hey guys, please check it out.
Everything seemed to work as it should.
ID: 3701 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
ProfileConan

Send message
Joined: 10 Nov 17
Posts: 15
Credit: 2,125,847
RAC: 1,954
Message 3702 - Posted: 11 Sep 2023, 21:21:53 UTC

Thank you, all uploaded and credited.

You have done a great job, all appears to be working.

Conan
ID: 3702 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
[AF>Le_Pommier] Jerome_C2005

Send message
Joined: 10 Feb 18
Posts: 3
Credit: 1,002,282
RAC: 0
Message 3703 - Posted: 11 Sep 2023, 21:51:52 UTC

Great job guy(s) !

Thanks :)
ID: 3703 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
kinjirra

Send message
Joined: 13 Jul 23
Posts: 1
Credit: 4,539,628
RAC: 5,176
Message 3705 - Posted: 12 Sep 2023, 21:59:15 UTC

Getting upload failures and tasks aborted by the project. However its still processing work? confused
ID: 3705 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
ice00
Project administrator
Project developer

Send message
Joined: 28 Oct 17
Posts: 220
Credit: 59,125
RAC: 12
Message 3707 - Posted: 13 Sep 2023, 17:31:06 UTC - in response to Message 3705.  

Getting upload failures and tasks aborted by the project. However its still processing work? confused


In the last 24h:
119543 tasks were reported correctly with valid result.
0 task were impossible to send from server to client
16097 tasks have calculation error
0 task was without a replay
26 tasks were unnecessary
5 tasks have validation errors
414 tasks were abandoned by client

It is possible that there are some problems in this initial phase of restarting (maybe within a week all will go better).

If you could see if in client log there is more information about the upload failure that could help.
ID: 3707 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
ProfileSkivelitis2
Avatar

Send message
Joined: 17 Nov 17
Posts: 1
Credit: 4,067,215
RAC: 0
Message 3710 - Posted: 15 Sep 2023, 9:42:13 UTC

Has stats export been enabled?
ID: 3710 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Greger

Send message
Joined: 14 Jan 18
Posts: 6
Credit: 16,600,886
RAC: 4,786
Message 3711 - Posted: 15 Sep 2023, 13:47:52 UTC
Last modified: 15 Sep 2023, 13:49:06 UTC

Stats is not updated (last modified 2023-09-11)

https://boinc.multi-pool.info/latinsquares/stats/
ID: 3711 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
ice00
Project administrator
Project developer

Send message
Joined: 28 Oct 17
Posts: 220
Credit: 59,125
RAC: 12
Message 3712 - Posted: 15 Sep 2023, 16:57:50 UTC - in response to Message 3710.  

Has stats export been enabled?


It should as in previous, but thanks to point out that it is not updating, so tomorrow I can look for the problem and fix it
ID: 3712 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
ice00
Project administrator
Project developer

Send message
Joined: 28 Oct 17
Posts: 220
Credit: 59,125
RAC: 12
Message 3713 - Posted: 15 Sep 2023, 17:33:24 UTC

Already resolved the problem of the stats.
ID: 3713 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Greger

Send message
Joined: 14 Jan 18
Posts: 6
Credit: 16,600,886
RAC: 4,786
Message 3714 - Posted: 16 Sep 2023, 5:41:32 UTC

Thanks
ID: 3714 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
ProfileNatalia Makarova
Project scientist
Avatar

Send message
Joined: 22 Oct 17
Posts: 3083
Credit: 0
RAC: 0
Message 3719 - Posted: 17 Sep 2023, 3:39:47 UTC - in response to Message 3707.  

16097 tasks have calculation error

What is the reason for the calculation error?
ID: 3719 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
1 · 2 · Next

Message boards : News : Server problems

©2024 ©2024 Progger & Stefano Tognon (ice00) & Reese