1 task takes Very Very long time.

Message boards : Number crunching : 1 task takes Very Very long time.
Message board moderation

To post messages, you must log in.

AuthorMessage
k

Send message
Joined: 28 Jan 18
Posts: 3
Credit: 60,935
RAC: 0
Message 3647 - Posted: 15 Jan 2023, 3:57:03 UTC

https://boinc.multi-pool.info/latinsquares/result.php?resultid=437216883
name: odlkmax_26000_1672832179.366460_2
Env: Linux(Fedora 37)

It already takes over 4 hours.
Does it find something special? Or just a bug?
Others works fine.
ID: 3647 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Hurricane

Send message
Joined: 6 Sep 21
Posts: 1
Credit: 11,319,950
RAC: 7,979
Message 3648 - Posted: 15 Jan 2023, 5:04:09 UTC - in response to Message 3647.  

I get those long buggy tasks all the time. Definitely a bug. 🐛 And will finish as a error to...
Better kill ✋ the task.
My screen is usually always off so I never see them in time to kill the tasks.

Server state Over
Outcome Computation error
Client state Compute error
Exit status 197 (0x000000C5) EXIT_TIME_LIMIT_EXCEEDED
Computer ID 110328
Run time 7 hours 56 min 13 sec
CPU time 7 hours 51 min 33 sec
Validate state Invalid
Credit 0.00
Device peak FLOPS 4.27 GFLOPS
Application version odlk3@home v1.00
windows_x86_64
ID: 3648 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
k

Send message
Joined: 28 Jan 18
Posts: 3
Credit: 60,935
RAC: 0
Message 3649 - Posted: 15 Jan 2023, 6:06:46 UTC - in response to Message 3648.  

Thank you.
ID: 3649 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Demis

Send message
Joined: 7 Nov 17
Posts: 24
Credit: 13,567,470
RAC: 11,606
Message 3650 - Posted: 15 Jan 2023, 11:54:44 UTC

This is a real bug in the task.
Unfortunately there are more than 50k such tasks.
And the most annoying thing is that if we just kill this task on the client, then it is simply transferred to another client.
I have studied such errors.
Mostly the reason is that the squares of the beginning and end of the problem are swapped.
This is what leads to the fact that it takes several weeks to get the result. And this is not possible in the boink client (this is how it is written in the code and this is normal).

Such tasks will be finally rejected by the server after 99999 iterations.
If, for example, one iteration is one day, then 99999 x 1 = 99999 / 365 = 273

That is, for one such task, it will take 273 years to cancel.
ID: 3650 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Natalia Makarova
Project scientist
Avatar

Send message
Joined: 22 Oct 17
Posts: 3083
Credit: 0
RAC: 0
Message 3651 - Posted: 17 Jan 2023, 8:58:12 UTC - in response to Message 3650.  

I have studied such errors.
Mostly the reason is that the squares of the beginning and end of the problem are swapped.

Demis
Please provide an example of such a task.
So that there are two squares - the beginning and the end, which are mixed up in places.
ID: 3651 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Demis

Send message
Joined: 7 Nov 17
Posts: 24
Credit: 13,567,470
RAC: 11,606
Message 3652 - Posted: 23 Jan 2023, 7:34:59 UTC - in response to Message 3651.  

input_odlkmax_27_0076651_00105.txt
input_odlkmax_27_0076651_00105-to-check.txt

0 1000000
0 9 8 4 7 6 3 2 5 1
2 1 4 6 5 3 8 9 0 7
6 0 2 9 8 4 1 3 7 5
1 7 6 3 9 8 4 5 2 0
9 2 3 0 4 7 5 8 1 6
8 3 0 1 2 5 7 6 9 4
7 4 5 8 0 9 6 1 3 2
3 5 9 2 6 1 0 7 4 8
4 6 7 5 1 2 9 0 8 3
5 8 1 7 3 0 2 4 6 9

0 9 8 4 7 6 3 2 5 1
8 1 7 5 3 2 9 4 0 6
4 5 2 9 6 0 1 3 7 8
9 7 6 3 1 8 4 0 2 5
6 2 0 1 4 7 5 8 9 3
3 8 1 6 2 5 0 9 4 7
7 4 5 8 0 9 6 1 3 2
2 3 9 0 5 1 8 7 6 4
1 6 3 7 9 4 2 5 8 0
5 0 4 2 8 3 7 6 1 9

./main_mosch_kf_2 input.txt output.txt 99999
PowerMeter2 intervals LS10

Start time:2023-01-18.15:39:06

File 'output.txt' created.
Line N27 start:

0 9 8 4 7 6 3 2 5 1
2 1 4 6 5 3 8 9 0 7
6 4 2 5 9 8 1 3 7 0
1 7 5 3 6 0 4 8 9 2
8 0 1 9 4 7 5 6 2 3
7 3 0 1 2 5 9 4 6 8
4 2 7 8 1 9 6 0 3 5
3 5 9 0 8 4 2 7 1 6
9 6 3 7 . . . . 8 .
5 . . . . . . . . 9

Range end. Desired square not reached.23-01-23.02:48:44
File 'out-not-finded-last-dirty-ls-99999.txt' created.
The dirty square of the end point of the search is recorded.
0 9 8 4 7 6 3 2 5 1
2 1 4 6 8 3 9 5 0 7
1 4 2 5 0 8 7 3 9 6
8 7 6 3 1 9 4 0 2 5
9 2 5 0 4 7 8 6 1 3
3 8 7 9 2 5 0 1 6 4
4 5 1 8 3 0 6 9 7 2
6 0 9 2 5 4 1 7 3 8
7 6 3 1 9 2 5 4 8 0
5 3 0 7 6 1 2 8 4 9

See file: out-not-finded-last-dirty-ls-99999.txt

Time work: 181095 sec

Finish time:2023-01-23.02:48:44

Program is end . . .


last dls (dirty latin squares, file out-not-finded-last-dirty-ls-99999.txt)
0 9 8 4 7 6 3 2 5 1
2 1 4 6 8 3 9 5 0 7
1 4 2 5 0 8 7 3 9 6
8 7 6 3 1 9 4 0 2 5
9 2 5 0 4 7 8 6 1 3
3 8 7 9 2 5 0 1 6 4
4 5 1 8 3 0 6 9 7 2
6 0 9 2 5 4 1 7 3 8
7 6 3 1 9 2 5 4 8 0
5 3 0 7 6 1 2 8 4 9
ID: 3652 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Natalia Makarova
Project scientist
Avatar

Send message
Joined: 22 Oct 17
Posts: 3083
Credit: 0
RAC: 0
Message 3653 - Posted: 23 Jan 2023, 9:54:28 UTC - in response to Message 3652.  
Last modified: 23 Jan 2023, 9:55:00 UTC

input_odlkmax_27_0076651_00105.txt
input_odlkmax_27_0076651_00105-to-check.txt

0 1000000
0 9 8 4 7 6 3 2 5 1
2 1 4 6 5 3 8 9 0 7
6 0 2 9 8 4 1 3 7 5
1 7 6 3 9 8 4 5 2 0
9 2 3 0 4 7 5 8 1 6
8 3 0 1 2 5 7 6 9 4
7 4 5 8 0 9 6 1 3 2
3 5 9 2 6 1 0 7 4 8
4 6 7 5 1 2 9 0 8 3
5 8 1 7 3 0 2 4 6 9

0 9 8 4 7 6 3 2 5 1
8 1 7 5 3 2 9 4 0 6
4 5 2 9 6 0 1 3 7 8
9 7 6 3 1 8 4 0 2 5
6 2 0 1 4 7 5 8 9 3
3 8 1 6 2 5 0 9 4 7
7 4 5 8 0 9 6 1 3 2
2 3 9 0 5 1 8 7 6 4
1 6 3 7 9 4 2 5 8 0
5 0 4 2 8 3 7 6 1 9

What makes you think that these squares are mixed up in places?
They are not mixed up, the second square is "larger" than the first, as it should be.

The problem is that between these squares there is almost no CF SN DLS.
Start the CF generation from the starting first square and see how the generation goes.
It is unlikely that a million CF SN DLS will be accumulated in this interval.
Therefore the task also hangs up.
But it is difficult or even impossible to track this in advance.

Generation goes like this

. . . . . . . . . 
СНДЛК: 56000000 КФ: 14508 время: 320 сек
СНДЛК: 56500000 КФ: 14508 время: 323 сек
СНДЛК: 57000000 КФ: 14508 время: 326 сек
СНДЛК: 57500000 КФ: 14508 время: 329 сек
СНДЛК: 58000000 КФ: 14508 время: 332 сек
СНДЛК: 58500000 КФ: 14508 время: 334 сек
СНДЛК: 59000000 КФ: 14508 время: 337 сек
СНДЛК: 59500000 КФ: 14508 время: 341 сек
СНДЛК: 60000000 КФ: 14508 время: 345 сек
СНДЛК: 60500000 КФ: 14508 время: 347 сек
СНДЛК: 61000000 КФ: 14508 время: 350 сек
СНДЛК: 61500000 КФ: 14508 время: 353 сек
СНДЛК: 62000000 КФ: 14508 время: 356 сек
СНДЛК: 62500000 КФ: 14508 время: 358 сек
СНДЛК: 63000000 КФ: 14508 время: 361 сек
СНДЛК: 63500000 КФ: 14508 время: 363 сек
СНДЛК: 64000000 КФ: 14508 время: 366 сек
СНДЛК: 64500000 КФ: 14508 время: 369 сек
СНДЛК: 65000000 КФ: 14508 время: 373 сек
СНДЛК: 65500000 КФ: 14508 время: 378 сек
СНДЛК: 66000000 КФ: 14508 время: 381 сек
СНДЛК: 66500000 КФ: 14508 время: 384 сек
СНДЛК: 67000000 КФ: 14508 время: 387 сек
СНДЛК: 67500000 КФ: 14508 время: 391 сек
СНДЛК: 68000000 КФ: 14508 время: 396 сек
СНДЛК: 68500000 КФ: 14508 время: 399 сек
СНДЛК: 69000000 КФ: 14508 время: 403 сек
СНДЛК: 69500000 КФ: 14508 время: 407 сек
СНДЛК: 70000000 КФ: 14508 время: 410 сек
СНДЛК: 70500000 КФ: 14508 время: 414 сек
СНДЛК: 71000000 КФ: 14508 время: 418 сек
. . . . . . . . . . . . . . 

PS. I don't understand anything you wrote next.
Perhaps ice00 understands this.
ID: 3653 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Natalia Makarova
Project scientist
Avatar

Send message
Joined: 22 Oct 17
Posts: 3083
Credit: 0
RAC: 0
Message 3654 - Posted: 23 Jan 2023, 10:03:00 UTC
Last modified: 23 Jan 2023, 10:03:38 UTC

By the way, if this task is always “killed”, then the CF SN DLS, which are in this interval, will not be checked.
ID: 3654 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Demis

Send message
Joined: 7 Nov 17
Posts: 24
Credit: 13,567,470
RAC: 11,606
Message 3655 - Posted: 23 Jan 2023, 12:39:00 UTC - in response to Message 3653.  

I took an example, for this year, which was on my client on 01/05/2023.

In general, problematic tasks (which run for more than 4 hours) come to the client's computer of several types:
1. The first and second square from different rulers.
2. The first and second squares are from the same ruler, but the distance between them is excessively large (does not fit into the standard working time of the client).
3. The first and second squares from the same ruler, but the distance between them is large (does not fit into the number 99.999 of the end of the task on the server).
4. The first and second squares are from the same ruler, but are mixed up in places (very often the real distance between them == (the size of the entire ruler) - 1.000.000).
5. There were a few more options, they are (mostly) mixed variations of the first four.

The specific example shown is option #3.
In this example, it can be seen that after 99.999 iterations of 1.000.000 LS (99.999 * 1.000.000 = 99.999.000.000), the task will not be completed.
I showed above how to calculate approximately the time of one iteration.

It took me about 5 days to calculate it out.
Ask ice00 in a week or two at what stage the input_odlkmax_27_0076651_00105.txt task is at.
ID: 3655 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Demis

Send message
Joined: 7 Nov 17
Posts: 24
Credit: 13,567,470
RAC: 11,606
Message 3656 - Posted: 23 Jan 2023, 12:52:33 UTC - in response to Message 3654.  

ID: 3656 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Demis

Send message
Joined: 7 Nov 17
Posts: 24
Credit: 13,567,470
RAC: 11,606
Message 3657 - Posted: 30 Jan 2023, 22:56:53 UTC

ID: 3657 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Demis

Send message
Joined: 7 Nov 17
Posts: 24
Credit: 13,567,470
RAC: 11,606
Message 3660 - Posted: 12 Feb 2023, 17:56:24 UTC
Last modified: 12 Feb 2023, 17:56:45 UTC

ID: 3660 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Demis

Send message
Joined: 7 Nov 17
Posts: 24
Credit: 13,567,470
RAC: 11,606
Message 3661 - Posted: 15 Feb 2023, 22:16:14 UTC

ID: 3661 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Demis

Send message
Joined: 7 Nov 17
Posts: 24
Credit: 13,567,470
RAC: 11,606
Message 3669 - Posted: 3 Mar 2023, 22:00:25 UTC

ID: 3669 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Demis

Send message
Joined: 7 Nov 17
Posts: 24
Credit: 13,567,470
RAC: 11,606
Message 3672 - Posted: 13 Mar 2023, 23:49:37 UTC - in response to Message 3651.  

I have studied such errors.
Mostly the reason is that the squares of the beginning and end of the problem are swapped.

Demis
Please provide an example of such a task.
So that there are two squares - the beginning and the end, which are mixed up in places.

https://boinc.multi-pool.info/latinsquares/result.php?resultid=448332584
ID: 3672 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Natalia Makarova
Project scientist
Avatar

Send message
Joined: 22 Oct 17
Posts: 3083
Credit: 0
RAC: 0
Message 3673 - Posted: 14 Mar 2023, 3:44:46 UTC - in response to Message 3672.  
Last modified: 14 Mar 2023, 3:46:59 UTC

I have studied such errors.
Mostly the reason is that the squares of the beginning and end of the problem are swapped.

Demis
Please provide an example of such a task.
So that there are two squares - the beginning and the end, which are mixed up in places.

https://boinc.multi-pool.info/latinsquares/result.php?resultid=448332584

I don't see any squares on the given link.
I don't have access to the server.

Задача 386752768
имя odlkmax_10114_1678261751.903925
приложение odlkmax@home
создан 8 Mar 2023, 7:49:11 UTC
Задания обрабатываются скрыто в ожидании завершения
ID: 3673 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote

Message boards : Number crunching : 1 task takes Very Very long time.


©2024 Progger & Stefano Tognon (ice00)