Error while Downloading

Message boards : Number crunching : Error while Downloading
Message board moderation

To post messages, you must log in.

AuthorMessage
HK-Steve

Send message
Joined: 18 Jan 18
Posts: 5
Credit: 7,810,533
RAC: 0
Message 829 - Posted: 5 Sep 2019, 13:42:06 UTC

Just started getting this error in the last hour. Error while downloading

This is across all my rigs, Thanks
ID: 829 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
HK-Steve

Send message
Joined: 18 Jan 18
Posts: 5
Credit: 7,810,533
RAC: 0
Message 832 - Posted: 6 Sep 2019, 9:59:44 UTC - in response to Message 829.  

Am I the only one getting errors?
ID: 832 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
ice00
Project administrator
Project developer

Send message
Joined: 28 Oct 17
Posts: 134
Credit: 49,844
RAC: 4
Message 833 - Posted: 6 Sep 2019, 17:05:55 UTC

Did you still has the problem?

Two days ago there were a low internet speed randomly , but I ask my provider yesterday to check the line and he did not find any problem.
ID: 833 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
HK-Steve

Send message
Joined: 18 Jan 18
Posts: 5
Credit: 7,810,533
RAC: 0
Message 834 - Posted: 6 Sep 2019, 19:46:45 UTC - in response to Message 833.  
Last modified: 6 Sep 2019, 19:47:35 UTC

Did you still has the problem?

Two days ago there were a low internet speed randomly , but I ask my provider yesterday to check the line and he did not find any problem.


Yes, I am still having issues.
https://boinc.multi-pool.info/latinsquares/results.php?userid=1107&offset=0&show_names=0&state=6&appid= [/url]
ID: 834 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
ice00
Project administrator
Project developer

Send message
Joined: 28 Oct 17
Posts: 134
Credit: 49,844
RAC: 4
Message 836 - Posted: 7 Sep 2019, 9:40:08 UTC - in response to Message 834.  


Yes, I am still having issues.
https://boinc.multi-pool.info/latinsquares/results.php?userid=1107&offset=0&show_names=0&state=6&appid= [/url]


I cannot access to your page.

Can you send the register event in tools from your Boinc application related to this project (use a PM if it has potential sensible data in it).

Thanks
ID: 836 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Nick Name

Send message
Joined: 7 May 18
Posts: 1
Credit: 1,005,652
RAC: 0
Message 837 - Posted: 8 Sep 2019, 7:02:07 UTC - in response to Message 832.  
Last modified: 8 Sep 2019, 7:08:37 UTC

Am I the only one getting errors?

No, I have also been seeing this in the last few days and currently have around 200 tasks showing an "Error while downloading". Today my client had 33 tasks with this error, and my host was in a backoff period with 9 hours remaining. Work started flowing again after a manual update.

I don't have time to look through the entire list but the few I looked at all had the same error message.

<core_client_version>7.14.2</core_client_version>
<![CDATA[
<message>
WU download error: couldn't get input files:
<file_xfer_error>
  <file_name>input_odlk3_66_0253047_00043.txt</file_name>
  <error_code>-224 (permanent HTTP error)</error_code>
  <error_message>permanent HTTP error</error_message>
</file_xfer_error>
</message>
]]>


I haven't had any network problems that I know of and this host picked up work from a second project without any problems.
Team USA forum
ID: 837 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
ice00
Project administrator
Project developer

Send message
Joined: 28 Oct 17
Posts: 134
Credit: 49,844
RAC: 4
Message 838 - Posted: 8 Sep 2019, 7:45:52 UTC - in response to Message 837.  

Actually input_odlk3_66_0253047_00043.txt is effectively not present now in the server.

I have double the max amount of workunits that one can download in a day, so it should help.
ID: 838 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Natalia Makarova
Project scientist
Avatar

Send message
Joined: 22 Oct 17
Posts: 1092
Credit: 0
RAC: 0
Message 839 - Posted: 8 Sep 2019, 8:46:54 UTC - in response to Message 838.  
Last modified: 8 Sep 2019, 8:51:46 UTC

Sorry to interfere.

Here
https://boinc.multi-pool.info/latinsquares/workunit.php?wuid=157127913

I see

189667883 84389 7 Sep 2019, 5:44:45 UTC 7 Sep 2019, 5:50:53 UTC Ошибка при загрузке 0.00 0.00 --- odlk3@home v1.00
windows_intelx86
190510650 85910 7 Sep 2019, 5:48:29 UTC 7 Sep 2019, 6:11:59 UTC Ошибка при загрузке 0.00 0.00 --- odlk3@home v1.00
windows_x86_64
190511168 46251 7 Sep 2019, 5:57:21 UTC 7 Sep 2019, 9:30:02 UTC Ошибка при загрузке 0.00 0.00 --- odlk3@home v1.00
windows_x86_64
190512601 72064 7 Sep 2019, 6:30:30 UTC 7 Sep 2019, 6:30:55 UTC Ошибка при загрузке 0.00 0.00 --- odlk3@home v1.00
x86_64-pc-linux-gnu
190513769 83758 7 Sep 2019, 6:34:47 UTC 7 Sep 2019, 23:01:56 UTC Ошибка при загрузке 0.00 0.00 --- odlk3@home v1.00
windows_x86_64

The errors are obvious. What reason?

I have double the max amount of workunits that one can download in a day, so it should help.

That is, several WU will fly into the air?
ID: 839 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Natalia Makarova
Project scientist
Avatar

Send message
Joined: 22 Oct 17
Posts: 1092
Credit: 0
RAC: 0
Message 840 - Posted: 8 Sep 2019, 8:56:19 UTC - in response to Message 838.  

Actually input_odlk3_66_0253047_00043.txt is effectively not present now in the server.

WU existed. It was not processed because there was a loading error.
Now this WU is not on the server?
So WU has disappeared and will not be processed?
ID: 840 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
ice00
Project administrator
Project developer

Send message
Joined: 28 Oct 17
Posts: 134
Credit: 49,844
RAC: 4
Message 842 - Posted: 8 Sep 2019, 12:34:25 UTC - in response to Message 840.  

I think that it is no present because the task generator create a DB entry for a workunit, but did not generate the workunit TXT file (so it is not present and you got a download error).

But with 295.000 tasks being managed right now it is difficulty to catch the problem.
ID: 842 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Natalia Makarova
Project scientist
Avatar

Send message
Joined: 22 Oct 17
Posts: 1092
Credit: 0
RAC: 0
Message 843 - Posted: 8 Sep 2019, 13:44:32 UTC - in response to Message 842.  

But with 295.000 tasks being managed right now it is difficulty to catch the problem.

What is the significance of the tasks that are being processed?

Errors were with other tasks!
You can see the errors, for example, here?
https://boinc.multi-pool.info/latinsquares/workunit.php?wuid=157127913

Can you establish the cause of these errors?
ID: 843 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
ice00
Project administrator
Project developer

Send message
Joined: 28 Oct 17
Posts: 134
Credit: 49,844
RAC: 4
Message 845 - Posted: 8 Sep 2019, 15:20:31 UTC - in response to Message 843.  
Last modified: 8 Sep 2019, 15:20:54 UTC


What is the significance of the tasks that are being processed?


Tasks ready to send	435163
Tasks being processed	289010



From log it seems like what I supposed. The generator generate a work-unit to add in the DB but not the associate TXT file, so when one takes that work-unit it has no TXT file to download.

I'm investigating why.
ID: 845 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Natalia Makarova
Project scientist
Avatar

Send message
Joined: 22 Oct 17
Posts: 1092
Credit: 0
RAC: 0
Message 846 - Posted: 8 Sep 2019, 16:23:12 UTC - in response to Message 845.  
Last modified: 8 Sep 2019, 16:24:34 UTC


What is the significance of the tasks that are being processed?


Tasks ready to send	435163
Tasks being processed	289010

I see it myself.
What does this have to do with errors?

From log it seems like what I supposed. The generator generate a work-unit to add in the DB but not the associate TXT file, so when one takes that work-unit it has no TXT file to download.

WU consists of two DLS, as far as I know.
Do you assume that the erroneous WUs do not have these two squares?
That is, there is a WU name, but there is no TXT file with squares?

So, the problem is with WU generation?
And again the same question: who did the WU generation?
ID: 846 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Natalia Makarova
Project scientist
Avatar

Send message
Joined: 22 Oct 17
Posts: 1092
Credit: 0
RAC: 0
Message 847 - Posted: 8 Sep 2019, 16:36:41 UTC
Last modified: 8 Sep 2019, 16:46:49 UTC

Задание 189667883
Имя odlk3_1630_1567652305.872677_0
Задача 157127913
Создан 5 Sep 2019, 2:58:29 UTC
Отправлен 7 Sep 2019, 5:44:45 UTC
Крайний срок отчёта 14 Sep 2019, 5:44:45 UTC
Получен 7 Sep 2019, 5:50:53 UTC
Состояние сервера Завершено
Результат выполнения Ошибка вычислений
Состояние клиента Загрузка
Статус выхода -186 (0xFFFFFF46) ERR_RESULT_DOWNLOAD


Текст протокола
...
app_version download error: couldn't get input files:
file_xfer_error
file_name>odlk3_1.0_windows_intelx86.exe error_code>-200 /file_xfer_error

/message

See
https://boinc.multi-pool.info/latinsquares/result.php?resultid=189667883

What does error_code -200 mean?
ID: 847 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
ice00
Project administrator
Project developer

Send message
Joined: 28 Oct 17
Posts: 134
Credit: 49,844
RAC: 4
Message 849 - Posted: 9 Sep 2019, 9:34:05 UTC - in response to Message 846.  


Tasks ready to send	435163
Tasks being processed	289010

I see it myself.
What does this have to do with errors?

They have to do to the way you can found the error.
If you have 100 task to send and 50 task being processed it is simple to query the DB and follow all the log to catch a problem.
With 435.000 WU is very hard.


From log it seems like what I supposed. The generator generate a work-unit to add in the DB but not the associate TXT file, so when one takes that work-unit it has no TXT file to download.
WU consists of two DLS, as far as I know.
Do you assume that the erroneous WUs do not have these two squares?
That is, there is a WU name, but there is no TXT file with squares?


yes.
A WU in BOINC is not only two DLS. The two DLS are the input for the client application only.

At server side a WU are generated from some XML templates that has many paramethers in them and a TXT file to send to the client (the WU name is created to be unique). The WU are so stored in to the DB with all information unless the TXT file (for ODLK1 the input are only around 456 bytes - two DLS - but for project like SETI or Einstein they are MB of data that the client download).

There can be more WUs with the same TXT file in it.
Let's say an example. Project like PRIME GRID test a TXT file more that one time (2 or 3) to be sure that CPU bug did not produce false result (e.g it say that a number is prime and it is not).
Else if a client terminate the WU calculation before it true finished (for example a user abandon the project), BOINC generate another WU with the same TXT for processing it again.
The same happen if a WU is schedulated to be finished in a week and a user did not complete it in time.
A new WU with the same TXT is generated and given to another user for being tested.


So, the problem is with WU generation?
And again the same question: who did the WU generation?

It is automatic as it is a daemon always running.

I can suspect that having the queue going to 0 last week, the daemon has start to create WU at a too much high rate in short time to fill the queue, so the process that create a WU associating the TXT file has some fault that are not present during the normal activities.
Let's say that the generator has a latent bug in it.
ID: 849 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Natalia Makarova
Project scientist
Avatar

Send message
Joined: 22 Oct 17
Posts: 1092
Credit: 0
RAC: 0
Message 850 - Posted: 9 Sep 2019, 10:14:36 UTC - in response to Message 849.  

It is automatic as it is a daemon always running.

There is no automatic task generation in the project or it works very poorly.
Otherwise, why was there such a Sprint?
https://boinc.multi-pool.info/latinsquares/forum_thread.php?id=39

Why automatic generation does not support a constant number of available tasks?
Why are tasks constantly decreasing to zero?
An automatic generator must not allow this!

Task generation is done by Progger.
I know that at the very beginning Progger did it.
I suggested he pass it on to you, but he did not agree, citing the fact that there are very complex things that you will have to explain.

Consequently, Progger began to do task generation too quickly and made a mistake somewhere.
So, are there still erroneous WUs in the database? A lot of them?

What does Progger say?
Are you in contact with him?
By the way, it would not hurt him to clarify the situation here!
ID: 850 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Natalia Makarova
Project scientist
Avatar

Send message
Joined: 22 Oct 17
Posts: 1092
Credit: 0
RAC: 0
Message 851 - Posted: 9 Sep 2019, 10:21:11 UTC
Last modified: 9 Sep 2019, 10:35:11 UTC

Tasks decrease very quickly!

Tasks ready to send 314738
Tasks are being processed 288611

There were over 600,000 tasks "ready to send".
I suspect that many tasks were erroneous.

If there is an automatic task generator, it should turn on when a certain minimal number of ready-made tasks is reached, and this number should never be reduced to zero.
In addition, the automatic generator will not rush and make mistakes :)
ID: 851 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote

Message boards : Number crunching : Error while Downloading


©2021 Progger & Stefano Tognon (ice00)