crash "watchdog"
-
Bonsoir, j’ai un crash récurrent sur mon mineur, de type “WATCHDOG” !!!
le log dit :=== Last 50 lines of /var/log/miner/claymore/lastrun_reboot.log === 16:07:04:627 f2ca9700 buf: {“jsonrpc”:“2.0”,“id”:0,“result”:[“0xfb20efb0aef42731a0b9f60763ac05c0c6bdb43e5eb3d8a1a4292a1e29875568”,“0x4e2977c9152afafb8ea63ea5434ada0692b481b8ad37d05391c3a7738d63eb5d”,“0x000000006df37f675ef6eadf5ab9a2072d44268d97df837e6748956e5c6c2116”]} 16:07:04:627 f2ca9700 parse packet: 242 16:07:04:627 f2ca9700 ETH: job is the same 16:07:04:627 f2ca9700 new buf size: 0 16:07:07:369 fe721700 GPU 0 temp = 71, old fan speed = 22, new fan speed = 26 16:07:07:370 fe721700 GPU 1 temp = 47, old fan speed = 33, new fan speed = 25 16:07:10:370 fe721700 GPU 0 temp = 71, old fan speed = 23, new fan speed = 27 16:07:10:370 fe721700 GPU 1 temp = 47, old fan speed = 33, new fan speed = 25 16:07:12:875 f2ca9700 got 243 bytes 16:07:12:875 f2ca9700 buf: {“jsonrpc”:“2.0”,“id”:0,“result”:[“0x88342ad3d8e70776b3a0d50bdc90ecf56a11c1fd3ddb930823aaa8767a52292b”,“0x4e2977c9152afafb8ea63ea5434ada0692b481b8ad37d05391c3a7738d63eb5d”,“0x000000006df37f675ef6eadf5ab9a2072d44268d97df837e6748956e5c6c2116”]} 16:07:12:875 f2ca9700 parse packet: 242 16:07:12:875 f2ca9700 ETH: job changed 16:07:12:875 f2ca9700 new buf size: 0 16:07:12:875 f2ca9700 ETH: 03/04/18-16:07:12 - New job from eth-eu2.nanopool.org:9999 16:07:12:875 f2ca9700 target: 0x000000006df37f67 (diff: 10000MH), epoch 173(2.35GB) 16:07:12:875 f2ca9700 ETH - Total Speed: 11.706 Mh/s, Total Shares: 300, Rejected: 0, Time: 21:31 16:07:12:875 f2ca9700 ETH: GPU0 0.000 Mh/s, GPU1 11.706 Mh/s 16:07:13:371 fe721700 GPU 0 temp = 72, old fan speed = 24, new fan speed = 29 16:07:13:371 fe721700 GPU 1 temp = 47, old fan speed = 33, new fan speed = 25 16:07:13:371 fe721700 GPU0 t=72C fan=26%, GPU1 t=47C fan=33% 16:07:14:568 f2ca9700 ETH: checking pool connection… 16:07:14:568 f2ca9700 send: {“worker”: “”, “jsonrpc”: “2.0”, “params”: [], “id”: 3, “method”: “eth_getWork”} 16:07:14:630 f2ca9700 got 243 bytes 16:07:14:630 f2ca9700 buf: {“jsonrpc”:“2.0”,“id”:0,“result”:[“0x88342ad3d8e70776b3a0d50bdc90ecf56a11c1fd3ddb930823aaa8767a52292b”,“0x4e2977c9152afafb8ea63ea5434ada0692b481b8ad37d05391c3a7738d63eb5d”,“0x000000006df37f675ef6eadf5ab9a2072d44268d97df837e6748956e5c6c2116”]} 16:07:14:630 f2ca9700 parse packet: 242 16:07:14:630 f2ca9700 ETH: job is the same 16:07:14:630 f2ca9700 new buf size: 0 16:07:16:371 fe721700 GPU 0 temp = 72, old fan speed = 26, new fan speed = 31 16:07:16:371 fe721700 GPU 1 temp = 47, old fan speed = 33, new fan speed = 25 16:07:17:710 f0ca5700 srv_thr cnt: 1, IP: 127.0.0.1 16:07:17:710 f0ca5700 recv: 51 16:07:17:710 f0ca5700 srv pck: 50 16:07:17:710 f0ca5700 srv bs: 0 16:07:17:710 f0ca5700 sent: 159 16:07:19:327 f2ca9700 send: {“id”:6,“jsonrpc”:“2.0”,“method”:“eth_submitHashrate”,“params”:[“0xb2a3c8”, “0x0000000000000000000000000000000000000000000000000000000024932c48”]} 16:07:19:372 fe721700 GPU 0 temp = 72, old fan speed = 28, new fan speed = 33 16:07:19:372 fe721700 GPU 1 temp = 47, old fan speed = 33, new fan speed = 25 16:07:19:835 fef22700 em hbt: 1, fm hbt: 48, 16:07:19:836 fef22700 watchdog - thread 0 (gpu0), hb time 69374 16:07:19:836 fef22700 WATCHDOG: GPU 0 hangs in OpenCL call, exit 16:07:19:836 fef22700 watchdog - thread 1 (gpu0), hb time 69196 16:07:19:836 fef22700 WATCHDOG: GPU 0 hangs in OpenCL call, exit 16:07:19:836 fef22700 watchdog - thread 2 (gpu1), hb time 44 16:07:19:836 fef22700 watchdog - thread 3 (gpu1), hb time 201 16:07:19:836 fef22700 Rebooting
Le crash arrive environ deux fois par 24 heures en se répétant 4 a 5 fois à la suite.
Le mineur repart ensuite pour de longues heures sans souci.le GPU en question possède un bios d’origine (je crois) et des fréquences stock.
La commande amdcovc me renvoie :
Memory Clocks: 300 1750 Adapter 1: Hawaii PRO [Radeon R9 290/390] Core: 1040 MHz, Mem: 1500 MHz, CoreOD: 0, MemOD: 0, Load: 100%, Temp: 57 C, Fan: 33.7255% Core clocks: 300 500 698 858 899 935 969 1040 Memory Clocks: 150 1500
Le GPU ne dépasse jamais les 71°C et je n’ai pas fixé de limite de minage.
Ne fonctionnant qu’avec 2go de ram j’ai vérifié l’utilisation mémoire en minage.
La commande “free -m” me renvoie :total used free shared buff/cache available Mem: 1933 534 1111 7 288 1237 0Swap: 0 0 0
534mo utilisés, la RAM ne semble pas en cause.
J’ai branché la prise molex 4 pins qui se trouve sur la carte mère a coté de la RAM et le système crashe a chaque demarrage.
Je ne pense pas que désactiver watchdog resolve le problème mais juste maintient le minage avec le GPU en question off.
Si quelqu’un a une idée, merci d’avance. -
Help :upside-down_face: :upside-down_face: