Mars3D长时间运行导致浏览器崩溃的深度分析与解决方案
当使用浏览器打开使用Mars3D或Cesium开发的三维GIS页面,特别是占用显存比较大的应用,运行一段时间后会报 Fragment shader failed to compile 的错误。对于长时间运行的应用,连续运行5~~6天后,页面崩溃报错后,即使重启浏览器都无效了,只能重启电脑解决。
具体现象
1、占用显存大的Mars3D或Cesium应用长时间运行后会报错:RuntimeError: Fragment shader failed to compile. Compile log: null.

2、刚开始报错时,刷新浏览器即可恢复。等连续运行5~~6天后,报错即使重启浏览器都无法解决,表现为一进入三维立刻报错,这时可能会报另外一个错误:Visit http://get.webgl.org to verify that your web browser and hardware support WebGL. 即使偶然能正常显示三维,但是CPU占用率很高,GPU占用率较低,几何不占用显存,页面非常卡顿。


3、浏览器控制台可能会打印类似下面的错误。
bigscreen/#/:1 [.WebGL-0x1c8c06a6dc00] GL_CONTEXT_LOST_KHR: glBindFramebuffer: Context has been lost.
bigscreen/#/:1 [.WebGL-0x1c8c06a6dc00] GL_CONTEXT_LOST_KHR: glBindFramebuffer: Context has been lost.
bigscreen/#/:1 [.WebGL-0x1c8c06a6dc00] GL_CONTEXT_LOST_KHR: glDisable: Context has been lost.
bigscreen/#/:1 [.WebGL-0x1c8c06a6dc00] GL_CONTEXT_LOST_KHR: glDisable: Context has been lost.
bigscreen/#/:1 [.WebGL-0x1c8c06a6dc00] GL_CONTEXT_LOST_KHR: glBindFramebuffer: Context has been lost.
bigscreen/#/:1 [.WebGL-0x1c8c06a6dc00] GL_CONTEXT_LOST_KHR: glDrawBuffers: Context has been lost.
bigscreen/#/:1 [.WebGL-0x1c8c06a6dc00] GL_CONTEXT_LOST_KHR: glDisable: Context has been lost.
bigscreen/#/:1 [.WebGL-0x1c8c06a6dc00] GL_CONTEXT_LOST_KHR: glStencilMask: Context has been lost.
bigscreen/#/:1 [.WebGL-0x1c8c06a6dc00] GL_CONTEXT_LOST_KHR: glViewport: Context has been lost.
bigscreen/#/:1 [.WebGL-0x1c8c06a6dc00] GL_CONTEXT_LOST_KHR: glUseProgram: Context has been lost.
bigscreen/#/:1 [.WebGL-0x1c8c06a6dc00] GL_CONTEXT_LOST_KHR: glActiveTexture: Context has been lost.
bigscreen/#/:1 [.WebGL-0x1c8c06a6dc00] GL_CONTEXT_LOST_KHR: glBindTexture: Context has been lost.
bigscreen/#/:1 WebGL: CONTEXT_LOST_WEBGL: loseContext: context lost
Cesium.js:899 [Cesium WebGL] Fragment shader compile log: null
KBe @ Cesium.js:899
Cesium.js:899 [Cesium WebGL] Fragment shader source:
#version 300 es
#ifdef GL_FRAGMENT_PRECISION_HIGH
precision highp float;
precision highp int;
...
...
#ifdef FRAGMENT_DEPTH_CHECK
in vec4 v_textureCoordinateBounds;
in vec4 v_originTextureCoordinateAndTranslate;
in vec4 v_compressed;
in mat2 v_rotationMatrix;
KBe @ Cesium.js:899
Cesium.js:15827 WebGL渲染运行出错 (页面已停止,请刷新页面)
RuntimeError: Fragment shader failed to compile. Compile log: null
Error
at new l1 (Cesium.js:74:6279)
at KBe (Cesium.js:900:122)
at ooe (Cesium.js:901:1411)
at GD (Cesium.js:901:1361)
at tA._bind (Cesium.js:901:2213)
at Sut (Cesium.js:9842:44218)
at Cu.draw (Cesium.js:9842:45041)
at uL.execute (Cesium.js:898:24035)
at bf (Cesium.js:15449:18769)
at zEt (Cesium.js:15449:19670)
Yo.showErrorPanel @ Cesium.js:15827
原因分析
1、Mars3D或Cesium开发的三维GIS页面,当有非常多3dtiles三维模型,跟Vue一起使用时,很容易未释放资源导致内存或显存泄漏。当显存占用达到一定程度时,WebGL引擎崩溃,报 RuntimeError: Fragment shader failed to compile 的错误。这个错误是WebGL崩溃导致的结果而不是原因,并不是代码错误导致WebGL崩溃的。
2、当连续5~~6天高负荷运行后,WebGL引擎崩溃后,即使重启浏览器WebGL引擎都无法恢复了,表现为以下几点:
(1)更换不同型号的显卡无法解决这个问题,排除显卡问题。
(2)升级不同版本的显卡驱动无法解决这个问题,排除显卡驱动问题。
(3)更换不同的浏览器,例如Chrome、Edge或火狐,无法解决这个问题,排除浏览器问题。
(4)使用不同的三维框架,例如Cesium.js、Mars3D或Three.js,当显存占用达到一定程度时,都会立刻报错,重启浏览器无效,排除Cesium或Mars3D的问题。
(5)重启浏览器偶尔三维会显示,但是非常卡顿,CPU使用率很高、GPU使用率较低、几乎不占用显存,这是由于WebGL引擎崩溃未恢复,浏览器使用CPU渲染三维导致的。
3、重启浏览器三维无法恢复后,在命令行输入dxdiag运行DirectX,发现显卡驱动正常。

4、开启浏览器WebGL日志,发现以下错误:
System Commit Limit (Gb) 33
D3D11 Feature Level 12_1
Has Discrete GPU yes
Software Rendering No
Log Messages
[44812:46452:0918/085512.367:VERBOSE1:components\viz\service\main\viz_main_impl.cc:85] : VizNullHypothesis is disabled (not a warning)
GpuProcessHost: The info collection GPU process exited normally. Everything is okay.
[44812:46452:0918/085555.205:WARNING:ui\gl\angle_platform_impl.cc:54] : HLSLCompiler::compileToBinary: C:\fakepath(421,1): warning X4000: use of potentially uninitialized variable (f_czm_getMaterialczm_materialInput)
[44812:46452:0918/085606.375:ERROR:ui\gl\angle_platform_impl.cc:49] : Renderer11.cpp:2209 (virtual rx::Renderer11::testDeviceLost): The D3D11 device was removed, HRESULT: 0x887A0007
[44812:46452:0918/085606.375:ERROR:gpu\command_buffer\service\shared_context_state.cc:1363] : SharedContextState context lost via ARB/EXT_robustness. Reset status = GL_UNKNOWN_CONTEXT_RESET_KHR
GpuProcessHost: The GPU process crashed!
[48716:41668:0918/085606.808:VERBOSE1:components\viz\service\main\viz_main_impl.cc:85] : VizNullHypothesis is disabled (not a warning)
libpng warning: iCCP: known incorrect sRGB profile
libpng warning: iCCP: known incorrect sRGB profile
libpng warning: iCCP: known incorrect sRGB profile
[41416:44368:0918/085530.877:ERROR:chrome\browser\ui\webui\ntp\new_tab_ui.cc:55] Requested load of chrome://newtab/ for incorrect profile type.
[44812:46452:0918/085606.375:ERROR:ui\gl\angle_platform_impl.cc:49] Renderer11.cpp:2209 (virtual rx::Renderer11::testDeviceLost): The D3D11 device was removed. HRESULT: 0x887A0007 'The GPU will not respond to more commands, most likely because of an invalid command passed by the calling application.'
ERR: Renderer11.cpp:2209 (virtual rx::Renderer11::testDeviceLost): The D3D11 device was removed. HRESULT: 0x887A0007
[44812:46452:0918/085606.375:ERROR:gpu\command_buffer\service\shared_context_state.cc:1363] SharedContextState Context lost via ARB_robustness. Reset status = GL_UNKNOWN_CONTEXT_RESET_ARB
[44812:46452:0918/085606.375:ERROR:components\viz\service\gl\exit_code.cc:13] Restarting GPU process due to unrecoverable error. Context was lost.
[41416:44368:0918/085606.609:ERROR:content\browser\gpu\gpu_process_host.cc:955] GPU process exited unexpectedly: exit_code=34
解决方案
1、长时间运行三维崩溃是代码问题,需要从代码上解决。
2、三维连续5~~6天高负荷运行,崩溃后重启浏览器无效,是由显卡超负荷运行导致的。
3、CPU占用率高、GPU占用率低、几乎不占用显存的情况,可以参考 Mars3D CPU占用率高而GPU占用率低的解决方案 这篇文章 。
4、完整的解决方案待补充。