.net运行中慢函数分析

程序在开发机上借助 IDE 的工具进行性能分析是非常简单的,但在程序在正式环境运行时进行慢函数分析就比较困难。这里介绍2个工具,CVCollectionCmd PerfView,借助它们的力量(性能计数器和其他系统探针)可以对 windows 中运行的 .net 程序进行运行中分析。

CVCollectionCmd 实际是 Visual Studio 的非自带官方扩展——并发可视化工具的独立命令行工具,可以独立于 Visual Studio 运行。除了可以直接 Open 启动一个进程外,还可以 Attach 现有进程。

# 启动
"C:\Program Files (x86)\Microsoft Concurrency Visualizer Collection Tools\CVCollectionCmd.exe" /Attach /Process 79272 /outdir D:\testproj\cvclog /config "D:\testproj\CVCollectionConfig.xml"

# 停止
"C:\Program Files (x86)\Microsoft Concurrency Visualizer Collection Tools\CVCollectionCmd.exe" /detach

<?xml version="1.0"?>
<LocalConfig xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema" MajorVersion="1" MinorVersion="0">

  <IncludeEnvSymbolPath>true</IncludeEnvSymbolPath>

  <DeleteEtlsAfterAnalysis>false</DeleteEtlsAfterAnalysis>

  <TraceLocation>D:\testproj\cvclog</TraceLocation>

  <SymbolPath>srv*d:\symbols*https://msdl.microsoft.com/download/symbols;D:\testproj\</SymbolPath>

  <Markers>
    <MarkerProvider Name="Default" Guid="8d4925ab-505a-483b-a7e0-6f824a07a6f0" Level="Low" />
    <MarkerProvider Name="TPL" Guid="2e5dba47-a3d2-4d16-8ee0-6671ffdcd7b5" Level="Normal" />
    <MarkerProvider Name="TPL Dataflow" Guid="16f53577-e41d-43d4-b47e-c17025bf4025" Level="Normal" />
    <MarkerProvider Name="TPL Synchronization" Guid="ec631d38-466b-4290-9306-834971ba0217" Level="Normal" />
    <MarkerProvider Name="PLINQ" Guid="159eeeec-4a14-4418-a8fe-faabcd987887" Level="Normal" />
    <MarkerProvider Name="Concurrency Runtime" Guid="f7b697a3-4db5-4d3b-be71-c4d284e6592f" Level="Normal" />
    <MarkerProvider Name="Scenario Markers" Guid="fb9244c9-f23a-4966-8a9c-97a51f8c355b" Level="Low" />

  </Markers>

  <FilterConfig>
    <CollectClrEvents>true</CollectClrEvents>
    <ClrCollectionOptions>CollectForNative DisableNGenRundown</ClrCollectionOptions>
    <CollectSampleEvents>true</CollectSampleEvents>
    <CollectGpuEvents>false</CollectGpuEvents>
    <CollectFileIO>true</CollectFileIO>
  </FilterConfig>

  <UserBufferSettings>
    <BufferFlushTimer>0</BufferFlushTimer>
    <BufferSize>256</BufferSize>
    <MinimumBuffers>512</MinimumBuffers>
    <MaximumBuffers>1024</MaximumBuffers>
  </UserBufferSettings>

  <KernelBufferSettings>
    <BufferFlushTimer>0</BufferFlushTimer>
    <BufferSize>256</BufferSize>
    <MinimumBuffers>512</MinimumBuffers>
    <MaximumBuffers>1024</MaximumBuffers>
  </KernelBufferSettings>

  <!-- List of MyCodeDirectory directories -->
  <JustMyCode>
    <MyCodeDirectory>D:\testproj\</MyCodeDirectory>
  </JustMyCode>
</LocalConfig>

工具会生成3个文件(XXX.CvTrace,XXX.kernel.etl,XXX.user.etl),使用 Visual Studio 打开 XXX.CvTrace,经过分析后会展示图形化数据。

如图,线程界面的表格中展示了函数的 CPU 用时排序。这里作者为了展示更多和更详细的数据把降噪调整为0%,默认值为2%。可以看到 unknown!0x7fe97f49757 和 unknown!7fe97f526ca 分别用时 14.55% 和 10.18%。但因为某些原因(可能因为程序使用 Assembly.LoadFrom加载dll)未成功加载 pdb 文件,导致 module 和函数名未被展示,这里只能使用 windbg 分析 dump 文件。

0:000> !ip2md 7fe97f49757
MethodDesc:   000007fe98107198
Method Name:  ******************************.Query(System.String, System.String, System.Data.DataSet ByRef)
Class:        000007fe981230a0
MethodTable:  000007fe98107258
mdToken:      000000000600003f
Module:       000007fe98106a20
IsJitted:     yes
CodeAddr:     000007fe97f49640
Transparency: Critical

0:000> !ip2md 7fe97f526ca
MethodDesc:   000007fe98165e78
Method Name:  ******************************.GetString(System.Data.DataRow, System.String, System.String)
Class:        000007fe9815d5b8
MethodTable:  000007fe98165f40
mdToken:      0000000006000025
Module:       000007fe98106a20
IsJitted:     yes
CodeAddr:     000007fe97f52640
Transparency: Critical

0:000> !ip2md 7fe97f526af
MethodDesc:   000007fe98165e78
Method Name:  ******************************.GetString(System.Data.DataRow, System.String, System.String)
Class:        000007fe9815d5b8
MethodTable:  000007fe98165f40
mdToken:      0000000006000025
Module:       000007fe98106a20
IsJitted:     yes
CodeAddr:     000007fe97f52640
Transparency: Critical

0:000> !ip2md 7fe97f5271b
MethodDesc:   000007fe98165e78
Method Name:  ******************************.GetString(System.Data.DataRow, System.String, System.String)
Class:        000007fe9815d5b8
MethodTable:  000007fe98165f40
mdToken:      0000000006000025
Module:       000007fe98106a20
IsJitted:     yes
CodeAddr:     000007fe97f52640
Transparency: Critical

0:000> !u 000007fe98165e78
Normal JIT generated code
******************************.GetString(System.Data.DataRow, System.String, System.String)
Begin 000007fe97f52640, size 140
000007fe`97f52640 55              push    rbp
000007fe`97f52641 57              push    rdi
000007fe`97f52642 56              push    rsi
000007fe`97f52643 4883ec70        sub     rsp,70h
000007fe`97f52647 488dac2480000000 lea     rbp,[rsp+80h]
000007fe`97f5264f 488bf1          mov     rsi,rcx
000007fe`97f52652 488d7da8        lea     rdi,[rbp-58h]
000007fe`97f52656 b912000000      mov     ecx,12h
000007fe`97f5265b 33c0            xor     eax,eax
000007fe`97f5265d f3ab            rep stos dword ptr [rdi]
000007fe`97f5265f 488bce          mov     rcx,rsi
000007fe`97f52662 488965a0        mov     qword ptr [rbp-60h],rsp
000007fe`97f52666 48894d10        mov     qword ptr [rbp+10h],rcx
000007fe`97f5266a 48895518        mov     qword ptr [rbp+18h],rdx
000007fe`97f5266e 4c894520        mov     qword ptr [rbp+20h],r8
000007fe`97f52672 833d27481b0000  cmp     dword ptr [000007fe`98106ea0],0
000007fe`97f52679 7405            je      000007fe`97f52680
000007fe`97f5267b e840d5705f      call    clr!JIT_DbgIsJustMyCode (000007fe`f765fbc0)
000007fe`97f52680 90              nop
000007fe`97f52681 90              nop
000007fe`97f52682 488b4d10        mov     rcx,qword ptr [rbp+10h]
000007fe`97f52686 3909            cmp     dword ptr [rcx],ecx
*** WARNING: Unable to verify checksum for System.Data.ni.dll
000007fe`97f52688 e8e351e756      call    System_Data_ni!_bidW103 <PERF> (System_Data_ni+0x557870) (000007fe`eedc7870)
000007fe`97f5268d 488945d0        mov     qword ptr [rbp-30h],rax
000007fe`97f52691 488b4dd0        mov     rcx,qword ptr [rbp-30h]
000007fe`97f52695 3909            cmp     dword ptr [rcx],ecx
000007fe`97f52697 e8fc55a856      call    System_Data_ni!`string'+0xc8 (000007fe`ee9d7c98) (System.Data.DataTable.get_Columns(), mdToken: 0000000006000c6a)
000007fe`97f5269c 488945c8        mov     qword ptr [rbp-38h],rax
000007fe`97f526a0 488b4dc8        mov     rcx,qword ptr [rbp-38h]
000007fe`97f526a4 488b5518        mov     rdx,qword ptr [rbp+18h]
000007fe`97f526a8 3909            cmp     dword ptr [rcx],ecx
000007fe`97f526aa e8b1aae856      call    System_Data_ni!_bidW103 <PERF> (System_Data_ni+0x56d160) (000007fe`eeddd160)
000007fe`97f526af 0fb6c8          movzx   ecx,al
000007fe`97f526b2 894de8          mov     dword ptr [rbp-18h],ecx
000007fe`97f526b5 837de800        cmp     dword ptr [rbp-18h],0
000007fe`97f526b9 742f            je      000007fe`97f526ea
000007fe`97f526bb 488b4d10        mov     rcx,qword ptr [rbp+10h]
000007fe`97f526bf 488b5518        mov     rdx,qword ptr [rbp+18h]
000007fe`97f526c3 3909            cmp     dword ptr [rcx],ecx
000007fe`97f526c5 e84653e756      call    System_Data_ni!_bidW103 <PERF> (System_Data_ni+0x557a10) (000007fe`eedc7a10)
000007fe`97f526ca 488945b0        mov     qword ptr [rbp-50h],rax
000007fe`97f526ce 488b4db0        mov     rcx,qword ptr [rbp-50h]
000007fe`97f526d2 48ba3812ff7f06000000 mov rdx,67FFF1238h
000007fe`97f526dc 483b0a          cmp     rcx,qword ptr [rdx]
000007fe`97f526df 0f95c1          setne   cl
000007fe`97f526e2 0fb6c9          movzx   ecx,cl
000007fe`97f526e5 894de4          mov     dword ptr [rbp-1Ch],ecx
000007fe`97f526e8 eb05            jmp     000007fe`97f526ef
000007fe`97f526ea 33c9            xor     ecx,ecx
000007fe`97f526ec 894de4          mov     dword ptr [rbp-1Ch],ecx
000007fe`97f526ef 90              nop
000007fe`97f526f0 8b4de4          mov     ecx,dword ptr [rbp-1Ch]
000007fe`97f526f3 0fb6c9          movzx   ecx,cl
000007fe`97f526f6 894dec          mov     dword ptr [rbp-14h],ecx
000007fe`97f526f9 837dec00        cmp     dword ptr [rbp-14h],0
000007fe`97f526fd 750c            jne     000007fe`97f5270b
000007fe`97f526ff 90              nop
000007fe`97f52700 488b4d20        mov     rcx,qword ptr [rbp+20h]
000007fe`97f52704 48894dd8        mov     qword ptr [rbp-28h],rcx
000007fe`97f52708 90              nop
000007fe`97f52709 eb34            jmp     000007fe`97f5273f
000007fe`97f5270b 90              nop
000007fe`97f5270c 488b4d10        mov     rcx,qword ptr [rbp+10h]
000007fe`97f52710 488b5518        mov     rdx,qword ptr [rbp+18h]
000007fe`97f52714 3909            cmp     dword ptr [rcx],ecx
000007fe`97f52716 e8f552e756      call    System_Data_ni!_bidW103 <PERF> (System_Data_ni+0x557a10) (000007fe`eedc7a10)
000007fe`97f5271b 488945c0        mov     qword ptr [rbp-40h],rax
000007fe`97f5271f 488b4dc0        mov     rcx,qword ptr [rbp-40h]
000007fe`97f52723 488b45c0        mov     rax,qword ptr [rbp-40h]
000007fe`97f52727 488b00          mov     rax,qword ptr [rax]
000007fe`97f5272a 488b4040        mov     rax,qword ptr [rax+40h]
000007fe`97f5272e ff10            call    qword ptr [rax]
000007fe`97f52730 488945b8        mov     qword ptr [rbp-48h],rax
000007fe`97f52734 488b45b8        mov     rax,qword ptr [rbp-48h]
000007fe`97f52738 488945d8        mov     qword ptr [rbp-28h],rax
000007fe`97f5273c 90              nop
000007fe`97f5273d eb00            jmp     000007fe`97f5273f
000007fe`97f5273f 90              nop
000007fe`97f52740 488b45d8        mov     rax,qword ptr [rbp-28h]
000007fe`97f52744 488d65f0        lea     rsp,[rbp-10h]
000007fe`97f52748 5e              pop     rsi
000007fe`97f52749 5f              pop     rdi
000007fe`97f5274a 5d              pop     rbp
000007fe`97f5274b c3              ret
000007fe`97f5274c 55              push    rbp
000007fe`97f5274d 57              push    rdi
000007fe`97f5274e 56              push    rsi
000007fe`97f5274f 4883ec30        sub     rsp,30h
000007fe`97f52753 488b6920        mov     rbp,qword ptr [rcx+20h]
000007fe`97f52757 48896c2420      mov     qword ptr [rsp+20h],rbp
000007fe`97f5275c 488dad80000000  lea     rbp,[rbp+80h]
000007fe`97f52763 488955a8        mov     qword ptr [rbp-58h],rdx
000007fe`97f52767 90              nop
000007fe`97f52768 488b4520        mov     rax,qword ptr [rbp+20h]
000007fe`97f5276c 488945d8        mov     qword ptr [rbp-28h],rax
000007fe`97f52770 90              nop
000007fe`97f52771 488d05c7ffffff  lea     rax,[000007fe`97f5273f]
000007fe`97f52778 4883c430        add     rsp,30h
000007fe`97f5277c 5e              pop     rsi
000007fe`97f5277d 5f              pop     rdi
000007fe`97f5277e 5d              pop     rbp
000007fe`97f5277f c3              ret

0:000> !ip2md System_Data_ni+0x56d160
MethodDesc:   000007feee8ac518
Method Name:  System.Data.DataColumnCollection.Contains(System.String)
Class:        000007feee8a3400
MethodTable:  000007feeea690b8
mdToken:      00000000060008f7
Module:       000007feee871000
IsJitted:     yes
CodeAddr:     000007feeeddd160
Transparency: Safe critical

0:000> !ip2md System_Data_ni+0x557a10
MethodDesc:   000007feee8abb10
Method Name:  System.Data.DataRow.get_Item(System.String)
Class:        000007feee8a31f8
MethodTable:  000007feeea68ae8
mdToken:      0000000006000af6
Module:       000007feee871000
IsJitted:     yes
CodeAddr:     000007feeedc7a10
Transparency: Safe critical

0:000> !ip2md System_Data_ni+0x557a21
MethodDesc:   000007feee8abb10
Method Name:  System.Data.DataRow.get_Item(System.String)
Class:        000007feee8a31f8
MethodTable:  000007feeea68ae8
mdToken:      0000000006000af6
Module:       000007feee871000
IsJitted:     yes
CodeAddr:     000007feeedc7a10
Transparency: Safe critical

通过 IP2MD 找到实际对应的函数为 Query 和 GetString。对于数据库查询函数 Query,几乎没有优化空间(其实还是有一点)。那么另一个函数 GetString 则是对 DataRow 进行处理,它的的下级调用 System.Data.ni.dll!0x557a21 用时是 9.92%,具体对应 System.Data.DataRow.get_Item(System.String)。然后逐级展开,最终定位到 System.Data.InternalDataCollectionBase.NamesEqual。

通过 ILSpy 逆向 System.Data.dll,可以了解到 DataRow.get_Item 是需要根据 name 在 DataColumnCollection 中搜索对应的 DataColumn,搜索算法是先根据字典进行查找,未查到时再顺序进行忽略大小写的字符串比较,未命中字典时性能损耗较大。同时 GetString 函数内部进行了 3 次这样的查找,大大影响了性能,代码优化后改为仅查找一次。


PerfView 工具则更为强大,拥有各种各样的强大功能(之前做过 GC 用时分析),这里简单参考了一篇官方教程。在 Linux 上也可以借助其他工具进行数据收集。

点击 CPU Stacks 在新弹出的窗口中可以看到一个表格,在这里可以看到与CVCollectionCmd 工具类似的结果。

发表回复

您的电子邮箱地址不会被公开。 必填项已用*标注