程序在开发机上借助 IDE 的工具进行性能分析是非常简单的,但在程序在正式环境运行时进行慢函数分析就比较困难。这里介绍2个工具,CVCollectionCmd 和 PerfView,借助它们的力量(性能计数器和其他系统探针)可以对 windows 中运行的 .net 程序进行运行中分析。
CVCollectionCmd 实际是 Visual Studio 的非自带官方扩展——并发可视化工具的独立命令行工具,可以独立于 Visual Studio 运行。除了可以直接 Open 启动一个进程外,还可以 Attach 现有进程。
# 启动 "C:\Program Files (x86)\Microsoft Concurrency Visualizer Collection Tools\CVCollectionCmd.exe" /Attach /Process 79272 /outdir D:\testproj\cvclog /config "D:\testproj\CVCollectionConfig.xml" # 停止 "C:\Program Files (x86)\Microsoft Concurrency Visualizer Collection Tools\CVCollectionCmd.exe" /detach
<?xml version="1.0"?> <LocalConfig xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema" MajorVersion="1" MinorVersion="0"> <IncludeEnvSymbolPath>true</IncludeEnvSymbolPath> <DeleteEtlsAfterAnalysis>false</DeleteEtlsAfterAnalysis> <TraceLocation>D:\testproj\cvclog</TraceLocation> <SymbolPath>srv*d:\symbols*https://msdl.microsoft.com/download/symbols;D:\testproj\</SymbolPath> <Markers> <MarkerProvider Name="Default" Guid="8d4925ab-505a-483b-a7e0-6f824a07a6f0" Level="Low" /> <MarkerProvider Name="TPL" Guid="2e5dba47-a3d2-4d16-8ee0-6671ffdcd7b5" Level="Normal" /> <MarkerProvider Name="TPL Dataflow" Guid="16f53577-e41d-43d4-b47e-c17025bf4025" Level="Normal" /> <MarkerProvider Name="TPL Synchronization" Guid="ec631d38-466b-4290-9306-834971ba0217" Level="Normal" /> <MarkerProvider Name="PLINQ" Guid="159eeeec-4a14-4418-a8fe-faabcd987887" Level="Normal" /> <MarkerProvider Name="Concurrency Runtime" Guid="f7b697a3-4db5-4d3b-be71-c4d284e6592f" Level="Normal" /> <MarkerProvider Name="Scenario Markers" Guid="fb9244c9-f23a-4966-8a9c-97a51f8c355b" Level="Low" /> </Markers> <FilterConfig> <CollectClrEvents>true</CollectClrEvents> <ClrCollectionOptions>CollectForNative DisableNGenRundown</ClrCollectionOptions> <CollectSampleEvents>true</CollectSampleEvents> <CollectGpuEvents>false</CollectGpuEvents> <CollectFileIO>true</CollectFileIO> </FilterConfig> <UserBufferSettings> <BufferFlushTimer>0</BufferFlushTimer> <BufferSize>256</BufferSize> <MinimumBuffers>512</MinimumBuffers> <MaximumBuffers>1024</MaximumBuffers> </UserBufferSettings> <KernelBufferSettings> <BufferFlushTimer>0</BufferFlushTimer> <BufferSize>256</BufferSize> <MinimumBuffers>512</MinimumBuffers> <MaximumBuffers>1024</MaximumBuffers> </KernelBufferSettings> <!-- List of MyCodeDirectory directories --> <JustMyCode> <MyCodeDirectory>D:\testproj\</MyCodeDirectory> </JustMyCode> </LocalConfig>
工具会生成3个文件(XXX.CvTrace,XXX.kernel.etl,XXX.user.etl),使用 Visual Studio 打开 XXX.CvTrace,经过分析后会展示图形化数据。
如图,线程界面的表格中展示了函数的 CPU 用时排序。这里作者为了展示更多和更详细的数据把降噪调整为0%,默认值为2%。可以看到 unknown!0x7fe97f49757 和 unknown!7fe97f526ca 分别用时 14.55% 和 10.18%。但因为某些原因(可能因为程序使用 Assembly.LoadFrom加载dll)未成功加载 pdb 文件,导致 module 和函数名未被展示,这里只能使用 windbg 分析 dump 文件。
0:000> !ip2md 7fe97f49757 MethodDesc: 000007fe98107198 Method Name: ******************************.Query(System.String, System.String, System.Data.DataSet ByRef) Class: 000007fe981230a0 MethodTable: 000007fe98107258 mdToken: 000000000600003f Module: 000007fe98106a20 IsJitted: yes CodeAddr: 000007fe97f49640 Transparency: Critical 0:000> !ip2md 7fe97f526ca MethodDesc: 000007fe98165e78 Method Name: ******************************.GetString(System.Data.DataRow, System.String, System.String) Class: 000007fe9815d5b8 MethodTable: 000007fe98165f40 mdToken: 0000000006000025 Module: 000007fe98106a20 IsJitted: yes CodeAddr: 000007fe97f52640 Transparency: Critical 0:000> !ip2md 7fe97f526af MethodDesc: 000007fe98165e78 Method Name: ******************************.GetString(System.Data.DataRow, System.String, System.String) Class: 000007fe9815d5b8 MethodTable: 000007fe98165f40 mdToken: 0000000006000025 Module: 000007fe98106a20 IsJitted: yes CodeAddr: 000007fe97f52640 Transparency: Critical 0:000> !ip2md 7fe97f5271b MethodDesc: 000007fe98165e78 Method Name: ******************************.GetString(System.Data.DataRow, System.String, System.String) Class: 000007fe9815d5b8 MethodTable: 000007fe98165f40 mdToken: 0000000006000025 Module: 000007fe98106a20 IsJitted: yes CodeAddr: 000007fe97f52640 Transparency: Critical 0:000> !u 000007fe98165e78 Normal JIT generated code ******************************.GetString(System.Data.DataRow, System.String, System.String) Begin 000007fe97f52640, size 140 000007fe`97f52640 55 push rbp 000007fe`97f52641 57 push rdi 000007fe`97f52642 56 push rsi 000007fe`97f52643 4883ec70 sub rsp,70h 000007fe`97f52647 488dac2480000000 lea rbp,[rsp+80h] 000007fe`97f5264f 488bf1 mov rsi,rcx 000007fe`97f52652 488d7da8 lea rdi,[rbp-58h] 000007fe`97f52656 b912000000 mov ecx,12h 000007fe`97f5265b 33c0 xor eax,eax 000007fe`97f5265d f3ab rep stos dword ptr [rdi] 000007fe`97f5265f 488bce mov rcx,rsi 000007fe`97f52662 488965a0 mov qword ptr [rbp-60h],rsp 000007fe`97f52666 48894d10 mov qword ptr [rbp+10h],rcx 000007fe`97f5266a 48895518 mov qword ptr [rbp+18h],rdx 000007fe`97f5266e 4c894520 mov qword ptr [rbp+20h],r8 000007fe`97f52672 833d27481b0000 cmp dword ptr [000007fe`98106ea0],0 000007fe`97f52679 7405 je 000007fe`97f52680 000007fe`97f5267b e840d5705f call clr!JIT_DbgIsJustMyCode (000007fe`f765fbc0) 000007fe`97f52680 90 nop 000007fe`97f52681 90 nop 000007fe`97f52682 488b4d10 mov rcx,qword ptr [rbp+10h] 000007fe`97f52686 3909 cmp dword ptr [rcx],ecx *** WARNING: Unable to verify checksum for System.Data.ni.dll 000007fe`97f52688 e8e351e756 call System_Data_ni!_bidW103 <PERF> (System_Data_ni+0x557870) (000007fe`eedc7870) 000007fe`97f5268d 488945d0 mov qword ptr [rbp-30h],rax 000007fe`97f52691 488b4dd0 mov rcx,qword ptr [rbp-30h] 000007fe`97f52695 3909 cmp dword ptr [rcx],ecx 000007fe`97f52697 e8fc55a856 call System_Data_ni!`string'+0xc8 (000007fe`ee9d7c98) (System.Data.DataTable.get_Columns(), mdToken: 0000000006000c6a) 000007fe`97f5269c 488945c8 mov qword ptr [rbp-38h],rax 000007fe`97f526a0 488b4dc8 mov rcx,qword ptr [rbp-38h] 000007fe`97f526a4 488b5518 mov rdx,qword ptr [rbp+18h] 000007fe`97f526a8 3909 cmp dword ptr [rcx],ecx 000007fe`97f526aa e8b1aae856 call System_Data_ni!_bidW103 <PERF> (System_Data_ni+0x56d160) (000007fe`eeddd160) 000007fe`97f526af 0fb6c8 movzx ecx,al 000007fe`97f526b2 894de8 mov dword ptr [rbp-18h],ecx 000007fe`97f526b5 837de800 cmp dword ptr [rbp-18h],0 000007fe`97f526b9 742f je 000007fe`97f526ea 000007fe`97f526bb 488b4d10 mov rcx,qword ptr [rbp+10h] 000007fe`97f526bf 488b5518 mov rdx,qword ptr [rbp+18h] 000007fe`97f526c3 3909 cmp dword ptr [rcx],ecx 000007fe`97f526c5 e84653e756 call System_Data_ni!_bidW103 <PERF> (System_Data_ni+0x557a10) (000007fe`eedc7a10) 000007fe`97f526ca 488945b0 mov qword ptr [rbp-50h],rax 000007fe`97f526ce 488b4db0 mov rcx,qword ptr [rbp-50h] 000007fe`97f526d2 48ba3812ff7f06000000 mov rdx,67FFF1238h 000007fe`97f526dc 483b0a cmp rcx,qword ptr [rdx] 000007fe`97f526df 0f95c1 setne cl 000007fe`97f526e2 0fb6c9 movzx ecx,cl 000007fe`97f526e5 894de4 mov dword ptr [rbp-1Ch],ecx 000007fe`97f526e8 eb05 jmp 000007fe`97f526ef 000007fe`97f526ea 33c9 xor ecx,ecx 000007fe`97f526ec 894de4 mov dword ptr [rbp-1Ch],ecx 000007fe`97f526ef 90 nop 000007fe`97f526f0 8b4de4 mov ecx,dword ptr [rbp-1Ch] 000007fe`97f526f3 0fb6c9 movzx ecx,cl 000007fe`97f526f6 894dec mov dword ptr [rbp-14h],ecx 000007fe`97f526f9 837dec00 cmp dword ptr [rbp-14h],0 000007fe`97f526fd 750c jne 000007fe`97f5270b 000007fe`97f526ff 90 nop 000007fe`97f52700 488b4d20 mov rcx,qword ptr [rbp+20h] 000007fe`97f52704 48894dd8 mov qword ptr [rbp-28h],rcx 000007fe`97f52708 90 nop 000007fe`97f52709 eb34 jmp 000007fe`97f5273f 000007fe`97f5270b 90 nop 000007fe`97f5270c 488b4d10 mov rcx,qword ptr [rbp+10h] 000007fe`97f52710 488b5518 mov rdx,qword ptr [rbp+18h] 000007fe`97f52714 3909 cmp dword ptr [rcx],ecx 000007fe`97f52716 e8f552e756 call System_Data_ni!_bidW103 <PERF> (System_Data_ni+0x557a10) (000007fe`eedc7a10) 000007fe`97f5271b 488945c0 mov qword ptr [rbp-40h],rax 000007fe`97f5271f 488b4dc0 mov rcx,qword ptr [rbp-40h] 000007fe`97f52723 488b45c0 mov rax,qword ptr [rbp-40h] 000007fe`97f52727 488b00 mov rax,qword ptr [rax] 000007fe`97f5272a 488b4040 mov rax,qword ptr [rax+40h] 000007fe`97f5272e ff10 call qword ptr [rax] 000007fe`97f52730 488945b8 mov qword ptr [rbp-48h],rax 000007fe`97f52734 488b45b8 mov rax,qword ptr [rbp-48h] 000007fe`97f52738 488945d8 mov qword ptr [rbp-28h],rax 000007fe`97f5273c 90 nop 000007fe`97f5273d eb00 jmp 000007fe`97f5273f 000007fe`97f5273f 90 nop 000007fe`97f52740 488b45d8 mov rax,qword ptr [rbp-28h] 000007fe`97f52744 488d65f0 lea rsp,[rbp-10h] 000007fe`97f52748 5e pop rsi 000007fe`97f52749 5f pop rdi 000007fe`97f5274a 5d pop rbp 000007fe`97f5274b c3 ret 000007fe`97f5274c 55 push rbp 000007fe`97f5274d 57 push rdi 000007fe`97f5274e 56 push rsi 000007fe`97f5274f 4883ec30 sub rsp,30h 000007fe`97f52753 488b6920 mov rbp,qword ptr [rcx+20h] 000007fe`97f52757 48896c2420 mov qword ptr [rsp+20h],rbp 000007fe`97f5275c 488dad80000000 lea rbp,[rbp+80h] 000007fe`97f52763 488955a8 mov qword ptr [rbp-58h],rdx 000007fe`97f52767 90 nop 000007fe`97f52768 488b4520 mov rax,qword ptr [rbp+20h] 000007fe`97f5276c 488945d8 mov qword ptr [rbp-28h],rax 000007fe`97f52770 90 nop 000007fe`97f52771 488d05c7ffffff lea rax,[000007fe`97f5273f] 000007fe`97f52778 4883c430 add rsp,30h 000007fe`97f5277c 5e pop rsi 000007fe`97f5277d 5f pop rdi 000007fe`97f5277e 5d pop rbp 000007fe`97f5277f c3 ret 0:000> !ip2md System_Data_ni+0x56d160 MethodDesc: 000007feee8ac518 Method Name: System.Data.DataColumnCollection.Contains(System.String) Class: 000007feee8a3400 MethodTable: 000007feeea690b8 mdToken: 00000000060008f7 Module: 000007feee871000 IsJitted: yes CodeAddr: 000007feeeddd160 Transparency: Safe critical 0:000> !ip2md System_Data_ni+0x557a10 MethodDesc: 000007feee8abb10 Method Name: System.Data.DataRow.get_Item(System.String) Class: 000007feee8a31f8 MethodTable: 000007feeea68ae8 mdToken: 0000000006000af6 Module: 000007feee871000 IsJitted: yes CodeAddr: 000007feeedc7a10 Transparency: Safe critical 0:000> !ip2md System_Data_ni+0x557a21 MethodDesc: 000007feee8abb10 Method Name: System.Data.DataRow.get_Item(System.String) Class: 000007feee8a31f8 MethodTable: 000007feeea68ae8 mdToken: 0000000006000af6 Module: 000007feee871000 IsJitted: yes CodeAddr: 000007feeedc7a10 Transparency: Safe critical
通过 IP2MD 找到实际对应的函数为 Query 和 GetString。对于数据库查询函数 Query,几乎没有优化空间(其实还是有一点)。那么另一个函数 GetString 则是对 DataRow 进行处理,它的的下级调用 System.Data.ni.dll!0x557a21 用时是 9.92%,具体对应 System.Data.DataRow.get_Item(System.String)。然后逐级展开,最终定位到 System.Data.InternalDataCollectionBase.NamesEqual。
通过 ILSpy 逆向 System.Data.dll,可以了解到 DataRow.get_Item 是需要根据 name 在 DataColumnCollection 中搜索对应的 DataColumn,搜索算法是先根据字典进行查找,未查到时再顺序进行忽略大小写的字符串比较,未命中字典时性能损耗较大。同时 GetString 函数内部进行了 3 次这样的查找,大大影响了性能,代码优化后改为仅查找一次。
PerfView 工具则更为强大,拥有各种各样的强大功能(之前做过 GC 用时分析),这里简单参考了一篇官方教程。在 Linux 上也可以借助其他工具进行数据收集。
点击 CPU Stacks 在新弹出的窗口中可以看到一个表格,在这里可以看到与CVCollectionCmd 工具类似的结果。