HP解决方案中心性能调优方法论.ppt
何建波解决方案拓展部中国惠普有限公司Oct 2004,Performance Tuning MethodologyTechnology&Best practise,本文中的大部分内容是个人观点经验教训,仅供参考,2023/11/7,2,Concept,What is performance tuning?It is experts responsibilitiesMeans working day and nightMuch efforts,little gains.性能优化是一个充满激情的工作,是一种爱的行为,真诚地帮助别人,.运用我们所知道的,所感觉到的,或者曾经承受过地来减轻他人的负担,2023/11/7,3,Concept,Why performance tuning/Testing?改善“客户体验”,提供一个可预期的系统降低 I.T.成本,提供一个经济有效的系统Number of CPU Storage capacity(ZheJiang Public security-23%savings)不仅检验厂商的产品,而且考验people根本原因:Modern computer/application is not fast/intelligence enough,through it,provide a intelligence system,2023/11/7,4,Concept,When tune performance?新开发的系统&正在运行的生产系统Performance tuning?En It is vendors responsibilitiesBegins at system architect stage,follow on outline design,coding,modification and implementation through all apps life cycle.Include App architect,EJB design,table/index designBut,how to tune the system performance?,2023/11/7,5,Methodology-How,0.与客户建立信任关系以询问开始 Ask questions first理解应用的体系结构(搭建与应用特性匹配的系统)使用记录工具确定“基准线”查看/检查系统隔离问题/简化问题每次改变一个参数(通常不可能)Know when to stop tuning,2023/11/7,6,1.Ask questions first,磨刀不误砍材功,理解系统性能问题,了解症状明确目标了解历史,不知从哪里来,就不知道要到哪里去设计问题,询问技巧-找出在什么情况下发生了什么变化,产生了什么现象SMIC file number increase to 3.5 Millions,cpu sys is too highE-BUPT SMS testing,jvm version,2023/11/7,7,系统架构(两层,多层?中间件?)逐层细化了解整个应用的体系架构,关键问题要追溯到函数调用一级业务/数据处理流程多少个(并发)用户(Keep connect v.s.Active)系统/用户范围内的进程,线程数量、堆栈的需求虚拟和物理内存需求IPC(共享内存,信号灯,消息队列)打开的/文件Opened and mapped file requirements.How much I/O?访问模式(顺序,随机?持续稳定的还是突发的)?What kind of networking resources are needed?(how many sockets,),2.Understand application architecture,2023/11/7,8,2.Understand application architecture,flat files,process 1,process 2,process 3,memory-resident database,directory,signal,signal,directory,Oracle shadow,pipe,database,shared memory/semaphore sets,sigtimedwait(),sigtimedwait(),semop(),semop(),semop(),semop(),2023/11/7,9,3.使用监控工具,使用你适合/喜欢的工具记录系统的性能数据measureware/sar/glance/OV performance mgt掌握系统的历史性能数据可以帮助你解决问题,优化系统进行对比正常情况/出现问题的对比优化步骤之间的对比易于roll-back进行容量规划通过检查系统的资源利用率和对应业务量的增减关系,2023/11/7,10,4.看什么?,第一印象系统资源利用率以及队列长度细心观察global wait states and global system calls仔细研究process system calls and process waits Is the process running?If not,what is it waiting on?If it is,where is it spending it time?Whatever it is doing,is it suppose to be doing it this way?,2023/11/7,11,5.确定问题域Where is the problem?,硬件?操作系统?数据库?中间件?应用?隔离问题上海浦东发展银行GuoXin Lucent 电信网管软件ICBC 信贷台帐系统备份24小时哈尔冰商业银行轧张3小时简化问题,2023/11/7,12,6.对症下药cure it,方法直接定位法排除法一次改变一个地方观察结果时间安排Repeat,2023/11/7,13,6.何时结束,满足业务的需求,而不是系统资源利用率吞吐量响应时间系统中最昂贵的资源得以充分的利用CPUMEMI/O理解80/20原则,2023/11/7,14,Performance tuning-technology,Four layerApplicationMiddlewareDatabaseO.S.H.W.ResourcesCPUMEMStorageNetwork,2023/11/7,15,Tools工欲善其事,必先利其器,2023/11/7,HP World 2003 Solutions and Technology Conference&Expo,16,Tools for Application profiling,Application layer(too idle,too busy)太闲进程阻塞,系统资源利用率低(特别是CPU),哪里被阻塞?所需资源被谁占用?为何它不能及时释放?回答:where is that process spending its time?What kernel code is it executing?What user space code is it executing?Wait for what?Who had the resouce I needed?ICBC credit system,太忙资源利用率太高,进程的时间花在什么地方了?Where is the thread/process spending its time?Kernel code active or sleeping?User space active or sleeping?Neusoft Insurance applicationWhatever its doing,is it suppose to be doing it this way?Between the application developers,the customer and HP,somebody had better know.我可以改用其它资源吗?ToolsTrace in application source code,not in H.W.,O.S.,Middleware,DB levelFor C,tusc,prospect,caliperFor Java,HPJconfig,HPJmeter,HPJtuneDo not work for a applications bug!solutionsapplication iteself,chatr,psrset,mpctl,wlm,prm,ICBC ATM/SMICNeusoft Insurance Application-sorting,2023/11/7,HP World 2003 Solutions and Technology Conference&Expo,17,Prospect,Key to performance is“how does my application spend its time?”Prospect provides:System-wide summary,yawn,but alsoPer-process and per-thread data:Execution/blocked进程在哪里阻塞,阻塞多长时间Userspace(application and library)and kernel profileEven instruction-level profiling!Provides per-CPU informationUser,Idle,SystemInterrupt,TrapContext switches.etc缺点:No use for IO bottlenecksIntimidating output,2023/11/7,18,Caliper,General purpose performance analysis tool for Itanium based systemsWorks directly on the binary as-isFeaturesVarious metrics measurement Eg:fprof,dtlb_miss,itlb_miss,dcache_miss,icache_miss,branch_prediction,func_coveretcPerformance data and source correlationEasy inclusion/exclusion of specific load modulesPer thread/aggregated thread reportMulti-process selection capabilityText/HTML output,2023/11/7,19,Caliper-Metrics,Various metrics Measure Configuration File-hot spots fprof,cgprof call graph cgprof cache/TLB behavior dcache_miss,icache_miss,dtlb_miss,itlb_miss branch prediction branch_prediction call counts func_count,cgprof function coverage func_cover CPU events total_cpu compiler feedback pbo,2023/11/7,20,Tusc,Trace Unix System Calls for a specified processRun command under tusc,or attach to PIDShowsSystem calls,input parameters and their return valuesSystem call durationThread id of the thread making system callsSignal information.etcUsage:tusc-T-f-e-l-o Useful options:-cSummary of syscall counts,errors and CPU time-cccCPU time for every syscall-T%TPrint a timestamp before every trace-rallDisplay read buffer for all reads-wallDisplay write buffer for all writes-fFollow fork()s,2023/11/7,Backup slide,21,GlancePlus Options(backup slide),GlancePlus Commands Menuh-Online Help q-exit(or e)A-Application Listg-Process List d-Disk Report P-PRM Group Lista-CPU By Processor i-IO By File System Y-Global System Callsc-CPU Report u-IO By Disk F-Process Open Filesm-Memory Report v-IO By Logical Volume M-Process Memory Regionst-System Tables N-NFS Global Activity R-Process Resourcesw-Swap Space n-NFS By System W-Process Wait StatesB-Global Waits l-Network By Interface L-Process System CallsZ-Global Threads T-Trans Tracker y-Renice ProcessG-Process Threads H-Alarm History s-Select ProcessI-Thread Resource J-Thread WaitS-Select Disk/NFS/Appl/Trans/Thread,2023/11/7,22,Performance tuning-Middleware,Transaction Process monitor tuxedoMSSQ(seperate reply and send queue)MSMQApplication ServerCore serverWeb ApplicationsJDBCEJB,2023/11/7,23,Core Server,JVM heap size 32bit/64bit gcThread-countBest tuned empirically tune for high CPU utilization and throughputDo not overdo!Effect may be constrained by pool sizes(SLSB,MDB,connection pool)Application/Module Specific Execute QueuesUseful in preventing deadlocksUseful in handling request burst/surges,2023/11/7,24,Core Server,Chunk tuning:Required only for large requests/responses可以相应的减少 socket reads/writes 的数量-Dweblogic.ChunkSize and-Main Factors:MTU and request sizeSet on both client and serverHigher memory requirementsNative I/O:Java i/o not necessarily slower than native i/o,2023/11/7,25,Web Container,Uncommon but crippling factor:Synchronization in application codeHTTP Session persistenceIf small percentage of session data changes,isolate that data into a separate object.,2023/11/7,26,Web Container,Disable Servlet and JSP page checks:禁止 jsp-param pageCheckSeconds=-1 servlet-reload-check-secs=-1 in container-descriptorServletReloadCheckSecs=-1 in WebAppComponent element in config.xmlFor JSPs,both of these have to be setGains of 5%to 8%have been observedJSP caching tagSupport both output and input cachingWLS-proprietary,2023/11/7,27,JDBC,JDBC Driver选择正确的JDBC Driver 非常重要Oracle thin 10g driver significantly faster than the 9i thin driverConnection Pooling不要动态起数据库连接 min-capacity=max-capacityConnection Pool size=同时进行数据库操作的线程数Big overhead:ConnectionLeakProfiling,CheckConnectionOnReserve/ReleasePrepared-statement cacheConnection-specific cache of compiled SQL statementsRequires use of Default is 10,should be set to total number of prepared statements in the applicationFor EJB applications,a rough calculation isno.of prepared statements(no.-of-finders+no.-of-field-groups+3)*(no.-of-CMP Beans)Actual number could be higher,2023/11/7,28,EJB,GeneralUse proven session-faade patternAvoid“RequiresNew”transaction attributeUse local interfaces or call-by-reference无状态会话Beans 和消息驱动的BeansMax-beans-in-free-pool and Initial-beans-in-free-pool有状态会话 BeansMax-beans-in-cache large enough to handle concurrent usersAvoid SFSB replication,2023/11/7,29,EJB,实体BeansCaching biggest performance boosterCaching only helps for pk-finders and cached referencesChoose concurrency strategy,caching and locking behavior for the best performanceFor example,for REPEATABLE_READ you could useDatabase-concurrency+use-select-for-update*Optimistic-concurrency+cache-between-transactions+read-verificationThe second combination would be more performant,because of cache-between-transactionsexclusive DB locks held for a very short time during commitRead-only beans*Not supported for all databases,2023/11/7,30,EJB,CMP Entity BeansEnable batch inserts and updatesenable-batch-operationsTurn off include-updates on finders and turn off existence-checksTurned on by default turn off if not neededRelationship-cachingUses multi-level joins to load multiple beans using a single SQL statementField-groupsAllows grouping commonly used CMP fields,2023/11/7,31,Performance tuning-database,Database Principal仔细设计(redolog,tablespace,schema,table,index,stored procedure)S.A.M.E./EVA/Oracle10g ASM上(应用)下(I.T.基础设施)贯通、相应规划 搭建一个适合应用特点的系统不要在麻袋片上绣花-底子要好,2023/11/7,HP World 2003 Solutions and Technology Conference&Expo,32,Tools for Database/Oracle,STATSPACK ESTAT/BSTATLive in the RDBMS directorySnapshot of database resource usage for a workloadNot real-timeOracle Enterprise Manager Tools(9i&10g)Diagnostics PackTuning PackReal-timeTKPROFQuery Analysis3rd partyQuest softwareSpotlightSQL*labToad,2023/11/7,HP World 2003 Solutions and Technology Conference&Expo,33,Database/Oracle performance tips,I/O intensiveApply 放之四海而皆准的原则 S.A.M.E.SGA sizing(过犹不及)db_cache_sizelog_bufferShared_pool PGA sizing并发用户数每个进程占用的内存大小 8i,9i&10g排序区,2023/11/7,34,TodaysRAID Sets,SCSI Bus 1,SCSI Bus 2,SCSI Bus 3,SCSI Bus 4,SCSI Bus 5,SCSI Bus 6,RAID 0 Volume,RAID 5 Volume,RAID 1 Volume,从根本上消除瓶颈问题,提高吞吐量,读写IO工作量被平均地分布到每个disk group的所有磁盘上避免对应用程序和数据库复杂的分析调优过程增加磁盘,自动leveling,HP EVA S.A.M.E.的硬件实现,2023/11/7,35,Performance tuning-O.S.,O.S.principalDeconfigure unnecessary moduleApache,NIS,NFS,security,Configure kernel parameters correctlyvm,proc,ipc,inet,fs,ioConfigure it as application installation guide,2023/11/7,HP World 2003 Solutions and Technology Conference&Expo,36,Tools for O.S.,O.S.toolsGlance(瑞士军刀),top,sar,iostat,vmstatKprofile,ktrace,cyclemeter,kgmon,KItrace,spinwatch,ipcrm,ipcs,timex,ndd,rtprio,rtsched,tunefs/vxtunefs,fsadm,scsictl,intctl,gprof,fuser,lsof,kctuneStorage I/O ramdisk,diskbenchNetworkingnetperf netstat,lanscan,lanshow,2023/11/7,Backup slide,37,GlancePlus Options,GlancePlus Commands Menuh-Online Help q-exit(or e)A-Application Listg-Process List d-Disk Report P-PRM Group Lista-CPU By Processor i-IO By File System Y-Global System Callsc-CPU Report u-IO By Disk F-Process Open Filesm-Memory Report v-IO By Logical Volume M-Process Memory Regionst-System Tables N-NFS Global Activity R-Process Resourcesw-Swap Space n-NFS By System W-Process Wait StatesB-Global Waits l-Network By Interface L-Process System CallsZ-Global Threads T-Trans Tracker y-Renice ProcessG-Process Threads H-Alarm History s-Select ProcessI-Thread Resource J-Thread WaitS-Select Disk/NFS/Appl/Trans/Thread,2023/11/7,38,Performance tuning-H.W.,nPar,vPar,PRM/WLM,psrset,mpctlCan the H.W.satisfy my needs?How do u know?Does the app NUMA aware?Does TPC-C match all applications?CPU#?Memory sizing?I/O adaptor,Controller,DisksScalability issues?Scale-in depends on application specScale-out Middleware/RACProduct specific features,SD/NUMA,EVA,XPNetworkI/O,2023/11/7,39,2023/11/7,40,Performance tuning-case study,CPU tipsCPU 不忙,系统很慢济南铁路局 opsBottleneck在其它地方CPU sys占用太多,系统很慢深圳华为,SMICCPU 100%忙,系统很慢成都电力计费系统 sybase中国工商总行信贷台帐系统Insurance multi-table joinFTS-No index,2023/11/7,41,Performance tuning-case study,Memory and swapswapmem_on-allow physical memory size to exceed the available swap space系统颠簸宁波国家税务局集中征税系统北京市(公安部)二代证系统,2023/11/7,42,Performance tuning-case study,NetworkChina Union PayWeb-based application,such as 公安部人口系统FuJian 163 DNS Server benchmark testing,2023/11/7,43,Performance tuning-case study,I/O principalMinimal I/O operations(Application Design,Large MemoryAs many as possible HBA,controller,S.A.M.E.Async and Direct I/OCase study东信北邮智能网预付费测试分离redolog简单信息挖掘公安部门的人口库短消息,电话,电子邮件,信用卡等,2023/11/7,HP World 2003 Solutions and Technology Conference&Expo,44,Summary and some advices,千人千相,应用同样。不要相信”专家”的吹嘘,包括我。Work on one problem at a time.Work on memory firstbut dont assume memory is a cure-all.Swapping is A Really Bad Thing.CPU bottlenecks can be caused by problems in other areas.你不能将11升的醋装入10升的瓶子 A fast system with one slow part is a slow systemWrite-back cache is A Really Good Thing.Know when to stop.Law of 80/20.找好靠山。learning&buck-passing。,2023/11/7,45,Partner+hp=everything is possible,提供企业IT运行永续的动力,2023/11/7,HP World 2003 Solutions and Technology Conference&Expo,47,Old engineering proverb:You dont become a hero by resolving problems that you havent yet allowed to become a crisis,2023/11/7,48,What is a good benchmark?,A Good benchmark is:Run for right reasonsRepresentative of your environment and workloadMeticulously planned,with contingenciesVery hard workA fabulous learning opportunityAn opportunity to evaluate both people and products,