欢迎来到三一办公! | 帮助中心 三一办公31ppt.com(应用文档模板下载平台)
三一办公
全部分类
  • 办公文档>
  • PPT模板>
  • 建筑/施工/环境>
  • 毕业设计>
  • 工程图纸>
  • 教育教学>
  • 素材源码>
  • 生活休闲>
  • 临时分类>
  • ImageVerifierCode 换一换
    首页 三一办公 > 资源分类 > PPT文档下载  

    应用多元统计分析.ppt

    • 资源ID:5723439       资源大小:1.87MB        全文页数:38页
    • 资源格式: PPT        下载积分:15金币
    快捷下载 游客一键下载
    会员登录下载
    三方登录下载: 微信开放平台登录 QQ登录  
    下载资源需要15金币
    邮箱/手机:
    温馨提示:
    用户名和密码都是您填写的邮箱或者手机号,方便查询和重复下载(系统自动生成)
    支付方式: 支付宝    微信支付   
    验证码:   换一换

    加入VIP免费专享
     
    账号:
    密码:
    验证码:   换一换
      忘记密码?
        
    友情提示
    2、PDF文件下载后,可能会被浏览器默认打开,此种情况可以点击浏览器菜单,保存网页到桌面,就可以正常下载了。
    3、本站不支持迅雷下载,请使用电脑自带的IE浏览器,或者360浏览器、谷歌浏览器下载即可。
    4、本站资源下载后的文档和图纸-无水印,预览文档经过压缩,下载后原文更清晰。
    5、试题试卷类文档,如果标题没有明确说明有答案则都视为没有答案,请知晓。

    应用多元统计分析.ppt

    Preface to the 1st Edition,Most of the observable phenomenafinmin in the empirical(empirikl经验)sciences are of a multivariate nature.In financial studies,assets in stock markets are observed simultaneously and their joint development is analyzed to better understand general tendencies(趋势)and to track indices(路灯).The underlying theoretical structure of these and many other quantitative studies of applied sciences is multivariate.This book on Applied Multivariate Statistical Analysis presents the tools and concepts of multivariate data analysis with a strong focus on applications.The aim of the book is to present multivariate data analysis in a way that is understandable for non-mathematicians and practitioners who are(面对)by statistical data analysis.This is achieved by focusing on the practical relevance and through the e-book character of this text.All practical examples may be recalculated and modified by the reader using a standard web browser and without reference or application of any specific software.,Most of the observable phenomenafinmin in the empirical(empirikl经验)sciences are of a multivariate nature.The underlying theoretical structure of these and many other quantitative studies of applied sciences is multivariate.This book on Applied Multivariate Statistical Analysis presents the tools and concepts of multivariate,mlti vereit data analysis with a strong focus on applications.,The book is divided into three main parts.The first part is devoted to graphical techniques describing the distributions of the variables involved.The second part deals with multivariate random variables and presents from a theoretical point of view distributions,estimators and tests for various practical situations.The last part is on multivariate techniques and introduces the reader to the wide selection of tools available for multivariate data analysis.All data sets are given in the appendix and are downloadable from md-stat.The text contains a wide variety of exercises the solutions of which are given in a separate textbook.In addition a full set of transparencies on md-stat is provided making iteasier for an instructor to present the materials in this book.All transparencies contain hyper links to the statistical web service so that students and instructors alike may repute all examples via a standard web browser.,1-2 week,UNIT-I Descriptive Techniques(描述技术)1 Comparison(对照)of Batches1.1 Boxplots 4 1.2 Histograms 101.3 Scatterplots 171.4 Data Set-Boston Housing 35,1 Comparison of Batches,Multivariate statistical analysis is concerned with analyzing and understanding data in high dimensions.We suppose that we are given a set xini=1 of n observations of a variable vector X in Rp.That is,we suppose that each observation xi has p dimensions:xi=(xi1,xi2,.,xip),and that it is an observed value of a variable vector X Rp.Therefore,X is posed of p random variables:X=(X1,X2,.,Xp)where Xj,for j=1,.,p,is a one-dimensional random variable.,1 Comparison of Batches,Multivariate statistical analysis is concerned with analyzing and understanding data in high dimensions.How do we begin to analyze this kind of data?Before we investigate questions on what inferences we can reach from the data,we should think about how to look at the data.This involves descriptive techniques.Questions that we could answer by descriptive techniques are:Are there ponents of X that are more spread out than others?Are there some elements of X that indicate subgroups of the data?Are there outliers in the ponents of X?How“normal”is the distribution of the data?,1.1 Boxplots,1 Comparison of Batches,Genuinedenjuin真正的,X6,X1,The median and mean bars are measures of locations.The relative location of the median(and the mean)in the box is a measure of skewness.The length of the box and whiskers are a measure of spread.The length of the whiskers indicate the tail length of the distribution.The outlying points are indicated with a“”or“”depending on if they are outside of FUL 1.5dF or FUL 3dF respectively.The boxplots do not indicate multi modality or clusters.If we pare the relative size and location of the boxes,we are paring distributions.,Summary,Reading material,1.2 Histograms,h=0.4,Diagonal,Histograms are density(denst)(密度)estimates(estimeits概算).A density estimate gives a good impression of the distribution of the data.In contrast to boxplots,density estimates show possible multimodality(多模式;综合,mltimdliti)of the data.The idea is to locally represent the data density by counting the number of observations in a sequence of consecutive(连续的)intervals(bins)(箱)with origin(rn起源、原点)x0.Let Bj(x0,h)denote(dinut,指示,表示)the bin of length h which is the element of a bin grid starting at x0:Bj(x0,h)=x0+(j 1)h,x0+jh),j Z,where.,.)(square brackets)denotes a left closed and right open interval(ntrvl 间隔,右开区间).,If xin i=1 is an i.i.d.sample with density f,the histogram is defined as follows:In sum(1.7)the first indicator function I xi Bj(x0,h)counts the number of observations falling into bin Bj(x0,h).The second indicator function I is responsible for“localizing”(luklizi局限)the counts around x.The parameter h is a smoothing or localizing parameter and controls the width(wid)of the histogram bins.An h that is too large leads to very big blocks and thus to a very unstructured histogram.On the other hand,an h that is too small gives a very variable estimate with many unimportant peaks.,H=0.1,H=0.2,H=0.3,Diagonaldaignladj.对角线的,斜的 n.对角线,斜线,H=0.4,The effect of h is given in detail in Figure 1.6.It contains the histogram(upper left)for the diagonal of the counterfeit bank notes for x0=137.8(the minimum of these observations)and h=0.1.Increasing h to h=0.2 and using the same origin,x0=137.8,results in the histogram shown in the lower left of the figure.This density histogram is somewhat smoother due to the larger h.The binwidth is next set to h=0.3(upper right).From this histogram,one has the impression that the distribution of the diagonal is bimodal with peaks at about 138.5 and 139.9.The detection of modes requires a fine tuning of the binwidth.Using methods from smoothing methodology(medldi,n.方法学)one can find an“optimal”binwidth h for n observations:,counterfeitkauntfitadj.假冒的,假装的,In Figure 1.7,we show histograms with x0=137.65(upper left),x0=137.75(lower left),with x0=137.85(upper right),and x0=137.95(lower right).All the graphs have been scaled equally on the y-axis to allow parison.One sees thatdespite the fixed binwidth hthe interpretation is not facilitated(fsiliteitid vt.使容易).The shift of the origin x0(to 4 different locations)created 4 different histograms.This property of histograms strongly contradicts the goal of presenting data features.,Modes of the density are detected with a histogram.Modes correspond to strong peaks in the histogram.Histograms with the same h need not be identical.They also depend on the origin x0 of the grid.The influence of the origin x0 is drastic.Changing x0 creates different looking histograms.The consequence of an h that is too large is an unstructured histogram that is too flat.A bin width h that is too small results in an unstable histogram.There is an“optimal”h=(24/n)1/3.It is remended to use averaged histograms.They are kernel densities.,Summary,1.4 Scatterplots,Scatterplots are bivariate or trivariate plots of variables(vribl)against each other.They help us understand relationships among the variables of a data set.A downward-sloping(slupi)scatter indicates that as we increase the variable on the horizontal axis,the variable on the vertical axis decreases(di:kri:s vt.减少).An analogous(nlgs adj.类似的)statement can be made for upward-sloping scatters.,Figure 1.12 plots the 5th column(upper inner frame)of the bank data against the 6th column(diagonal).The scatter is downward-sloping.As we already know from the previous section on marginal parison a good separation between genuine and counterfeit bank notes is visible for the diagonal variable.The sub-cloud in the upper half(circles)of Figure 1.12 corresponds to the true bank notes.As noted before,this separation is not distinct(adj.清楚的、明显),since the two groups overlap(,uvlp vt.重叠)somewhat.,Draftman绘图员,Scatterplots in two and three dimensions helps in identifying separated points,outliers or sub-clusters.Scatterplots help us in judging positive or negative dependencies.Draftman scatterplot matrices help detect structures conditioned on values of other variables.As the brush of a scatterplot matrix moves through a point cloud,we can study conditional dependence.,Summary,1.8 Data Set,Boston Housing Data Set,Variablevribladj.可变的,易变的,不定的n.变量,可变物,First Step:New Words第一类 高频词 160个,

    注意事项

    本文(应用多元统计分析.ppt)为本站会员(牧羊曲112)主动上传,三一办公仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。 若此文所含内容侵犯了您的版权或隐私,请立即通知三一办公(点击联系客服),我们立即给予删除!

    温馨提示:如果因为网速或其他原因下载失败请重新下载,重复下载不扣分。




    备案号:宁ICP备20000045号-2

    经营许可证:宁B2-20210002

    宁公网安备 64010402000987号

    三一办公
    收起
    展开