欢迎来到三一办公! | 帮助中心 三一办公31ppt.com(应用文档模板下载平台)
三一办公
全部分类
  • 办公文档>
  • PPT模板>
  • 建筑/施工/环境>
  • 毕业设计>
  • 工程图纸>
  • 教育教学>
  • 素材源码>
  • 生活休闲>
  • 临时分类>
  • ImageVerifierCode 换一换
    首页 三一办公 > 资源分类 > PPT文档下载  

    人工智能与数据挖掘教学课件lect113.ppt

    • 资源ID:5108057       资源大小:345KB        全文页数:39页
    • 资源格式: PPT        下载积分:10金币
    快捷下载 游客一键下载
    会员登录下载
    三方登录下载: 微信开放平台登录 QQ登录  
    下载资源需要10金币
    邮箱/手机:
    温馨提示:
    用户名和密码都是您填写的邮箱或者手机号,方便查询和重复下载(系统自动生成)
    支付方式: 支付宝    微信支付   
    验证码:   换一换

    加入VIP免费专享
     
    账号:
    密码:
    验证码:   换一换
      忘记密码?
        
    友情提示
    2、PDF文件下载后,可能会被浏览器默认打开,此种情况可以点击浏览器菜单,保存网页到桌面,就可以正常下载了。
    3、本站不支持迅雷下载,请使用电脑自带的IE浏览器,或者360浏览器、谷歌浏览器下载即可。
    4、本站资源下载后的文档和图纸-无水印,预览文档经过压缩,下载后原文更清晰。
    5、试题试卷类文档,如果标题没有明确说明有答案则都视为没有答案,请知晓。

    人工智能与数据挖掘教学课件lect113.ppt

    Part I,Data Mining Fundamentals,Chapter 1:Data Mining:A First View,姓春苔完烯姓炒仕择棠氛狮径哦固塞仙委闹塑塌谈溯撤翔竞砖豆裕韵南爬人工智能与数据挖掘教学课件lect-1-13人工智能与数据挖掘教学课件lect-1-13,6/5/2023,BUPT AI&DM,2,Content,1.1 What is Data Mining?Definition1.2 What can computers Learn?1.3 Is Data Mining Appropriate for My Problem?1.4 Expert Systems or Data Mining?1.6 Why Not Simple Search?,灭限滦章法朔哑夸到挛戳纵合雇什闯滥勿疑虾玄初励佬沙镜宝腑药窥和烃人工智能与数据挖掘教学课件lect-1-13人工智能与数据挖掘教学课件lect-1-13,6/5/2023,BUPT AI&DM,3,1.1 What is data mining:Motivation,Data explosion problem Automated data collection tools and mature database technology lead to tremendous amounts of data stored in databases,data warehouses and other information repositories.Such amount of data beyond human understanding.We are drowning in data,but starving for knowledge!Solution:Data warehousing and data miningData warehousing:for data storageData mining:for Extraction of interesting knowledge(rules,regularities,patterns,constraints)from data in large databases,治淹稍毅牟诗供豪冲阉氧堵盾族鳖卓华狰躇营睫恬揩雨啸矣筛锭纪件譬梳人工智能与数据挖掘教学课件lect-1-13人工智能与数据挖掘教学课件lect-1-13,6/5/2023,BUPT AI&DM,4,1.1 Data Mining is a result of natural evolution of information technology,1960s:Data collection and database creation1970s-early 1980s:Database Management SystemsMid-1980s-present:Data warehouseData analysis and understanding(data mining),房挟读汞益幢方芝统铡痔胎镇熔润鸥注师绘赏袖缩叠尧暮鼓缔湾乳富挖劈人工智能与数据挖掘教学课件lect-1-13人工智能与数据挖掘教学课件lect-1-13,6/5/2023,BUPT AI&DM,5,Data Analysis:New Trend,This is a time that one must speak with data.未来属于运算师(Super Crunchers超级运算师,Ian Ayres,2009):日常决策将变得越来越自动化,人的判断作用将局限于为计算提供数据葡萄酒味道和香味的预测:奥利.阿申费尔特是普林斯顿大学的经济学家,完全不懂葡萄酒的制作,但可以预测波尔多葡萄酒的价格基于天气(炎热、干燥的年份酒会非常好),准确率高于葡萄酒专家本书原计划叫“理论的终结”,后来利用google改书名而不是与出版社编辑讨论,因为发现用此名点击率高63%放贷员曾经收入优厚、职责最大,现在只是呼叫中心的接线员,重复电脑提示的问题,报酬很低,蛤繁肾汀仆贩响赠字盗莎蠢基烈帕痛封斯锈控欧羞孪枕押画貌渐勿逾露枚人工智能与数据挖掘教学课件lect-1-13人工智能与数据挖掘教学课件lect-1-13,6/5/2023,BUPT AI&DM,6,Data Analysis:New Trend(cont.),This is a time that one must speak with data.基因测序和新物种:克雷格.文特尔使用能够分析数据的高速计算机,从给单个生物基因排序,2003年开始给海洋测序,2005年给空气测序。这个过程中发现了数千种以前不知道的细菌和其它生命形式。他对生物学的推进比同辈所有人都大。,妄波催导赣信臼译棕烁壁獭据铭嗅唇巫瘴硫瞬叠扎钧丝转损哦番亥掏恫糜人工智能与数据挖掘教学课件lect-1-13人工智能与数据挖掘教学课件lect-1-13,6/5/2023,BUPT AI&DM,7,在过去,上海通用保修问题分析主要依靠简单的纯手工处理的计算方式,每次只能产生寥寥几篇问题报告。尽管汽车生产量远不如现在大,但这个耗时费力的分析周期却在根本上导致了保修成本居高不下。在非自动操作环境下,从保修索赔出现到找出问题原因平均要花费612个月的时间,且在此间往往还需要借助于通用全球的支持,解决问题的整个过程也主要建立在经验分析的基础上。另外,不准确的数据导致上海通用难以准确预测保修成本,从而合理准备下一周期的保修预算,导致大量运营资金被占用、现金流降低。采用SAS的保修分析解决方案后,上海通用的保修分析周期在头6个月里就缩短了70%,有效地降低了保修成本,实现了该系统使用的预期目标。同时,这些显著的改善效果帮助上海通用在短短半年内就收回了保修分析系统所有的软硬件投资,共为公司节省了1,800万人民币的成本。警察地理信息系统,姥乳呐经障触碎善毡肘坠斡理概档苞哪硼进匈驼驭船料揽毗胳怒唯踞饵奏人工智能与数据挖掘教学课件lect-1-13人工智能与数据挖掘教学课件lect-1-13,6/5/2023,BUPT AI&DM,8,Data Mining Definitions,(1)The process of employing one or more computer learning techniques to automatically analyze and extract knowledge from data.(in this text book)(2)Extraction of interesting(non-trivial,implicit,previously unknown and potentially useful)information or patterns from data in large databases.(generally accepted),冒凭轨并灸稚俱粱垮组刷豌哑掣骗灵瞥蛆偏黑貉娠勋桅狮忌妆八琉嘛添炼人工智能与数据挖掘教学课件lect-1-13人工智能与数据挖掘教学课件lect-1-13,6/5/2023,BUPT AI&DM,9,Induction-based Learning(基于归纳的学习),Data mining methods use induction-based learningThe process of forming general concept definitions by observing specific examples of concepts to be learned.,毡野次毯旅疟旷义散俏选夏番腻扇铀踩赃官毡浚舟绅骋足枫耕岿课住铸搔人工智能与数据挖掘教学课件lect-1-13人工智能与数据挖掘教学课件lect-1-13,6/5/2023,BUPT AI&DM,10,What Is Data Mining?,Alternative names:Data mining or knowledge mining?Gold mining-poor analogyKnowledge discovery in databases(KDD),business intelligence,茅代撞盲橙挫卑屠炳王覆公胞洪腮钙况翟同澎喊呻屹佣娃贮规呀瑶栅糟廉人工智能与数据挖掘教学课件lect-1-13人工智能与数据挖掘教学课件lect-1-13,6/5/2023,BUPT AI&DM,11,Why Data Mining?Potential Applications(or p4),Database analysis and decision supportMarket analysis and managementtarget marketing,cross selling,market segmentationRisk analysis and managementForecasting,customer retention,quality controlFraud detection and managementOther ApplicationsText mining(news group,email,documents)and Web analysis.,搞拉背趋蛮圈扑喝播柄锥掇诧嚏废又臀飘疯酚玻慕撮团脱烛释饵歧渤蔑忘人工智能与数据挖掘教学课件lect-1-13人工智能与数据挖掘教学课件lect-1-13,6/5/2023,BUPT AI&DM,12,Content,1.1 What is Data Mining?Definition1.2 What can computers Learn?Four Levels of Learning(略)Three Concept Views(略)Supervised LearningUnsupervised Learning1.3 Is Data Mining Appropriate for My Problem?1.4 Expert Systems or Data Mining?1.6 Why Not Simple Search?,拨姬荧菇蚁愈宛偶尘班蘑庸劫鞘獭狈灰害锣嫩秽氨谴逛歌巢效楼鼎突烹邱人工智能与数据挖掘教学课件lect-1-13人工智能与数据挖掘教学课件lect-1-13,6/5/2023,BUPT AI&DM,13,1.2.1 Supervised Learning,Build a learner model using data instances of known origin.Use the model to determine the outcome of new instances of unknown origin.,罗实俏讶信筐揍爽掘根择蓝泊悠斩澄嘲跳牛村脏咳授彼口赃妻召股蹬撬米人工智能与数据挖掘教学课件lect-1-13人工智能与数据挖掘教学课件lect-1-13,6/5/2023,BUPT AI&DM,14,Attributes:input attributes,output attributesProcess:Training Data,Test DataLearning outcome:tree,production rules,槛物扮抢拖岸漫扑妒熬既陨斧双渔低缝寸掖鳞弊炸射捣意汞帘后袄弱餐隆人工智能与数据挖掘教学课件lect-1-13人工智能与数据挖掘教学课件lect-1-13,6/5/2023,BUPT AI&DM,15,驹策翼曲酌讳蔑舷咨贡凝炳欧脱窃钦戴台侍睛春钉诱勘怒载够媚摩弟波舅人工智能与数据挖掘教学课件lect-1-13人工智能与数据挖掘教学课件lect-1-13,6/5/2023,BUPT AI&DM,16,Decision tree:A tree structure where non-terminal nodes represent tests on one or more attributes and terminal nodes(leaf nodes)reflect decision outcomes.root node,绣亲淬蜒潮捕玄滁艺炙升硷览瀑兼祝糟贡朱诧黄忙腹蒲职涡傣杨黎甸捣仿人工智能与数据挖掘教学课件lect-1-13人工智能与数据挖掘教学课件lect-1-13,6/5/2023,BUPT AI&DM,17,Production Rules(产生式规则),IF Swollen Glands=Yes THEN Diagnosis=Strep ThroatIF Swollen Glands=No&Fever=Yes THEN Diagnosis=ColdIF Swollen Glands=No&Fever=No THEN Diagnosis=Allergy,Antecedent conditions:先决条件Consequent conditions:结论,筏棘硒唉卯兜霉劝夕莎治轧悠纬始甩腑颖帐亭点胞唾蒋叠专扦裤纯鼓俄看人工智能与数据挖掘教学课件lect-1-13人工智能与数据挖掘教学课件lect-1-13,6/5/2023,BUPT AI&DM,18,1.2.2 Unsupervised Clustering,A data mining method that builds models from data without predefined classes.,联奥克噶假案满购糙策价卤桶糙访依柬妮驱吾屋牺杯拱滁荚湾宇绅鹊抠耳人工智能与数据挖掘教学课件lect-1-13人工智能与数据挖掘教学课件lect-1-13,6/5/2023,BUPT AI&DM,19,The Acme Investors Dataset,袜辙临帅涯舆姨蒂怎哦部亲遭足空话庙郝坟屯桓堂讯植獭幌缠忧陈湾萨嘶人工智能与数据挖掘教学课件lect-1-13人工智能与数据挖掘教学课件lect-1-13,The Acme Investors Dataset&Supervised Learning,Can I develop a general profile of an online investor?Can I determine if a new customer is likely to open a margin account?Can I build a model to accurately predict the average number of trades per month for a new investor?What characteristics differentiate female and male investors?,冲顶冯览锭侦鹊梧晚泵红凑择吼瑶衣粕华缠昨鳃肌砖象圈半涌昂哨道闹帽人工智能与数据挖掘教学课件lect-1-13人工智能与数据挖掘教学课件lect-1-13,What attribute similarities group customers of Acme Investors together?What differences in attribute values segment the customer database?,The Acme Investors Dataset&Unsupervised Clustering,分共辙褥协尉锚祁才龟拽枚愧率线郡炉腥掩怔脸足柒弦苯涵姐番尝析繁隘人工智能与数据挖掘教学课件lect-1-13人工智能与数据挖掘教学课件lect-1-13,6/5/2023,BUPT AI&DM,22,IF Margin Account=Yes&Age=20-29&Annual Income=40-59kTHEN Cluster=1accuracy=0.80,coverage=0.50IF Account Type=Custodial&Favorite Recreation=Skiing&Annual Income=80-90kTHEN Cluster=2accuracy=0.95,coverage=0.35IF Account Type=Joint&Trades/Month 5&Transaction Method=OnlineTHEN Cluster=3accuracy=0.82,coverage=0.65,(see example clusters on p13),五依划咨粱诫虫输郭肯扛嚏邹筋眩虾凤钻烟辊巾衰灸甭溅硕闲蜗矮沮携惹人工智能与数据挖掘教学课件lect-1-13人工智能与数据挖掘教学课件lect-1-13,6/5/2023,BUPT AI&DM,23,Content,1.1 What is Data Mining?Definition1.2 What can computers Learn?1.3 Is Data Mining Appropriate for My Problem?(Data Mining vs Data Query)1.4 Expert Systems or Data Mining?1.6 Why Not Simple Search?,魔裙宛娱滔蹲妻控庶秒皆冕诚绣篡潮富偶翔刃拂暮桨彰丸逐暑阐趋犯碑施人工智能与数据挖掘教学课件lect-1-13人工智能与数据挖掘教学课件lect-1-13,6/5/2023,BUPT AI&DM,24,Data Mining or Data Query?,Shallow Knowledge:Shallow knowledge is factual.It can be easily stored and manipulated in a database.Multidimensional Knowledge:Multidimensional knowledge is also factual.On-line analytical Processing(OLAP)tools are used to manipulate multidimensional knowledge.Hidden Knowledge:Hidden knowledge represents patterns or regularities in data that cannot be easily found using database query.However,data mining algorithms can find such patterns with ease(example p15).Deep Knowledge:Deep knowledge is knowledge stored in a database that can only be found if we are given some direction about what we are looking for.,俐炕川絮军热钡仅粕卓识媚肛舔敲拢喝伊限恨咋免篙彰坪速悼敛摸惨菌箭人工智能与数据挖掘教学课件lect-1-13人工智能与数据挖掘教学课件lect-1-13,Data Mining vs.Data Query:An Example(p16),Use data query if you already almost know what you are looking for.Use data mining to find regularities in data that are not obvious.,淖囤尽吊妨口挽墩柿痪蜀峪总殃擎箕托奋每爸惦领枕骡蜒尧省嫁谭肉独际人工智能与数据挖掘教学课件lect-1-13人工智能与数据挖掘教学课件lect-1-13,6/5/2023,BUPT AI&DM,26,Content,1.1 What is Data Mining?Definition1.2 What can computers Learn?1.3 Is Data Mining Appropriate for My Problem?(Data Mining vs Data Query)1.4 Expert Systems or Data Mining?(Data Mining vs ES)1.6 Why Not Simple Search?,财业扔忱藉扼哩妆府支液芥芬辗叹漓漫鸵斩郴烈束焉观股芒扛脐铂报蕊龋人工智能与数据挖掘教学课件lect-1-13人工智能与数据挖掘教学课件lect-1-13,6/5/2023,BUPT AI&DM,27,1.4 Expert Systems or Data Mining?,Expert System(ES):A computer program that emulates the problem-solving skills of one or more human experts.Used when no(quality)data available,or,in the field where human has good knowledge in it.Experts learn their skills by education and experience.Human experts often use rules to describe what they know.ES and DM can work together.,济朗酵裹侧闪绕澡冀盎雅蓝利骂浮寿暂忍彪婚沸帚榷愤负辖牙给詹懊叭桅人工智能与数据挖掘教学课件lect-1-13人工智能与数据挖掘教学课件lect-1-13,6/5/2023,BUPT AI&DM,28,祝聋纤脱岸尘零瘸想腮姑沃玲酿哮伍王特系予殴毁刽梅琼嵌酗衍候莫蒸通人工智能与数据挖掘教学课件lect-1-13人工智能与数据挖掘教学课件lect-1-13,6/5/2023,BUPT AI&DM,29,Content,1.1 What is Data Mining?Definition1.2 What can computers Learn?1.3 Is Data Mining Appropriate for My Problem?(Data Mining vs Data Query)1.4 Expert Systems or Data Mining?(Data Mining vs ES)1.6 Why Not Simple Search?(Data Mining vs Nearest Neighbor Approach),每怯注乐责葵舶秸坊蝴疽嘱波谨碎艳诺戴贱怪愧见硫譬披赞汝往距脉宋噎人工智能与数据挖掘教学课件lect-1-13人工智能与数据挖掘教学课件lect-1-13,6/5/2023,BUPT AI&DM,30,1.6 Why Not Simple Search?,Stores instances or generalized model of the data.Nearest Neighbor ClassifierClassification is performed by searching the training data for the instance closest in distance to the unknown instance.Advantage:suitable for areas where human has limited knowledgeProblem:Slow when number of cases is largeAttributes need pre-process.Example:p22.Most similar instance is 4.But it gives a wrong conclusion.K-nearest Neighbor ClassifierHelp ward off a single atypical training instance.Still need to determine relevant attributes,扮榴鼻焊便帕羹麓姆垮冶辩貌栖美喻当轴零杯踢撒古兔腹渴喘猴提掘役毋人工智能与数据挖掘教学课件lect-1-13人工智能与数据挖掘教学课件lect-1-13,6/5/2023,BUPT AI&DM,31,Discussion,When to use ES or DM?,雄筏标享犯按片莫虏恋帐孝垣租奖哗遇迟丰既米跪揣贺逗肋吩蚊斜跌肖垄人工智能与数据挖掘教学课件lect-1-13人工智能与数据挖掘教学课件lect-1-13,1.5 A Simple Data Mining Process Model,主嫁宾蚤尧兔跟燥雅屁乌粤靛息破仁吹硬污乒据胁衅厄唤橙屡孤胞慕呆柑人工智能与数据挖掘教学课件lect-1-13人工智能与数据挖掘教学课件lect-1-13,Figure 1.3 A simple data mining process model,澜糊层捕丫汰袖喷气透疾咒钟峨垂堤览涡当兄柜钳双沤犁统饵档接把藕凑人工智能与数据挖掘教学课件lect-1-13人工智能与数据挖掘教学课件lect-1-13,Assembling the Data,The Data Warehouse Relational Databases and Flat Files,犯掏栅扰幢疤庆瘪最久砒妓氛囤都胞韦浊困九堵除轴梢茂泥渔备崭慌育割人工智能与数据挖掘教学课件lect-1-13人工智能与数据挖掘教学课件lect-1-13,The Data Warehouse,The data warehouse is a historical database designed for decision support.,单陶掩馒宁副依眯攒储狭病揍茂消陌倒仰拽腆特饿膨抠侮赂掖膘走篇唇涤人工智能与数据挖掘教学课件lect-1-13人工智能与数据挖掘教学课件lect-1-13,Mining the Data,氓辟淖酿辗汲矾颅紧竣母圾福栈馁纹萝牢央潮黑架羽阮干味袁倍瞳客史嚼人工智能与数据挖掘教学课件lect-1-13人工智能与数据挖掘教学课件lect-1-13,Interpreting the Results,脉驮妄辉洗昔者蹦廓波延每浅弯儡模箕二厂撤荡菱警花场舱饭鸣廊钟棺蛋人工智能与数据挖掘教学课件lect-1-13人工智能与数据挖掘教学课件lect-1-13,Result Application,隋腮膜焕邦哲踌稚愚善泣眩玫青丛滓埠拜坡扦湾怕简奄乱韶甄蠕幼良沦柑人工智能与数据挖掘教学课件lect-1-13人工智能与数据挖掘教学课件lect-1-13,6/5/2023,AI&DM,39,A KDD Process(by Han),Data mining:the core of knowledge discovery process.,Data Cleaning,Data Integration,Databases,Data Warehouse,Knowledge,Task-relevant Data,Selection,Data Mining,Pattern Evaluation,雅祁暇佑迟暑牢誊君肛云删驶忘曰乾末坯唬挟色目查磺异酝寇粕胃敌砌入人工智能与数据挖掘教学课件lect-1-13人工智能与数据挖掘教学课件lect-1-13,

    注意事项

    本文(人工智能与数据挖掘教学课件lect113.ppt)为本站会员(sccc)主动上传,三一办公仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。 若此文所含内容侵犯了您的版权或隐私,请立即通知三一办公(点击联系客服),我们立即给予删除!

    温馨提示:如果因为网速或其他原因下载失败请重新下载,重复下载不扣分。




    备案号:宁ICP备20000045号-2

    经营许可证:宁B2-20210002

    宁公网安备 64010402000987号

    三一办公
    收起
    展开