计算机专业英语考试作文题目.ppt
Data Mining,Yanci Zhang,What is Data Mining?,Extraction of implicit,previously unknown and potentially useful information from dataExploration&analysis of large quantities of dataautomatic or semi-automatic meansdiscover meaningful patterns,Process of Knowledge Discovery,Example:NBA 1/2,Play-by-play informationWho is on the courtWho shootsCoaches want to know Who works best?What strategies combination works best?,Example:NBA 2/2,Advanced Scout is a data mining tool to answer these questionsData collectionData preprocessing:cleaning,transformations,enrichmentData miningInterpretation and knowledge discovery,What is(not)Data Mining?,What is not data miningLook up phone number in phone directoryQuery a web search engine for information about“Amazon”What is data miningCertain names are more prevalent in certain US locations(OBrien,ORurke,OReilly in Boston area)Group together similar documents returned by search engine according to their context(e.g.Amazon rainforest,A),Why Data Mining?,data rich but information poorwe are drown in data,but starving for knowledge,Tasks,Prediction MethodsUse some variables to predict unknown or future values of other variablesDescription MethodsFind human-interpretable patterns that describe the data,Applications,Data analysis and decision supportMarket analysis and managementBeer and diapersRisk analysis and managementCredit card risk analysis and controlFraud detection and detection of unusual patterns,Applications,Text mining and Web miningStream data miningDNA and bio-data analysisSimilarity search and comparison among DNA sequencesAssociation analysis:identification of co-occurring gene sequencesPath analysis:linking genes to different disease development stagesVisualization tools and genetic data analysis,Challenges,ScalabilityDimensionalityComplex and Heterogeneous DataData QualityData Ownership and DistributionPrivacy PreservationStreaming Data,Assignments,GroupGroup16:PC and MAC 10Group17:PC and MAC 11Group18:What is augmented reality?Group38:What is Graphics Processing Units(GPU)?Individual:Write an English article:Applications of Data mining(300 words)Deadline:2011-11-10,