Google-云计算时代的社交网络平台和技术(1).ppt
《Google-云计算时代的社交网络平台和技术(1).ppt》由会员分享,可在线阅读,更多相关《Google-云计算时代的社交网络平台和技术(1).ppt(70页珍藏版)》请在三一办公上搜索。
1、2/4/2023,Ed Chang,1,云计算时代的社交网络平台和技术,张智威副院长,研究院,谷歌中国教授,电机工程系,加州大学,2/4/2023,Ed Chang,2,180 million(25%),208 million(3%),60 million(90%),60 million(29%),500 million,180 million,600 k,Engineering,Graduates,Mobile Phones,Broadband Users,Internet,Population,China,U.S.,China Opportunity China&US in 2006-07
2、,72 k,72000,2/4/2023,Ed Chang,3,Google China,Size(700)200 engineers 400 other employees Almost 100 internsLocationsBeijing(2005)Taipei(2006)Shanghai(2007),2/4/2023,Ed Chang,4,Organizing the Worlds Information,Socially,社区平台(Social Platform)云运算(Cloud Computing)结论与前瞻(Concluding Remarks),2/4/2023,Ed Cha
3、ng,5,Web 1.0,.htm,.htm,.htm,.jpg,.jpg,.doc,.htm,.msg,.htm,.htm,2/4/2023,Ed Chang,6,Web with People(2.0),.htm,.jpg,.doc,.xls,.msg,2/4/2023,Ed Chang,7,+Social Platforms,.htm,.jpg,.doc,.xls,.msg,App(Gadget),App(Gadget),2/4/2023,Ed Chang,8,2/4/2023,Ed Chang,9,2/4/2023,Ed Chang,10,2/4/2023,Ed Chang,11,2/
4、4/2023,Ed Chang,12,开放社区平台,2/4/2023,Ed Chang,13,2/4/2023,Ed Chang,14,2/4/2023,Ed Chang,15,2/4/2023,Ed Chang,16,2/4/2023,Ed Chang,17,开放社区平台,社区平台,2/4/2023,Ed Chang,18,2/4/2023,Ed Chang,19,2/4/2023,Ed Chang,20,开放社区平台,社区平台,2/4/2023,Ed Chang,21,2/4/2023,Ed Chang,22,Social Graph,2/4/2023,Ed Chang,23,2/4/20
5、23,Ed Chang,24,What Users Want?,People care about other peoplecare about people they knowconnect to people they do not knowDiscover interesting informationbased on other peopleabout who other people areabout what other people are doing,2/4/2023,Ed Chang,25,Information Overflow Challenge,Too many peo
6、ple,too many choices of forums and apps“I soon need to hire a full-time to manage my online social networks”Desiring a Social Network Recommendation System,2/4/2023,Ed Chang,26,Recommendation System,Friend RecommendationCommunity/Forum RecommendationApplication SuggestionAds Matching,2/4/2023,Ed Cha
7、ng,27,Organizing the Worlds Information,Socially,社区平台(Social Platform)云运算(Cloud Computing)结论与前瞻(Concluding Remarks),2/4/2023,Ed Chang,28,picture source:http:/www.sis.pitt.edu,(1)数据在云端 不怕丢失 不必备份(2)软件在云端 不必下载 自动升级,(3)无所不在的云计算 任何设备 登录后就是你的(4)无限强大的云计算 无限空间 无限速度,业界趋势:云计算时代的到来,2/4/2023,Ed Chang,29,互联网搜索:云
8、计算的例子,1.用户输入查询关键字,Cloud Computing,2.分布式预处理数据以便为搜索提供服务:Google Infrastructure(thousands of commodity servers around the world)MapReduce for mass data processingGoogle File System,3.返回搜索结果,2/4/2023,Ed Chang,30,Given a matrix that“encodes”data,Collaborative Filtering,2/4/2023,Ed Chang,31,Given a matrix
9、that“encodes”data,Many applications(collaborative filtering):User Community User User Ads User Ads Community etc.,Users,Communities,2/4/2023,Ed Chang,32,Collaborative Filtering(CF)Breese,Heckerman and Kadie 1998,Memory-basedGiven user u,find“similar”users(k nearest neighbors)Bought similar items,saw
10、 similar movies,similar profiles,etc.Different similarity measures yield different techniquesMake predictions based on the preferences of these“similar”usersModel-basedBuild a model of relationship between subject mattersMake predictions based on the constructed model,2/4/2023,Ed Chang,33,Memory-Bas
11、ed Model Goldbert et al.1992;Resnik et al.1994;Konstant et al.1997,ProsSimplicity,avoid model-building stageConsMemory and Time consuming,uses the entire database every time to make a predictionCannot make prediction if the user has no items in common with other users,2/4/2023,Ed Chang,34,Model-Base
12、d ModelBreese et al.1998;Hoffman 1999;Blei et al.2004,ProsScalability,model is much smaller than the actual datasetFaster prediction,query the model instead of the entire datasetConsModel-building takes time,2/4/2023,Ed Chang,35,Algorithm Selection Criteria,Near-real-time RecommendationScalable Trai
13、ningIncremental Training is DesirableCan deal with data scarcityCloud Computing!,2/4/2023,Ed Chang,36,Model-based Prior Work,Latent Semantic Analysis(LSA)Probabilistic LSA(PLSA)Latent Dirichlet Allocation(LDA),2/4/2023,Ed Chang,37,Latent Semantic Analysis(LSA)Deerwester et al.1990,Map high-dimension
14、al count vectors to lower dimensional representation called latent semantic spaceBy SVD decomposition:A=U VT,A=Word-document co-occurrence matrixUij=How likely word i belongs to topic jjj=How significant topic j isVijT=How likely topic i belongs to doc j,2/4/2023,Ed Chang,38,Latent Semantic Analysis
15、(cont.),LSA keeps k-largest singular valuesLow-rank approximation to the original matrixSave space,de-noisified and reduce sparsityMake recommendations using Word-word similarity:TDoc-doc similarity:T Word-doc relationship:,2/4/2023,Ed Chang,39,Probabilistic Latent Semantic Analysis(PLSA)Hoffman 199
16、9;Hoffman 2004,Document is viewed as a bag of wordsA latent semantic layer is constructed in between documents and wordsP(w,d)=P(d)P(w|d)=P(d)zP(w|z)P(z|d)Probability delivers explicit meaningP(w|w),P(d|d),P(d,w)Model learning via EM algorithm,2/4/2023,Ed Chang,40,PLSA extensions,PHITS Cohn&Chang 20
17、00Model document-citation co-occurrenceA linear combination of PLSA and PHITS Cohn&Hoffmann 2001Model contents(words)and inter-connectivity of documentsLDA Blei et al.2003Provide a complete generative model with Dirichlet priorAT Griffiths&Steyvers 2004Include authorship informationDocument is categ
18、orized by authors and topicsART McCallum 2004Include email recipient as additional informationEmail is categorized by author,recipients and topics,2/4/2023,Ed Chang,41,Combinational Collaborative Filtering(CCF),Fuse multiple informationAlleviate the information sparsity problemHybrid training scheme
19、Gibbs sampling as initializations for EM algorithmParallelizationAchieve linear speedup with the number of machines,2/4/2023,Ed Chang,42,Notations,Given a collection of co-occurrence dataCommunity:C=c1,c2,cNUser:U=u1,u2,uMDescription:D=d1,d2,dVLatent aspect:Z=z1,z2,zKModelsBaseline modelsCommunity-U
20、ser(C-U)modelCommunity-Description(C-D)modelCCF:Combinational Collaborative FilteringCombines both baseline models,2/4/2023,Ed Chang,43,Baseline Models,Community-User(C-U)model,Community-Description(C-D)model,Community is viewed as a bag of users c and u are rendered conditionally independent by int
21、roducing z Generative process,for each user u 1.A community c is chosen uniformly 2.A topic z is selected from P(z|c)3.A user u is generated from P(u|z),Community is viewed as a bag of words c and d are rendered conditionally independent by introducing z Generative process,for each word d 1.A commun
22、ity c is chosen uniformly 2.A topic z is selected from P(z|c)3.A word d is generated from P(d|z),2/4/2023,Ed Chang,44,Baseline Models(cont.),Community-User(C-U)model,Community-Description(C-D)model,Pros 1.Personalized community suggestion Cons 1.C-U matrix is sparse,may suffer from information spars
23、ity problem 2.Cannot take advantage of content similarity between communities,Pros 1.Cluster communities based on community content(description words)Cons 1.No personalized recommendation 2.Do not consider the overlapped users between communities,2/4/2023,Ed Chang,45,CCF Model,Combinational Collabor
24、ative Filtering(CCF)model,CCF combines both baseline models A community is viewed as-a bag of users AND a bag of words By adding C-U,CCF can perform personalized recommendation which C-D alone cannot By adding C-D,CCF can perform better personalized recommendation than C-U alone which may suffer fro
25、m sparsity Things CCF can do that C-U and C-D cannot-P(d|u),relate user to word-Useful for user targeting ads,2/4/2023,Ed Chang,46,Algorithm Requirements,Near-real-time RecommendationScalable TrainingIncremental Training is Desirable,2/4/2023,Ed Chang,47,Parallelizing CCF,Details omitted,2/4/2023,Ed
- 配套讲稿:
如PPT文件的首页显示word图标,表示该PPT已包含配套word讲稿。双击word图标可打开word文档。
- 特殊限制:
部分文档作品中含有的国旗、国徽等图片,仅作为作品整体效果示例展示,禁止商用。设计者仅对作品中独创性部分享有著作权。
- 关 键 词:
- Google 计算 时代 社交 网络 平台 技术
链接地址:https://www.31ppt.com/p-2235406.html