云计算时代的社交网络平台和技术_谷歌中国.ppt
《云计算时代的社交网络平台和技术_谷歌中国.ppt》由会员分享,可在线阅读,更多相关《云计算时代的社交网络平台和技术_谷歌中国.ppt(70页珍藏版)》请在三一办公上搜索。
1、2/25/2023,Ed Chang,1,云计算时代的社交网络平台和技术,张智威副院长,研究院,谷歌中国教授,电机工程系,加州大学,2/25/2023,Ed Chang,2,180 million(25%),208 million(3%),60 million(90%),60 million(29%),500 million,180 million,600 k,Engineering,Graduates,Mobile Phones,Broadband Users,Internet,Population,China,U.S.,China Opportunity China&US in 2006-
2、07,72 k,72000,2/25/2023,Ed Chang,3,Google China,Size(700)200 engineers 400 other employees Almost 100 internsLocationsBeijing(2005)Taipei(2006)Shanghai(2007),2/25/2023,Ed Chang,4,Organizing the Worlds Information,Socially,社区平台(Social Platform)云运算(Cloud Computing)结论与前瞻(Concluding Remarks),2/25/2023,E
3、d Chang,5,Web 1.0,.htm,.htm,.htm,.jpg,.jpg,.doc,.htm,.msg,.htm,.htm,2/25/2023,Ed Chang,6,Web with People(2.0),.htm,.jpg,.doc,.xls,.msg,2/25/2023,Ed Chang,7,+Social Platforms,.htm,.jpg,.doc,.xls,.msg,App(Gadget),App(Gadget),2/25/2023,Ed Chang,8,2/25/2023,Ed Chang,9,2/25/2023,Ed Chang,10,2/25/2023,Ed
4、Chang,11,2/25/2023,Ed Chang,12,开放社区平台,2/25/2023,Ed Chang,13,2/25/2023,Ed Chang,14,2/25/2023,Ed Chang,15,2/25/2023,Ed Chang,16,2/25/2023,Ed Chang,17,开放社区平台,社区平台,2/25/2023,Ed Chang,18,2/25/2023,Ed Chang,19,2/25/2023,Ed Chang,20,开放社区平台,社区平台,2/25/2023,Ed Chang,21,2/25/2023,Ed Chang,22,Social Graph,2/25/
5、2023,Ed Chang,23,2/25/2023,Ed Chang,24,What Users Want?,People care about other peoplecare about people they knowconnect to people they do not knowDiscover interesting informationbased on other peopleabout who other people areabout what other people are doing,2/25/2023,Ed Chang,25,Information Overfl
6、ow Challenge,Too many people,too many choices of forums and apps“I soon need to hire a full-time to manage my online social networks”Desiring a Social Network Recommendation System,2/25/2023,Ed Chang,26,Recommendation System,Friend RecommendationCommunity/Forum RecommendationApplication SuggestionAd
7、s Matching,2/25/2023,Ed Chang,27,Organizing the Worlds Information,Socially,社区平台(Social Platform)云运算(Cloud Computing)结论与前瞻(Concluding Remarks),2/25/2023,Ed Chang,28,picture source:http:/www.sis.pitt.edu,(1)数据在云端 不怕丢失 不必备份(2)软件在云端 不必下载 自动升级,(3)无所不在的云计算 任何设备 登录后就是你的(4)无限强大的云计算 无限空间 无限速度,业界趋势:云计算时代的到来,
8、2/25/2023,Ed Chang,29,互联网搜索:云计算的例子,1.用户输入查询关键字,Cloud Computing,2.分布式预处理数据以便为搜索提供服务:Google Infrastructure(thousands of commodity servers around the world)MapReduce for mass data processingGoogle File System,3.返回搜索结果,2/25/2023,Ed Chang,30,Given a matrix that“encodes”data,Collaborative Filtering,2/25/2
9、023,Ed Chang,31,Given a matrix that“encodes”data,Many applications(collaborative filtering):User Community User User Ads User Ads Community etc.,Users,Communities,2/25/2023,Ed Chang,32,Collaborative Filtering(CF)Breese,Heckerman and Kadie 1998,Memory-basedGiven user u,find“similar”users(k nearest ne
10、ighbors)Bought similar items,saw similar movies,similar profiles,etc.Different similarity measures yield different techniquesMake predictions based on the preferences of these“similar”usersModel-basedBuild a model of relationship between subject mattersMake predictions based on the constructed model
11、,2/25/2023,Ed Chang,33,Memory-Based Model Goldbert et al.1992;Resnik et al.1994;Konstant et al.1997,ProsSimplicity,avoid model-building stageConsMemory and Time consuming,uses the entire database every time to make a predictionCannot make prediction if the user has no items in common with other user
12、s,2/25/2023,Ed Chang,34,Model-Based ModelBreese et al.1998;Hoffman 1999;Blei et al.2004,ProsScalability,model is much smaller than the actual datasetFaster prediction,query the model instead of the entire datasetConsModel-building takes time,2/25/2023,Ed Chang,35,Algorithm Selection Criteria,Near-re
13、al-time RecommendationScalable TrainingIncremental Training is DesirableCan deal with data scarcityCloud Computing!,2/25/2023,Ed Chang,36,Model-based Prior Work,Latent Semantic Analysis(LSA)Probabilistic LSA(PLSA)Latent Dirichlet Allocation(LDA),2/25/2023,Ed Chang,37,Latent Semantic Analysis(LSA)Dee
14、rwester et al.1990,Map high-dimensional count vectors to lower dimensional representation called latent semantic spaceBy SVD decomposition:A=U VT,A=Word-document co-occurrence matrixUij=How likely word i belongs to topic jjj=How significant topic j isVijT=How likely topic i belongs to doc j,2/25/202
15、3,Ed Chang,38,Latent Semantic Analysis(cont.),LSA keeps k-largest singular valuesLow-rank approximation to the original matrixSave space,de-noisified and reduce sparsityMake recommendations using Word-word similarity:TDoc-doc similarity:T Word-doc relationship:,2/25/2023,Ed Chang,39,Probabilistic La
16、tent Semantic Analysis(PLSA)Hoffman 1999;Hoffman 2004,Document is viewed as a bag of wordsA latent semantic layer is constructed in between documents and wordsP(w,d)=P(d)P(w|d)=P(d)zP(w|z)P(z|d)Probability delivers explicit meaningP(w|w),P(d|d),P(d,w)Model learning via EM algorithm,2/25/2023,Ed Chan
17、g,40,PLSA extensions,PHITS Cohn&Chang 2000Model document-citation co-occurrenceA linear combination of PLSA and PHITS Cohn&Hoffmann 2001Model contents(words)and inter-connectivity of documentsLDA Blei et al.2003Provide a complete generative model with Dirichlet priorAT Griffiths&Steyvers 2004Include
18、 authorship informationDocument is categorized by authors and topicsART McCallum 2004Include email recipient as additional informationEmail is categorized by author,recipients and topics,2/25/2023,Ed Chang,41,Combinational Collaborative Filtering(CCF),Fuse multiple informationAlleviate the informati
19、on sparsity problemHybrid training schemeGibbs sampling as initializations for EM algorithmParallelizationAchieve linear speedup with the number of machines,2/25/2023,Ed Chang,42,Notations,Given a collection of co-occurrence dataCommunity:C=c1,c2,cNUser:U=u1,u2,uMDescription:D=d1,d2,dVLatent aspect:
20、Z=z1,z2,zKModelsBaseline modelsCommunity-User(C-U)modelCommunity-Description(C-D)modelCCF:Combinational Collaborative FilteringCombines both baseline models,2/25/2023,Ed Chang,43,Baseline Models,Community-User(C-U)model,Community-Description(C-D)model,Community is viewed as a bag of users c and u ar
21、e rendered conditionally independent by introducing z Generative process,for each user u 1.A community c is chosen uniformly 2.A topic z is selected from P(z|c)3.A user u is generated from P(u|z),Community is viewed as a bag of words c and d are rendered conditionally independent by introducing z Ge
22、nerative process,for each word d 1.A community c is chosen uniformly 2.A topic z is selected from P(z|c)3.A word d is generated from P(d|z),2/25/2023,Ed Chang,44,Baseline Models(cont.),Community-User(C-U)model,Community-Description(C-D)model,Pros 1.Personalized community suggestion Cons 1.C-U matrix
23、 is sparse,may suffer from information sparsity problem 2.Cannot take advantage of content similarity between communities,Pros 1.Cluster communities based on community content(description words)Cons 1.No personalized recommendation 2.Do not consider the overlapped users between communities,2/25/2023
24、,Ed Chang,45,CCF Model,Combinational Collaborative Filtering(CCF)model,CCF combines both baseline models A community is viewed as-a bag of users AND a bag of words By adding C-U,CCF can perform personalized recommendation which C-D alone cannot By adding C-D,CCF can perform better personalized recom
25、mendation than C-U alone which may suffer from sparsity Things CCF can do that C-U and C-D cannot-P(d|u),relate user to word-Useful for user targeting ads,2/25/2023,Ed Chang,46,Algorithm Requirements,Near-real-time RecommendationScalable TrainingIncremental Training is Desirable,2/25/2023,Ed Chang,4
- 配套讲稿:
如PPT文件的首页显示word图标,表示该PPT已包含配套word讲稿。双击word图标可打开word文档。
- 特殊限制:
部分文档作品中含有的国旗、国徽等图片,仅作为作品整体效果示例展示,禁止商用。设计者仅对作品中独创性部分享有著作权。
- 关 键 词:
- 计算 时代 社交 网络 平台 技术 中国
链接地址:https://www.31ppt.com/p-2814397.html