Selected highlights of Coursera Social Networking course, taught by Prof. Lada Adamic of the Univ. of Michigan. Presented at the annual Annual RTP Analytics Unconference, May 4, 2013

Published on: **Mar 4, 2016**

- 1. Consolidated Behaviors and Attitudes1Analyzing NetworksAn Overview, and Discussion of NetworkAnalysis (NA) and Social Network Analysis (SNA)Prepared for 2013 AnalyticsCamp:An Annual Unconference , Held in theResearch Triangle Park, NC Area, on May 4, 2013By Bruce ConnerConsolidated Behaviors and Attitudes
- 2. Consolidated Behaviors and Attitudes2Full Disclosure• I just finished the Social Networking course,on Coursera, taught by Lada Adamic, Assoc.Prof. of Information at the Univ. of Michigan– All of the content of this deck is derivedfrom that course (not original)– For purposes of this unconference, I willnot be further citing or footnoting thiscontent
- 3. Consolidated Behaviors and Attitudes3My Interest in Social Networking Analysis (SNA)• Interest in marketing analytics and quantitative market research– Rise of social media and social marketing– Big data and marketing analytics– The strengths and weaknesses of behavioral data (Web, mobile, CRM,transactional, scanner, telemetry, etc.) in marketing applications• A long-term interest in clustering and segmentation as tools ofidentifying and targeting of products, services, and messages: cansocial relationships and social communities enhance this?• Marketing issues such as:– The role of opinion leaders in influencing brand preferences and purchases ofgoods and services– Diffusion of products, services, innovations, brands, preferences, etc.– Formation of preferences for products/services/brands– Targeted marketing to communities and individuals in those communities
- 4. Consolidated Behaviors and Attitudes4Agenda• Brief introduction to the applications and issues thatSocial Networking Analysis (SNA) – and, more broadlyNetwork Analysis (NA) -- try to deal with• Brief overview of some methods, approaches, andstatistics involved• Possible Discussion Topics:– Who is currently using SNA (or NA) -- and what are yourapplications?– How (else) might SNA (or NA) be used in your work?– Specifically, how might SNA (or NA) be used in marketing,product development, or other business applications (orother applications– Other topics/questions/thoughts?
- 5. Consolidated Behaviors and Attitudes5Quick Overview of SNA Applications
- 6. Consolidated Behaviors and Attitudes6Quick Overview of Applications of SNA:Anti-Terrorism and National Security
- 7. Consolidated Behaviors and Attitudes7A Quick Overview ofApplications of SNA (2)• Anti-terrorism• Criminal justice– Conspiracy (e.g., Enron)– Insider trading– Fraud
- 8. Consolidated Behaviors and Attitudes8A Quick Overview ofApplications of SNA (3)• Anti-terrorism• Criminal justice• Social media
- 9. Consolidated Behaviors and Attitudes9A Quick Overview ofApplications of SNA (4)• Anti-terrorism• Criminal justice• Social media• Gaming–Game (Social) Experience–Recruitment/virality/engagement/retention/conversion
- 10. Consolidated Behaviors and Attitudes10And Some MoreApplications of SNA (5)• Organizational analysis/communities of practice• Marketing based on affiliationwith “communities”• Inputs to clustering/segmentation/ profiling• Biological networks(health care, genomics,etc.)• Predictive analytics (e.g.,predicting improvementsin recipes based oningredient networks)• Sociology/Economics/Political Science/etc.• Computer networks
- 11. Consolidated Behaviors and Attitudes11Kinds of Questions SNA Addresses
- 12. Consolidated Behaviors and Attitudes12Kinds Of Questions thatSNA/NA Address• How do networks form and grow?– Compare real-world networks (e.g., the Internet,Facebook, biological networks) with varioustheoretical models• Do the theoretical models help explain the behavior and growthdynamics of the real network?• Example: Randomly-formed network vs. “preferentialattachment”
- 13. Consolidated Behaviors and Attitudes13Kinds Of Questions thatSNA/NA Address (2)• How does network structure (topology) affect theway that information disseminates -- or thatinfections spread???
- 14. Consolidated Behaviors and Attitudes14Kinds Of Questions thatSNA/NA Address (3)• Based on the number, strength, directionality, and/orcharacteristics/attributes of “links,” … andcharacteristics of individuals/nodes …… how do we identify (and characterize)communities???
- 15. Consolidated Behaviors and Attitudes15Quick Look at SNA/NA Data
- 16. Consolidated Behaviors and Attitudes16What are networks?• Networks are sets of nodes connected by edges.“Network” ≡ “Graph”points linesvertices edges, arcs mathnodes links computer sciencesites bonds physicsactors ties, relations sociologynodeedge
- 17. Consolidated Behaviors and Attitudes17Network elements: edges• Directed (also called arcs, links)– A -> B• A likes B, A gave a gift to B, A is B’s child• Undirected– A <-> B or A – B• A and B like each other• A and B are siblings• A and B are co-authors
- 18. Consolidated Behaviors and Attitudes18Directed networksAdaCoraLouiseJeanHelenMarthaAliceRobinMarionMaxineLenaHazel HildaFrancesEvaRuthEdnaAdeleJaneAnnaMaryBettyEllaEllenLauraIrene• Girls’ school dormitory dining-table partners, 1st and 2nd choices (Moreno,The sociometry reader, 1960)
- 19. Consolidated Behaviors and Attitudes19Example Adjacency Matrix123450 0 0 0 00 0 1 1 00 1 0 1 00 0 0 0 11 1 0 0 0A =
- 20. Consolidated Behaviors and Attitudes20Graph Data: 2 Tables(Nodes and Edges)
- 21. Consolidated Behaviors and Attitudes212 Ways that NA is Different FromConventional (Frequentist) Statistics• Non-independence of “edge rows”:– Example: if I am “linked” to two individuals, it often increases theprobability that they are linked to each other– Implication: one cannot necessarily use statistical tests based on statisticalindependence, normal distribution, etc., to understand statisticalsignificance• Exploration of real-world “graphs” by comparing them to varioushypothetical (strawman) models– A Monte Carlo approach:• Generate large numbers of graphs based on hypothetical models• Compare the various characteristic of real world graph to thedistribution of same characteristics of the multiple hypotheticalgraphs to test the null hypothesis that the real graph issignificantly different than the hypothetical graphs
- 22. Consolidated Behaviors and Attitudes22A Brief Look at Two Topologies
- 23. Consolidated Behaviors and Attitudes23Erdös-Renyi Random Graph:Simplest Network Model• Assumptions– Nodes connect at random– Network is undirected• Key parameters– Number of nodes N– Either “p” or “M”• p = probability that any two nodes share an edge• M = total number of edges in the graph
- 24. Consolidated Behaviors and Attitudes24What ER RandomNetworks Look Likeafter springlayout
- 25. Consolidated Behaviors and Attitudes25Preferential Attachment Networks• Preferential attachment of growingnetworks:– New nodes prefer to attach to well-connected nodes over less-well connectednodes• Process also known as– Cumulative advantage– Rich-get-richer– Matthew effect
- 26. Consolidated Behaviors and Attitudes26Preferential Growth
- 27. Consolidated Behaviors and Attitudes27A Sample of Network Statistics
- 28. Consolidated Behaviors and Attitudes28Node Statistics• Node network properties– From immediate connections• indegreehow many directed edges (arcs) are incident on a node• outdegreehow many directed edges (arcs) originate at a node• degree (in or out)number of edges incident on a node– From the entire graph• Centrality (betweenness, closeness)outdegree=2indegree=3degree=5
- 29. Consolidated Behaviors and Attitudes29Giant Component• if the largest component encompasses a significant fraction of the graph, it iscalled the giant component
- 30. Consolidated Behaviors and Attitudes30average degreesizeofgiantcomponent “Percolation Threshold”av deg = 0.99 av deg = 1.18 av deg = 3.96Percolation threshold: how many edges needto be added before the giant componentappears?As the average degree increases to z = 1, agiant component suddenly appears
- 31. Consolidated Behaviors and Attitudes31Shortest Path – AndAverage Shortest Path• How many hops between two nodes?• On average, how many hops between eachpair of nodes
- 32. Consolidated Behaviors and Attitudes32Centrality
- 33. Consolidated Behaviors and Attitudes33Nodes are sized by degree, and colored by betweenness.Betweenness: Example
- 34. Consolidated Behaviors and Attitudes34Closeness ExampleYXYXYXYX
- 35. Consolidated Behaviors and Attitudes35Example of Eigenvector Centrality (aRecursive Measure) in Directed Networks• PageRank brings order to the Web:– its not just the pages that point to you, but how manypages point to those pages, etc.– more difficult to artificially inflate centrality with arecursive definition
- 36. Consolidated Behaviors and Attitudes36Degree Distributions: An Example –With a Log-Log Distribution• Sexualnetworks:great variationin contactnumbers
- 37. Consolidated Behaviors and Attitudes37Small World Networks
- 38. Consolidated Behaviors and Attitudes38NEMASmall world phenomenon:Milgram’s experiment
- 39. Consolidated Behaviors and Attitudes39Ties and Geography“The geographic movement of the [message] from Nebraska toMassachusetts is striking. There is a progressive closing in on the targetarea as each new person is added to the chain”S.Milgram ‘The small world problem’, Psychology TodayM 1967NEMA
- 40. Consolidated Behaviors and Attitudes40Kleinberg’s geographical small world modelnodes are placed on a lattice and connect to nearest neighborsadditional links placed with:p(link between u and v) = (distance(u,v))-rIf you set r = 2, you get optimum ability to getbetween nodes with minimal jumps!!!!!
- 41. Consolidated Behaviors and Attitudes41Communities
- 42. Consolidated Behaviors and Attitudes42Why Care About Communities?• Opinion formation and uniformity If each node adopts the opinion of the majorityof its neighbors, it is possible to have differentopinions in different cohesive subgroups
- 43. Consolidated Behaviors and Attitudes43Political Blogs
- 44. Consolidated Behaviors and Attitudes44Community Finding• Social and other networks have a natural community structure• We want to discover this structure rather than impose a certainsize of community or fix the number of communities• Without “looking”, can we discover community structure in anautomated way?
- 45. Consolidated Behaviors and Attitudes45Hierarchical clustering• Process:– after calculating the “distances”for all pairs of vertices– start with all n vertices disconnected– add edges between pairs one by one in order ofdecreasing weight– result: nested components, where one can take a‘slice’ at any level of the tree
- 46. Consolidated Behaviors and Attitudes46Permuted Adjacency Matrix
- 47. Consolidated Behaviors and Attitudes47Betweenness Clustering• Successively removing edges of highest betweenness (the bridges, or localbridges) breaks up the network into separate components
- 48. Consolidated Behaviors and Attitudes48Modularity• Algorithm– Start with all vertices as isolates– Follow a greedy strategy:• successively join clusters with the greatest increase DQ in modularity• stop when the maximum possible DQ <= 0 from joining any two– Successfully used to find community structure in a graphwith > 400,000 nodes with > 2 million edges• Amazon’s people who bought this also bought that…– Alternatives to achieving optimum DQ:• simulated annealing rather than greedy search
- 49. Consolidated Behaviors and Attitudes49Some Interesting Applications of NA
- 50. Consolidated Behaviors and Attitudes50
- 51. Consolidated Behaviors and Attitudes51
- 52. Consolidated Behaviors and Attitudes52Ingredient Networks
- 53. Consolidated Behaviors and Attitudes53