Social Networks And Group Formation

Posted by

This is the first in a three-part series on academic research that illuminates social networks, one of the most important trends in design today.

Humans suffer from information overload; there’s much more information on any given subject than a person is able to access. As a result, people are forced to depend upon each other for knowledge. Know-who information rather than know-what, know-how or know-why information has become most crucial. It involves knowing who has the needed information and being able to reach that person (Johnson et al. 2000).

In this context, understanding the formation, evolution and utilization of online social networks becomes important. A social network is “a set of people (or organizations or other social entities) connected by a set of social relationships, such as friendship, co-working or information exchange.” (Garton et al., 1997) While the Internet contributes to the information overload, it also provides useful tools to effectively manage one’s social networks and through them gain access to the right pieces of information.

This field is of particular interest to researchers working at the intersection of information systems, sociology and mathematics. These researchers study the uses of social networks and the ways in which they are mediated in society and in the workplace through information communication technologies (ICTs) such as (but not limited to) the Internet. This literature review explores how social networks that take advantage of information communication technologies—specifically, web based technologies—begin, evolve and are utilized.

The online social network field is broad, and any literature review can only focus on a selection of articles. The present article highlights recent research in the field and focuses on centrality, linkage strength, identity, trust, activity and benefits. By no means is this review comprehensive, but it should give practitioners some useful concepts to consider as they design social network based web applications.

The Strength of Weak Ties

Social networks were first researched in the late 1940s. With the advent of the Internet, online communities and social networking websites, their significance has only increased. Any review hoping to be meaningful must begin with the normative contributions of the sociologist Mark Granovetter and the mathematician Linton C. Freeman who both wrote influential articles well before the Internet was popularized.

Granovetter (1973) argued that within a social network, weak ties are more powerful than strong ties. He explained that this was because information was far more likely to be “diffused” through weaker ties. He concluded that weak ties are “indispensable to individuals’ opportunities and to their incorporation into communities while strong ties breed local cohesion.”

Granovetter’s doctoral thesis demonstrated that most people landed jobs thanks to their weak ties and not their strong ones. It was the people that they did not know well, the ones with whom they did not have shared histories and did not see on a regular basis who were of most help. This is because people with strong ties generally share the same pieces of information and resources. Therefore they are of less help to one another.

Similarly, Granovetter identified absent ties (also called nodding ties) – those ties that lack the emotional intensity, time, intimacy and reciprocity to even qualify as weak ties. Someone living on the same street that you nod to everyday is an absent tie. An absent tie is someone that exists in your life but with whom you have no connection whatsoever. That person is not helpful in the way that a weak tie can be.

Depending upon the type of application you are building, you may want to design it so that people are encouraged to form weak ties with people that they do not know very well. They are more likely to benefit from those weak ties than from strong ones. But it is important to recognize the difference between a weak tie and an absent one. On social network sites like MySpace and Facebook, where self worth is garnered through the number of ties, the difference becomes important. Yet, the fact that you can search and connect to all kinds of ties on these networks has influenced their growth.

According to Granovetter’s theory, there would be value in the visual depiction of weak ties. LinkedIn tells you how many ties you have at each degree of separation, but other than that you are not given much information about those ties. Are they strong, weak, or absent ties? LinkedIn has another problem too: It makes it difficult for you to connect with your weak ties. You often have to ask a common friend for permission to establish that connection. No wonder LinkedIn is being eclipsed by other social network services!

Centralization in a Network

An understanding of social networks needs also to include accounts of centrality and of one node’s relationship to other nodes in a network. This is why Linton C. Freeman’s article on centrality in social networks is important (Freeman, 1979). Freeman explored how “graph centralization” was based on differences in point centralities. He also outlined three competing theories regarding the definition of centrality based on degree of a point, control and independence.

Degree of a point refers to the number of nodes connected to a given node. In simple terms, this means counting the number of friends you have in a social network. The more friends, you have, the more important you are.

Control refers to the extent to which nodes depend on one specific node to communicate with other nodes. For example, if hundreds of friends are connected to each other only when you serve as the bridge connecting them, then your centrality is high. You are the node that controls the communication flows.

And finally independence means that a node is closely related to all the nodes considered – so that it is minimally dependent on any single node and is not subject to control. This means you can reach the maximum number of people through the shortest number of links, without being dependent on a particular few nodes.

Figure 1: A depiction of centrality.
* Degree point: C and K have the most nodes connected to them.
* Control: D serves as the bridge between the most nodes and controls the flow of information.
* Independence: K is most closely connected to the other nodes by multiple nodes (I and Q).

Because social networks are fundamentally social tools in which people are constantly monitoring and growing their social network, most social network media depict growth using the degree of point definition. However, control and independence can be more useful definitions. For example, a person who controls information flows is more important than one who may have more friends in the network. Centrality can also indicate which members are the most useful or well connected and therefore the best information resources.

Learning from Flickr & Yahoo

The principles of node structures, tie strength and centrality have been applied to understand nodes in modern day online social networks. A good example of this is in the explanatory research conducted by Kumar, Novak and Tomkins (2006). They compared two online social networks, Flickr and Yahoo 360, which together had more than five million users at the time. These researchers noticed that the social networks follow a standard pattern of growth, namely, rapid early growth followed a period of decline and then slow but steady growth.

Kumar, Novak and Tomkins also saw that network activity is of three types:
* “Singletons,” who have no connections and are least central
* The “giant component,” which is the largest group of nodes tightly connected to the central nodes and to each other
* The “middle region,” which represents isolated groups which interact amongst themselves but not with the rest of the network, forming isolated stars. These groups grow one user at a time. Over time they merge with the giant component.


Figure 2a: The red section represents the giant component. The blue is the middle region comprising of isolated networks while the gray are singletons.

The node analysis of these networks showed that more than half of a social network is outside the giant component where the greatest centrality lies. They used the “control” definition of centrality to determine this. The research also highlighted a prevalence of “stars” in the middle region which are mini social networks, typically driven by one dynamic member who serves as the point of centrality with others serving as satellite nodes – connected to the dynamic member but not to each other. In Kumar, Novak and Tomkins’ analysis the middle region represented one-third of users on Flickr and about ten percent of users on Yahoo! 360.

Also keep in mind that the most growth happens in the middle region where dynamic members influence others to join their network. These sub-networks can gradually join the giant component over time. Once they do, the importance of the dynamic member diminishes. Even if that dynamic member were to leave the network, the others would stay in the network.


Figure 2b: A connection is made between one of the isolated networks from the middle region connects to the giant component.


Figure 2c: The formerly isolated network becomes part of the giant component.

What are the implications of this? When designing your social network, be aware that most of the network will be outside the giant component. In a sense, social networks themselves are thousands of sub-networks. The more mechanisms that you provide for those sub-networks to flourish, greater the overall network growth. Social networks are fundamentally virtual ghettos. Networks like MySpace and Facebook that encourage ghettos grow the most. Ning, which lets you create your own network and join others too, cleverly understands this concept and leverages it.

Live Journal, DBLP & Adoption Behavior

Most online social networks grow based on the initiative of early adopters who transfer their offline networks online and serve as “stars.” But it is also important to look at the evolution of social networks based on intentional activity within a network. Backstrom, Huttenlocher and Kleinberg (2006) analyze group formation in large social networks. They used LiveJournal data from its ten million users and DBLP, a database of co-authorship in conference publications to study how the communities grew based on the underlying social networks. They showed that a person was more likely to join a social network if friends of the person were already closely linked together on it. Having several friends closely connected in an online social network builds trust. For those of us who are active members of social networks, this makes obvious sense.

The article conclusively showed that the most growth happened in the giant component (without using the term explicitly) where the nodes were most central. In highlighting the importance of the giant component, Backstrom, Huttenlocher and Kleinberg validated the Kumar et al. (2006) theory. Their article raises a critical question: Once a node becomes aware of its neighbors’ behavior, under what conditions and based on what network relationships will the node adopt that behavior itself?

Another group of researchers who studied the DBLP database were Cai et al. (2006). They pointed out that each node belongs to several different social networks, with the other networks affecting the group formation patterns, evolution and information sharing on the social network. As a result, they felt that a network can’t be analyzed independently but needs to be studied in the context of other networks. It may also influence whether a node leaves a network based on the activity of nodes on its other networks. This raises an important question for practitioners: Do you know how much of the activity on your social network is influenced by activity on other social networks?

This is of particular interest when examined in the context of the new Google lab efforts around Social Stream, which hopes to be a meta-social-network aggregating different networks together. Developed in partnership with Carnegie Mellon University, Social Stream s currently in private beta. The question that social network designers worry about is, once you can understand network activity on different networks via a single, consolidated interface, how will that affect your own network preferences?

It is clear that online social networks are always evolving because of both outside influences and activity within them. Butler (2001) emphasized this when they showed that network size has a complex influence on the network such that more member gains results in more member losses too. They argued that it is necessary to balance the positives and negatives of size and communication activity. A final question to consider is which type of membership activity and where (giant component, middle layer or among singletons) most affects an online network?


Researchers studying group formation have incorporated the normative social network theories discussed by Granovetter and Freeman. They recognize that these are socio-technical systems that must account for human agency, meaning that the ability of human beings to make unique choices heavily influences a network’s evolution. As a result, one can apply social networking theory to a web product, but one must remember that because these are human systems it is difficult gauge the potential success of a given network.

The next part of this series will explore information-sharing patterns on social networks. The third part will cover some workplace scenarios.

Authors Note: By no means is this review comprehensive, moreover it should serve as just a starting point for gaining familiarity with some of the academic contributions.


Backstrom, L., Huttenlocher, D., Kleinberg, J., and Lan, X. (2006.) Group formation in large social networks: membership, growth, and evolution. In Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM Press: Philadelphia, PA, USA.

Butler, B. (2001.) Membership size, communication activity, and sustainability: a resource-based model of online social structures. Information Systems Research, 12 (4), p. 26.

Cai, D., Shao, Z., He, X., Yan, X., and Han, J. (2005) Mining hidden community in heterogeneous social networks. In Proceedings of the 3rd International Workshop on Link Discovery. ACM Press: Chicago, Illinois.
Freeman, L. C. (1979.) Centrality in social networks conceptual clarification. Social Networks, 1 pp. 215-239.
Garton, L., C. Haythornthwaite and B. Wellman. (1997.) Studying online social networks. Journal of Computer Mediated Communication, 3 (1).
Granovetter, M. S. (1973) The strength of weak ties. American Journal of Psychology, 78 (6), pp. 1360-1380.
Johnson, B., Lorenz, E. and Lundvall, B. (2002.) Why all this fuss about codified and tacit knowledge? Industrial and Corporate Change, 11 (2), pp. 245-262.
Kumar, R., J. Novak and A. Tomkins. (2006.) Structure and evolution of online social networks. Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 611-617. ACM Press: Philadelphia, PA, USA.


  1. Nice summary of research on what is definitely an important topic for designers today.

    The discussion of LinkedIn seems to be a bit off the mark though. The fact that LinkedIn does not distinguish between strong and weak ties is the rule, not the exception, as far as popular social networking applications go.

    The claim that LinkedIn makes it difficult for you to connect with your weak ties is not factually correct, either. LinkedIn has provided the ability to connect to users with whom you don’t share connections for quite a while now (there’s even an option to indicate that you don’t know the person you’re inviting to connect!)

    Nevertheless, a good read – I look forward to the rest of the series.

  2. Dmitry,

    Thank you for your feedback. I’m not sure if I agree with you completely about LinkedIn letting you connect with weak ties. Only if sign up for one of the paid accounts, do you really have easy access to potential weak ties. Otherwise, the way you reach others is restricted and it was even more difficult to connect with them earlier.

    In terms of whether LinkedIn is an exception versus the norm in being able to differ between strong and weak ties, Facebook presents a different story. By virtue of letting you categorize your Friends, you have a place in which to “keep” your weak ties. The Limited Profile view also helps with this. However, with Facebook I’d really like the ability to create my own categories into which I can put people. I’m told that Plaxo does this with the new version of their software.

    Nevertheless, thanks for the thoughts and I don’t mean to simply slam LinkedIn. It does some things really well and I’m an active user myself.

  3. Thanks for sharing, you’ll probably find the next two parts interesting as well then. In those I talk about group formation and workplace scenarios. I noticed that your article highlights different research. It just goes to show much is out there in terms of quality research.

  4. Great Job Shiv, I’ve been following this topic lately and this is a nice review, very clear, congrats.

  5. _The more mechanisms that you provide for those sub-networks to flourish, greater the overall network growth._

    That’s a good point: the IA of a social network does not need to emphasise a merging of everyone into the “giant component”. The disconnected sub-networks are important too.

    On the other hand, I’m very sad for all those singletons. A singleton in a social network is an oxymoron. And yet that’s how we all start out: a newbie with no friends. Identifying singletons and assisting them to form connections is one of the most important processes for building the network. If a new user can’t do that easily, they don’t come back — and typically leave an unused account behind. Can the theory help IA’s to improve that process?


  6. Something else to remember about encouraging the flourishing of disconnected sub-networks is that innovation and evolution is more likely to occur at the edges, in the isolated enclaves, away from the big groups and their pressure to conform. Once a group moves into the “giant component” then subtle forces come into play which eventually results in stagnation and decay … and then the only way the giant component can ever stay alive is by constantly devouring the fresh blood of disconnected sub-networks.

  7. Thanks for that excellent article.

    Makes me wonder about the idea of “deriving value” from a network. You mention that weak ties are better at providing value to the individual than strong ties. Yet you cite Ning’s model, which would seem to encourage the creation of small, affinity-based, strong-tie communities. Will Ning users look to derive value in a different way?

    Maybe there’s a distinction to be made between more utilitarian communities like LinkedIn [primarily for professional networking] and other, more purely social communities.

  8. Interesting thoughts. Ning does encourage the creation of small, strong-tie networks. But it also encourages you to use your account to join multiple networks on the Ning platform. And in that sense, its helping you find ways to connect with weak ties. So in a sense there are two levels of affinity – first to Ning and then to your specific social networks within Ning.

    Good question about the utilitarian versus social distinction. One can argue that all networks are social but that may not necessarily be the case. Those categories could work.

  9. Eric–

    Your comments seemed on-base at first. Though in re-reading them, there seems to be a degree of proposition to your claims. Is there research that bolsters the ideas in your reminder?

    I can understand that innovation and evolution happen when isolated from the large forces. But the very nature of the technology needs to be considered: one’s association with the larger group is not absolute state of existence, one is not locked in (Tron!). We can step away from our computers. Or we can belong to entirely different networks (indeed I may be “at the edges” in one and part of the “giant component” in another). What I do in my free time away from the network may inform my presence when I log back in–it may still contribute to innovation and evolution within that big group, no? My opinion is that we can have fresh blood without any devouring taking place.

    Or have I misunderstood the ideas in your comment?

  10. Thanks Shiv,
    Good article.
    I remember seeing a LinkedIn visualization tool that displayed a user’s connections in a graph or illustrative format. And, as we all know, there are also other data visualization tools for social networks out there. For instance, Mashable includes a few in this list:

    What are your thoughts on these types of tools? Are they accurate? Which ones do you find the most interesting or helpful?

  11. The power of Linkedin is that it is not myspace and facebook.

    It is missing the point to critique LInkedIn with regards to weak and strong ties.

    It takes effort to connect and thank good for that, that means that the actual value of your network is so much higher per connection.

  12. Melissa and Thomas, Thank you for your thoughts. I haven’t seen too much research on the visualization tools so it is hard for me to comment on them. On the surface and through my own experience, I can say this – they can be valuable if they truly provide information at different depths. Many visualization tools appear gimmicky because they provide either too little information or too abstract a view. Part of the problem is that a large screen is really required to take advantage of them.

    Regarding the point about LinkedIn, what’s interesting is that different people have different perceptions of the roles of these networks. They also use them very differently. So while for one person, the comparisons maybe natural for another it maybe unfair. Still, your point is noted.

  13. Would like to recommend 2 websites for people who are looking for free social network hosting

  14. Stefan,

    Thank you for the link to the article. It was very interesting reading. The other two parts are on their way.


Comments are closed.