Mixing patterns

From HandWiki - Reading time: 8 min

Mixing patterns refer to systematic tendencies of one type of nodes in a network to connect to another type. For instance, nodes might tend to link to others that are very similar or very different. This feature is common in many social networks, although it also appears sometimes in non-social networks. Mixing patterns are closely related to assortativity; however, for the purposes of this article, the term is used to refer to assortative or disassortative mixing based on real-world factors, either topological or sociological.

Types of Mixing Patterns

Mixing patterns are a characteristic of an entire network, referring to the extent for nodes to connect to other similar or different nodes. Mixing, therefore, can be classified broadly as assortative or disassortative. Assortative mixing is the tendency for nodes to connect to like nodes, while disassortative mixing captures the opposite case in which very different nodes are connected.

Obviously, the particular node characteristics involved in the process of creating a link between a pair will shape a network's mixing patterns. For instance, in a sexual relationship network, one is likely to find a preponderance of male-female links, while in a friendship network male-male and female-female networks might prevail. Examining different sets of node characteristics thus may reveal interesting communities or other structural properties of the network. In principle there are two kinds of methods used to exploit these properties. One is based on analytical calculations by using generating function techniques. The other is numerical, and is based on Monte Carlo simulations for the graph generation.[1]

In a study on mixing patterns in networks, M.E.J. Newman starts by classifying the node characteristics into two categories. While the number of real-world node characteristics is virtually unlimited, they tend to fall under two headings: discrete and scalar/topological. The following sections define the differences between the categories and provide examples of each. For each category, the models of assortatively mixed networks introduced by Newman are discussed in brief.

Mixing Based on Discrete Characteristics

Discrete characteristics of a node are categorical, nominal, or enumerative, and often qualitative. For instance, race, gender, and sexual orientation are commonly examined discrete characteristics.

To measure the mixing of a network on discrete characteristics, Newman[1] defines a quantity [math]\displaystyle{ e_{ij} }[/math] to be the fraction of edges in a network that connect nodes of type i to type j (see Fig. 1). On an undirected network this quantity is symmetric in its indices [math]\displaystyle{ e_{ij} = e_{ji} }[/math], while on directed ones it may be asymmetric. It satisfies the sum rules

[math]\displaystyle{ \sum_{ij}{e_{ij} = 1},\quad\sum_{j}{e_{ij} = a_{i}},\quad\sum_{i}{e_{ij} = b_{j}} }[/math],

where [math]\displaystyle{ a_{i} }[/math] and [math]\displaystyle{ b_{i} }[/math] are the fractions of each type of an edge's end that is attached to nodes of type [math]\displaystyle{ i }[/math]. On undirected graphs, where there is no physical distinction between the ends of a link, i.e. the ends of edges are all of the same type, [math]\displaystyle{ a_{i} = b_{i} }[/math].

Then, an assortativity coefficient, a measure of the similarity's or dissimilarity's strength between two nodes on a set of discrete characteristics may be defined as:

[math]\displaystyle{ r = \frac{\sum_i{e_{ii}} - \sum_i{a_i b_i}}{1 - \sum_i{a_i b_i}} }[/math]

with

[math]\displaystyle{ r_{min} = -\frac{\sum_i{a_i b_i}}{1 - \sum_i{a_i b_i}}. }[/math]

This formula yields [math]\displaystyle{ r = 0 }[/math] when there's no assortative mixing, since [math]\displaystyle{ e_{ij} = a_{i}b_{j} }[/math] in that case, and [math]\displaystyle{ r = 1 }[/math] when the network is perfectly assortative. If the network is perfectly disassortative, i.e. every link connects two nodes of different types, then [math]\displaystyle{ r = r_{min} }[/math], which lies in general in the range [math]\displaystyle{ -1\leq r \lt 0 }[/math]. This range for [math]\displaystyle{ r_{min} }[/math] implies that a perfectly disassortative network is normally closer to a randomly mixed network than a perfectly assortative one is. When there are several different types of nodes, then random mixing will most often pair unlike nodes, so that the network appears to be mostly disassortative. Therefore, it is appropriate that the value [math]\displaystyle{ r = 0 }[/math] for a random network should be closer to that for the perfectly disassortative network than for the perfectly assortative one.

Generating function

The method of generating functions is based on the idea of figuring out the proper generating function for the distributions of our interest every time, and extract data related to the networks structure by differentiating them. Assuming that the degree distribution [math]\displaystyle{ p_{k}^{(i)} }[/math] for nodes of type [math]\displaystyle{ i }[/math] and the value of the matrix [math]\displaystyle{ e_{ij} }[/math] (and hence, the values of [math]\displaystyle{ a_{i} }[/math] and [math]\displaystyle{ b_{i} }[/math]) are known, then we may consider the ensemble of all graphs with the specified [math]\displaystyle{ p_{k}^{(i)} }[/math] and [math]\displaystyle{ e_{ij} }[/math] to yield collective (macroscopic) network characteristics. In principle, the generating function for [math]\displaystyle{ p_{k}^{(i)} }[/math] and its first moment are given by:

[math]\displaystyle{ G_{0}^{(i)}(x_{1},...,x_{n}) = \sum_{k} p_{k}^{(i)}x^{k}, }[/math] and [math]\displaystyle{ G_{1}^{(i)} = \frac{1}{z_{i}}\frac{dG_{0}^{(i)}}{dx}\Bigg|_{x = 1}, }[/math]

where [math]\displaystyle{ x_{i} = \frac{\sum_{j}e_{ij}x_{j}}{\sum_{j}e_{ij}}, }[/math] the node of type [math]\displaystyle{ i }[/math] ([math]\displaystyle{ r_{i} }[/math] in the number) and [math]\displaystyle{ z_{i} }[/math] the mean degree for nodes of this type. Now we focus on the distributions that we're interested for.

The distribution of the total number of nodes reachable by following an edge that arrives at a node of type [math]\displaystyle{ i }[/math] has a generating function [math]\displaystyle{ H_{1}^{(i)}(x) = xG_{1}^{(i)}[H_{1}^{(1)}(x),...,H_{1}^{(n)}(x)] }[/math]. Similarly, the distribution of the number of nodes reachable from a randomly chosen node of type [math]\displaystyle{ i }[/math] is generated by [math]\displaystyle{ H_{0}^{(i)}(x) = xG_{0}^{(i)}[H_{1}^{(1)}(x),...,H_{1}^{(n)}(x)] }[/math]. Now we are in position to yield some of the network's properties. The mean number [math]\displaystyle{ s_{i} }[/math] of nodes reachable from a node of type [math]\displaystyle{ i }[/math] is

[math]\displaystyle{ s_{i} = \frac{dH_{0}^{(i)}}{dx}\Bigg|_{x = 1} = 1 + G_{0}^{(i)'}(1)\frac{\sum_{j}e_{ij}H_{1}^{(i)'}(1)}{\sum_{j}e_{ij}} }[/math]

Furthermore, if [math]\displaystyle{ u_{i} }[/math] is the probability for a node of type [math]\displaystyle{ i }[/math] (reached by following a randomly chosen link in the graph) not to belong to the giant cluster, then the overall fraction [math]\displaystyle{ S }[/math] of nodes that compose this cluster is given by

[math]\displaystyle{ S = 1 - \sum_{i}\frac{a_{i}}{z_{i}}G_{0}^{(i)}(u_{1},...,u_{n}) }[/math]

The numerical simulations based on Monte Carlo techniques seem to agree with the analytical results yielded by the formulas described above.

Mixing by Scalar or Topological Characteristics

Scalar characteristics of a node are those that are quantitative. They may be continuous or discrete ordinal variables like counts. Age is perhaps the simplest example, though intelligence and raw income are other obvious possibilities. Some topological features of the network may also be used for examining mixing by scalar properties. Specifically, the degree of a node is often a highly important feature in the mixing patterns of networks.[2] Topological scalar features are very useful, because unlike other measures, they are always available. They are sometimes used as a proxy for real-world "sociability".[1]

For measuring the assortativity of scalar variables, similar to the discrete case (see above) an assortativity coefficient can be defined. One can measure it using the standard Pearson Correlation, as Newman demonstrates.[1] In Fig. 2, for instance, a calculation of the Pearson Correlation Coefficient yields r = 0.574. This indicates a fairly strong association between the age of husbands and wives at the time of marriage.

An alternative coefficient can be computed for measuring the mixing by the degree of the nodes. Newman [1] derives the expression, which is found to be

[math]\displaystyle{ r = \frac{\sum_{jk}jk(e_{jk} - q_{j} q_{k})}{\sigma^2_q} }[/math] for an undirected network. In this formula, if [math]\displaystyle{ p_k }[/math] refers to the graph's degree distribution (i.e., the probability that a node has degree k) then [math]\displaystyle{ q_k = \frac{(k + 1) p_{k + 1}}{z} }[/math]. This refers to the excess degree of a node, or the number of other edges aside from the currently examined one. The z refers to the average degree in the network, and [math]\displaystyle{ \sigma_q }[/math] is the standard deviation of the distribution [math]\displaystyle{ q_k }[/math]. For a directed network the equivalent expression is [math]\displaystyle{ r = \frac{\sum_{jk}jk(e_{jk} - q_{j}^{in} q_{k}^{out})}{\sigma_{in}\sigma_{out}} }[/math].

This correlation is positive when nodes are assortative by degree, and negative when the network is disassortative. Thus, the measure captures an overall sense of the mixing patterns of a network. For a more in-depth analysis of this topic, see the article on assortativity.

The method of generating functions is still applicable for this case too, but the functions to be calculated are rarely calculable in closed form. Thus, numerical simulations seem to be the only way to yield results of some interest. The technique used is once again the Monte Carlo one. For the case of networks with a power-law degree-distribution [math]\displaystyle{ p_{k}\sim k^{-\tau} }[/math], [math]\displaystyle{ q_{k} }[/math] has a divergent mean, unless [math]\displaystyle{ \tau \gt 3 }[/math], which rarely happens so.[3] Instead, the exponentially truncated power-law distribution [math]\displaystyle{ p_{k} = \frac{k^{-\tau}\mathrm{e}^{-k/\kappa}}{\mathrm{Li}_{\tau}(\mathrm{e}^{-1/\kappa})}\ \mathrm{for}\ k\geq 1 }[/math] yields a distribution for the excess degree of the type [math]\displaystyle{ q_{k}\sim (k+1)^{1-\tau}\mathrm{e}^{-(k+1)/\kappa} }[/math]. The results for this case are summarized below.

1) The position of the phase transition at which a giant cluster appears moves to higher values of [math]\displaystyle{ \kappa }[/math] as the value of [math]\displaystyle{ r }[/math] decreases. That is, the more assortative a network is, the lower the edge density threshold for the giant cluster's appearance will be.

2) The size of the giant cluster in the limit of large [math]\displaystyle{ \kappa }[/math] is smaller for the assortatively mixed graph, than for the neutral and disassortative ones.

3) Assortative mixing in the network affects the network's robustness under node removal. For assortative networks, it is required to remove about ten times more than usual (usual means a neutral network[clarification needed]) high-degree nodes to destroy the giant cluster, while the opposite is true for disassortative networks, i.e. they are more susceptible than neutral ones under removal of the high-degree nodes.

The fascinating result on the dependence of the network's robustness to its node mixing may be explained as follows. According to their definition, high-degree nodes in assortative networks tend to form a core group among them. Such a core group provides robustness to the network by concentrating all the obvious target nodes together in one portion of the graph. Removing these high-degree nodes is still one of the most effective ways to destroy network connectivity, but it is less effective (compared to neutral networks) because by removing them all from the same portion of the graph we fail to attack other portions. If these other portions are themselves percolating, then a giant cluster will persist even as the highest degree nodes vanish. On the other hand, the disassortatively mixed network is particularly susceptible to removal of high-degree nodes because these nodes are strewn far apart across the network, so that attacking them is like attacking all parts of the network at once.

Examples and Applications

A common application of mixing patterns is the study of disease transmission. For instance, many studies have used mixing to study the spread of HIV/AIDS and other contagious diseases.[4][5][6] These articles find a strong connection between Mixing patterns and the rate of disease spread. The findings can also be used to model real-world network growth, as in,[7] or find communities within networks.

References

  1. 1.0 1.1 1.2 1.3 1.4 Newman, M. E. J. (2003-02-27). "Mixing patterns in networks". Physical Review E 67 (2): 026126. doi:10.1103/physreve.67.026126. ISSN 1063-651X. PMID 12636767. Bibcode2003PhRvE..67b6126N. 
  2. Newman, M. E. J. (2002-10-28). "Assortative Mixing in Networks". Physical Review Letters 89 (20): 208701. doi:10.1103/physrevlett.89.208701. ISSN 0031-9007. PMID 12443515. Bibcode2002PhRvL..89t8701N. 
  3. Albert, Réka; Barabási, Albert-László (2002-01-30). "Statistical mechanics of complex networks". Reviews of Modern Physics 74 (1): 47–97. doi:10.1103/revmodphys.74.47. ISSN 0034-6861. Bibcode2002RvMP...74...47A. 
  4. Aral, S O; Hughes, J P; Stoner, B; Whittington, W; Handsfield, H H; Anderson, R M; Holmes, K K (1999). "Sexual mixing patterns in the spread of gonococcal and chlamydial infections.". American Journal of Public Health (American Public Health Association) 89 (6): 825–833. doi:10.2105/ajph.89.6.825. ISSN 0090-0036. PMID 10358670. 
  5. Garnett, Geoffrey P.; HUGHES, James P.; Anderson, Roy M.; Stoner, Bradley P.; Aral, Sevgi O. et al. (1996). "Sexual Mixing Patterns of Patients Attending Sexually Transmitted Diseases Clinics". Sexually Transmitted Diseases (Ovid Technologies (Wolters Kluwer Health)) 23 (3): 248–257. doi:10.1097/00007435-199605000-00015. ISSN 0148-5717. PMID 8724517. 
  6. Ford, Kathleen; Sohn, Woosung; Lepkowski, James (2002). "American Adolescents: Sexual Mixing Patterns, Bridge Partners, and Concurrency". Sexually Transmitted Diseases (Ovid Technologies (Wolters Kluwer Health)) 29 (1): 13–19. doi:10.1097/00007435-200201000-00003. ISSN 0148-5717. PMID 11773873. 
  7. Catanzaro, Michele; Caldarelli, Guido; Pietronero, Luciano (2004). "Social network growth with assortative mixing". Physica A: Statistical Mechanics and Its Applications (Elsevier BV) 338 (1–2): 119–124. doi:10.1016/j.physa.2004.02.033. ISSN 0378-4371. Bibcode2004PhyA..338..119C. 




Licensed under CC BY-SA 3.0 | Source: https://handwiki.org/wiki/Mixing_patterns
24 views | Status: cached on July 23 2024 01:11:34
↧ Download this article as ZWI file
Encyclosphere.org EncycloReader is supported by the EncyclosphereKSF