Basic Model
The Forest Fire model was firstly proposed in [1] and [2]. It shows its applications in [3] and [4]. The basic setting of the model is as follows. The model has two parameters, a forward burning probability and a backward burning ratio . Consider a node joining the network at time , and let be the graph constructed thus far. will consist of just a single node. Node forms outlinks to nodes in according to the following process:
(a) first chooses an ambassador node uniformly at random and forms a link to .
(b) Generate two random numbers, and , that are geometrically distributed with means and , respectively. Node selects outlinks and in-links of incident to nodes that were not yet visited. Let denotes the other ends of these selected links. If not enough in or outlinks are available, selects as many as it can.
(c) forms outlinks to and then applies step (b) recursively to each of . As the process continues, nodes cannot be visited a second time, preventing the construction from cycling.
Thus, the burning of links in the Forest Fire model begins at , spreads to and proceeds recursively until it dies out. The original authors claim that this model can preserve the following three properties of a graph as growing:
- Heavy-tailed in-degree and out-degree distribution
- Densification power law
- Shrinking diameter
Application to Graph Sampling
In [3], the authors described how to apply Forest Fire model to graph sampling as follows. We first choose node uniformly at random. We then generate a random number that is geometrically distributed with mean . Node selects out-links incident to nodes that were not yet visited. Let denote the other ends of these selected links. We then apply this step recursively to each of until enough nodes have been burned. As the process continues, nodes cannot be visited a second time, preventing the construction from cycling. If the fire dies, then we restart it, i.e. select new node uniformly at random. We call the parameter the forward burning probability. In [4], the authors start from a set of “seed” users, which are chosen in three ways: 1) totally random 2) geo-location random 3) randomly according to user activities.
Remarks
Forest Fire model is indeed an ad-hoc model. As stated in [2], it lacks of rigorous analysis of the model itself.
Reference
[1] J. Leskovec, J. Kleinberg, and C. Faloutsos. Graphs over time: densification laws, shrinking diameters and possible explanations. In Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining, KDD ’05, pages 177–187, New York, NY, USA, 2005. ACM.
[2] J. Leskovec, J. Kleinberg, and C. Faloutsos. Graph evolution: Densification and shrinking diameters. ACM Transactions on Knowledge Discovery from Data (TKDD), 1, March 2007.
[3] J. Leskovec and C. Faloutsos. Sampling from large graphs. In Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining, KDD ’06, pages 631–636, New York, NY, USA, 2006. ACM.
[4] M. De Choudhury, Y.-R. Lin, H. Sundaram, K. S. Candan, L. Xie, and A. Kelliher. How Does the Data Sampling Strategy Impact the Discovery of Information Diffusion in Social Media? In Proceedings of the 4th International AAAI Conference on Weblogs and Social Media, May 2010.
Does anyone knows where can I find the R implementation of FF?