Abstract
Let P be a point set in ℝd, and let M be a function that maps any subset of P to a positive real. We examine the problem of computing the mean and variance of M when a subset in P is selected according to a random distribution. We consider two distributions; in the first distribution (the Bernoulli distribution), each point p in P is included in the random subset independently, with probability π(p). In the second distribution (the fixed-size distribution), exactly s points are selected uniformly at random among all possible subsets of s points in P.
We present efficient algorithms for computing the mean and variance of several geometric measures when point sets are selected under one of the described random distributions. We also implemented four of those algorithms: an algorithm that computes the mean 2D bounding box volume in the Bernoulli distribution, an algorithm for the mean 2D convex hull area in the fixed-size distribution, an algorithm that computes the exact mean and variance of the mean pairwise distance (MPD) for d-dimensional point sets in the fixed-size distribution, and an (1 − ε)-approximation algorithm for the same measure. We conducted experiments where we compared the performance of our implementations with a standard heuristic approach, and we show that our implementations are very efficient. We also compared the implementation of our exact MPD algorithm and the (1 − ε)-approximation algorithm; the approximation method performs faster on real-world datasets for point sets of up to 13 dimensions, and provides high-precision approximations.
- P. K. Agarwal, S. Har-Peled, S. Suri, H. Yıldız, and W. Zhang. 2014. Convex hulls under uncertainty. In European Symposium on Algorithms. Springer, 37--48.Google Scholar
- Kfir Barhum, Oded Goldreich, and Adi Shraibman. 2007. On Approximating the Average Distance Between Points. Springer, Berlin, 296--310.Google Scholar
- A. K. Brunbjerg, J. Cavender-Bares, W. L. Eiserhardt, R. Ejrnæs, L. W. Aarssen, H. L. Buckley, E. Forey, F. Jansen, J. Kattge, C. Lane, R. A. Lubke, A. T. Moles, A. L. Monserrat, R. K. Peet, J. Roncal, L. Wootton, and J.-C. Svenning. 2014. Multi-scale phylogenetic structure in coastal dune plant communities across the globe. Journal of Plant Ecology (2014). arXiv:http://jpe.oxfordjournals.org/content/early/2014/01/28/jpe.rtt069.full.pdf+htmlGoogle Scholar
- Benno Büeler, Andreas Enge, and Komei Fukuda. 2000. Exact Volume Computation for Polytopes: A Practical Study. Birkhäuser, Basel, 131--154.Google Scholar
- P. B. Callahan and S. R. Kosaraju. 1995. A decomposition of multidimensional point sets with applications to K-nearest-neighbors and N-body potential fields. J. ACM 42, 1 (Jan. 1995), 67--90. Google ScholarDigital Library
- W. K. Cornwell, D. W. Schwilk, and D. D. Ackerly. 2006. A trait-based test for habitat filtering: Convex hull volume. Ecology 87, 6 (2006), 1465--1471.Google ScholarCross Ref
- M. Fink, J. Hershberger, N. Kumar, and S. Suri. 2016. Hyperplane separability and convexity of probabilistic point sets. In Proceedings of the 32nd International Symposium on Computational Geometry (SoCG’16). 38:1--38:16.Google Scholar
- Nicholas J. Gotelli. 2000. Null model analysis of species co-occurrence patterns. Ecology 81, 9 (2000), 2606--2621.Google ScholarCross Ref
- J. C. Gower. 1971. A general coefficient of similarity and some of its properties. Biometrics (1971), 857--871.Google Scholar
- Ronald L. Graham. 1972. An efficient algorithm for determining the convex hull of a finite planar set. Information Processing Letters 1, 4 (1972), 132--133.Google ScholarCross Ref
- W. Hoeffding. 1963. Probability inequalities for sums of bounded random variables. Journal of the American Statistical Association 58, 301 (1963), 13--30.Google ScholarCross Ref
- Lingxiao Huang and Jian Li. 2015. Approximating the Expected Values for Combinatorial Optimization Problems over Stochastic Points. Springer, Berlin, 910--921.Google Scholar
- A. Jørgensen, M. Löffler, and J. M. Phillips. 2011. Geometric computations on indecisive points. In Workshop on Algorithms and Data Structures. Springer, 536--547. Google ScholarDigital Library
- J. B. Lasserre. 1983. An analytical expression and an algorithm for the volume of a convex polyhedron in R<sup>n</sup>. Journal of Optimization Theory and Applications 39, 3 (1983), 363--377.Google ScholarCross Ref
- Chao Li, Chenglin Fan, Jun Luo, Farong Zhong, and Binhai Zhu. 2015. Expected computations on color spanning sets. Journal of Combinatorial Optimization 29, 3 (2015), 589--604. Google ScholarDigital Library
- L. Huang, J. Li, J. M. Phillips, and H. Wang. 2016. Epsilon-kernel coresets for stochastic points. Retrieved July 7, 2018 from https://arxiv.org/abs/1411.0194.Google Scholar
- Maarten Löffler and Jeff M. Phillips. 2009. Shape fitting on point sets with probability distributions. In Proceedings of the 17th Annual European Symposium on Algorithms (ESA’09). Springer, Berlin. 313--324.Google Scholar
- Maarten Löffler and Marc van Kreveld. 2010. Largest and smallest convex hulls for imprecise points. Algorithmica 56, 2 (Feb. 2010), 235--269.Google ScholarCross Ref
- Norman W. H. Mason, Pascal Irz, Cédric Lanoiselée, David Mouillot, and Christine Argillier. 2008. Evidence that niche specialization explains species--Energy relationships in lake fish communities. Journal of Animal Ecology 77, 2 (2008), 285--296.Google ScholarCross Ref
- M. A Mouchet, S. Villéger, N. W. H. Mason, and D. Mouillot. 2010. Functional diversity measures: An overview of their redundancy and their ability to discriminate community assembly rules. Functional Ecology 24, 4 (2010), 867--876.Google ScholarCross Ref
- S. Suri, K. Verbeek, and H. Yıldız. 2013. On the most likely convex hull of uncertain points. In European Symposium on Algorithms. Springer, 791--802.Google Scholar
- N. G. Swenson and M. D. Weiser. 2014. On the packing and filling of functional space in eastern North American tree assemblages. Ecography 37, 11 (2014), 1056--1062.Google Scholar
- C. Tsirogiannis. 2016. GitHub Repository of Geometric-Measure Statistics Code. Retrieved July 7, 2018 from https://github.com/constantinosTsirogiannis/GeometryStats.Google Scholar
- S. Villéger, N. W. H. Mason, and D. Mouillot. 2008. New multidimensional functional diversity indices for a multifaceted framework in functional ecology. Ecology 89, 8 (2008), 2290--2301.Google ScholarCross Ref
- B. Walker, A. Kinzig, and J. Langridge. 1999. Plant attribute diversity, resilience, and ecosystem function: The nature and significance of dominant and minor species. Ecosystems 2 (1999), 95--113.Google ScholarCross Ref
- Jie Yang, Nathan G. Swenson, Guocheng Zhang, Xiuqin Ci, Min Cao, Liqing Sha, Jie Li, J. W. Ferry Slik, and Luxiang Lin. 2015. Local-scale partitioning of functional and phylogenetic beta diversity in a tropical tree assemblage. Scientific Reports 5 (2015), 12731.Google ScholarCross Ref
Index Terms
- Computing the Expected Value and Variance of Geometric Measures
Recommendations
Computing generalized ham-sandwich cuts
Barany et al. (2008) [1] proved that, for any @b@?[0,1]^d and any d well-separated convex bodies S"1,S"2,...,S"d in R^d, there exists a hyperplane (a generalized ham-sandwich cut) splitting the volume of S"i as (@b"i,1-@b"i) for all i. Steiger and Zhao (...
Dynamic 3-sided planar range queries with expected doubly-logarithmic time
The Priority Search Tree is the classic solution for the problem of dynamic 2-dimensional searching for the orthogonal query range of the form [a,b]x(-~,c] (3-sided rectangle). It supports all operations in logarithmic worst case complexity in both main ...
Tail variance of portfolio under generalized Laplace distribution
Many popular downside risk measures such as tail conditional expectation and conditional value-at-risk only characterize the tail expectation, but pay no attention to the tail variance beyond the value-at-risk. This is a severe deficiency of risk ...
Comments