Technical Report CSI-0019

Concise Specification and Algorithms for Generating Families of Test Networks and Graphs for Benchmarking against Real World Big Data

K. A. Hawick

Archived: 2017

Abstract

Network science finds applications in many areas of science, engineering, social sciences and other fields where data and particularly big data can a be analysed using a graph or network based model. Network science has progressed in recent years through improvments in implementations and performance of graph analysis algorithms and software. It is useful to be able to parameterise familes of test graphs or network with appropriate size and complexity properties to be able to verify numerically the scalability and practical performance or memory limitations of certain procedures and analysis algorithms. We review common families of network graphs including random, scale-free, preferentialy attachment and hybrids that can be generated according to size and othe r parameters to provide statistical benchmark sets for studying various analysis algorithms. We also review various graph and network data storage formats and consider the practicalities of managing such data in the context of computational experiments. We review concise specifications of various families of graph including scale free, preferential attachment, and particular named graph and network models, giving a consistent notation and algorithms and code fragments for generating test data sets of these families. We discuss the value of having varying size test data that comes from parameterized formulae in this manner for benchmarking and comparing properties with real world big data sets with unknown categorisations and characteristics.

Keywords: Network science; big data; graphs; benchmark; scale-free network; preferential attachment.

Full Document Text: Not yet available.

Citation Information: BiBTeX database for CSI Notes.

BiBTeX reference:

@TechReport{CSI-0019,
        Title = {Concise Specification and Algorithms for Generating Families of Test Networks and Graphs for Benchmarking against Real World Big Data},
        Author = {K. A. Hawick},
        Institution = {Computer Science, University of Hull},
        Year = {2017},
        Address = {Cottingham Road, Hull, HU6 7RX},
        Month = {August},
        Number = {CSI-0019},
        Type = {Computational Science},
        Keywords = {Network science; big data; graphs; benchmark; scale-free network; preferential attachment.},
        Owner = {kahawick},
        Timestamp = {2017.07.16}
}