1. Introduction
Α blog is “a web page where a blogger ‘logs’ all the other web pages he finds interesting” [1]. In this vein, blogs by definition link to other sources of information, usually to other blogs [2] establishing social networks [3]. Sigala [4] mentioned that “blogs create and maintain strong online communities through their social ties tools, such as blogrolls, permalinks, comments and trackbacks”. Nowadays, due to their connectivity, blogs receive extensive attention [5].
In the conversational blogosphere [6], some blogs focus on specific subjects, such as health blogs, travel blogs, gardening blogs, fashion blogs, education blogs, niche blogs, quizzing blogs, legal blogs etc. [7]. Tech blogs are “blogs that focus on information technology innovation and the high-tech industry” ([8], p. 40). They may be personal journals, where the bloggers express personal experiences and reflections, and provide commentary and opinions, and articulate ideas [9,10], belonging to experts as a mean to broadcast their expertise to a large audience [11] or belonging to organizations and enterprises [12] who are rushing into them in order to interact meaningfully with their potential customers [13, 14] to increase the visibility of their products in an inexpensive manner [15] and finally to integrate them into their marketing strategies [16].
Technorati.com, the most popular real-time search engine dedicated to the blogosphere, tracks at 21-06-2013, with 42,692 blogs involving technology (info tech and gadgets). This category of blogs ranks second among the top topics on blogging. However, little research effort has been devoted to investigating tech blogs [8,17].
Social network analysis is a theory which can help to explore the nature of interconnected unities [18]. The paper performs a topological analysis of “tech” blogs in Greece. Finding blogs’ communication patterns is important, since “the discovery of information networks among web sites or among site producers through the analysis of link counts and patterns, and exploration into motivations or contexts for linking, has been a key issue in this social science literature” ([19], p. 62). In Section 2, definition of “tech” blogs and their role in the decisionmaking process is given. In Section 3, literature review on Social Network Analysis is provided. Section 4 describes the methodology used and finally at Section 5, results and discussion about the network are presented. The originality of the paper lies in the study of “tech blogs” interconnections and provision of insights on their network using SNA for first time while the limitations of the study are associated to the country specific issues.
2. “Tech” Blogs and the Decision Making Process
On the societal level, the use of collaborative technologies such as blogs lead to online creation of communities based on similar interests, in which people communicate rapidly and conveniently, share information and keep in touch with each other [20]. This community creation can affect the entire communications process [17,21] and has strongly influence in the way people relate, act and make decisions [21]. Blogs provide a new medium where bloggers and readers come together to offer and seek impartial information. Consumers perceive information from blogs more credible and trustworthy from traditional forms of media and if they interact in blogs over a long period of time, they trust the opinions of other users and take them into consideration when making purchase decisions [22-25]. In this vein, blogs have created an additional channel for product and service recommendations and endorsements between people who have never met [26,27]. Moreover, marketers believe that blogs may be an important tool for getting out their message or advertising their products thus they have started creating blogs or use blogs as a way of creating a buzz for their products [28].
“Tech” blogs are emerging as an influential actor in technology innovation discourse [8] and some of them “those that do aren’t part of some proletarian information revolution” have become the tech world’s new elite [29]. “Tech” bloggers may be from chief technologists at high-tech companies to independent developers and commentators [8] and from marketers to individuals that potentially create and diffuse knowledge about technology innovations, firms (e.g., Apple, Microsoft, Google, and Yahoo), and products to a broad community of other bloggers and their readers.
Blogs can be viewed as a new word-of-mouth (WOM) channel [17] and along with interpersonal influence are ranked the most important information source when a consumer is making a purchase decision [30]. “Tech” bloggers observe and comment on tech companies’ strategies, on competitive moves taking place among them and on product launches [8]. Elite “tech” bloggers have the power to position products, services, or ideas quickly and effectively in the blog’s community or even for an international audience [31] and could be used by marketers as an additional channel for advertisement. On the other hand, a single “tech” blogging complainer can bring together other dissatisfied consumers who could use the blog as a vehicle to organize new product boycotts or to reveal insider information [17].
3. Social Network Analysis
Social networks can be defined as “a collectivity of individuals among whom exchanges take place that are supported only by shared norms of trustworthy behaviour” [32]. Peoples’ tendency to come together, to connected through various social relationships or exchanges and to form networks is inherent in the structure of society [33,34]. Nowadays, as technology is increasingly incorporated into people’s day-to-day social relationships [35] new online social networks emerge linking people, organizations, and knowledge and new ties are developed among people sharing interests [36]. Social networks are now vehicles wherein various types of knowledge is brought together and new knowledge is created; thus expanding or sustaining the network and its output over time [37].
Social Network Analysis is a powerful sociological methodology for “explaining variances in resources, social behavior and socio-economic outcomes” ([38], p. 4) through a structural interpretation of human interaction [18]. In its application, identifies patterns of interaction of individuals and knowledge flows within a social network. SNA makes the invisible work visible [39].
SNA is rooted in matrix algebra and graph theory, in the concepts of nodes and connections [40]. Nodes are the social actors, tech blogs in this case and “connections” refer to channels of communication [41]. SNA study patterns of relations, meaning that “while relations are measured as existing between pairs of nodes, understanding the effect and meaning of a tie between two nodes requires taking into account the broader patterns of ties within the network” ([41], p. 15). When used to mine a network, SNA can help to identify boundary spanners, gatekeepers, knowledge bottlenecks, under and overutilized nodes [37], central nodes that can act as hubs, leaders, or bridging different communities [42]. Discovering inherent community structures can help understand networks deeply and reveal interesting properties shared by the actors [43].
4. Methodology
This research uses the Google Blog Search Engine and records blogs from Greece tagging the word “technology”. The blogs were visited and if they were “tech” blogs then by using snowball sampling, links from blogrolls of these blogs to other blogs were recorded in November 2012. Snowball sampling is one of the preferred methods of sampling in social network studies [3]. Snowball sampling ended when no new tech blogs were located. Finally 141 blogs were recorded. Blogrolls were used as they present formal interconnections between blogs. These interconnections take the form of suggestions to potential users [44]. The “blogroll” occupies a permanent position on the blog’s home page and consists of blogs that the blogger frequently reads or especially admires and have shared interests thus offers links to these blogs [45-47].
For the construction of the network, the adjacency matrix A was created. In this case, A is a 141 × 141 nonsymmetric binary data matrix, where 1 is placed in cell Aij if blog i is linked by an arc to blog j through the blogroll, otherwise 0 is placed in the cell. The next step involves the construction of “tech” blogs interconnection network. It is a directed graph where blogs are noted as nodes and incoming links as directed arrows—arcs.
First, connectivity issues were investigated, locating the components comprising the blog network. A directed network is strongly connected when there exists a path between any pair of nodes and weakly connected when there exists a path connecting all pairs of nodes when it is not taken into account the direction of arcs. If this is still not possible, an effort is made to locate maximal subnetworks bearing the above properties. A disconnected blog network implies either fragmentation in views and opinions or immaturity in the whole blogging process [48]. The most straightforward approach is to find the largest component of the network, provided that such a maximal sub-network exists, with respect to the total number of nodes.
Traditionally, nodes are investigated in a network regarding their overall position, with respect to all other nodes. Thus an effort was made to find which (if any) nodes are more important than others. A common technique is to measure the centrality index of nodes and compare all nodes according to this index. The paper uses three different measurements of centrality, namely degree, closeness and betweenness centrality and discusses the results. A page-rank [49] calculation will also provide with a ranking of most prominent nodes.
However, in recent literature, there is a shift in the perspective from which a network is examined, leaving individual nodes and regarding more general, topological issues that hold over the whole network. Newman [50] has assembled a set of metrics regarding the topology of a simple, undirected network. This approach is used in this paper since it has been reported as the most important and concise, in the derived undirected blog network. More specifically link density, degree, distance, diameter, eccentricity, clustering coefficient, assortativity coefficient and algebraic connectivity are calculated.
Link Density, S, is the ratio of the actual number of links, L, divided by the maximum possible number of links that could exist in a network. Obviously, in a directed network with N nodes, the maximum possible number of links will be exactly
which is the case of a complete graph where each node is connected to all other nodes of the network. Thus, link density is calculated as:
and can take values in [0..1]. Obviously in a directed network the number of links doubles so
The Degree, di, of node vi is the number of links emanating from vi. In directed networks we have to deal with in-degree and out-degree (link going to a node and links leaving a node respectively). Since every link in an undirected network contributes to two nodes, the average degree of the network can be easily calculated as:
The Distance between two nodes vi and vj is the length of the shortest path that connects vi to vj. The average distance of a network is the average of all distances in this network.
The Diameter, D, of a network is the longest distance over all pairs of nodes.
The Eccentricity of a node is the largest distance from this node to any other node in the network. All node eccentricities can be averaged yielding the average eccentricity of the network.
The Clustering Coefficient, CCi, of node vi, is the ratio of the actual number of links of vi’s neighbours, divided by the maximum possible number of links in this neighbourhood. If a node has large clustering coefficient, then its neighbours tend to form highly interconnected clusters. If vi has exactly K neighbours which interconnect with M links between them, then CCi is calculated as:
The average on all CC’s for all the nodes of a network is the average clustering coefficient of the network.
The Assortativity Coefficient, R, of an undirected network, takes values from [−1, 1] and denotes the degree-similarities between neighbouring nodes. When R is less than zero, a node is connected with other nodes of arbitrary degrees. However, when R is greater than zero and closing to one, nodes tend to connect with other nodes with similar degrees (assortative networks). Calculation of R is as follows:
where ji and ki are the degrees of the nodes at the ends of the i-th link, and i = 1···L.
The Algebraic Connectivity of a graph, studied by Chung [51], is the second smallest eigenvalue of its Laplacian Matrix. The Laplacian Matrix of a graph G with N nodes is the NXN matrix Q = Δ − A, where A is the adjacency matrix of G and Δ = diag(di). The larger the algebraic connectivity, the more difficult it is to find a way to cut a graph to many different components.
5. Results and Discussion
A drawing of the blog network, drawn by Gephi, using the spring-embedding algorithm of Force-Atlas is provided. The network is drawn as undirected in this case.
From Figure 1, it is obvious that the network is disconnected. Nodes are divided into 4 components (on the upper side), where the component in the middle is the largest one, formed with about 57% of the total number of node. It is very interesting to note the large number of isolated nodes (about 34% of the total number of nodes), grouped together in the lower side of Figure 1. It is obvious that these blog-nodes are just start-ups, probably managed by single, wannabe bloggers. The large proportion of isolates shows immaturity in this specific blogosphere. Similar ideas hold also for nodes not belonging to the largest connected component. In the following, attention will be restricted to the largest component, which is drawn in Figure 2, (as a directed network) using the same drawing software.
The network in Figure 2 is a weakly-connected network with N = 80 nodes and L = 149 arcs.
5.1. Centrality Measurements
In-degree, in-closeness and betweenness centrality is calculated and PageRank algorithm is applied on the network of Figure 2. In Table 1 the relevant results are presented for the 15 most prominent nodes, together with their respective node—identifiers (id). All values, apart from in-degrees, are normalized in [0..1] and were computed by [52].
The actual labels of nodes are not so important, at least to non-Greeks (although they are available upon request by the authors). However, it is interesting to note that quite different rankings are produced by different measurements. Although in-degree, as a measurement of prestige, is a quite important metric in directed networks, PageRank seems to give more consistent results, at least to anyone who can identify the actual blogs. Such different rankings with respect to centralities show that this network is very far from being similar to any random network with similar characteristics.
More particularly, the actual low maximum (6) for indegree, shows off again a kind of immaturity in this social network. It is generally expected that, through time, some nodes should emerge as “mostly respected” or “highly appreciated” by others who point to them. These nodes should be considered as “tech gurus” in this case and collect many “positive” in-arcs.
In-closeness centrality, betweenness centrality and PageRank actually need node labels in order to be properly interpreted. However, some insights will be given in the next sub-section, calculate topological measurements for the blog network are calculated.
5.2. Topology Measurements
In Table 2, topological calculations are presented on the largest component of the tech-blogs social network. BND is the original, directed network and BNU is the derived, undirected network.
Table 1. Most prominent nodes by centrality.
The blog network seems to have a normal, tending to sparse, density for a real-life social network (2.3% of the maximum possible number of links exist—doubles for the undirected case). Denser networks emerge in more congested networks, especially when they are created by automatic procedures. The average degree (close to 4) is rather low, a result that can be interpreted again by the immaturity presented in this network.
More importantly, the diameter of 9 (7), together with the average shortest path (close to 4) are in compliance with the famous 6-degree-of separation principle. The concept 6-degree-of separation was advocated by Milgram [53] and shows that people have a narrow circle of acquaintances. After a series of social experiments Travers & Milgram [54] suggested that all people in USA are connected through about 6 intermediate acquaintances and Korte & Milgram [55] claimed that two randomly chosen human beings can be connected by only a short chain of intermediate acquaintances. This principle applies to many other networks than networks of friends [56]. It is very easy for anyone who browses in this network to visit all nodes with 3 - 4 clicks. So it seems that this network tends to form as many other real-life social networks, regarding its searchability.
The clustering coefficient, along with the mean shortest path, can indicate a “small world” phenomenon. It indicates how nodes are embedded in their neighborhood. The average gives an overall indication of the clustering of the network. For the clustering coefficient to be meaningful it should be significantly higher than in version of the network where all of the edges have been shuffled. By creating a random Erdos-Renyi network with the same number of nodes and vertices, the average clustering coefficient is calculated, which in this case is 0.029, significantly smaller that blogs’ network, directed or undirected. Hence it can be deduced that the blog network under investigation is definitely a small world.
Negative assortativity results generally means that the notion of “homophily” does not exist in a network. A value of −0.22 is a rather large negative value in this case, meaning that bloggers do not out-link to other “important” bloggers. Instead, they prefer to link to nodes with smaller degrees. Again, this can be seen as a quality measurement that shows that in this network there seem to be negative feelings among important bloggers and a tendency to link to nodes with smaller prestige. There seems to be a growing competition between bloggers in this context.
Finally, a score of 0.193 for algebraic connectivity shows that this network tends to become stable, since positive values in this metric show that it becomes relatively difficult to disconnect. This metric is expected to become larger, as time passes by.
6. Conclusions
The paper makes a topological analysis of Greek “tech” blogs. “Tech” blogs were recorded and their incoming links were reported through their blogrolls. The resulting network is analyzed using a number of well-established metrics from social network analysis theory, like centrality, the average shortest path length, the clustering coefficient, the assortativity coefficient and the algebraic connectivity. Results indicate that the network is a small world.
Usually hyperlinks are important to bloggers in order to maintain some communication with each other and to determine their digital territory [57]. However, in the network of “tech” blogs, a large number of isolated nodes exist showing immaturity in this specific blogosphere. Regarding the more connected component of the “tech” blogosphere, the actual low maximum for in-degree and the sparse density for a real-life social network, show off a kind of immaturity again. It is expected that, through time, elite blogs will be emerged. Elite blogs may have higher impact on the provision of information, have better access to high-tech firms and special knowledge of upcoming events, happenings and products, and gain more advertisements and privileges, as free trials and distributed samples from technology firms like journalists from wellestablished “tech” magazines. The notion of “homophily” does not exist in the network of “tech blogs” and bloggers prefer to link to nodes with smaller degrees. Perhaps, “tech” bloggers are afraid of the growing competition thus they have reservations.
The limitations of the study are associated to the specific methodology include, the sampling procedure, and the use of incoming links through blogrolls. Blogrolls are indicative of the blogger’s networks; however in a broader view, recording of incoming links though blog posts should be taken into consideration.
Country specific limitations also exist and are associated with Internet infrastructure and cultural issues. In Greece however, the ratio of internet users is relatively small and blogging started to expand late during 2002- 2003 and this may pose important limitation. Thus it would be very interesting to compare the network with other “tech” blog networks in order to investigate if bloggers’ tendency for isolation depends on fragmentation in views and opinions, immaturity or competitive attitude.