In the last decades, the human society has been facing with an increased number of terrorists and criminal attacks. With the advancement of the technology and growing use of the Internet, if on the one hand, the IT growth has improved the quality of people’s life and provided a lot of comforts, from the other hand it has also brought dismissive consequences with it in terms of cyber-criminality. Indeed, the use of the Internet provides to terrorists and criminals a convenient and effective means for establishing and advertising illegal activities. Especially, social media facilitate new and international collaborations as well as they make easier to spread information all over the world with a reduced risk of being exposed. Furthermore, the low level of evidence about the extent, role and nature of organized crime groups in cyberspace, obstruct the development of effective counter measures against these group activities.
Twitter, for example, represents one of the most popular social network, used to advertise illegal activities and to radicalize people. On twitter, there are users, organizations, public figures and news channels publishing new tweets every day. Unfortunately, due to the huge amount of data generated as well as because of the heterogeneous nature of the dataset in terms of incompleteness, fuzzy boundaries and dynamics, the discovering process of such cyber-criminals is a hard task to perform. In this context, social network analysis represents an effective method for identifying central players, key actors and sub-groups, by exploring connections among them. Social network analysis could be considered, indeed, a starting approach for analyzing group of suspicious users for identifying emergent behaviors that can be exploited to further guide during the definition of ad-hoc counter actions against these activities.
As a consequence, this task aims to support the process of analysis of similarities among suspicious online users by analysing relationships among them. The ongoing analysis process includes discovering similarities among these online user’s profiles as well as possible behavioural patterns based on the exploitation and extension of already existing approaches as well as on their combination with well-known clustering analysis techniques.
The activity deals with social media analysis in the context of organized crime and terrorist networks. It aims at defining a methodological process to support the discovering similarities among suspicious online user’s profiles as well as possible behavioral patterns.
More specifically, it is focused on three particular objectives:
- Meta model definition related to OC/TN: here, specific meta-data, which are considered relevant in identifying relations with TN/OC, are identified and analyzed, moreover normalization of missing meta data values has been faced.
- Identification of similarities among suspicious users: here, the aspects related to the evaluation of similarities among couple of users has been faced by combining multiple similarity criteria based on the comparison of users’ meta-data.
- Behavioral Patterns of suspicious users: it focused on the discovering of groups and on the identification of emerging patterns of suspicious users by using clustering techniques combined with association rules.
In particular, specific meta-data related to OC/TN are considered as well as a penalizing mechanism, for dealing with missing values, has been defined; then, the TF-IDF technique is combined with Jaccard and Cosine distances as similarity criteria between suspicions users as well as a specific extension are defined. Furthermore, methods of clustering and association rules are integrated to support the identification process of similar group of users with particular relationships to OC or TNs. Finally, a first software tool is going to be implemented for the evaluation of the proposed approach, and experimented on the Twitter social media.
Online social networks are becoming a huge information source, as they contain a lot of data about users’ profiles, which can be used single users in terms of interests, moods but also connections among each other and collective behaviors. They can also be used to identify some particular behavior in the context of OC and TNs.
In this perspective, this activity is devoted the following activities, represented in figure 1. Specifically, starting from a set of users in input, already classified as “suspicions”, in the “Meta-model definition” an approach, based on a set of reference keywords related to criminality, is used to support the selection and extraction of data linked with Organized Crime and Terrorist Networks; furthermore, four typologies of data and a basic penalizing mechanism, for dealing with the missing values, are proposed. In the “User Profiles Similarity Identification” block, the similarity aspects among couple of users is faced on the basis of a combination of similarity criteria already available in literature. Then, the “Patterns Discovering in OC and TNs” block deals with clustering and patters extraction of suspicious users on the basis of a Betweenness Centrality model. The Methodology produces a set of diagrams and analysis on group of users potentially related to OC and TNs .
Andrea Tundis, Technische Universität Darmstadt