Nishanth Sastry

Professor Nishanth Sastry


About

Areas of specialism

Internet Data Science; Social Networks; Social Computing; Computational Social Science; Computer Networks; Edge Computing; Data Analysis; Online Harms; Web Science; Privacy on the Web

Research

Research interests

Supervision

Postgraduate research supervision

Publications

Haris Bin Zia, Aravindh Raman, Ignacio Castro, Ishaku Hassan Anaobi, Emiliano De Cristofaro, Nishanth Sastry, Gareth Tyson (2022)Toxicity in the Decentralized Web and the Potential for Model Sharing, In: Proceedings of the ACM on measurement and analysis of computing systems6(2)35pp. 1-25 ACM

The "Decentralised Web" (DW) is an evolving concept, which encompasses technologies aimed at providing greater transparency and openness on the web. The DW relies on independent servers (aka instances) that mesh together in a peer-to-peer fashion to deliver a range of services (e.g. micro-blogs, image sharing, video streaming). However, toxic content moderation in this decentralised context is challenging. This is because there is no central entity that can define toxicity, nor a large central pool of data that can be used to build universal classifiers. It is therefore unsurprising that there have been several high-profile cases of the DW being misused to coordinate and disseminate harmful material. Using a dataset of 9.9M posts from 117K users on Pleroma (a popular DW microblogging service), we quantify the presence of toxic content. We find that toxic content is prevalent and spreads rapidly between instances. We show that automating per-instance content moderation is challenging due to the lack of sufficient training data available and the effort required in labelling. We therefore propose and evaluate ModPair, a model sharing system that effectively detects toxic content, gaining an average per-instance macro-F1 score 0.89.

Suparna De, Shalini Jangra, Vibhor Agarwal, Jon Johnson, Nishanth Sastry (2023)Biases and Ethical Considerations for Machine Learning Pipelines in the Computational Social Sciences, In: Ethics in Artificial Intelligence: Bias, Fairness and Beyondpp. 99-113 Springer Nature Singapore

Computational analyses driven by Artificial Intelligence (AI)/Machine Learning (ML) methods to generate patterns and inferences from big datasets in computational social science (CSS) studies can suffer from biases during the data construction, collection and analysis phases as well as encounter challenges of generalizability and ethics. Given the interdisciplinary nature of CSS, many factors such as the need for a comprehensive understanding of different facets such as the policy and rights landscape, the fast-evolving AI/ML paradigms and dataset-specific pitfalls influence the possibility of biases being introduced. This chapter identifies challenges faced by researchers in the CSS field and presents a taxonomy of biases that may arise in AI/ML approaches. The taxonomy mirrors the various stages of common AI/ML pipelines: dataset construction and collection, data analysis and evaluation. By detecting and mitigating bias in AI, an active area of research, this chapter seeks to highlight practices for incorporating responsible research and innovation into CSS practices.

Online conversation understanding is an important yet challenging NLP problem which has many useful applications (e.g., hate speech detection). However, online conversations typically unfold over a series of posts and replies to those posts, forming a tree structure within which individual posts may refer to semantic context from higher up the tree. Such semantic cross-referencing makes it difficult to understand a single post by itself; yet considering the entire conversation tree is not only difficult to scale but can also be misleading as a single conversation may have several distinct threads or points, not all of which are relevant to the post being considered. In this paper, we propose a Graph-based Attentive Semantic COntext Modeling (GASCOM) framework for online conversation understanding. Specifically, we design two novel algorithms that utilise both the graph structure of the online conversation as well as the semantic information from individual posts for retrieving relevant context nodes from the whole conversation. We further design a token-level multi-head graph attention mechanism to pay different attentions to different tokens from different selected context utterances for fine-grained conversation context modeling. Using this semantic conversational context, we re-examine two well-studied problems: polarity prediction and hate speech detection. Our proposed framework significantly outperforms state-of-the-art methods on both tasks, improving macro-F1 scores by 4.5% for polarity prediction and by 5% for hate speech detection. The GASCOM context weights also enhance interpretability.

Emeka Obiodu, Abdullahi Abubakar, Aravindh Raman, Pushkal Agrawal, Tooba Faisal, Nishanth Sastry, Hamid Aghvami (2022)How Special is New Year Eve Traffic? Insights from Four Years 3G/4G/5G User Measurements, In: 2022 45th International Conference on Telecommunications and Signal Processing (TSP)pp. 349-354 IEEE

The specialness of New Year eve traffic is a telecoms industry fable. But how true is it, and what's the impact on user experience? We investigate this on the four UK cellular networks, in London, on New Year eve in 2016/17, 2017/18, 2018/19 and 2019/20 (covid cancelled 2020/21 & 2021/22). Overall, we captured 544,560 readings across 14 categories using 3G/4G/5G devices. This paper summarises our longitudinal readings into 10 observations on the nature of network performance, from a user's perspective, on special days such as New Year eve. Based on these, we confirm that mature 3G/4G networks are unable to deliver a consistent user experience, especially on atypical days. For example, on 4G, a user had a 60% chance to get a latency below 50 ms and 90% chance for 500ms. If repeated in mature 5G networks, it suggests that it is inadequate to support safety-critical 5G use cases.

Wenjie Yin, Vibhor Agarwal, Aiqi Jiang, Arkaitz Zubiaga, Nishanth Sastry (2023)AnnoBERT: Effectively Representing Multiple Annotators’ Label Choices to Improve Hate Speech Detection, In: Proceedings of the International AAAI Conference on Web and Social Media17pp. 902-913

Supervised machine learning approaches often rely on a "ground truth" label. However, obtaining one label through majority voting ignores the important subjectivity information in tasks such hate speech detection. Existing neural network models principally regard labels as categorical variables, while ignoring the semantic information in diverse label texts. In this paper, we propose AnnoBERT, a first-of-its-kind architecture integrating annotator characteristics and label text with a transformer-based model to detect hate speech, with unique representations based on each annotator's characteristics via Collaborative Topic Regression (CTR) and integrate label text to enrich textual representations. During training, the model associates annotators with their label choices given a piece of text; during evaluation, when label information is not available, the model predicts the aggregated label given by the participating annotators by utilising the learnt association. The proposed approach displayed an advantage in detecting hate speech, especially in the minority class and edge cases with annotator disagreement. Improvement in the overall performance is the largest when the dataset is more label-imbalanced, suggesting its practical value in identifying real-world hate speech, as the volume of hate speech in-the-wild is extremely small on social media, when compared with normal (non-hate) speech. Through ablation studies, we show the relative contributions of annotator embeddings and label text to the model performance, and tested a range of alternative annotator embeddings and label text combinations.

Vibhor Agarwal, Anthony P. Young, Sagar Joglekar, Nishanth Sastry (2023)A Graph-Based Context-Aware Model to Understand Online Conversations, In: ACM transactions on the web

Online forums that allow for participatory engagement between users have been transformative for the public discussion of many important issues. However, such conversations can sometimes escalate into full-blown exchanges of hate and misinformation. Existing approaches in natural language processing (NLP), such as deep learning models for classification tasks, use as inputs only a single comment or a pair of comments depending upon whether the task concerns the inference of properties of the individual comments or the replies between pairs of comments, respectively. But in online conversations, comments and replies may be based on external context beyond the immediately relevant information that is input to the model. Therefore, being aware of the conversations’ surrounding contexts should improve the model’s performance for the inference task at hand. We propose GraphNLI 1 , a novel graph-based deep learning architecture that uses graph walks to incorporate the wider context of a conversation in a principled manner. Specifically, a graph walk starts from a given comment and samples “nearby” comments in the same or parallel conversation threads, which results in additional embeddings that are aggregated together with the initial comment’s embedding. We then use these enriched embeddings for downstream NLP prediction tasks that are important for online conversations. We evaluate GraphNLI on two such tasks - polarity prediction and misogynistic hate speech detection - and find that our model consistently outperforms all relevant baselines for both tasks. Specifically, GraphNLI with a biased root-seeking random walk performs with a macro- F 1 score of 3 and 6 percentage points better than the best-performing BERT-based baselines for the polarity prediction and hate speech detection tasks, respectively. We also perform extensive ablative experiments and hyperparameter searches to understand the efficacy of GraphNLI. This demonstrates the potential of context-aware models to capture the global context along with the local context of online conversations for these two tasks.

Ranjan Pal, Ziyuan Huang, Xinlong Yin, Sergey Lototsky, Swades De, Sasu Tarkoma, Mingyan Liu, Jon Crowcroft, Nishanth Sastry (2021)Aggregate Cyber-Risk Management in the IoT Age Cautionary Statistics for (Re)Insurers and Likes, In: IEEE internet of things journal8(9)pp. 7360-7371 IEEE

IoT-driven smart societies are modern service-networked ecosystems, whose proper functioning is hugely based on the success of supply chain relationships. Robust security is still a big challenge in such ecosystems, catalyzed primarily by naive cyber-security practices (e.g., setting default IoT device passwords) on behalf of the ecosystem managers, i.e., users and organizations. This has recently led to some catastrophic malware-driven DDoS and ransomware attacks (e.g., the Mirai and WannaCry attacks). Consequently, markets for commercial third-party cyber-risk management (CRM) services (e.g., cyber-insurance) are steadily but sluggishly gaining traction with the rapid increase of IoT deployment in society, and provides a channel for ecosystem managers to transfer residual cyber-risk post attack events. Current empirical studies have shown that such residual cyber-risks affecting smart societies are often heavy-tailed in nature and exhibit tail dependencies . This is both, a major concern for a profit-minded CRM firm that might normally need to cover multiple such dependent cyber-risks from different sectors (e.g., manufacturing and energy) in a service-networked ecosystem, and a good intuition behind the sluggish market growth of CRM products. In this article, we provide: 1) a rigorous general theory to elicit conditions on (tail-dependent) heavy-tailed cyber-risk distributions under which a risk management firm might find it (non)sustainable to provide aggregate cyber-risk coverage services for smart societies and 2) a real-data-driven numerical study to validate claims made in theory assuming boundedly rational cyber-risk managers, alongside providing ideas to boost markets that aggregate dependent cyber-risks with heavy-tails. To the best of our knowledge, this is the only complete general theory till date on the feasibility of aggregate CRM.

Emeka Obiodu, Nishanth Sastry, Aravindh Raman (2020)Is it time for a 999-like (or 112/911) system for critical information servicesƒ, In: NOMS 2020 - 2020 IEEE/IFIP Network Operations and Management Symposiumpp. 1-6 IEEE

The nature of information gathering and dissemination has changed dramatically over the past 20 years as traditional media sources are increasingly being replaced by a cacaphony of social media channels. Despite this, society still expects to disseminate its critical information via traditional news sources. Public Warning Systems (PWS) exist, but concerns about spamming users with irrelevant warnings mean that mostly only life threatening emergency warnings are delivered via PWS. We argue that it is time for society to upgrade its infrastructure for critical information services (CIS) and that a smartphone app system can provide a standardised, less-intrusive user interface to deliver CIS, especially if the traffic for the app is prioritised during congestion periods. Accordingly, we make three contributions in this paper. Firstly, using network parameters from our longitudinal measurements of network performance in Central London (an area of high user traffic), we show, with simulations, that reserving some bandwidth exclusively for CIS could assure QoS for CIS without significant degradation for other services. Secondly, we provide a conceptual design of a 999 CIS app, which can mimic the current 999 voice system and can be built using 3GPP defined systems. Thirdly, we identify the stakeholder relationships with industry partners and policymakers that can help to deliver a CIS system that is fit for purpose for an increasingly smartphone-based society.

Dmytro Karamshuk, Frances Shaw, Julie Brownlie, Nishanth Sastry (2017)Bridging big data and qualitative methods in the social sciences: A case study of Twitter responses to high profile deaths by suicide, In: Online social networks and media1pp. 33-43 Elsevier B.V

With the rise of social media, a vast amount of new primary research material has become available to social scientists, but the sheer volume and variety of this make it difficult to access through the traditional approaches: close reading and nuanced interpretations of manual qualitative coding and analysis. This paper sets out to bridge the gap by developing semi-automated replacements for manual coding through a mixture of crowdsourcing and machine learning, seeded by the development of a careful manual coding scheme from a small sample of data. To show the promise of this approach, we attempt to create a nuanced categorisation of responses on Twitter to several recent high profile deaths by suicide. Through these, we show that it is possible to code automatically across a large dataset to a high degree of accuracy (71%), and discuss the broader possibilities and pitfalls of using Big Data methods for Social Science.

Gareth Tyson, Yehia Elkhatib, Nishanth Sastry, Steve Uhlig (2013)Demystifying porn 2.0, In: Proceedings of the 2013 conference on internet measurement conferencepp. 417-426 ACM

The Internet has evolved into a huge video delivery infrastructure, with websites such as YouTube and Netflix appearing at the top of most traffic measurement studies. However, most traffic studies have largely kept silent about an area of the Internet that (even today) is poorly understood: adult media distribution. Whereas ten years ago, such services were provided primarily via peer-to-peer file sharing and bespoke websites, recently these have converged towards what is known as ``Porn 2.0''. These popular web portals allow users to upload, view, rate and comment videos for free. Despite this, we still lack even a basic understanding of how users interact with these services. This paper seeks to address this gap by performing the first large-scale measurement study of one of the most popular Porn 2.0 websites: YouPorn. We have repeatedly crawled the website to collect statistics about 183k videos, witnessing over 60 billion views. Through this, we offer the first characterisation of this type of corpus, highlighting the nature of YouPorn's repository. We also inspect the popularity of objects and how they relate to other features such as the categories to which they belong. We find evidence for a high level of flexibility in the interests of its user base, manifested in the extremely rapid decay of content popularity over time, as well as high susceptibility to browsing order. Using a small-scale user study, we validate some of our findings and explore the infrastructure design and management implications of our observations.

Pan Hui, N. Sastry (2009)Real World Routing Using Virtual World Information, In: 2009 International Conference on Computational Science and Engineering4pp. 1103-1108 IEEE

In this paper, we propose to leverage social graphs from Online Social Networks (OSN) to improve the forwarding efficiency of mobile networks, more particularly Delay Tolerant Networks (DTN). We extract community structures from three popular OSNs, Flickr, LiveJournal,and YouTube, and quantify the clustering features of each network at different levels of hierarchical resolution. We then show how community information can be used for forwarding using hints small enough to store on a mobile device. We also provide a first comparison study of the topological community structures for different types of OSNs with millions of users.

Changtao Zhong, Nishanth Sastry (2014)"Copy content, copy friends: studies of content curation and social bootstrapping on Pinterest" by Changtao Zhong and Nishanth Sastry with Ching-man Au Yeung as coordinator, In: SIGWEB newsletter : the newsletter of ACM's Special Interest Group on Hypertext and Hypermedia2014(Summer)pp. 1-6

Copying, sharing and linking have always been important for the functioning and the growth of the World Wide Web. Two recent copying trends which have emerged are social content curation, and social logins. Social curation involves the copying, categorization and sharing of links and images from third party websites on the social curation website. Social logins enable the copying of user identities and their friends from an established social network such as Facebook or Twitter, onto third party websites. In this article, we chronicile our ongoing work on Pinterest, a popular image sharing website and social network. The highly active user community on Pinterest has been instrumental in making social curation a mainstream phenomenon. Interestingly, a large fraction (nearly 60%) of the users have also linked their Pinterest accounts with Facebook and have copied their Facebook friends over onto the new website. Thus, using a large dataset crawled from Pinterest, we uncover both the practices used for sharing content, as well as how the copying of friends has helped the content sharing. We find that social curation tends to copy and share hard-to-find niche interest content from websites with a low Alexa Rank or Google Page Rank, and curators with consistent updates and a diversity of interests are popular and attract more followers. On the other hand, Pinterest users can also copy friends from Faceebook, or Twitter. We find that this copying of friends create a community with higher levels of social interaction; thus social logins serve as a social bootstrapping tool. But beyond bootstrapping, we also find a weaning process, where active and influential users tend to form more links natively on Pinterest interact with native friends rather than copied friends.

Anubhab Banerjee, Nishanth Sastry, Carmen Mas Machuca (2019)Sharing Content at the Edge of the Network Using Game Theoretic Centrality, In: 2019 21st International Conference on Transparent Optical Networks (ICTON)2019-pp. 1-4 IEEE

Content Delivery Networks aim at delivering the desired content to each user at minimum delay and cost. To tackle this problem, the content placement problem considering available cache locations has been widely studied. However, this paper addresses this problem by taking advantage of using existing but still underused Wi-Fi links. Our study considers to cache content in user homes and sharing it among neighbours via Wi-Fi links. To maximizee energy savings and reduce delays, content should be intelligently placed at the caches distributed in different users' homes. We propose using a `game theoretic centrality' metric, which models the sharing of content among neighbours as a co-operative coalition game. We apply this metric to study the energy savings and evaluate how close the contents are placed to the interested user(s).

E. V. Lakshmi, N. N. Sastry, B. P. Rao, Nishanth Ramakrishna Sastry (2011)Optimum active decoy deployment for effective deception of missile radars, In: Proceedings of 2011 IEEE CIE International Conference on Radar1pp. 234-237 IEEE

In a battle engagement scenario, while missile interception and hard kill options can be exercised, soft kill options are less expensive and elegant. In this paper, optimum positioning of an active decoy which is fired in the form of a cartridge from the platform of the target is reported. Various radar and jammer parameters for effective luring away of the missile are studied. Computer simulations are carried out and it is shown that miss distances of the order of half a Kilo meter or more can be obtained for typical monopulse radars.

Walid Magdy, Yehia Elkhatib, Gareth Tyson, Sagar Joglekar, Nishanth Sastry (2017)Fake it till you make it: Fishing for Catfishes, In: 2017 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM)pp. 497-504 ACM

Many adult content websites incorporate social networking features. Although these are popular, they raise significant challenges, including the potential for users to "catfish", i.e., to create fake profiles to deceive other users. This paper takes an initial step towards automated catfish detection. We explore the characteristics of the different age and gender groups, identifying a number of distinctions. Through this, we train models based on user profiles and comments, via the ground truth of specially verified profiles. When applying our models for age and gender estimation to unverified profiles, 38% of profiles are classified as lying about their age, and 25% are predicted to be lying about their gender. The results suggest that women have a greater propensity to catfish than men. Our preliminary work has notable implications on operators of such online social networks, as well as users who may worry about interacting with catfishes.

Gareth Tyson, Nishanth Sastry, Ivica Rimac, Ruben Cuevas, Andreas Mauthe (2012)A survey of mobility in information-centric networks, In: Proceedings of the 1st ACM workshop on emerging name-oriented mobile networking design - architecture, algorithms, and applicationspp. 1-6 ACM

In essence, an information-centric network (ICN) is one which supports a content request/reply model. One proposed benefit of this is improved mobility. This can refer to provider, consumer or content mobility . Despite this, little specific research has looked into the effectiveness of ICN in this regard. This paper presents a survey of some of the key ICN technologies, alongside their individual approaches to mobility. Through this, we highlight some of the promising benefits of ICN, before discussing important future research questions that must be answered.

Arkaitz Zubiaga, Bertie Vidgen, Miriam Fernandez, Nishanth Sastry (2022)Editorial for Special Issue on Detecting, Understanding and Countering Online Harms, In: Online social networks and media27100186 Elsevier B.V

This editorial article introduces the OSNEM special issue on Detecting, Understanding and Countering Online Harms. Whilst online social networks and media have revolutionised society, leading to unprecedented connectivity across the globe, they have also enabled the spread of hazardous and dangerous behaviours. Such ‘online harms’ are now a pressing concern for policymakers, regulators and big tech companies. Building deep knowledge about the scope, nature, prevalence, origins and dynamics of online harms is crucial for ensuring we can clean up online spaces. This, in turn, requires innovation and advances in methods, data, theory and research design – and developing multi-domain and multi-disciplinary approaches. In particular, there is a real need for methodological research that develops high-quality methods for detecting online harms in a robust, fair and explainable way. With this motivation in mind, the present special issue attracted 20 submissions, of which 8 were ultimately accepted for publication in the journal. These submissions predominantly revolve around online misinformation and abusive language, with an even distribution between the two topics. In what follows, we introduce and briefly discuss the contributions of these accepted submissions.

Sagar Joglekar, Nishanth Sastry, Miriam Redi (2017)Like at First Sight: Understanding User Engagement with the World of Microvideos, In: Social Informaticspp. 237-256 Springer International Publishing

Several content-driven platforms have adopted the ‘micro video’ format, a new form of short video that is constrained in duration, typically at most 5–10 s long. Micro videos are typically viewed through mobile apps, and are presented to viewers as a long list of videos that can be scrolled through. How should micro video creators capture viewers’ attention in the short attention span? Does quality of content matter? Or do social effects predominate, giving content from users with large numbers of followers a greater chance of becoming popular? To the extent that quality matters, what aspect of the video – aesthetics or affect – is critical to ensuring user engagement? We examine these questions using a snapshot of nearly all (\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${>}120,000$$\end{document}) videos uploaded to globally accessible channels on the micro video platform Vine over an 8 week period. We find that although social factors do affect engagement, content quality becomes equally important at the top end of the engagement scale. Furthermore, using the temporal aspects of video, we verify that decisions are made quickly, and that first impressions matter more, with the first seconds of the video typically being of higher quality and having a large effect on overall user engagement. We verify these data-driven insights with a user study from 115 respondents, confirming that users tend to engage with micro videos based on “first sight”, and that users see this format as a more immediate and less professional medium than traditional user-generated video (e.g., YouTube) or user-generated images (e.g., Flickr).

Katarzyna Musial, Nishanth Sastry (2012)Social media, In: Proceedings of the Fourth Annual Workshop on simplifying complex networks for practitionerspp. 1-6 ACM

On many social media and user---generated content sites, users can not only upload content but also create links with other users to follow their activities. It is interesting to ask whether the resulting user---user Followers' Network is based more on social ties, or shared interests in similar content. This paper reports our preliminary progress in answering this question using around five years of data from social video---sharing site vimeo. Many links in the Followers' Network are between users who do not have any videos in common, which would imply the network is not interest---based, but rather has a social character. However, the Followers' Network also exhibits properties unlike other social networks, for instance, clustering co---efficient is low, links are frequently not reciprocated, and users form links across vast geographical distances. In addition, analysis of the relationship strength, calculated as the number of commonly liked videos, people who follow each other and share some "likes" have more video likes in common than the general population. We conclude by speculating on the reasons for these differences and proposals for further work.

Pushkal Agarwal, Miriam Redi, Nishanth Sastry, Edward Wood, Andrew Blick (2020)Wikipedia and Westminster, In: Proceedings of the 31st ACM Conference on hypertext and social mediapp. 161-166 ACM

Wikipedia is a major source of information providing a large variety of content online, trusted by readers from around the world. Readers go to Wikipedia to get reliable information about different subjects, one of the most popular being living people, and especially politicians. While a lot is known about the general usage and information consumption on Wikipedia, less is known about the life-cycle and quality of Wikipedia articles in the context of politics. The aim of this study is to quantify and qualify content production and consumption for articles about politicians, with a specific focus on UK Members of Parliament (MPs). First, we analyze spatio-temporal patterns of readers' and editors' engagement with MPs' Wikipedia pages, finding huge peaks of attention during election times, related to signs of engagement on other social media (e.g. Twitter). Second, we quantify editors' polarisation and find that most editors specialize in a specific party and choose specific news outlets as references. Finally we observe that the average citation quality is pretty high, with statements on 'Early life and career' missing citations most often (18%).

Yash Vekaria, Vibhor Agarwal, Pushkal Agarwal, Sangeeta Mahapatra, Sakthi Balan Muthiah, Nishanth Sastry, Nicolas Kourtellis (2021)Differential Tracking Across Topical Webpages of Indian News Media, In: ACM International Conference Proceeding Seriespp. 299-308
Nishanth Ramakrishna Sastry (2022)Blockchain Theory and Applications - Welcome and Committees, In: The Institute of Electrical and Electronics Engineers, Inc. (IEEE) Conference Proceedingspp. i-ii The Institute of Electrical and Electronics Engineers, Inc. (IEEE)

Conference Title: 2022 IEEE International Conference on Pervasive Computing and Communications Workshops and other Affiliated Events (PerCom Workshops) Conference Start Date: 2022, March 21 Conference End Date: 2022, March 25 Conference Location: Pisa, ItalyPresents the introductory welcome message from the conference proceedings. May include the conference officers' congratulations to all involved with the conference event and publication of the proceedings record.

Dmytro Karamshuk, Tetyana Lokot, Oleksandr Pryymak, Nishanth Sastry (2016)Identifying Partisan Slant in News Articles and Twitter During Political Crises, In: Social Informaticspp. 257-272 Springer International Publishing

In this paper, we are interested in understanding the interrelationships between mainstream and social media in forming public opinion during mass crises, specifically in regards to how events are framed in the mainstream news and on social networks and to how the language used in those frames may allow to infer political slant and partisanship. We study the lingual choices for political agenda setting in mainstream and social media by analyzing a dataset of more than 40M tweets and more than 4M news articles from the mass protests in Ukraine during 2013–2014—known as “Euromaidan”—and the post-Euromaidan conflict between Russian, pro-Russian and Ukrainian forces in eastern Ukraine and Crimea. We design a natural language processing algorithm to analyze at scale the linguistic markers which point to a particular political leaning in online media and show that political slant in news articles and Twitter posts can be inferred with a high level of accuracy. These findings allow us to better understand the dynamics of partisan opinion formation during mass crises and the interplay between mainstream and social media in such circumstances.

Xuehui Hu, Nishanth Sastry (2020)What a Tangled Web We Weave: Understanding the Interconnectedness of the Third Party Cookie Ecosystem, In: 12th ACM Conference on Web Sciencepp. 76-85 ACM

When users browse to a so-called "First Party" website, other third parties are able to place cookies on the users' browsers. Although this practice can enable some important use cases, in practice, these third party cookies also allow trackers to identify that a user has visited two or more first parties which both share the second party. This simple feature been used to bootstrap an extensive tracking ecosystem that can severely compromise user privacy. In this paper, we develop a metric called "tangle factor" that measures how a set of first party websites may be interconnected or tangled with each other based on the common third parties used. Our insight is that the interconnectedness can be calculated as the chromatic number of a graph where the first party sites are the nodes, and edges are induced based on shared third parties. We use this technique to measure the interconnectedness of the browsing patterns of over 100 users in 25 different countries, through a Chrome browser plugin which we have deployed. The users of our plugin consist of a small carefully selected set of 15 test users in UK and China, and 1000+ in-the-wild users, of whom 124 have shared data with us. We show that different countries have different levels of interconnectedness, for example China has a lower tangle factor than the UK. We also show that when visiting the same sets of websites from China, the tangle factor is smaller, due to blocking of major operators like Google and Facebook. We show that selectively removing the largest trackers is a very effective way of decreasing the interconnectedness of third party websites. We then consider blocking practices employed by privacy-conscious users (such as ad blockers) as well as those enabled by default by Chrome and Firefox, and compare their effectiveness using the tangle factor metric we have defined. Our results help quantify for the first time the extent to which one ad blocker is more effective than others, and how Firefox defaults also greatly help decrease third party tracking compared to Chrome.

Nishanth Sastry (2014)Crowdsourcing and Social Networks, In: Encyclopedia of Social Network Analysis and Miningpp. 316-318 Springer New York
Nishanth Sastry, Eiko Yoneki, Jon Crowcroft (2009)Buzztraq, In: Proceedings of the Second ACM EuroSys Workshop on social network systemspp. 39-45 ACM

Web 2.0 sites have made networked sharing of user generated content increasingly popular. Serving rich-media content with strict delivery constraints requires a distribution infrastructure. Traditional caching and distribution algorithms are optimised for globally popular content and will not be efficient for user generated content that often show a heavy-tailed popularity distribution. New algorithms are needed. This paper shows that information encoded in social network structure can be used to predict access patterns which may be partly driven by viral information dissemination, termed as a social cascade. Specifically, knowledge about the number and location of friends of previous users is used to generate hints that enable placing replicas of the content closer to future accesses.

N. Sastry, A. Hylick, J. Crowcroft (2010)SpinThrift: Saving energy in viral workloads, In: 2010 Second International Conference on COMmunication Systems and NETworks (COMSNETS 2010)pp. 1-6 IEEE

This paper looks at optimising the energy costs for data storage when the work load is highly skewed by a large number of accesses from a few popular articles, but whose popularity varies dynamically. A typical example of such a work load is news article access, where the most popular is highly accessed, but which article is most popular keeps changing. The properties of dynamically changing popular content are investigated using a trace drawn from a social news web site. It is shown that a) popular content have a much larger window of interest than non-popular articles. i.e. popular articles typically have a more sustained interest rather than a brief surge of interest. b) popular content are accessed by multiple unrelated users. In contrast, articles whose accesses spread only virally, i.e. from friend to friend, are shown to have a tendency not to be popular. Using this data, we improve upon Popular Data Concentration (PDC), a technique which is used to save energy by spinning down disks that do not contain popular data. PDC requires keeping the data ordered by their popularity, which involves significant amount of data migration, when the most popular articles keep changing. In contrast, our technique, SpinThrift, detects popular data by the proportion of non-viral accesses made, and results in lesser data migration, whilst using a similar amount of energy as PDC.

Harikrishna Paik, N. N. Sastry, I. SantiPrabha, Nishanth Ramakrishna Sastry (2015)Effectiveness of FM CW jammer parameter on break-lock condition of phase locked loop in monopulse receiver, In: 2015 IEEE International Conference on Electrical, Computer and Communication Technologies (ICECCT)pp. 1-5 IEEE

In this paper, the break-lock phenomenon of phase locked loop (PLL) in missile borne monopulse radar receiver is presented. The continuous wave (CW) frequency modulated (FM) signal is used as jamming signal which is injected into the PLL along with the desired radar echo signal. The effects of key parameters in the FM CW jammer platform such as frequency sensitivity (k f ), modulating signal amplitude (v m ) and modulation frequency (f m ) on break-lock are reported. The value of k f at which the PLL loses the frequency lock to the radar echo signal as a function of modulating signal amplitude and modulation frequency is presented. It is shown that break-lock is achieved at 3.511×10 9 Hz/V for a typical modulating signal amplitude of 5 mV and modulation frequency of 200 kHz, when the radar echo amplitude at the PLL input is 1 volt. The break-lock is also studied by injecting radar echo signal with different amplitude at the PLL input and the value of k f required for break-lock is reported. From these results, the frequency deviation and modulation index required for break-lock are computed and conclusions are demonstrated. The PLL with a third order passive loop filter is designed by exact method and simulation is carried out using visual system simulator (VSS) AWR software for performance evaluation.

Stan Wong, Nishanth Sastry, Oliver Holland, Vasilis Friderikos, Mischa Dohler, Hamid Aghvami (2017)Virtualized authentication, authorization and accounting (V-AAA) in 5G networks, In: 2017 IEEE Conference on Standards for Communications and Networking (CSCN)pp. 175-180 IEEE

Virtualization, containerization and softwarization technologies enable telecommunication systems to realize multitenancy, multi-network slicing and multi-level services. However, the use of these technologies to such ends requires a redesign of the telecommunications network architecture that goes beyond the current long term evolution-advanced (LTE-A). This paper proposes a novel hierarchical and distributed Virtualized Authentication, Authorization and Accounting (V-AAA) architecture for fifth-generation (5G) telecommunications systems, conceived to handle multi-tenancy, multi-network slicing and multi-level services. It also contemplates a new hierarchical and distributed database architecture to inter-work with our 5G V-AAA, able to cope with the network flexibility, elasticity and traffic fluctuation implied in 5G. The sum achievement is the design of a new approach that can provide fast billing and multiple network services for authentication and authorization at the edge cloud.

Nishanth Sastry, Jon Crowcroft (2010)SpinThrift, In: Proceedings of the first ACM SIGCOMM workshop on green networkingpp. 69-76 ACM

This paper looks at optimising the energy costs for storing user-generated content when accesses are highly skewed towards a few "popular" items, but the popularity ranks vary dynamically. Using traces from a video-sharing website and a social news website, it is shown that the non-popular content, which constitute the majority by numbers, tend to have accesses which spread locally in the social network, in a viral fashion. Based on the proportion of viral accesses, popular data is separated onto a few disks on storage. The popular disks receive the majority of accesses, allowing other disks to be spun down when there are no requests, saving energy. Our technique, SpinThrift, improves upon Popular Data Concentration (PDC), which, in contrast with our binary separation between popular and unpopular items, directs the majority of accesses to a few disks by arranging data according to popularity rank. Disregarding the energy required for data reorganisation, SpinThrift and PDC display similar energy savings. However, because of the dyamically changing popularity ranks, SpinThrift requires less than half the number of data reorderings compared to PDC.

Tooba Faisal, Damiano Di Francesco Maesa, Nishanth Sastry, Simone Mangiante (2020)AJIT, In: Proceedings of the ACM MobiArch 2020 The 15th Workshop on mobility in the evolving internet architecturepp. 48-53 ACM

New applications such as remote surgery and connected cars, which are being touted as use cases for 5G and beyond, are mission-critical. As such, communications infrastructure needs to support and enforce stringent and guaranteed levels of service before such applications can take off. However, from an operator's perspective, it can be difficult to provide uniformly high levels of service over long durations or large regions. As network conditions change over time, or when a mobile end point goes to regions with poor coverage, it may be difficult for the operator to support previously agreed upon service agreements that are too stringent. Second, from a consumer's perspective, purchasing a stringent service level agreement with an operator can also be expensive. Finally, failures in mission critical applications can lead to disasters, so infrastructure should support assignment of liabilities when a guaranteed service level is reneged upon - this is a difficult problem because both the operator and the customer have an incentive to lay the blame on each other to avoid liabilities of poor service. To address the above problems, we propose AJIT, an architecture that allows creating fine-grained short-term contracts between operator and consumer. AJIT uses smart contracts to allow dynamically changing service levels so that more expensive and stringent levels of service need only be requested by a customer for short durations when the application needs it, and operator agrees to the SLA only when the infrastructure is able to support the demand. Second, AJIT uses trusted enclaves to do the accounting of packet deliveries such that neither the customer requesting guaranteed service levels for mission-critical applications, nor the operator providing the infrastructure support, can cheat.

Nishanth Sastry (2014)KeyNS, In: IEEE Conferencespp. 1-1

On-demand video streaming dominates today's Internet traffic mix. For instance, Netflix constitutes a third of the peak time traffic in the USA. Nearly half of UK online households have accessed BBC's shows through its on-demand streaming interface, BBC iPlayer. Using UK-wide traces from BBC iPlayer as a case study, this talk will characterise users' content consumption at scale and discuss techniques that can be deployed at the edge by users to substantially decrease the load on the Internet. We will survey both well-known techniques such as peer-assisted VoD, studying whether it works at scale, as well as new edge-caching mechanisms that can potentially be deployed today. We will conclude by exploring new directions for content-centric network architectures, to address the roots of the pain points observed in our user workload, in a "clean" fashion.

Harikrishna Paik, N. N. Sastry, I. SantiPrabha, Nishanth Ramakrishna Sastry (2014)Effectiveness of noise jamming with White Gaussian Noise and phase noise in amplitude comparison monopulse radar receivers, In: 2014 IEEE International Conference on Electronics, Computing and Communication Technologies (CONECCT)pp. 1-5 IEEE

In the missile borne monopulse radar system, effectiveness of jamming the receiver in presence of internal and external noise is much significant. In this paper, jamming of such radar receiver in frequency domain is studied when White Gaussian Noise (WGN) and Phase Noise (PN) signals are injected into the receiver in two separate cases. The missile radar receiver operates on unmodulated continuous wave sinusoidal echo signal and the jammer is assumed to be a WGN source which generates Gaussian noise samples with zero mean. The Gaussian noise signal is injected into the receiver along with the radar echo signal and the noise power required for breaking the frequency lock in the receiver is reported. Initially, it is assumed that receiver is locked onto the desired radar echo signal frequency as the noise power is too less to break the frequency lock of the receiver. It is verified that Gaussian noise power required for jamming the receiver depends upon how the power is interpreted. For our simulation, the noise power is interpreted in symbol rate bandwidth, sampling frequency bandwidth, and in single-sided and double-sided power spectral density. The break-lock in the radar receiver is presented. In the case of phase noise, the noise is added to phase of the radar echo signal and the phase noise mask required for break-lock in the receiver is studied. The phase noise is specified through a phase noise mask consisting of frequency and dBc/Hz values. It is verified that phase noise mask required for jamming the receiver is less when frequency offset from echo signal is large. The effects of windowing techniques when implemented in the phase noise measurement are presented. It is shown that the windowing technique reduces the phase noise required for breaking the frequency lock in the receiver. The effectiveness of noise jamming is carried out through computer simulation using AWR (Visual System Simulator) software. The receiver response is observed online in the frequency spectrum of the signal.

Oliver Holland, Nishanth Sastry, Shuyu Ping, Pravir Chawdhry, Jean-Marc Chareau, James Bishop, Michele Bavaro, Emanuele Anguili, Raymond Knopp, Florian Kaltenberger, Dominique Nussbaum, Yue Gao, Juhani Hallio, Mikko Jakobsson, Jani Auranen, Reijo Ekman, Jarkko Paavola, Arto Kivinen, Rogerio Dionisio, Paulo Marques, Ha-Nguyen Tran, Kentaro Ishizu, Hiroshi Harada, Heikki Kokkinen, Olli Luukkonen (2014)A series of trials in the UK as part of the Ofcom TV white spaces pilot, In: 2014 1st International Workshop on Cognitive Cellular Systems (CCS)pp. 1-5 IEEE

TV White Spaces technology is a means of allowing wireless devices to opportunistically use locally-available TV channels (TV White Spaces), enabled by a geolocation database. The geolocation database informs the device of which channels can be used at a given location, and in the UK/EU case, which transmission powers (EIRPs) can be used on each channel based on the technical characteristics of the device, given an assumed interference limit and protection margin at the edge of the primary service coverage area(s). The UK regulator, Ofcom, has initiated a large-scale Pilot of TV White Spaces technology and devices. The ICT-ACROPOLIS Network of Excellence, teaming up with the ICT-SOLDER project and others, is running an extensive series of trials under this effort. The purpose of these trials is to test a number of aspects of white space technology, including the white space device and geolocation database interactions, the validity of the channel availability/powers calculations by the database and associated interference effects on primary services., and the performances of the white spaces devices, among others. An additional key purpose is to undertake a number of research investigations such as into aggregation of TV White Space resources with conventional (licensed/unlicensed) resources, secondary coexistence issues and means to mitigate such issues, and primary coexistence issues under challenging deployment geometries, among others. This paper describes our trials, their intentions and characteristics, objectives, and some early observations.

Harikrishna Paik, N. N. Sastry, I. SantiPrabha, Nishanth Ramakrishna Sastry (2016)Quantitative analysis of break-lock in monopulse receiver phase-locked loop using noise jamming signal, In: 2016 International Conference on Advances in Computing, Communications and Informatics (ICACCI)pp. 704-710 IEEE

It is evident that noise jamming is one of the several active jamming techniques employed against tracking radars and missile seekers. The noise jamming mainly aims at completely masking the desired radar signal by the externally injected noise signal. Of several parameters to be considered for the analysis of noise jamming problem, the noise jammer power is one of the most critical parameter. In this paper, emphasis is given for estimation and quantitative analyses of the effectiveness of break-lock in a missile borne phase locked loop (PLL) based monopulse radar receiver using external noise signal. The analyses involve estimating the jamming signal power required to break-lock as a function of radar echo signal power through computer simulation and experimental measurements. The simulation plots representing the receiver PLL output are presented for selected echo signal powers from -14 dBm to -2 dBm. The simulation results are compared and verified with experimental results and it is established that these results are close approximate within 2 dB. It is noted that the measured values of jamming signal power at break-lock using HMC702LP6CE, HMC703LP4E and HMC830LP6GE PLL synthesizers are -19.5 dBm, -18.1 dBm and -17.6 dBm, respectively, while the simulated value is -18.8 dBm for a typical radar echo signal power of -10 dBm. The fairly good and consistent agreement between these results validates the simulation data.

Dmytro Karamshuk, Mladen Pupavac, Frances Shaw, Julie Brownlie, Vanessa Pupavac, Nishanth Sastry (2017)Towards Transdisciplinary Collaboration between Computer and Social Scientists: Initial Experiences and Reflections, In: Social Network Analysispp. 21-40 CRC Press

This chapter explores a collaboration between computer scientists, who take a primarily quantitative approach, and qualitative researchers in sociology and international relations. It aims to investigate how online platforms support or hinder the sharing of empathy and trust among people in extreme and vulnerable circumstances. The chapter introduces the benefits of interdisciplinary work within the emerging field of computational social science. It explored how computer and social scientists can work together to investigate these themes in relation to two different spheres: emotional distress and humanitarian and disaster-linked crises. The computer scientists in the team have devised a simple tool that finds replies to an initial data set achieved through key word searching. Identifying the potential and limits of social media research is important for international researchers and policymakers in the context of situations where access on the ground and traditional field analysis may be difficult.

Nishanth Sastry (2007)Folksonomy-based reasoning in opportunistic networks, In: Proceedings of the 2007 ACM CoNEXT conferencepp. 1-2 ACM

Disparate algorithms are being designed to decide certain basic questions in opportunistic networks. This position paper describes a nascent idea that aims to provide a single framework to answer such questions. Inspired by the concept of a generic knowledge plane, we propose to study whether the information embodied in folksonomies can be used to make network decisions in opportunistic networks.

A I Hassan, A Raman, I Castro, H B Zia, E De Cristofaro, N Sastry, G Tyson (2021)Exploring content moderation in the decentralised web: The pleroma case

Decentralising the Web is a desirable but challenging goal. One particular challenge is achieving decentralised content moderation in the face of various adversaries (e.g. trolls). To overcome this challenge, many Decentralised Web (DW) implementations rely on federation policies. Administrators use these policies to create rules that ban or modify content that matches specific rules. This, however, can have unintended consequences for many users. In this paper, we present the first study of federation policies on the DW, their in-the-wild usage, and their impact on users. We identify how these policies may negatively impact "innocent"users and outline possible solutions to avoid this problem in the future.

Pushkal Agarwal, Sagar Joglekar, Panagiotis Papadopoulos, Nishanth Sastry, Nicolas Kourtellis (2020)Stop tracking me Bro! Differential Tracking of User Demographics on Hyper-Partisan Websites, In: WEB CONFERENCE 2020: PROCEEDINGS OF THE WORLD WIDE WEB CONFERENCE (WWW 2020)pp. 1479-1490 Assoc Computing Machinery

Websites with hyper-partisan, left or right-leaning focus offer content that is typically biased towards the expectations of their target audience. Such content often polarizes users, who are repeatedly primed to specific (extreme) content, usually reflecting hard party lines on political and socio-economic topics. Though this polarization has been extensively studied with respect to content, it is still unknown how it associates with the online tracking experienced by browsing users, especially when they exhibit certain demographic characteristics. For example, it is unclear how such websites enable the ad-ecosystem to track users based on their gender or age. In this paper, we take a first step to shed light and measure such potential differences in tracking imposed on users when visiting specific party-line's websites. For this, we design and deploy a methodology to systematically probe such websites and measure differences in user tracking. This methodology allows us to create user personas with specific attributes like gender and age and automate their browsing behavior in a consistent and repeatable manner. Thus, we systematically study how personas are being tracked by these websites and their third parties, especially if they exhibit particular demographic properties. Overall, we test 9 personas on 556 hyper-partisan websites and find that right-leaning websites tend to track users more intensely than left-leaning, depending on user demographics, using both cookies and cookie synchronization methods and leading to more costly delivered ads.

Pushkal Agarwal, Aravindh Raman, Damiola Ibosiola, Nishanth Sastry, Gareth Tyson, Kiran Garimella (2022)Jettisoning Junk Messaging in the Era of End-to-End Encryption: A Case Study of WhatsApp, In: PROCEEDINGS OF THE ACM WEB CONFERENCE 2022 (WWW'22)pp. 2582-2591 Assoc Computing Machinery

WhatsApp is a popular messaging app used by over a billion users around the globe. Due to this popularity, understanding misbehavior on WhatsApp is an important issue. The sending of unwanted junk messages by unknown contacts via WhatsApp remains understudied by researchers, in part because of the end-to-end encryption offered by the platform. We address this gap by studying junk messaging on a multilingual dataset of 2.6M messages sent to 5K public WhatsApp groups in India. We characterise both junk content and senders. We find that nearly 1 in 10 messages is unwanted content sent by junk senders, and a number of unique strategies are employed to reflect challenges faced on WhatsApp, e.g., the need to change phone numbers regularly. We finally experiment with on-device classification to automate the detection of junk, whilst respecting end-to-end encryption.

Pushkal Agarwal, Oliver Hawkins, Margarita Amaxopoulou, Noel Dempsey, Nishanth Sastry, Edward Wood (2021)Hate Speech in Political Discourse: A Case Study of UK MPs on Twitter, In: PROCEEDINGS OF THE 32ND ACM CONFERENCE ON HYPERTEXT AND SOCIAL MEDIA (HT '21)pp. 5-16 Assoc Computing Machinery

Online presence is becoming unavoidable for politicians worldwide. In countries such as the UK, Twitter has become the platform of choice, with over 85% (553 of 650) of the Members of Parliament (MPs) having an active online presence. Whereas this has allowed ordinary citizens unprecedented and immediate access to their elected representatives, it has also led to serious concerns about online hate towards MPs. This work attempts to shed light on the problem using a dataset of conversations between MPs and non-MPs over a two month period. Deviating from other approaches in the literature, our data captures entire threads of conversations between Twitter handles of MPs and citizens in order to provide a full context for content that may be flagged as 'hate'. By combining widely-used hate speech detection tools trained on several widely available datasets, we analyse 2.5 million tweets to identify hate speech against MPs and we characterise hate across multiple dimensions of time, topics and MPs' demographics. We find that MPs are subject to intense 'pile on' hate by citizens whereby they get more hate when they are already busy with a high volume of mentions regarding some event or situation. We also show that hate is more dense with regard to certain topics and that MPs who have an ethnic minority background and those holding positions in Government receive more hate than other MPs. We find evidence of citizens expressing negative sentiments while engaging in cross-party conversations, with supporters of one party (e.g. Labour) directing hate against MPs of another party (e.g. Conservative).

Ranjan Pal, Ziyuan Huang, Sergey Lototsky, Xinlong Yin, Mingyan Liu, Jon Crowcroft, Nishanth Sastry, Swades De, Bodhibrata Nag (2021)Will Catastrophic Cyber-Risk Aggregation Thrive in the IoT Age? A Cautionary Economics Tale for (Re-)Insurers and Likes, In: ACM transactions on management information systems12(2)17pp. 1-36 Assoc Computing Machinery

Service liability interconnections among networked IT and IoT-driven service organizations create potential channels for cascading service disruptions due to modern cybercrimes such as DDoS, APT, and ransomware attacks. These attacks are known to inflict cascading catastrophic service disruptions worth billions of dollars across organizations and critical infrastructure around the globe. Cyber-insurance is a risk management mechanism that is gaining increasing industry popularity to cover client (organization) risks after a cyberattack. However, there is a certain likelihood that the nature of a successful attack is of such magnitude that an organizational client's insurance provider is not able to cover the multi-party aggregate losses incurred upon itself by its clients and their descendants in the supply chain, thereby needing to re-insure itself via other cyber-insurance firms. To this end, one question worth investigating in the first place is whether an ecosystem comprising a set of profit-minded cyber-insurance companies, each capable of providing re-insurance services for a service-networked IT environment, is economically feasible to cover the aggregate cyber-losses arising due to a cyber-attack. Our study focuses on an empirically interesting case of extreme heavy tailed cyber-risk distributions that might be presenting themselves to cyber-insurance firms in the modern Internet age in the form of catastrophic service disruptions, and could be a possible standard risk distribution to deal with in the near IoT age. Surprisingly, as a negative result for society in the event of such catastrophes, we prove via a game-theoretic analysis that it may not be economically incentive compatible, even under i.i.d. statistical conditions on catastrophic cyber-risk distributions, for limited liability-taking risk-averse cyber-insurance companies to offer cyber re-insurance solutions despite the existence of large enough market capacity to achieve full cyber-risk sharing. However, our analysis theoretically endorses the popular opinion that spreading i.i.d. cyber-risks that are not catastrophic is an effective practice for aggregate cyber-risk managers, a result established theoretically and empirically in the past. A failure to achieve a working re-insurance market in critically demanding situations after catastrophic cyber-risk events strongly calls for centralized government regulatory action/intervention to promote risk sharing through re-insurance activities for the benefit of service-networked societies in the IoT age.

Pushkal Agarwal, Kiran Garimella, Sagar Joglekar, Nishanth Sastry, Gareth Tyson (2020)Characterising User Content on a Multi-lingual Social Network

Social media has been on the vanguard of political information diffusion in the 21st century. Most studies that look into disinformation, political influence and fake-news focus on mainstream social media platforms. This has inevitably made English an important factor in our current understanding of political activity on social media. As a result, there has only been a limited number of studies into a large portion of the world, including the largest, multilingual and multi-cultural democracy: India. In this paper we present our characterisation of a multilingual social network in India called ShareChat. We collect an exhaustive dataset across 72 weeks before and during the Indian general elections of 2019, across 14 languages. We investigate the cross lingual dynamics by clustering visually similar images together, and exploring how they move across language barriers. We find that Telugu, Malayalam, Tamil and Kannada languages tend to be dominant in soliciting political images (often referred to as memes), and posts from Hindi have the largest cross-lingual diffusion across ShareChat (as well as images containing text in English). In the case of images containing text that cross language barriers, we see that language translation is used to widen the accessibility. That said, we find cases where the same image is associated with very different text (and therefore meanings). This initial characterisation paves the way for more advanced pipelines to understand the dynamics of fake and political content in a multi-lingual and non-textual setting. Comment: Accepted at ICWSM 2020, please cite the ICWSM version

Emeka Obiodu, Abdullahi Abubakar, Aravindh Raman, Nishanth Sastry, Simone Mangiante (2021)To share or not to share: reliability assurance via redundant cellular connectivity in Connected Cars, In: 2021 IEEE/ACM 29th International Symposium on Quality of Service (IWQOS)pp. 1-6 IEEE

As adoption of connected cars (CCs) grows, the expectation is that 5G will better support safety-critical vehicle-to-everything (V2X) use cases. Operationally, most relationships between cellular network providers and car manufacturers or users are exclusive, providing a single network connectivity, with at best an occasional option of a back-up plan if the single network is unavailable. We question if this setup can provide QoS assurance for V2X use cases. Accordingly, in this paper, we investigate the role of redundancy in providing QoS assurance for cellular connectivity for CCs. Using our bespoke Android measurement app, we did a drive-through test on 380 kilometers of major and minor roads in South East England. We measured round trip times, jitter, page load times, packet loss, network type, uplink speed and downlink speeds on the four UK networks for 14 UK-centric websites every five minutes. In addition, we did the same measurement using a much more expensive universal SIM card provider that promises to fall back on any of the four UK networks to assure reliability. By comparing actual performance on the best performing network versus the universal SIM, and then projected performance of a two/three/four multi-operator setup, we make three major contributions. First, the use of redundant multi-connectivity, especially if managed by the demand-side, can deliver superior performance (up to 28 percentage points in some cases). Second, despite costing 95x more per GB of data, the universal SIM performed worse than the best performing network except for uplink speed, highlighting how the choice of parameter to monitor can influence operational decisions. Third, any assessment of CC connectivity reliability based on availability is sub-optimal as it can hide significant under-performance.

Anthony P. Young, Sagar Joglekar, Gioia Boschi, Nishanth Sastry (2021)Ranking comment sorting policies in online debates, In: Argument & computation12(2)pp. 265-285 Ios Press

Online debates typically possess a large number of argumentative comments. Most readers who would like to see which comments are winning arguments often only read a part of the debate. Many platforms that host such debates allow for the comments to be sorted, say from the earliest to latest. How can argumentation theory be used to evaluate the effectiveness of such policies of sorting comments, in terms of the actually winning arguments displayed to a reader who may not have read the whole debate? We devise a pipeline that captures an online debate tree as a bipolar argumentation framework (BAF), which is sorted depending on the policy, giving a sequence of induced sub-BAFs representing how and how much of the debate has been read. Each sub-BAF has its own set of winning arguments, which can be quantitatively compared to the set of winning arguments of the whole BAF. We apply this pipeline to evaluate policies on Kialo debates, where it is shown that reading comments from most to least liked, on average, displays more winners than reading comments earliest first. Therefore, in Kialo, reading comments from most to least liked is on average more effective than reading from the earliest to the most recent.

Damiano Di Francesco Maesa, Laura Ricci, Nishanth Sastry (2022)Blockchain: Protocols, applications, and transactions analysis, In: BLOCKCHAIN-RESEARCH AND APPLICATIONS3(1)100071 Elsevier
Tooba Faisal, Damiano Di Francesco Maesa, Nishanth Sastry, Simone Mangiante (2021)How to Request Network Resources Just-in-Time using Smart Contracts, In: 2021 IEEE International Conference on Blockchain and Cryptocurrency (ICBC)pp. 1-5 IEEE

5G promises unprecedented levels of network connectivity to handle diverse applications, including life-critical applications such as remote surgery. However, to enable the adoption of such applications, it is important that customers trust the service quality provided. This can only be achieved through transparent Service Level Agreements (SLAs). Current resource provisioning systems are too general to handle such variety in applications. Moreover, service agreements are often opaque to customers, which can be an obstacle for 5G adoption for mission-critical services.In this work, we advocate short-term and specialised rather than long-term general service contracts and propose an end-to-end Permissioned Distributed Ledger (PDL) focused architecture; which allows operators to advertise their service contracts on a public portal backed by a PDL. These service contracts with clear Service Level Agreement (SLA) offers are deployed as smart contracts to enable transparent, automatic and immutable SLAs. To justify our choice of using a permissioned ledger instead of permissionless, we evaluated and compared contract execution times on both permissioned (i.e. Quorum and Hyperledger Fabric) and permissionless (i.e. Ropsten testnet) ledgers.

Vibhor Agarwal, Sagar Joglekar, Anthony P. Young, Nishanth Sastry (2022)GraphNLI: A Graph-based Natural Language Inference Model for Polarity Prediction in Online Debates, In: PROCEEDINGS OF THE ACM WEB CONFERENCE 2022 (WWW'22)pp. 2729-2737 Assoc Computing Machinery

Online forums that allow participatory engagement between users have been transformative for public discussion of important issues. However, debates on such forums can sometimes escalate into full blown exchanges of hate or misinformation. An important tool in understanding and tackling such problems is to be able to infer the argumentative relation of whether a reply is supporting or attacking the post it is replying to. This so called polarity prediction task is difficult because replies may be based on external context beyond a post and the reply whose polarity is being predicted. We propose GraphNLI, a novel graph-based deep learning architecture that uses graph walk techniques to capture the wider context of a discussion thread in a principled fashion. Specifically, we propose methods to perform root-seeking graph walks that start from a post and captures its surrounding context to generate additional embeddings for the post. We then use these embeddings to predict the polarity relation between a reply and the post it is replying to. We evaluate the performance of our models on a curated debate dataset from Kialo, an online debating platform. Our model outperforms relevant baselines, including S-BERT, with an overall accuracy of 83%.

Emeka Obiodu, Abdullahi K Abubakar, Nishanth Sastry (2021)Is it 5G or not? Investigating doubts about the 5G icon and network performance, In: IEEE INFOCOM 2021 - IEEE Conference on Computer Communications Workshops (INFOCOM WKSHPS)9484519pp. 1-6 IEEE

Following the rollout of the first 5G networks in 2018, press reports in the US began to emerge that the 5G icon on smartphones was not depicting 5G connectivity. Such reports about 'fake' 5G icon reverberated across the industry, exposing a mismatch between the icon on the phone and the actual experience of users. Between 2018 - early 2020, 3GPP and the GSMA sought to provide industry guidance on what and when the 5G icon should be used and how 5G performance can differ from 4G. In this paper, we introduce an intuitive four stage investigation framework to explore the technical considerations that ultimately confirm the veracity of the 5G connectivity. Then, following the launch of 5G in the UK in late 2019, we set out to explore if there were similar confusion on 5G notification and performance in the country. We conducted field measurements at the five busiest train stations in the UK, during rush hour, using a Samsung 5G S10 and a Samsung S6 Edge+ 4G device to compare 5G notifications and perceived network performance on 4G and 5G networks. We observe confusing messages to the user - device icon says 5G but Android's TelephonyManager API says 4G; worst cases for latency and uplink/downlink speeds were minimised but best case performance was the same on 4G and 5G devices. Based on our observations, and while we expect any lingering concerns to be ironed out as 5G deployment and adoption matures, we draw lessons that should guide the industry to avoid doubts about the icon and connectivity in 6G.

Vibhor Agarwal, Nishanth Sastry (2022)"Way back then": A Data-driven View of 25+years ofWeb Evolution, In: PROCEEDINGS OF THE ACM WEB CONFERENCE 2022 (WWW'22)pp. 3471-3479 Assoc Computing Machinery

Since the inception of the first web page three decades back, the Web has evolved considerably, from static HTML pages in the beginning to the dynamic web pages of today, from mainly the text-based pages of the 1990s to today's multimedia rich pages, etc.. Although much of this is known anecdotally, to our knowledge, there is no quantitative documentation of the extent and timing of these changes. This paper attempts to address this gap in the literature by looking at the top 100 Alexa websites for over 25 years from the Internet Archive or the "Wayback Machine", archive.org. We study the changes in popularity, from Geocities and Yahoo! in the mid-to-late 1990s to the likes of Google, Facebook, and Tiktok of today. We also look at different categories of websites and their popularity over the years and find evidence for the decline in popularity of news and education-related websites, which have been replaced by streaming media and social networking sites. We explore the emergence and relative prevalence of different MIME-types (text vs. image vs. video vs. javascript and json) and study whether the use of text on the Internet is declining.

Ella Guest, Bertie Vidgen, Alexandros Mittos, Nishanth Sastry, Gareth Tyson, Helen Margetts, (2021)An Expert Annotated Dataset for the Detection of Online Misogyny, In: 16TH CONFERENCE OF THE EUROPEAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (EACL 2021)pp. 1336-1350 Assoc Computational Linguistics-Acl

Online misogyny is a pernicious social problem that risks making online platforms toxic and unwelcoming to women. We present a new hierarchical taxonomy for online misogyny, as well as an expert labelled dataset to enable automatic classification of misogynistic content. The dataset consists of 6,567 labels for Reddit posts and comments. As previous research has found untrained crowdsourced annotators struggle with identifying misogyny, we hired and trained annotators and provided them with robust annotation guidelines. We report baseline classification performance on the binary classification task, achieving accuracy of 0.93 and F1 of 0.43. The codebook and datasets are made freely available for future researchers.

Gioia Boschi, Anthony P. Young, Sagar Joglekar, Chiara Cammarota, Nishanth Sastry (2021)Who Has the Last Word? Understanding How to Sample Online Discussions, In: ACM transactions on the web15(3)pp. 1-25

In online debates, as in offline ones, individual utterances or arguments support or attack each other, leading to some subset of arguments (potentially from different sides of the debate) being considered more relevant than others. However, online conversations are much larger in scale than offline ones, with often hundreds of thousands of users weighing in, collaboratively forming large trees of comments by starting from an original post and replying to each other. In large discussions, readers are often forced to sample a subset of the arguments being put forth. Since such sampling is rarely done in a principled manner, users may not read all the relevant arguments to get a full picture of the debate from a sample. This article is interested in answering the question of how users should sample online conversations to selectively favour the currently justified or accepted positions in the debate. We apply techniques from argumentation theory and complex networks to build a model that predicts the probabilities of the normatively justified arguments given their location in idealised online discussions of comments and replies, which we represent as trees. Our model shows that the proportion of replies that are supportive, the distribution of the number of replies that comments receive, and the locations of comments that do not receive replies (i.e., the “leaves” of the reply tree) all determine the probability that a comment is a justified argument given its location. We show that when the distribution of the number of replies is homogeneous along the tree length, for acrimonious discussions (with more attacking comments than supportive ones), the distribution of justified arguments depends on the parity of the tree level, which is the distance from the root expressed as number of edges. In supportive discussions, which have more supportive comments than attacks, the probability of having justified comments increases as one moves away from the root. For discussion trees that have a non-homogeneous in-degree distribution, for supportive discussions we observe the same behaviour as before, while for acrimonious discussions we cannot observe the same parity-based distribution. This is verified with data obtained from the online debating platform Kialo. By predicting the locations of the justified arguments in reply trees, we can therefore suggest which arguments readers should sample, to grasp the currently accepted opinions in such discussions. Our models have important implications for the design of future online debating platforms.

G Tyson, N Sastry, R Cuevas, I Rimac, A Mauthe (2013)Where is in a Name? A Survey of Mobility in Information-Centric Networks
Oliver Holland, Shuyu Ping, Nishanth Sastry, Hong Xing, Suleyman Taskafa, Adnan Aijaz, Pravir Chawdhry, Jean-Marc Chareau, James Bishop, Michele Bavaro, Philippe Viaud, Tiziano Pinato, Emanuele Anguili, Mohammad Reza Akhavan, Julie McCann, Yue Gao, Zhijin Qin, Qianyun Zhang, Raymond Knopp, Florian Kaltenberger, Dominique Nussbaum, Rogerio Dionisio, Jose Ribeiro, Paulo Marques, Juhani Hallio, Mikko Jakobsson, Jani Auranen, Reijo Ekman, Jarkko Paavola, Arto Kivinen, Heikki Kokkinen, Tomaz Solc, Mihael Mohorcic, Ha-Nguyen Tran, Kentaro Ishizu, Takeshi Matsumura, Kazuo Ibuka, Hiroshi Harada, Keiichi Mizutani, Hiroshi Harada (2015)Some Initial Results and Observations from a Series of Trials within the Ofcom TV White Spaces Pilot, In: 2015 IEEE 81ST VEHICULAR TECHNOLOGY CONFERENCE (VTC SPRING)2015pp. 1-7 IEEE

TV White Spaces (TVWS) technology allows wireless devices to opportunistically use locally-available TV channels enabled by a geolocation database. The UK regulator Ofcom has initiated a pilot of TVWS technology in the UK. This paper concerns a large-scale series of trials under that pilot. The purposes are to test aspects of white space technology, including the white space device and geolocation database interactions, the validity of the channel availability/powers calculations by the database and associated interference effects on primary services, and the performances of the white space devices, among others. An additional key purpose is to perform research investigations such as on aggregation of TVWS resources with conventional resources and also aggregation solely within TVWS, secondary coexistence issues and means to mitigate such issues, and primary coexistence issues under challenging deployment geometries, among others. This paper provides an update on the trials, giving an overview of their objectives and characteristics, some aspects that have been covered, and some early results and observations.

Dmytro Karamshuk, Nishanth Sastry, Andrew Secker, Jigna Chandaria (2015)On factors affecting the usage and adoption of a nation-wide TV streaming service, In: 2015 IEEE Conference on Computer Communications (INFOCOM)26pp. 837-845 IEEE

Using nine months of access logs comprising 1.9 Billion sessions to BBC iPlayer, we survey the UK ISP ecosystem to understand the factors affecting adoption and usage of a high bandwidth TV streaming application across different providers. We find evidence that connection speeds are important and that external events can have a huge impact for live TV usage. Then, through a temporal analysis of the access logs, we demonstrate that data usage caps imposed by mobile ISPs significantly affect usage patterns, and look for solutions. We show that product bundle discounts with a related fixed-line ISP, a strategy already employed by some mobile providers, can better support user needs and capture a bigger share of accesses. We observe that users regularly split their sessions between mobile and fixed-line connections, suggesting a straightforward strategy for offloading by speculatively pre-fetching content from a fixed-line ISP before access on mobile devices.

Ihsan Zulkipli, Joanna Clark, Madeleine Hart, Roshan L. Shrestha, Parveen Gul, David Dang, Tami Kasichiwin, Izabela Kujawiak, Nishanth Sastry, Viji M. Draviam (2018)Spindle rotation in human cells is reliant on a MARK2-mediated equatorial spindle-centering mechanism, In: The Journal of cell biology217(9)pp. 3057-3070 Rockefeller Univ Press

The plane of cell division is defined by the final position of the mitotic spindle. The spindle is pulled and rotated to the correct position by cortical dynein. However, it is unclear how the spindle's rotational center is maintained and what the consequences of an equatorially off centered spindle are in human cells. We analyzed spindle movements in 100s of cells exposed to protein depletions or drug treatments and uncovered a novel role for MARK2 in maintaining the spindle at the cell's geometric center. Following MARK2 depletion, spindles glide along the cell cortex, leading to a failure in identifying the correct division plane. Surprisingly, spindle off centering in MARK2-depleted cells is not caused by excessive pull by dynein. We show that MARK2 modulates mitotic microtubule growth and length and that codepleting mitotic centromere-associated protein (MCAK), a microtubule destabilizer, rescues spindle off centering in MARK2-depleted cells. Thus, we provide the first insight into a spindle-centering mechanism needed for proper spindle rotation and, in turn, the correct division plane in human cells.

Emeka Obiodu, Nishanth Sastry, Aravindh Raman (2018)Towards a taxonomy of differentiated service classes in the 5G era, In: 2018 IEEE 5G WORLD FORUM (5GWF)pp. 129-134 IEEE

The physics and economics of cellular networks often means that there is a need to treat some services differently. This reality has spawned several technical mechanisms in the industry (e.g. DiffServ, QCI) and lured policymakers to promulgate, sometimes, unclear service classes (e.g. FCC's non-BIAS in the US). Yet, in the face of Net Neutrality expectations, this mixture of technical and policy toolkit has had little commercial impact, with no clear roadmap on how cellular operators should differentiate between services. Worse, the lack of clarity has disincentivised innovations that would increase the utilisation of the network or improve its operational efficiency. It has also discouraged the introduction of more customer choice on how to manage the priority of their own services. As policymakers begin the process of crafting the rules that will guide the 5G era, our contribution in this position paper is to bring better clarity on the nature and treatment of differentiated services in the industry. We introduce a clarifying framework of seven differentiated service classes (statutory, critical, best effort, commercially-preferred, discounted, delayed and blocked). Our framework is designed to shape discussions, provide guidance to stakeholders and inform policymaking on how to define, design, implement and enforce differentiated service classes in the 5G era.

Gareth Tyson, Yehia Elkhatib, Nishanth Sastry, Steve Uhlig (2016)Measurements and Analysis of a Major Adult Video Portal, In: ACM transactions on multimedia computing communications and applications12(2)pp. 1-25 Assoc Computing Machinery

Today, the Internet is a large multimedia delivery infrastructure, with websites such as YouTube appearing at the top of most measurement studies. However, most traffic studies have ignored an important domain: adult multimedia distribution. Whereas, traditionally, such services were provided primarily via bespoke websites, recently these have converged towards what is known as "Porn 2.0". These services allow users to upload, view, rate, and comment on videos for free (much like YouTube). Despite their scale, we still lack even a basic understanding of their operation. This article addresses this gap by performing a large-scale study of one of the most popular Porn 2.0 websites: YouPorn. Our measurements reveal a global delivery infrastructure that we have repeatedly crawled to collect statistics (on 183k videos). We use this data to characterise the corpus, as well as to inspect popularity trends and how they relate to other features, for example, categories and ratings. To explore our discoveries further, we use a small-scale user study, highlighting key system implications.

Ranjan Pal, Junhui Li, Jon Crowcroft, Yong Li, Mingyan Liu, Nishanth Sastry (2021)Privacy Risk is a Function of Information Type: Learnings for the Surveillance Capitalism Age, In: IEEE eTransactions on network and service management18(3)pp. 3280-3296 IEEE

ln-app advertising is a multi-billion dollar industry that is an essential part of the current digital ecosystem, and is amenable to sensitive consumer information often being sold downstream without the knowledge of consumers, and in many cases to their annoyance. While this practice, in cases, may result in long-term benefits for the consumers, it can result in serious information privacy (IP) breaches of very significant impact (e.g., breach of genetic data) in the short term. The question we raise through this article is: does the type of information being traded downstream play a role in the degree of IP risks generated? We investigate two general (one-many) information trading market structures between a single data aggregating seller (e.g., enterprise app) and multiple competing buyers (e.g., ad-networks, retailers), distinguished by mutually exclusive and privacy sanitized aggregated consumer data (information) types: (i) data entailing strategically complementary actions among buyers and (ii) data entailing strategically substituting actions among buyers. Our primary question of interest here is: trading which type of data might pose less information privacy risks for society? To this end, we show that at market equilibrium IP trading markets exhibiting strategic substitutes between buying firms pose lesser risks for IP in society, primarily because the 'substitutes' setting, in contrast to the 'complements' setting, economically incentivizes appropriate consumer data distortion by the seller in addition to restricting the proportion of buyers to which it sells. Moreover, we also show that irrespective of the data type traded by the seller, the likelihood of improved IP in society is higher if there is purposeful or free-riding based transfer/leakage of data between buying firms. This is because the seller finds itself economically incentivized to restrict the release of sanitized consumer data with respect to the span of its buyer space, as well as in improved data quality.

Dmytro Karamshuk, Nishanth Sastry, Mustafa Al-Bassam, Andrew Secker, Jigna Chandaria (2016)Take-Away TV: Recharging Work Commutes With Predictive Preloading of Catch-Up TV Content, In: IEEE journal on selected areas in communications34(8)pp. 2091-2101 IEEE

Mobile data offloading can greatly decrease the load on and usage of current and future cellular data networks by exploiting opportunistic and frequent access to Wi-Fi connectivity. Unfortunately, Wi-Fi access from mobile devices can be difficult during typical work commutes, e.g., via trains or cars on highways. In this paper, we propose a new approach: to preload the mobile device with content that a user might be interested in, thereby avoiding the need for cellular data access. We demonstrate the feasibility of this approach by developing a supervised machine learning model that learns from user preferences for different types of content, and propensity to be guided by the user interface of the player, and predictively preload entire TV shows. Testing on a data set of nearly 3.9 million sessions from all over the U.K. to BBC TV shows, we find that predictive preloading can save over 71% of the mobile data for an average user.

Xuehui Hu, Guillermo Suarez de Tangil, Nishanth Sastry (2020)Multi-country Study of Third Party Trackers from Real Browser Histories, In: 2020 IEEE European Symposium on Security and Privacy (EuroS&P)pp. 70-86 IEEE

This paper aims to understand how third-party ecosystems have developed in four different countries: UK, China, AU, US. We are interested in how wide a view a given third-party player may have, of an individual user's browsing history over a period of time, and of the collective browsing histories of a cohort of users in each of these countries. We study this by utilizing two complementary approaches: the first uses lists of the most popular websites per country, as determined by Alexa.com. The second approach is based on the real browsing histories of a cohort of users in these countries. Our larger continuous user data collection spans over a year. Some universal patterns are seen, such as more third parties on more popular websites, and a specialization among trackers, with some trackers present in some categories of websites but not others. However, our study reveals several unexpected country-specific patterns: China has a home-grown ecosystem of third-party operators in contrast with the UK, whose trackers are dominated by players hosted in the US. UK trackers are more location sensitive than Chinese trackers. One important consequence of these is that users in China are tracked lesser than users in the UK. Our unique access to the browsing patterns of a panel of users provides a realistic insight into third party exposure, and suggests that studies which rely solely on Alexa top ranked websites may be over estimating the power of third parties, since real users also access several niche interest sites with lesser numbers of many kinds of third parties, especially advertisers.

Shaik Shakeel Ahamad, V. N. Sastry, Siba K. Udgata, Nishanth Ramakrishna Sastry (2012)Enhanced Mobile SET Protocol with Formal Verification, In: 2012 THIRD INTERNATIONAL CONFERENCE ON COMPUTER AND COMMUNICATION TECHNOLOGY (ICCCT)pp. 288-293 IEEE

In this paper we propose an Enhanced Mobile SET (EMSET) protocol with formal verification using Mobile Agent technology and Digital Signature with Message Recovery based on ECDSA mechanism. Mobile Agent technology and Digital Signature with Message Recovery (DSMR) based on ECDSA mechanism provides in proposing EMSET protocol in Mobile Networks. Mobile Agent technology has many benefits such as bandwidth conservation, reduction of latency, reduction of completion time, Asynchronous (disconnected) communications. Digital Signature with Message Recovery based on ECDSA eliminates the need of adopting PKI cryptosystems. Our proposed protocol EMSET ensures Authentication, Integrity, Confidentiality and Non Repudiation, achieves Identity protection from merchant and Eavesdropper, achieves Transaction privacy from Eavesdropper and Payment Gateway, achieves Payment Secrecy, Order Secrecy, forward secrecy, and prevents Double Spending, Overspending and Money laundering. In addition to these our proposed protocol withstands Replay, Man in the Middle and Impersonation attacks. The security properties of the proposed protocol have been verified using Scyther Tool and presented with results.

Sufian Hameed, Xiaoming Fu, Pan Hui, Nishanth Sastry (2011)LENS: Leveraging Social Networking and Trust to Prevent Spam Transmission, In: 2011 19TH IEEE INTERNATIONAL CONFERENCE ON NETWORK PROTOCOLS (ICNP)pp. 13-18 IEEE

In this paper we introduce LENS, a novel spam protection system based on the recipient's social network, which allows correspondence within the social circle to directly pass to the mailbox and further mitigates spam beyond social circles. The key idea in LENS is to select legitimate and authentic users, called Gatekeepers (GKs), from outside the recipients social circle and within pre-defined social distances. Unless a GK vouches for the emails of potential senders from outside the social circle of a particular recipient, those e-mails are prevented from transmission. In this way LENS drastically reduces the consumption of Internet bandwidth by spam. Using extensive evaluations, we show that LENS provides each recipient reliable email delivery from a large fraction of the social network. We also evaluate the computational complexity of email processing with LENS deployed on two Mail Servers (MSs) and compared it with the most popular content-based filter i.e SpamAssassin. LENS proved to be fast in processing emails (around 2-3 orders of magnitude better than SpamAssassin) and scales efficiently with increasing community size and GKs.

Pietro Panzarasa, Christopher J Griffiths, Nishanth Sastry, Anna De Simoni (2020)Social Medical Capital: How Patients and Caregivers Can Benefit From Online Social Interactions, In: Journal of medical Internet research22(7)pp. e16337-e16337

The rapid growth of online health communities and the increasing availability of relational data from social media provide invaluable opportunities for using network science and big data analytics to better understand how patients and caregivers can benefit from online conversations. Here, we outline a new network-based theory of social medical capital that will open up new avenues for conducting large-scale network studies of online health communities and devising effective policy interventions aimed at improving patients' self-care and health.

Dmytro Karamshuk, Nishanth Sastry, Andrew Secker, Jigna Chandaria (2015)ISP-friendly peer-assisted on-demand streaming of long duration content in BBC iPlayer, In: 2015 IEEE Conference on Computer Communications (INFOCOM)26pp. 289-297 IEEE

In search of scalable solutions, CDNs are exploring P2P support. However, the benefits of peer assistance can be limited by various obstacle factors such as ISP friendliness - requiring peers to be within the same ISP, bitrate stratification - the need to match peers with others needing similar bitrate, and partial participation - some peers choosing not to redistribute content. This work relates potential gains from peer assistance to the average number of users in a swarm, its capacity, and empirically studies the effects of these obstacle factors at scale, using a month-long trace of over 2 million users in London accessing BBC shows online. Results indicate that even when P2P swarms are localised within ISPs, up to 88% of traffic can be saved. Surprisingly, bitrate stratification results in 2 large sub-swarms and does not significantly affect savings. However, partial participation, and the need for a minimum swarm size do affect gains. We investigate improvements to gain from increasing content availability through two well-studied techniques: content bundling-combining multiple items to increase availability, and historical caching of previously watched items. Bundling proves ineffective as increased server traffic from larger bundles outweighs benefits of availability, but simple caching can considerably boost traffic gains from peer assistance.

Shweta Bhatt, Sagar Joglekar, Shehar Bano, Nishanth Sastry (2018)Illuminating an Ecosystem of Partisan Websites, In: COMPANION PROCEEDINGS OF THE WORLD WIDE WEB CONFERENCE 2018 (WWW 2018)pp. 545-554 Assoc Computing Machinery

This paper aims to shed light on alternative news media ecosystems that are believed to have influenced opinions and beliefs by false and/or biased news reporting during the 2016 US Presidential Elections. We examine a large, professionally curated list of 668 hyper-partisan websites and their corresponding Facebook pages, and identify key characteristics that mediate the traffic flow within this ecosystem. We uncover a pattern of new websites being established in the run up to the elections, and abandoned after. Such websites form an ecosystem, creating links from one website to another, and by 'liking' each others' Facebook pages. These practices are highly effective in directing user traffic internally within the ecosystem in a highly partisan manner, with right-leaning sites linking to and liking other right-leaning sites and similarly left-leaning sites linking to other sites on the left, thus forming a filter bubble amongst news producers similar to the filter bubble which has been widely observed among consumers of partisan news. Whereas there is activity along both left- and right-leaning sites, right-leaning sites are more evolved, accounting for a disproportionate number of abandoned websites and partisan internal links. We also examine demographic characteristics of consumers of hyper-partisan news and find that some of the more populous demographic groups in the US tend to be consumers of more right-leaning sites.

Nishanth Sastry, Karen Sollins, Jon Crowcroft (2009)Delivery Properties of Human Social Networks, In: IEEE INFOCOM 2009 - IEEE CONFERENCE ON COMPUTER COMMUNICATIONS, VOLS 1-5pp. 2586-2590 IEEE

The recently proposed Pocket Switched Network paradigm takes advantage of human social contacts to opportunistically create data paths over time. Our goal is to examine the effect of the human contact process on data delivery. We find that the contact occurrence distribution is highly uneven: contacts between a few node-pairs occur too frequently, leading to inadequate mixing in the network, while the majority of contacts are rare, and essential for connectivity. This distribution of contacts leads to a significant variation in performance over short time windows. We discover that the formation of a large clique core during the window is correlated with the fraction of data delivered, as well as the speed of delivery. We then show that the clustering co-efficient of the contact graph over a time window is a good predictor of performance during the window. Taken together, our findings suggest new directions for designing forwarding algorithms in ad-hoe or delay-tolerant networking schemes using humans as data mules.

Harikrishna Paik, N. N. Sastry, I. SantiPrabha (2015)Effectiveness of Repeat Jamming using Linear FM Interference Signal in Monopulse receivers, In: A K Soni, D K Lobiyal (eds.), 3RD INTERNATIONAL CONFERENCE ON RECENT TRENDS IN COMPUTING 2015 (ICRTC-2015)57pp. 296-304 Elsevier

Monopulse technique is widely used in modern tracking radars and missile seekers for precise angle (frequency) tracking. In this paper, the break-lock behavior of phase locked loop (PLL) in monopulse radar receiver in presence of linear frequency modulated (LFM) repeater jamming signal is presented. The radar echo and LFM signals are injected into the PLL simultaneously with an assumption that initially, the PLL locks onto the echo signal frequency. The frequency deviation required for breaking the frequency lock as a function of jamming signal power and modulation rate is reported. The results show that break-lock is achieved at frequency deviation of 0.35 MHz for a typical jammer power of -14 dBm and 200 kHz modulation rate when the radar echo power at the PLL input is -14 dBm. The break-lock is also studied for different modulation rate (200, 300, 400 kHz and so) and echo signal power (-14, -10 dBm) at the input of the PLL. For the computer simulation, the radar echo and centre frequency of LFM signals are assumed at an intermediate frequency (IF) of 50 MHz such that the LFM signal closely replicate the actual radar echo signal. The PLL containing charge pump phase detector and passive loop filter is designed with a typical bandwidth of 200 kHz. The simulation is carried out using visual system simulator AWR software and potential conclusions are demonstrated. (C) 2015 Published by Elsevier B.V.

Emeka Obiodu, Nishanth Sastry, Aravindh Raman (2019)CLASP: a 999-style Priority Lanes Framework for 5G-era Critical Data Services, In: 2019 International Symposium ELMAR2019-pp. 101-104 IEEE

Society is increasingly reliant on digital services for its proper functioning. Yet, going into the 5G era, the prevailing paradigm for data treats all traffic as equal regardless of how critical they are to the proper functioning of society. We argue that this is a suboptimal scenario and that services such as driverless cars and road/rail traffic updates are too important for society to be treated the same way as entertainment services. Our contribution in this paper is to propose the CLASP (Critical, Localized, Authorized, Specific, Perishable) framework to guide regulators and policymakers in deciding and managing 999-style priority lanes for critical data services during atypical scenarios in the 5G era. Our evaluation shows that reserving a 100kbps `lane' for CLASP-prioritised traffic for all users does not lead to an overall statistically significant deterioration in atypical scenarios.

Ming-Chun Lee, Andreas F. Molisch, Nishanth Sastry, Aravindh Raman (2017)Individual Preference Probability Modeling for Video Content in Wireless Caching Networks, In: GLOBECOM 2017 - 2017 IEEE Global Communications Conference2018-pp. 1-7 IEEE

Caching of video files at the wireless edge, i.e., at the base stations or on user devices, is a key method for improving wireless video delivery. While global popularity distributions of video content have been investigated in the past, and used in a variety of caching algorithms, this paper investigates the statistical modeling of the individual user preferences. With individual preferences being represented by probabilities, we identify their critical features and parameters and propose a novel modeling framework as well as a parameterization of the framework based on an extensive real-world data set. Besides, an implementation recipe for generating practical individual preference probabilities is proposed. By comparing with the underlying real data, we show that the proposed models and generation approach can effectively characterize individual preferences of users for video content.

N.R. Sastry, S.S. Lam (2002)A theory of window-based unicast congestion control, In: 10th IEEE International Conference on Network Protocols, 2002. Proceedingspp. 144-154 IEEE

We present a comprehensive theoretical framework for window-based congestion control protocols that are designed to converge to fairness and efficiency. We first derive a sufficient condition for convergence to fairness. Using this, we show how fair window increase/decrease policies can be constructed from suitable pairs of monotonically non-decreasing functions. We show that well-studied protocols such as TCP, GAIMD (general additive-increase multiplicative-decrease) and binomial congestion control can be constructed using this method. Thus we provide a common framework for the analysis of such window-based protocols. To validate our approach, we present experimental results for a new TCP-friendly protocol, LOG, designed using this framework with the objective of reconciling the smoothness requirement of streaming media-like applications with the need for a fast dynamic response to congestion.

Emeka Obiodu, Aravindh Raman, Abdullahi Kutiriko Abubakar, Simone Mangiante, Nishanth Sastry, A. Hamid Aghvami (2022)DSM-MoC as Baseline: Reliability Assurance via Redundant Cellular Connectivity in Connected Cars, In: IEEE eTransactions on network and service management19(3)pp. 2178-2194
Aravindh Raman, Sagar Joglekar, Emiliano De Cristofaro, Nishanth Sastry, Gareth Tyson (2019)Challenges in the Decentralised Web: The Mastodon Case, In: IMC'19: PROCEEDINGS OF THE 2019 ACM INTERNET MEASUREMENT CONFERENCEpp. 217-229 Assoc Computing Machinery

The Decentralised Web (DW) has recently seen a renewed momentum, with a number of DW platforms like Mastodon, PeerTube, and Hubzilla gaining increasing traction. These offer alternatives to traditional social networks like Twitter, YouTube, and Facebook, by enabling the operation of web infrastructure and services without centralised ownership or control. Although their services differ greatly, modern DW platforms mostly rely on two key innovations: first, their open source software allows anybody to setup independent servers ("instances") that people can sign-up to and use within a local community; and second, they build on top of federation protocols so that instances can mesh together, in a peer-to-peer fashion, to offer a globally integrated platform. In this paper, we present a measurement-driven exploration of these two innovations, using a popular DW microblogging platform (Mastodon) as a case study. We focus on identifying key challenges that might disrupt continuing efforts to decentralise the web, and empirically highlight a number of properties that are creating natural pressures towards re-centralisation. Finally, our measurements shed light on the behaviour of both administrators (i.e., people setting up instances) and regular users who sign-up to the platforms, also discussing a few techniques that may address some of the issues observed.

Ming-Chun Lee, Andreas F. Molisch, Nishanth Sastry, Aravindh Raman (2019)Individual Preference Probability Modeling and Parameterization for Video Content in Wireless Caching Networks, In: IEEE/ACM transactions on networking27(2)pp. 676-690 IEEE

Caching of video files at the wireless edge, i.e., at the base stations or on user devices, is a key method for improving wireless video delivery. While global popularity distributions of video content have been investigated in the past and used in a variety of caching algorithms, this paper investigates the statistical modeling of the individual user preferences . With individual preferences being represented by probabilities, we identify their critical features and parameters and propose a novel modeling framework by using a genre-based hierarchical structure as well as a parameterization of the framework based on an extensive real-world data set. Besides, the correlation analysis between parameters and critical statistics of the framework is conducted. With the framework, an implementation recipe for generating practical individual preference probabilities is proposed. By comparing with the underlying real data, we show that the proposed models and generation approach can effectively characterize the individual preferences of users for video content.

Aravindh Raman, Nishanth Sastry, Arjuna Sathiaseelan, Jigna Chandaria, Andrew Secker (2017)Wi-Stitch: Content Delivery in Converged Edge Networks, In: PROCEEDINGS OF THE 2017 WORKSHOP ON MOBILE EDGE COMMUNICATIONS (MECOMM '17)pp. 13-18 Assoc Computing Machinery

Wi-Fi, the most commonly used access technology at the very edge, supports download speeds that are orders of magnitude faster than the average home broadband or cellular data connection. Furthermore, it is extremely common for users to be within reach of their neighbours' Wi-Fi access points. Given the skewed nature of interest in content items, it is likely that some of these neighbours are interested in the same items as the users. We sketch the design of Wi-Stitch, an architecture that exploits these observations to construct a highly efficient content sharing infrastructure at the very edge and show through analysis of a real workload that it can deliver substantial (up to 70%) savings in network traffic. The Wi-Stitch approach can be used both by clients of fixed-line broadband, as well as mobile devices obtaining indoors access in converged networks.

Tooba Faisal, Damiano Di Francesco Maesa, Nishanth Sastry, Simone Mangiante (2021)Automated Quality of Service Monitoring for 5G and Beyond Using Distributed Ledgers, In: 2021 IEEE/ACM 29th International Symposium on Quality of Service (IWQOS)pp. 1-6 IEEE

The viability of new mission-critical networked applications such as connected cars or remote surgery is heavily dependent on the availability of truly customized network services at a Quality of Service (QoS) level that both the network operator and the customer can agree on. This is difficult to achieve in today's mainly "best effort" Internet. Even if a level of service were to be agreed upon between a consumer and an operator, it is important for both parties to be able to scalably and impartially monitor the quality of service delivered in order to enforce the service level agreement (SLA). Building upon a recently proposed architecture for automated negotiation of SLAs using smart contracts, we develop a low overhead solution for monitoring these SLAs and arranging automated payments based on the smart contracts. Our solution uses cryptographically secure bloom filters to create succinct summaries of the data exchanged over fine-grained epochs. We then use a state channel-based design for both parties to quickly and scalably agree and sign off on the data that was delivered in each epoch, making it possible to monitor and enforce at run time the agreed upon QoS levels.

Oliver Holland, Aravindh Raman, Nishanth Sastry, Stan Wong, Jane Mack, Lisa Lam (2016)Assessment of a Platform for Non-Contiguous Aggregation of IEEE 802.11 Waveforms in TV White Space, In: 2016 IEEE 83rd Vehicular Technology Conference (VTC Spring)2016-pp. 1-5 IEEE

TV White Spaces (TVWS) and associated spectrum sharing mechanisms represent key means of realizing necessary prime-frequency spectrum for future wireless communication systems. We have been leading a major trial of TVWS technology within the Ofcom TV White Spaces Pilot. As one aspect of the work of our trial, we have investigated solutions for aggregation in TVWS and as part of that the performance of InterDigital White Space Devices (WSDs), capable of aggregating a IEEE 802.11 enabled technology for operation in up to 4 TVWS channels, non-contiguously as well as contiguously. This paper reports on some of our assessment of aggregation in TVWS, as well as our assessment of the InterDigital WSDs. It reports on the white space channel availabilities that can be achieved through aggregation, based on a real implementation of a WSD exhaustively testing a large area of England with a high resolution. The considerable benefit that is achieved through allowing non-contiguous aggregation as compared with contiguous-only aggregation is shown. Further, this paper assesses the TCP and UDP throughput performances of the InterDigital WSDs against the number of channels aggregated and received signal powers, in highly controlled scenarios. Statistics on performance of the WSDs for the studied large area of England are derived based on this. These results are compared with theoretical similar WSDs with one major difference that they can only achieve contiguous channel aggregation. Results show almost a doubling of capacity through non-contiguous aggregation with the InterDigital WSDs; this performance benefit would be increased significantly if more than 4 channels were supported for aggregation.

Alejandro Cartas, Martin Kocour, Aravindh Raman, Ilias Leontiadis, Jordi Luque, Nishanth Sastry, Jose Nunez-Martinez, Diego Perino, Carlos Segura (2019)A Reality Check on Inference at Mobile Networks Edge, In: PROCEEDINGS OF THE 2ND ACM INTERNATIONAL WORKSHOP ON EDGE SYSTEMS, ANALYTICS AND NETWORKING (EDGESYS '19)pp. 54-59 Assoc Computing Machinery

Edge computing is considered a key enabler to deploy Artificial Intelligence platforms to provide real-time applications such as AR/VR or cognitive assistance. Previous works show computing capabilities deployed very close to the user can actually reduce the end-to-end latency of such interactive applications. Nonetheless, the main performance bottleneck remains in the machine learning inference operation. In this paper, we question some assumptions of these works, as the network location where edge computing is deployed, and considered software architectures within the framework of a couple of popular machine learning tasks. Our experimental evaluation shows that after performance tuning that leverages recent advances in deep learning algorithms and hardware, network latency is now the main bottleneck on end-to-end application performance. We also report that deploying computing capabilities at the first network node still provides latency reduction but, overall, it is not required by all applications. Based on our findings, we overview the requirements and sketch the design of an adaptive architecture for general machine learning inference across edge locations.

C Sharp, S Schaffert, A Woo, N Sastry, C Karlof, S Sastry, D Culler (2005)Design and implementation of a sensor network system for vehicle tracking and autonomous interception, In: E Cayirci, S Baydere, P Havinga (eds.), PROCEEDINGS OF THE SECOND EUROPEAN WORKSHOP ON WIRELESS SENSOR NETWORKS2005pp. 93-107 IEEE

We describe the design and implementation of PEG, a networked system of distributed sensor nodes that detects an uncooperative agent called the evader and assists an autonomous robot called the pursuer in capturing the evader. PEG requires embedded network services such as leader election, routing, network aggregation, and closed loop control. Instead of using general purpose distributed system solutions for these services, we employ whole-system analysis and rely on spatial and physical properties to create simple and efficient mechanisms. We believe this approach advances sensor network design, yielding pragmatic solutions that leverage physical properties to simplify design of embedded distributed systems. We deployed PEG on a 400 square meter field using 100 sensor nodes, and successfully intercepted the evader in all runs. We confronted practical issues such as node breakage, packaging decisions, in situ debugging, network reprogramming, and system reconfiguration. We discuss the approaches we took to cope with these issues and share our experiences in deploying a realistic outdoor sensor network system.

Aravindh Raman, Dmytro Karamshuk, Nishanth Sastry, Andrew Secker, Jigna Chandaria (2018)Consume Local: Towards Carbon Free Content Delivery, In: 2018 IEEE 38TH INTERNATIONAL CONFERENCE ON DISTRIBUTED COMPUTING SYSTEMS (ICDCS)2018-pp. 994-1003 IEEE

P2P sharing amongst consumers has been proposed as a way to decrease load on Content Delivery Networks. This paper develops an analytical model that shows an additional benefit of sharing content locally: Selecting close by peers to share content from leads to shorter paths compared to traditional CDNs, decreasing the overall carbon footprint of the system. Using data from a month-long trace of over 3 million monthly users in London accessing TV shows online, we show that local sharing can result in a decrease of 24-48% in the system-wide carbon footprint of online video streaming, despite various obstacle factors that can restrict swarm sizes. We confirm the robustness of the savings by using realistic energy parameters drawn from two widely used settings. We also show that if the energy savings of the CDN servers are transferred as carbon credits to the end users, over 70% of users can become carbon positive, i.e., are able to support their content consumption without incurring any carbon footprint, and are able to offset their other carbon consumption. We suggest carbon credit transfers from CDNs to end users as a novel way to incentivise participation in peer-assisted content delivery.

David Dang, Christoforos Efstathiou, Dijue Sun, Haoran Yue, Nishanth R Sastry, Viji M Draviam (2023)Deep learning techniques and mathematical modeling allow 3D analysis of mitotic spindle dynamics, In: The Journal of cell biology222(5)

Time-lapse microscopy movies have transformed the study of subcellular dynamics. However, manual analysis of movies can introduce bias and variability, obscuring important insights. While automation can overcome such limitations, spatial and temporal discontinuities in time-lapse movies render methods such as 3D object segmentation and tracking difficult. Here, we present SpinX, a framework for reconstructing gaps between successive image frames by combining deep learning and mathematical object modeling. By incorporating expert feedback through selective annotations, SpinX identifies subcellular structures, despite confounding neighbor-cell information, non-uniform illumination, and variable fluorophore marker intensities. The automation and continuity introduced here allows the precise 3D tracking and analysis of spindle movements with respect to the cell cortex for the first time. We demonstrate the utility of SpinX using distinct spindle markers, cell lines, microscopes, and drug treatments. In summary, SpinX provides an exciting opportunity to study spindle dynamics in a sophisticated way, creating a framework for step changes in studies using time-lapse microscopy.

Changtao Zhong, Dmytro Karamshuk, Nishanth Sastry (2015)Predicting Pinterest: Automating a Distributed Human Computation, In: PROCEEDINGS OF THE 24TH INTERNATIONAL CONFERENCE ON WORLD WIDE WEB (WWW 2015)pp. 1417-1426 Assoc Computing Machinery

Everyday, millions of users save content items for future use on sites like Pinterest, by "pinning" them onto carefully categorised personal pinboards, thereby creating personal taxonomies of the Web. This paper seeks to understand Pinterest as a distributed human computation that categorises images from around the Web. We show that despite being categorised onto personal pinboards by individual actions, there is a generally a global agreement in implicitly assigning images into a coarse-grained global taxonomy of 32 categories, and furthermore, users tend to specialise in a handful of categories. By exploiting these characteristics, and augmenting with image-related features drawn from a state-of-the-art deep convolutional neural network, we develop a cascade of predictors that together automate a large fraction of Pinterest actions. Our end-to-end model is able to both predict whether a user will repin an image onto her own pinboard, and also which pinboard she might choose, with an accuracy of 0.69 (Accuracy@5 of 0.75).

Nasreen Anjum, Dmytro Karamshuk, Mohammad Shikh-Bahaei, Nishanth Sastry (2017)Survey on peer-assisted content delivery networks, In: Computer networks (Amsterdam, Netherlands : 1999)116pp. 79-95 Elsevier

Peer-assisted content delivery networks have recently emerged as an economically viable alternative to traditional content delivery approaches: the feasibility studies conducted for several large content providers suggested a remarkable potential of peer-assisted content delivery networks to reduce the burden of user requests on content delivery servers and several commercial peer-assisted deployments have been recently introduced. Yet there are many technical and commercial challenges which question the future of peer-assisted solutions in industrial settings. This includes among others unreliability of peer to-peer networks, the lack of incentives for peers' participation, and copyright issues. In this paper, we carefully review and systematize this ongoing debate around the future of peer-assisted networks and propose a novel taxonomy to characterize the research and industrial efforts in the area. To this end, we conduct a comprehensive survey of the last decade in the peer-assisted content delivery research and devise a novel taxonomy to characterize the identified challenges and the respective proposed solutions in the literature. Our survey includes a thorough review of the three very large scale feasibility studies conducted for BBC iPlayer, MSN Video and Conviva, five large commercial peer-assisted CDNs - Kankan, LiveSky, Akamai NetSession, Spotify, Tudou - and a vast scope of technical papers. We focus both on technical challenges in deploying peer-assisted solutions and also on non-technical challenges caused due to heterogeneity in user access patterns and distribution of resources among users as well as commercial feasibility related challenges attributed to the necessity of accounting for the interests and incentives of Internet Service Providers, End-Users and Content Providers. The results of our study suggest that many of technical challenges for implementing peer-assisted content delivery networks on an industrial scale have been already addressed in the literature, whereas a problem of finding economically viable solutions to incentivize participation in peer-assisted schemes remains an open issue to a large extent. Furthermore, the emerging Internet of Things (IoT) is expected to enable expansion of conventional CDNs to a broader network of connected devices through machine to machine communication. (C) 2017 The Authors. Published by Elsevier B.V.

Douglas F. S. Nunes, Edson S. Moreira, Bruno Y. L. Kimura, Nishanth Sastry, Toktam Mahmoodi (2017)Attraction-Area Based Geo-Clustering for LTE Vehicular CrowdSensing Data Offloading, In: PROCEEDINGS OF THE 20TH ACM INTERNATIONAL CONFERENCE ON MODELLING, ANALYSIS AND SIMULATION OF WIRELESS AND MOBILE SYSTEMS (MSWIM'17)2017-pp. 323-327 Assoc Computing Machinery

Vehicular CrowdSensing (VCS) is an emerging solution designed to remotely collect data from smart vehicles. It enables a dynamic and large-scale phenomena monitoring just by exploring the variety of technologies which have been embedded in modern cars. However, VCS applications might generate a huge amount of data traffic between vehicles and the remote monitoring center, which tends to overload the LTE networks. In this paper, we describe and analyze a gEo-clUstering approaCh for Lte vehIcular crowDsEnsing dAta offloadiNg (EUCLIDEAN). It takes advantage of opportunistic vehicle-to-vehicle (V2V) communications to support the VCS data upload process, preserving, as much as possible, the cellular network resources. In general, it is shown from the presented results that our proposal is a feasible and an effective scheme to reduce up to 92.98 % of the global demand for LTE transmissions while performing vehicle-based sensing tasks in urban areas. The most encouraging results were perceived mainly under high-density conditions (i.e., above 125 vehicles/km(2)), where our solution provides the best benefits in terms of cellular network data offloading.

Yilei Liang, Dan O'Keeffe, Nishanth Sastry (2020)PAIGE: Towards a Hybrid-Edge Design for Privacy-Preserving Intelligent Personal Assistants, In: PROCEEDINGS OF THE THIRD ACM INTERNATIONAL WORKSHOP ON EDGE SYSTEMS, ANALYTICS AND NETWORKING (EDGESYS'20)pp. 55-60 Assoc Computing Machinery

Intelligent Personal Assistants (IPAs) such as Apple's Siri, Google Now, and Amazon Alexa are becoming an increasingly important class of web-service application. In contrast to keyword-oriented web search, IPAs provide a rich query interface that enables user interaction through images, audio, and natural language queries. However, supporting this interface involves compute-intensive machine-learning inference. To achieve acceptable performance, ML-driven IPAs increasingly depend on specialized hardware accelerators (e.g. GPUs, FPGAs or TPUs), increasing costs for IPA service providers. For end-users, IPAs also present considerable privacy risks given the sensitive nature of the data they capture. In this paper, we present Privacy Preserving Intelligent Personal Assistant at the EdGEx (PAIGE), a hybrid edge-cloud architecture for privacy-preserving Intelligent Personal Assistants. PAIGE's design is founded on the assumption that recent advances in low-cost hardware for machine-learning inference offer an opportunity to offload compute-intensive IPA ML tasks to the network edge. To allow privacy-preserving access to large IPA databases for less compute-intensive pre-processed queries, PAIGE leverages trusted execution environments at the server side. PAIGE's hybrid design allows privacy-preserving hardware acceleration of compute-intensive tasks, while avoiding the need to move potentially large IPA question-answering databases to the edge. As a step towards realising PAIGE, we present a first systematic performance evaluation of existing edge accelerator hardware platforms for a subset of IPA workloads, and show they offer a competitive alternative to existing datacenter alternatives.

Kiev Gama, Victoria Rautenbach, Cameron Green, Breno Alencar Goncalves, Serena Coetzee, Nicolene Fourie, Nishanth Sastry (2019)Mapathons and Hackathons to Crowdsource the Generation and Usage of Geographic Data, In: INTERNATIONAL CONFERENCE ON GAME JAMS, HACKATHONS AND GAME CREATION EVENTS (ICGJ 2019)pp. 1-5 Assoc Computing Machinery

Mapathons and hackathons are short-lived events with different purposes. A mapathon is a collaborative effort for collecting geographic data in unmapped areas, while hackathons are focused on application development. Mapathon outputs need to be high quality to be reusable, but often, when applications are later built on top of map data, there is a mismatch between data collected and application requirements. We conducted an international collaboration project aiming to address this situation by creating a circular process where geographic information is collected in a mapathon and later used in a hackathon. Based on user feedback this cycle can be repeated so the collected data and developed applications can be improved. In this event report, we describe the two mapathon-hackathon cycles that were part of our pilot to validate that process. We present their outcomes and some lessons learned. We focused on the so-called "blue economy" (i.e. the sustainable use of marine and ocean resources for economic growth and improved livelihoods in coastal areas.) as the target domain for this pilot. Data for carefully selected areas of the South African coast was collected through in mapathons and later use in hackathons. The mapathons were held in South Africa, and the hackathons took place in Brazil.

Changtao Zhong, Nishanth Sastry (2017)Systems Applications of Social Networks, In: ACM computing surveys50(5)pp. 1-42 Assoc Computing Machinery

The aim of this article is to provide an understanding of social networks as a useful addition to the standard toolbox of techniques used by system designers. To this end, we give examples of how data about social links have been collected and used in different application contexts. We develop a broad taxonomy-based overview of common properties of social networks, review how they might be used in different applications, and point out potential pitfalls where appropriate. We propose a framework, distinguishing between two main types of social network-based user selection-personalised user selection, which identifies target users who may be relevant for a given source node, using the social network around the source as a context, and generic user selection or group delimitation, which filters for a set of users who satisfy a set of application requirements based on their social properties. Using this framework, we survey applications of social networks in three typical kinds of application scenarios: recommender systems, content-sharing systems (e.g., P2P or video streaming), and systems that defend against users who abuse the system (e.g., spam or sybil attacks). In each case, we discuss potential directions for future research that involve using social network properties.

Nishanth Sastry, Pan Hui (2012)Path Formation in Human Contact Networks, In: M T Thai, P M Pardalos (eds.), Handbook of Optimization in Complex Networkspp. 349-385 Springer Nature

The Pocket Switched Network (PSN) is a radical proposal to take advantage of short-range connectivity afforded by human face-to-face contacts, and create longer paths by having intermediate nodes ferry data on behalf of the sender. The Pocket Switched Network creates paths over time using transient social contacts. This chapter explores the achievable connectivity properties of this dynamically changing mileu, and gives a community-based heuristic to find efficient routes. We first employ empirical traces to examine the effect of the human contact process on data delivery. Contacts between a few node pairs are found to occur too frequently, leading to inadequate mixing of data, while the majority of contacts occur rarely, but are essential for global connectivity. We then examine all successful paths found by flooding and show that though delivery times vary widely, randomly sampling a small number of paths between each source and destination is sufficient to yield a delivery time distribution close to that of flooding over all paths. Thus, despite the apparent fragility implied by the reliance on rare edges, the rate at which the network can deliver data is remarkably robust to path failures. We then give a natural heuristic that finds routes by exploiting the latent social structure. Previous methods relied on building and updating routing tables to cope with dynamic network conditions. This has been shown to be cost ineffective due to the partial capture of transient network behavior. A more promising approach would be to capture the intrinsic characteristics of such networks and utilize them for routing decsions. We design and evaluate BUBBLE, a novel social-based forwarding algorithm, that utilizes the centrality and community metrics to enhance delivery performance. We empirically show that BUBBLE can efficiently identify good paths using several real mobility datasets.

Sagar Joglekar, Daniele Quercia, Miriam Redi, Luca Maria Aiello, Tobias Kauer, Nishanth Sastry (2020)FaceLift: a transparent deep learning framework to beautify urban scenes, In: Royal Society open science7(1)pp. 190987-190987 Royal Soc London

In the area of computer vision, deep learning techniques have recently been used to predict whether urban scenes are likely to be considered beautiful: it turns out that these techniques are able to make accurate predictions. Yet they fall short when it comes to generating actionable insights for urban design. To support urban interventions, one needs to go beyond predicting beauty, and tackle the challenge of recreating beauty. Unfortunately, deep learning techniques have not been designed with that challenge in mind. Given their 'black-box nature', these models cannot be directly used to explain why a particular urban scene is deemed to be beautiful. To partly fix that, we propose a deep learning framework (which we name FaceLift(1)) that is able to both beautify existing urban scenes (Google Street Views) and explain which urban elements make those transformed scenes beautiful. To quantitatively evaluate our framework, we cannot resort to any existing metric (as the research problem at hand has never been tackled before) and need to formulate new ones. These new metrics should ideally capture the presence (or absence) of elements that make urban spaces great. Upon a review of the urban planning literature, we identify five main metrics: walkability, green spaces, openness, landmarks and visual complexity. We find that, across all the five metrics, the beautified scenes meet the expectations set by the literature on what great spaces tend to be made of. This result is further confirmed by a 20-participant expert survey in which FaceLift has been found to be effective in promoting citizen participation. All this suggests that, in the future, as our framework's components are further researched and become better and more sophisticated, it is not hard to imagine technologies that will be able to accurately and efficiently support architects and planners in the design of the spaces we intuitively love.

Xuehui Hu, Nishanth Sastry (2019)Characterising Third Party Cookie Usage in the EU after GDPR, In: PROCEEDINGS OF THE 11TH ACM CONFERENCE ON WEB SCIENCE (WEBSCI'19)pp. 137-141 Assoc Computing Machinery

The recently introduced General Data Protection Regulation (GDPR) requires that when obtaining information online that could be used to identify individuals, their consents must be obtained. Among other things, this affects many common forms of cookies, and users in the EU have been presented with notices asking their approvals for data collection. This paper examines the prevalence of third party cookies before and after GDPR by using two datasets: accesses to top 500 websites according to Alexa.com, and weekly data of cookies placed in users' browsers by websites accessed by 16 UK and China users across one year. We find that on average the number of third parties dropped by more than 10% after GDPR, but when we examine real users' browsing histories over a year, we find that there is no material reduction in long-term numbers of third party cookies, suggesting that users are not making use of the choices offered by GDPR for increased privacy. Also, among websites which offer users a choice in whether and how they are tracked, accepting the default choices typically ends up storing more cookies on average than on websites which provide a notice of cookies stored but without giving users a choice of which cookies, or those that do not provide a cookie notice at all. We also find that top non-EU websites have fewer cookie notices, suggesting higher levels of tracking when visiting international sites. Our findings have deep implications both for understanding compliance with GDPR as well as understanding the evolution of tracking on the web.

Jiaqiang Liu, Huan Yan, Yong Li, Dmytro Karamshuk, Nishanth Sastry, Di Wu, Depeng Jin (2021)Discovering and Understanding Geographical Video Viewing Patterns in Urban Neighborhoods, In: IEEE transactions on big data7(5)pp. 873-884 IEEE

Video accounts for a large proportion of traffic on the Internet. Understanding its geographical viewing patterns is extremely valuable for the design of Internet ecosystems for content delivery, recommendation and ads. While previous works have addressed this problem at coarse-grain scales (e.g., national), the urban-scale geographical patterns of video access have never been revealed. To this end, this article aims to investigate the problem that whether there exists distinct viewing patterns among the neighborhoods of a large-scale city. To achieve this, we need to address several challenges including unknown of patterns profiles, complicate urban neighborhoods, and comprehensive viewing features. The contributions of this article include two aspects. First, we design a framework to automatically identify geographical video viewing patterns in urban neighborhoods. Second, by using a dataset of two months real video requests in Shanghai collected from one major ISP of China, we make a rigorous analysis of video viewing patterns in Shanghai. Our study reveals the following important observations. First, there exists four prevalent and distinct patterns of video access behavior in urban neighborhoods, which are corresponding to four different geographical contexts: downtown residential, office, suburb residential and hybrid regions. Second, there exists significant features that distinguish different patterns, e.g., the probabilities of viewing TV plays at midnight, and viewing cartoons at weekends can distinguish the two viewing patterns corresponding to downtown and suburb regions.

Sagar Joglekar, Nishanth Sastry, Neil S Coulson, Stephanie JC Taylor, Anita Patel, Robbie Duschinsky, Amrutha Anand, Matt Jameson Evans, Chris J Griffiths, Aziz Sheikh, Pietro Panzarasa, Anna De Simoni (2018)How Online Communities of People With Long-Term Conditions Function and Evolve: Network Analysis of the Structure and Dynamics of the Asthma UK and British Lung Foundation Online Communities (Preprint)
Rafael Cappelletti, Nishanth Sastry (2012)IARank: Ranking Users on Twitter in Near Real-time, Based on their Information Amplification Potential, In: PROCEEDINGS OF THE 2012 ASE INTERNATIONAL CONFERENCE ON SOCIAL INFORMATICS (SOCIALINFORMATICS 2012)pp. 70-77 IEEE

This work introduces IARank, a novel, simple and accurate model to continuously rank influential Twitter users in real-time. Our model is based on the information amplification potential of a user, the capacity of the user to increase the audience of a tweet or another username that they find interesting, by retweets or mentions. We incorporate information amplification using two factors, the first of which indicates the tendency of a user to be retweeted or mentioned, and the second of which is proportional to the size of the audience of the retweets or mentions. We distinguish between cumulative influence acquired by a user over time, and an important tweet made by an otherwise not-important user, which deserves attention instantaneously, and devise our ranking scheme based on both notions of influence. We show that our methods produce rankings similar to PageRank, which is the basis for several other successful rankings of Twitter users. However, as opposed to PageRank-like algorithms, which take non-trivial time to converge, our method produces rankings in near-real time. We validate our results with a user-study, which shows that our method ranks top users similar to a manual ranking produced by the users themselves. Further, our ranking marginally outperformed PageRank, with 80% of the Top 5 most important users being classified as relevant to the event, whereas, PageRank had 60% of the Top 5 users marked as relevant. However, PageRank produces slightly better rankings, which correlates better with the user-produced rankings, when considering users beyond the top 5.

Gianfranco Nencioni, Nishanth Sastry, Gareth Tyson, Vijay Badrinarayanan, Dmytro Karamshuk, Jigna Chandaria, Jon Crowcroft (2016)SCORE: Exploiting Global Broadcasts to Create Offline Personal Channels for On-Demand Access, In: IEEE/ACM transactions on networking24(4)pp. 2429-2442 IEEE

The last 5 years have seen a dramatic shift in media distribution. For decades, TV and radio were solely provisioned using push-based broadcast technologies, forcing people to adhere to fixed schedules. The introduction of catch-up services, however, has now augmented such delivery with online pull-based alternatives. Typically, these allow users to fetch content for a limited period after initial broadcast, allowing users flexibility in accessing content. Whereas previous work has investigated both of these technologies, this paper explores and contrasts them, focusing on the network consequences of moving towards this multifaceted delivery model. Using traces from nearly 6 million users of BBC iPlayer, one of the largest catch-up TV services, we study this shift from push-to pull-based access. We propose a novel technique for unifying both push-and pull-based delivery: the Speculative Content Offloading and Recording Engine (SCORE). SCORE operates as a set-top box, which interacts with both broadcast push and online pull services. Whenever users wish to access media, it automatically switches between these distribution mechanisms in an attempt to optimize energy efficiency and network resource utilization. SCORE also can predict user viewing patterns, automatically recording certain shows from the broadcast interface. Evaluations using our BBC iPlayer traces show that, based on parameter settings, an oracle with complete knowledge of user consumption can save nearly 77% of the energy, and over 90% of the peak bandwidth, of pure IP streaming. Optimizing for energy consumption, SCORE can recover nearly half of both traffic and energy savings.

P Rost, C Mannweiler, D Michalopoulos, C Sartori, V Sciancalepore, N Sastry, O Holland, S Tayade, B Han, D Bega, D Aziz, H Bakker (2017)Network Slicing to Enable Scalability and Flexibility in 5G Mobile Networks, In: arXiv.org Cornell University Library, arXiv.org

We argue for network slicing as an efficient solution that addresses the diverse requirements of 5G mobile networks, thus providing the necessary flexibility and scalability associated with future network implementations. We elaborate on the challenges that emerge when we design 5G networks based on network slicing. We focus on the architectural aspects associated with the coexistence of dedicated as well as shared slices in the network. In particular, we analyze the realization options of a flexible radio access network with focus on network slicing and their impact on the design of 5G mobile networks. In addition to the technical study, this paper provides an investigation of the revenue potential of network slicing, where the applications that originate from such concept and the profit capabilities from the network operator's perspective are put forward.

Sagar Joglekar, Nishanth Sastry, Neil S Coulson, Stephanie JC Taylor, Anita Patel, Robbie Duschinsky, Amrutha Anand, Matt Jameson Evans, Chris J Griffiths, Aziz Sheikh, Pietro Panzarasa, Anna De Simoni (2018)Addendum to the Acknowledgements: How Online Communities of People With Long-Term Conditions Function and Evolve: Network Analysis of the Structure and Dynamics of the Asthma UK and British Lung Foundation Online Communities (Preprint) JMIR Publications
Nishanth Sastry, Simon Lam (2005)CYRF, In: IEEE/ACM Transactions on Networking (TON)13(2)pp. 330-342 IEEE Press

This work presents a comprehensive theoretical framework for memoryless window-based congestion control protocols that are designed to converge to fairness and efficiency. We first derive a necessary and sufficient condition for stepwise convergence to fairness. Using this, we show how fair window increase/decrease policies can be constructed from suitable pairs of monotonically nondecreasing functions. We generalize this to smooth protocols that converge over each congestion epoch. The framework also includes a simple method for incorporating TCP-friendliness.Well-studied congestion control protocols such as TCP, GAIMD, and Binomial congestion control can be constructed using this method. Thus, we provide a common framework for the analysis of such window-based protocols. We also present two new congestion control protocols for streaming media-like applications as examples of protocol design in this framework: The first protocol, LOG, has the objective of reconciling the smoothness requirement of an application with the need for a fast dynamic response to congestion.The second protocol, SIGMOID, guarantees a minimum bandwidth for an application but behaves exactly like TCP for large windows.

Seyed Ehsan Ghoreishi, Vasilis Friderikos, Dmytro Karamshuk, Nishanth Sastry, A. Hamid Aghvami (2016)Provisioning Cost-Effective Mobile Video Caching, In: 2016 IEEE INTERNATIONAL CONFERENCE ON COMMUNICATIONS (ICC)pp. 1-7 IEEE

The exploding volumes of mobile video traffic call for deploying content caches inside mobile operators network. With in-network caching, users' requests for popular content can be served from a content cache deployed at mobile gateways in vicinity to the end user, therefore considerably reducing the load on the content servers and the backbone of operator's network. In practice, content caches can be installed at multiple levels inside an operator's network (e.g., serving gateway, packet data network gateway, RAN, etc.), leading to an idea of hierarchical in-network video caching. In order to evaluate the pros and cons of hierarchical caching, in this paper we formulate a cache provisioning problem which aims to find the best trade-off between the cost of cache storage and bandwidth savings from hierarchical caching. More specifically, we aim to find the optimal size of video caches at different layers of a hierarchical in-network caching architecture which minimizes the ratio of transmission bandwidth cost to storage cost. We overcome the complexity of our problem which is formulated as a binary-integer programming (BIP) by using canonical duality theory (CDT). Numerical results obtained using the invasive weed optimization (IWO) show that important gains can be achieved, with benefit-cost ratio and cost efficiency improvements of more than 43% and 38%, respectively.

Ming-Chun Lee, Mingyue Ji, Andreas F. Molisch, Nishanth Sastry (2019)Performance of Caching-Based D2D Video Distribution with Measured Popularity Distributions, In: 2019 IEEE GLOBAL COMMUNICATIONS CONFERENCE (GLOBECOM)pp. 1-6 IEEE

On-demand video accounts for the majority of wireless data traffic. Video distribution schemes based on caching combined with device-to-device (D2D) communications promise order-of-magnitude greater spectral efficiency for video delivery, but hinge on the principle of "concentrated demand distributions." This paper presents, for the first time, the analysis and evaluations of the throughput-outage tradeoff of such schemes based on measured cellular demand distributions. In particular, we use a dataset with more than 100 million requests from the BBC iPlayer, a popular video streaming service in the U.K., as the foundation of the analysis and evaluations. We present an achievable scaling law based on the practical popularity distribution, and show that such scaling law is identical to those reported in the literature. We find that also for the numerical evaluations based on a realistic setup, order-of-magnitude improvements can be achieved. Our results indicate that the benefits promised by the caching-based D2D in the literature could be retained for cellular networks in practice.

Giridhari Venkatadri, Oana Goga, Changtao Zhong, Bimal Viswanath, Krishna P. Gummadi, Nishanth Sastry (2016)Strengthening Weak Identities Through Inter-Domain Trust Transfer, In: PROCEEDINGS OF THE 25TH INTERNATIONAL CONFERENCE ON WORLD WIDE WEB (WWW'16)pp. 1249-1259 Assoc Computing Machinery

On most current websites untrustworthy or spammy identities are easily created. Existing proposals to detect untrustworthy identities rely on reputation signals obtained by observing the activities of identities over time within a single site or domain; thus, there is a time lag before which websites cannot easily distinguish attackers and legitimate users. In this paper, we investigate the feasibility of leveraging information about identities that is aggregated across multiple domains to reason about their trustworthiness. Our key insight is that while honest users naturally maintain identities across multiple domains (where they have proven their trustworthiness and have acquired reputation over time), attackers are discouraged by the additional effort and costs to do the same. We propose a flexible framework to transfer trust between domains that can be implemented in today's systems without significant loss of privacy or significant implementation overheads. We demonstrate the potential for inter-domain trust assessment using extensive data collected from Pinterest, Facebook, and Twitter. Our results show that newer domains such as Pinterest can benefit by transferring trust from more established domains such as Facebook and Twitter by being able to declare more users as likely to be trustworthy much earlier on (approx. one year earlier).

Ranjan Pal, Konstantinos Psounis, Jon Crowcroft, Frank Kelly, Pan Hui, Sasu Tarkoma, Abhishek Kumar, John Kelly, Aritra Chatterjee, Leana Golubchik, Nishanth Sastry, Bodhibrata Nag (2020)When Are Cyber Blackouts in Modern Service Networks Likely?: A Network Oblivious Theory on Cyber (Re)Insurance Feasibility, In: ACM transactions on management information systems11(2)pp. 1-38 Assoc Computing Machinery

Service liability interconnections among globally networked IT- and IoT-driven service organizations create potential channels for cascading service disruptions worth billions of dollars, due to modern cyber-crimes such as DDoS, APT, and ransomware attacks. A natural question that arises in this context is: What is the likelihood of a cyber-blackout?, where the latter term is defined as the probability that all (or a major subset of) organizations in a service chain become dysfunctional in a certain manner due to a cyber-attack at some or all points in the chain. The answer to this question has major implications to risk management businesses such as cyber-insurance when it comes to designing policies by risk-averse insurers for providing coverage to clients in the aftermath of such catastrophic network events. In this article, we investigate this question in general as a function of service chain networks and different cyber-loss distribution types. We show somewhat surprisingly (and discuss the potential practical implications) that, following a cyber-attack, the effect of (a) a network interconnection topology and (b) a wide range of loss distributions on the probability of a cyber-blackout and the increase in total service-related monetary losses across all organizations are mostly very small. The primary rationale behind these results are attributed to degrees of heterogeneity in the revenue base among organizations and the Increasing Failure Rate property of popular (i.i.d/non-i.i.d) loss distributions, i.e., log-concave cyber-loss distributions. The result will enable risk-averse cyber-riskmanagers to safely infer the impact of cyber-attacks in a worst-case network and distribution oblivious setting.

Sagar Joglekar, Nishanth Sastry, Neil S. Coulson, Stephanie J. C. Taylor, Anita Patel, Robbie Duschinsky, Amrutha Anand, Matt Jameson Evans, Chris J. Griffiths, Aziz Sheikh, Pietro Panzarasa, Anna De Simoni (2018)How Online Communities of People With Long-Term Conditions Function and Evolve: Network Analysis of the Structure and Dynamics of the Asthma UK and British Lung Foundation Online Communities (vol 20, e238, 2018), In: Journal of medical Internet research20(9) Jmir Publications, Inc
Ming-Chun Lee, Mingyue Ji, Andreas F. Molisch, Nishanth Sastry (2019)Throughput-Outage Analysis and Evaluation of Cache-Aided D2D Networks With Measured Popularity Distributions, In: IEEE transactions on wireless communications18(11)pp. 5316-5332 IEEE

Caching of video files on user devices, combined with file exchange through device-to-device (D2D) communications is a promising method for increasing the throughput of wireless networks. Previous theoretical investigations showed that throughput can be increased by orders of magnitude, but assumed a Zipf distribution for modeling the popularity distribution, which was based on observations in wired networks. Thus the question whether cache-aided D2D video distribution can provide in practice the benefits promised by existing theoretical literature remains open. To answer this question, we provide new results specifically for popularity distributions of video requests of mobile users. Based on an extensive real-world dataset, we adopt a generalized distribution, known as Mandelbrot-Zipf (MZipf) distribution. We first show that this popularity distribution can fit the practical data well. Using this distribution, we analyze the throughput-outage tradeoff of the cache-aided D2D network and show that the scaling law is identical to the case of Zipf popularity distribution when the MZipf distribution is sufficiently skewed, implying that the benefits previously promised in the literature could indeed be realized in practice. To support the theory, practical evaluations using numerical experiments are provided, and show that the cache-aided D2D can outperform the conventional unicasting from base stations.

Changtao Zhong, Mostafa Salehi, Sunil Shah, Marius Cobzarenco, Nishanth Sastry, Meeyoung Cha (2014)Social Bootstrapping: How Pinterest and Last.fm Social Communities Benefit by Borrowing Links from Facebook, In: WWW'14: PROCEEDINGS OF THE 23RD INTERNATIONAL CONFERENCE ON WORLD WIDE WEBpp. 305-314 Assoc Computing Machinery

How does one develop a new online community that is highly engaging to each user and promotes social interaction? A number of websites offer friend-finding features that help users bootstrap social networks on the website by copying links from an established network like Facebook or Twitter. This paper quantifies the extent to which such social bootstrapping is effective in enhancing a social experience of the website. First, we develop a stylised analytical model that suggests that copying tends to produce a giant connected component (i.e., a connected community) quickly and preserves properties such as reciprocity and clustering, up to a linear multiplicative factor. Second, we use data from two websites, Pinterest and Last.fm, to empirically compare the subgraph of links copied from Facebook to links created natively. We find that the copied subgraph has a giant component, higher reciprocity and clustering, and confirm that the copied connections see higher social interactions. However, the need for copying diminishes as users become more active and influential. Such users tend to create links natively on the website, to users who are more similar to them than their Facebook friends. Our findings give new insights into understanding how bootstrapping from established social networks can help engage new users by enhancing social interactivity.

A De Simoni, S Joglekar, SJC Taylor, A Patel, R Duschinsky, N Coulson, C Griffiths, P Panzarasa, N Sastry, A Anand, M J Evans (2017)Structure and dynamics of online patients' communities: the case of Asthma UK and BLF online fora