No Arabic abstract
A wide range of approaches have been applied to manage the spread of global pandemic events such as COVID-19, which have met with varying degrees of success. Given the large-scale social and economic impact coupled with the increasing time span of the pandemic, it is important to not only manage the spread of the disease but also put extra efforts on measures that expedite resumption of social and economic life. It is therefore important to identify situations that carry high risk, and act early whenever such situations are identified. While a large number of mobile applications have been developed, they are aimed at obtaining information that can be used for contact tracing, but not at estimating the risk of social situations. In this paper, we introduce an infection risk score that provides an estimate of the infection risk arising from human contacts. Using a real-world human contact dataset, we show that the proposed risk score can provide a realistic estimate of the level of risk in the population. We also describe how the proposed infection risk score can be implemented on smartphones. Finally, we identify representative use cases that can leverage the risk score to minimize infection propagation.
There is little information from independent sources in the public domain about mobile malware infection rates. The only previous independent estimate (0.0009%) [12], was based on indirect measurements obtained from domain name resolution traces. In this paper, we present the first independent study of malware infection rates and associated risk factors using data collected directly from over 55,000 Android devices. We find that the malware infection rates in Android devices estimated using two malware datasets (0.28% and 0.26%), though small, are significantly higher than the previous independent estimate. Using our datasets, we investigate how indicators extracted inexpensively from the devices correlate with malware infection. Based on the hypothesis that some application stores have a greater density of malicious applications and that advertising within applications and cross-promotional deals may act as infection vectors, we investigate whether the set of applications used on a device can serve as an indicator for infection of that device. Our analysis indicates that this alone is not an accurate indicator for pinpointing infection. However, it is a very inexpensive but surprisingly useful way for significantly narrowing down the pool of devices on which expensive monitoring and analysis mechanisms must be deployed. Using our two malware datasets we show that this indicator performs 4.8 and 4.6 times (respectively) better at identifying infected devices than the baseline of random checks. Such indicators can be used, for example, in the search for new or previously undetected malware. It is therefore a technique that can complement standard malware scanning by anti-malware tools. Our analysis also demonstrates a marginally significant difference in battery use between infected and clean devices.
Digital contact tracing apps for COVID, such as the one developed by Google and Apple, need to estimate the risk that a user was infected during a particular exposure, in order to decide whether to notify the user to take precautions, such as entering into quarantine, or requesting a test. Such risk score models contain numerous parameters that must be set by the public health authority. In this paper, we show how to automatically learn these parameters from data. Our method needs access to exposure and outcome data. Although this data is already being collected (in an aggregated, privacy-preserving way) by several health authorities, in this paper we limit ourselves to simulated data, so that we can systematically study the different factors that affect the feasibility of the approach. In particular, we show that the parameters become harder to estimate when there is more missing data (e.g., due to infections which were not recorded by the app), and when there is model misspecification. Nevertheless, the learning approach outperforms a strong manually designed baseline. Furthermore, the learning approach can adapt even when the risk factors of the disease change, e.g., due to the evolution of new variants, or the adoption of vaccines.
Identifying the infection sources in a network, including the index cases that introduce a contagious disease into a population network, the servers that inject a computer virus into a computer network, or the individuals who started a rumor in a social network, plays a critical role in limiting the damage caused by the infection through timely quarantine of the sources. We consider the problem of estimating the infection sources and the infection regions (subsets of nodes infected by each source) in a network, based only on knowledge of which nodes are infected and their connections, and when the number of sources is unknown a priori. We derive estimators for the infection sources and their infection regions based on approximations of the infection sequences count. We prove that if there are at most two infection sources in a geometric tree, our estimator identifies the true source or sources with probability going to one as the number of infected nodes increases. When there are more than two infection sources, and when the maximum possible number of infection sources is known, we propose an algorithm with quadratic complexity to estimate the actual number and identities of the infection sources. Simulations on various kinds of networks, including tree networks, small-world networks and real world power grid networks, and tests on two real data sets are provided to verify the performance of our estimators.
Agent-Based Models are a powerful class of computational models widely used to simulate complex phenomena in many different application areas. However, one of the most critical aspects, poorly investigated in the literature, regards an important step of the model credibility assessment: solution verification. This study overcomes this limitation by proposing a general verification framework for Agent-Based Models that aims at evaluating the numerical errors associated with the model. A step-by-step procedure, which consists of two main verification studies (deterministic and stochastic model verification), is described in detail and applied to a specific mission critical scenario: the quantification of the numerical approximation error for UISS-TB, an ABM of the human immune system developed to predict the progression of pulmonary tuberculosis. Results provide indications on the possibility to use the proposed model verification workflow to systematically identify and quantify numerical approximation errors associated with UISS-TB and, in general, with any other ABMs.
We consider real-time timely tracking of infection status (e.g., covid-19) of individuals in a population. In this work, a health care provider wants to detect infected people as well as people who recovered from the disease as quickly as possible. In order to measure the timeliness of the tracking process, we use the long-term average difference between the actual infection status of the people and their real-time estimate by the health care provider based on the most recent test results. We first find an analytical expression for this average difference for given test rates, and given infection and recovery rates of people. Next, we propose an alternating minimization based algorithm to minimize this average difference. We observe that if the total test rate is limited, instead of testing all members of the population equally, only a portion of the population is tested based on their infection and recovery rates. We also observe that increasing the total test rate helps track the infection status better. In addition, an increased population size increases diversity of people with different infection and recovery rates, which may be exploited to spend testing capacity more efficiently, thereby improving the system performance. Finally, depending on the health care providers preferences, test rate allocation can be altered to detect either the infected people or the recovered people more quickly.