No Arabic abstract
Crowdsourcing is a popular paradigm for soliciting forecasts on future events. As people may have different forecasts, how to aggregate solicited forecasts into a single accurate prediction remains to be an important challenge, especially when no historical accuracy information is available for identifying experts. In this paper, we borrow ideas from the peer prediction literature and assess the prediction accuracy of participants using solely the collected forecasts. This approach leverages the correlations among peer reports to cross-validate each participants forecasts and allows us to assign a peer assessment score (PAS) for each agent as a proxy for the agents prediction accuracy. We identify several empirically effective methods to generate PAS and propose an aggregation framework that uses PAS to identify experts and to boost existing aggregators prediction accuracy. We evaluate our methods over 14 real-world datasets and show that i) PAS generated from peer prediction methods can approximately reflect the prediction accuracy of agents, and ii) our aggregation framework demonstrates consistent and significant improvement in the prediction accuracy over existing aggregators for both binary and multi-choice questions under three popular accuracy measures: Brier score (mean square error), log score (cross-entropy loss) and AUC-ROC.
We consider the problem of purchasing data for machine learning or statistical estimation. The data analyst has a budget to purchase datasets from multiple data providers. She does not have any test data that can be used to evaluate the collected data and can assign payments to data providers solely based on the collected datasets. We consider the problem in the standard Bayesian paradigm and in two settings: (1) data are only collected once; (2) data are collected repeatedly and each days data are drawn independently from the same distribution. For both settings, our mechanisms guarantee that truthfully reporting ones dataset is always an equilibrium by adopting techniques from peer prediction: pay each provider the mutual information between his reported data and other providers reported data. Depending on the data distribution, the mechanisms can also discourage misreports that would lead to inaccurate predictions. Our mechanisms also guarantee individual rationality and budget feasibility for certain underlying distributions in the first setting and for all distributions in the second setting.
Some important indoor localization applications, such as localizing a lost kid in a shopping mall, call for a new peer-to-peer localization technique that can localize an individuals smartphone or wearables by directly using anothers on-body devices in unknown indoor environments. However, current localization solutions either require pre-deployed infrastructures or multiple antennas in both transceivers, impending their wide-scale application. In this paper, we present P2PLocate, a peer-to-peer localization system that enables a single-antenna device co-located with a batteryless backscatter tag to localize another single-antenna device with decimeter-level accuracy. P2PLocate leverages the multipath variations intentionally created by an on-body backscatter tag, coupled with spatial information offered by user movements, to accomplish this objective without relying on any pre-deployed infrastructures or pre-training. P2PLocate incorporates novel algorithms to address two major challenges: (i) interference with strong direct-path signal while extracting multipath variations, and (ii) lack of direction information while using single-antenna transceivers. We implement P2PLocate on commercial off-the-shelf Google Nexus 6p, Intel 5300 WiFi card, and Raspberry Pi B4. Real-world experiments reveal that P2PLocate can localize both static and mobile targets with a median accuracy of 0.88 m.
To mitigate the attacks by malicious peers and to motivate the peers to share the resources in peer-to-peer networks, several reputation systems have been proposed in the past. In most of them, the peers evaluate other peers based on their past interactions and then aggregate this information in the whole network. However such an aggregation process requires approximations in order to converge at some global consensus. It may not be the true reflection of past behavior of the peers. Moreover such type of aggregation gives only the relative ranking of peers without any absolute evaluation of their past. This is more significant when all the peers responding to a query, are malicious. In such a situation, we can only know that who is better among them without knowing their rank in the whole network. In this paper, we are proposing a new algorithm which accounts for the past behavior of the peers and will estimate the absolute value of the trust of peers. Consequently, we can suitably identify them as a good peers or malicious peers. Our algorithm converges at some global consensus much faster by choosing suitable parameters. Because of its absolute nature it will equally load all the peers in network. It will also reduce the inauthentic download in the network which was not possible in existing algorithms.
Metro origin-destination prediction is a crucial yet challenging time-series analysis task in intelligent transportation systems, which aims to accurately forecast two specific types of cross-station ridership, i.e., Origin-Destination (OD) one and Destination-Origin (DO) one. However, complete OD matrices of previous time intervals can not be obtained immediately in online metro systems, and conventional methods only used limited information to forecast the future OD and DO ridership separately. In this work, we proposed a novel neural network module termed Heterogeneous Information Aggregation Machine (HIAM), which fully exploits heterogeneous information of historical data (e.g., incomplete OD matrices, unfinished order vectors, and DO matrices) to jointly learn the evolutionary patterns of OD and DO ridership. Specifically, an OD modeling branch estimates the potential destinations of unfinished orders explicitly to complement the information of incomplete OD matrices, while a DO modeling branch takes DO matrices as input to capture the spatial-temporal distribution of DO ridership. Moreover, a Dual Information Transformer is introduced to propagate the mutual information among OD features and DO features for modeling the OD-DO causality and correlation. Based on the proposed HIAM, we develop a unified Seq2Seq network to forecast the future OD and DO ridership simultaneously. Extensive experiments conducted on two large-scale benchmarks demonstrate the effectiveness of our method for online metro origin-destination prediction.
In the setting where we ask participants multiple similar possibly subjective multi-choice questions (e.g. Do you like Bulbasaur? Y/N; do you like Squirtle? Y/N), peer prediction aims to design mechanisms that encourage honest feedback without verification. A series of works have successfully designed multi-task peer prediction mechanisms where reporting truthfully is better than any other strategy (dominantly truthful), while they require an infinite number of tasks. A recent work proposes the first multi-task peer prediction mechanism, Determinant Mutual Information (DMI)-Mechanism, where not only is dominantly truthful but also works for a finite number of tasks (practical). However, few works consider how to optimize the multi-task peer prediction mechanisms. In addition to the definition of optimization goal, the biggest challenge is we do not have space for optimization since there is only a single practical and dominantly truthful mechanism. This work addresses this problem by proposing a tractable effort incentive optimization goal and generalizing DMI-Mechanism to a new family of practical, dominantly truthful mechanisms, Volume Mutual Information (VMI)-Mechanisms. We show that DMI-Mechanism may not be optimal. But we can construct a sequence of VMI-Mechanisms that are approximately optimal. The main technical tool is a novel family of mutual information measures, Volume Mutual Information, which generalizes Determinant Mutual Information. We construct VMI by a simple geometric idea: we measure how informative a distribution is by measuring the volume of distributions that is less informative than it (inappropriately, its similar to measuring how clever a person is by counting the number of people that are less clever than he/she).