ترغب بنشر مسار تعليمي؟ اضغط هنا

User Profiling Using Hinge-loss Markov Random Fields

95   0   0.0 ( 0 )
 نشر من قبل Golnoosh Farnadi
 تاريخ النشر 2020
  مجال البحث الهندسة المعلوماتية
والبحث باللغة English




اسأل ChatGPT حول البحث

A variety of approaches have been proposed to automatically infer the profiles of users from their digital footprint in social media. Most of the proposed approaches focus on mining a single type of information, while ignoring other sources of available user-generated content (UGC). In this paper, we propose a mechanism to infer a variety of user characteristics, such as, age, gender and personality traits, which can then be compiled into a user profile. To this end, we model social media users by incorporating and reasoning over multiple sources of UGC as well as social relations. Our model is based on a statistical relational learning framework using Hinge-loss Markov Random Fields (HL-MRFs), a class of probabilistic graphical models that can be defined using a set of first-order logical rules. We validate our approach on data from Facebook with more than 5k users and almost 725k relations. We show how HL-MRFs can be used to develop a generic and extensible user profiling framework by leveraging textual, visual, and relational content in the form of status updates, profile pictures and Facebook page likes. Our experimental results demonstrate that our proposed model successfully incorporates multiple sources of information and outperforms competing methods that use only one source of information or an ensemble method across the different sources for modeling of users in social media.



قيم البحث

اقرأ أيضاً

The dynamic monitoring of commuting flows is crucial for improving transit systems in fast-developing cities around the world. However, existing methodology to infer commuting originations and destinations have to either rely on large-scale survey da ta, which is inherently expensive to implement, or on Call Detail Records but based on ad-hoc heuristic assignment rules based on the frequency of appearance at given locations. In this paper, we proposed a novel method to accurately infer the point of origin and destinations of commuting flows based on individuals spatial-temporal patterns inferred from Call Detail Records. Our project significantly improves the accuracy upon the heuristic assignment rules popularly adopted in the literature. Starting with the historical data of geo-temporal travel patterns for a panel of individuals, we create, for each person-location, a vector of probability distribution capturing the likelihood that the person will appear in that location for a given the time of day. Stacked in this way, the matrix of historical geo-temporal data enables us to apply Eigen-decomposition and use unsupervised machine learning techniques to extract commonalities across locations for the different groups of travelers, which ultimately allows us to make inferences and create labels, such as home and work, on specific locations. Testing the methodology on real-world data with known location labels shows that our method identifies home and workplaces with significant accuracy, improving upon the most commonly used methods in the literature by 79% and 34%, respectively. Most importantly, our methodology does not bear any significant computation burden and is easily scalable and easily expanded to other real-world data with historical tracking.
The equations of a physical constitutive model for material stress within tantalum grains were solved numerically using a tetrahedrally meshed volume. The resulting output included a scalar vonMises stress for each of the more than 94,000 tetrahedra within the finite element discretization. In this paper, we define an intricate statistical model for the spatial field of vonMises stress which uses the given grain geometry in a fundamental way. Our model relates the three-dimensional field to integrals of latent stochastic processes defined on the vertices of the one- and two-dimensional grain boundaries. An intuitive neighborhood structure of said boundary nodes suggested the use of a latent Gaussian Markov random field (GMRF). However, despite the potential for computational gains afforded by GMRFs, the integral nature of our model and the sheer number of data points pose substantial challenges for a full Bayesian analysis. To overcome these problems and encourage efficient exploration of the posterior distribution, a number of techniques are now combined: parallel computing, sparse matrix methods, and a modification of a block update strategy within the sampling routine. In addition, we use an auxiliary variables approach to accommodate the presence of outliers in the data.
An important challenge in the field of exponential random graphs (ERGs) is the fitting of non-trivial ERGs on large graphs. By utilizing fast matrix block-approximation techniques, we propose an approximative framework to such non-trivial ERGs that r esult in dyadic independence (i.e., edge independent) distributions, while being able to meaningfully model both local information of the graph (e.g., degrees) as well as global information (e.g., clustering coefficient, assortativity, etc.) if desired. This allows one to efficiently generate random networks with similar properties as an observed network, and the models can be used for several downstream tasks such as link prediction. Our methods are scalable to sparse graphs consisting of millions of nodes. Empirical evaluation demonstrates competitiveness in terms of both speed and accuracy with state-of-the-art methods -- which are typically based on embedding the graph into some low-dimensional space -- for link prediction, showcasing the potential of a more direct and interpretable probabalistic model for this task.
We consider general discrete Markov Random Fields(MRFs) with additional bottleneck potentials which penalize the maximum (instead of the sum) over local potential value taken by the MRF-assignment. Bottleneck potentials or analogous constructions hav e been considered in (i) combinatorial optimization (e.g. bottleneck shortest path problem, the minimum bottleneck spanning tree problem, bottleneck function minimization in greedoids), (ii) inverse problems with $L_{infty}$-norm regularization, and (iii) valued constraint satisfaction on the $(min,max)$-pre-semirings. Bottleneck potentials for general discrete MRFs are a natural generalization of the above direction of modeling work to Maximum-A-Posteriori (MAP) inference in MRFs. To this end, we propose MRFs whose objective consists of two parts: terms that factorize according to (i) $(min,+)$, i.e. potentials as in plain MRFs, and (ii) $(min,max)$, i.e. bottleneck potentials. To solve the ensuing inference problem, we propose high-quality relaxations and efficient algorithms for solving them. We empirically show efficacy of our approach on large scale seismic horizon tracking problems.
We derive two sufficient conditions for a function of a Markov random field (MRF) on a given graph to be a MRF on the same graph. The first condition is information-theoretic and parallels a recent information-theoretic characterization of lumpabilit y of Markov chains. The second condition, which is easier to check, is based on the potential functions of the corresponding Gibbs field. We illustrate our sufficient conditions at the hand of several examples and discuss implications for practical applications of MRFs. As a side result, we give a partial characterization of functions of MRFs that are information-preserving.

الأسئلة المقترحة

التعليقات
جاري جلب التعليقات جاري جلب التعليقات
سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا