ترغب بنشر مسار تعليمي؟ اضغط هنا

Expectation-Maximization Binary Clustering for Behavioural Annotation

99   0   0.0 ( 0 )
 نشر من قبل Frederic Bartumeus
 تاريخ النشر 2015
  مجال البحث علم الأحياء
والبحث باللغة English




اسأل ChatGPT حول البحث

We present a variant of the well sounded Expectation-Maximization Clustering algorithm that is constrained to generate partitions of the input space into high and low values. The motivation of splitting input variables into high and low values is to favour the semantic interpretation of the final clustering. The Expectation-Maximization binary Clustering is specially useful when a bimodal conditional distribution of the variables is expected or at least when a binary discretization of the input space is deemed meaningful. Furthermore, the algorithm deals with the reliability of the input data such that the larger their uncertainty the less their role in the final clustering. We show here its suitability for behavioural annotation of movement trajectories. However, it can be considered as a general purpose algorithm for the clustering or segmentation of multivariate data or temporal series.

قيم البحث

اقرأ أيضاً

Many real world tasks such as reasoning and physical interaction require identification and manipulation of conceptual entities. A first step towards solving these tasks is the automated discovery of distributed symbol-like representations. In this p aper, we explicitly formalize this problem as inference in a spatial mixture model where each component is parametrized by a neural network. Based on the Expectation Maximization framework we then derive a differentiable clustering method that simultaneously learns how to group and represent individual entities. We evaluate our method on the (sequential) perceptual grouping task and find that it is able to accurately recover the constituent objects. We demonstrate that the learned representations are useful for next-step prediction.
Any clustering algorithm must synchronously learn to model the clusters and allocate data to those clusters in the absence of labels. Mixture model-based methods model clusters with pre-defined statistical distributions and allocate data to those clu sters based on the cluster likelihoods. They iteratively refine those distribution parameters and member assignments following the Expectation-Maximization (EM) algorithm. However, the cluster representability of such hand-designed distributions that employ a limited amount of parameters is not adequate for most real-world clustering tasks. In this paper, we realize mixture model-based clustering with a neural network where the final layer neurons, with the aid of an additional transformation, approximate cluster distribution outputs. The network parameters pose as the parameters of those distributions. The result is an elegant, much-generalized representation of clusters than a restricted mixture of hand-designed distributions. We train the network end-to-end via batch-wise EM iterations where the forward pass acts as the E-step and the backward pass acts as the M-step. In image clustering, the mixture-based EM objective can be used as the clustering objective along with existing representation learning methods. In particular, we show that when mixture-EM optimization is fused with consistency optimization, it improves the sole consistency optimization performance in clustering. Our trained networks outperform single-stage deep clustering methods that still depend on k-means, with unsupervised classification accuracy of 63.8% in STL10, 58% in CIFAR10, 25.9% in CIFAR100, and 98.9% in MNIST.
69 - Qiuyun Zou , Haochuan Zhang , 2021
The reconstruction of sparse signal is an active area of research. Different from a typical i.i.d. assumption, this paper considers a non-independent prior of group structure. For this more practical setup, we propose EM-aided HyGEC, a new algorithm to address the stability issue and the hyper-parameter issue facing the other algorithms. The instability problem results from the ill condition of the transform matrix, while the unavailability of the hyper-parameters is a ground truth that their values are not known beforehand. The proposed algorithm is built on the paradigm of HyGAMP (proposed by Rangan et al.) but we replace its inner engine, the GAMP, by a matrix-insensitive alternative, the GEC, so that the first issue is solved. For the second issue, we take expectation-maximization as an outer loop, and together with the inner engine HyGEC, we learn the value of the hyper-parameters. Effectiveness of the proposed algorithm is also verified by means of numerical simulations.
We aimed to explore the utility of the recently developed open-source mobile health platform RADAR-base as a toolbox to rapidly test the effect and response to NPIs aimed at limiting the spread of COVID-19. We analysed data extracted from smartphone and wearable devices and managed by the RADAR-base from 1062 participants recruited in Italy, Spain, Denmark, the UK, and the Netherlands. We derived nine features on a daily basis including time spent at home, maximum distance travelled from home, maximum number of Bluetooth-enabled nearby devices (as a proxy for physical distancing), step count, average heart rate, sleep duration, bedtime, phone unlock duration, and social app use duration. We performed Kruskal-Wallis tests followed by post-hoc Dunns tests to assess differences in these features among baseline, pre-, and during-lockdown periods. We also studied behavioural differences by age, gender, body mass index (BMI), and educational background. We were able to quantify expected changes in time spent at home, distance travelled, and the number of nearby Bluetooth-enabled devices between pre- and during-lockdown periods. We saw reduced sociality as measured through mobility features, and increased virtual sociality through phone usage. People were more active on their phones, spending more time using social media apps, particularly around major news events. Furthermore, participants had lower heart rate, went to bed later, and slept more. We also found that young people had longer homestay than older people during lockdown and fewer daily steps. Although there was no significant difference between the high and low BMI groups in time spent at home, the low BMI group walked more. RADAR-base can be used to rapidly quantify and provide a holistic view of behavioural changes in response to public health interventions as a result of infectious outbreaks such as COVID-19.
Untargeted metabolomic studies are revealing large numbers of naturally occurring metabolites that cannot be characterized because their chemical structures and MS/MS spectra are not available in databases. Here we present iMet, a computational tool based on experimental tandem mass spectrometry that could potentially allow the annotation of metabolites not discovered previously. iMet uses MS/MS spectra to identify metabolites structurally similar to an unknown metabolite, and gives a net atomic addition or removal that converts the known metabolite into the unknown one. We validate the algorithm with 148 metabolites, and show that for 89% of them at least one of the top four matches identified by iMet enables the proper annotation of the unknown metabolite. iMet is freely available at http://imet.seeslab.net.
التعليقات
جاري جلب التعليقات جاري جلب التعليقات
سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا