ﻻ يوجد ملخص باللغة العربية
We predict restaurant ratings from Yelp reviews based on Yelp Open Dataset. Data distribution is presented, and one balanced training dataset is built. Two vectorizers are experimented for feature engineering. Four machine learning models including Naive Bayes, Logistic Regression, Random Forest, and Linear Support Vector Machine are implemented. Four transformer-based models containing BERT, DistilBERT, RoBERTa, and XLNet are also applied. Accuracy, weighted F1 score, and confusion matrix are used for model evaluation. XLNet achieves 70% accuracy for 5-star classification compared with Logistic Regression with 64% accuracy.
We use over 350,000 Yelp reviews on 5,000 restaurants to perform an ablation study on text preprocessing techniques. We also compare the effectiveness of several machine learning and deep learning models on predicting user sentiment (negative, neutra
The Bangla language is the seventh most spoken language, with 265 million native and non-native speakers worldwide. However, English is the predominant language for online resources and technical knowledge, journals, and documentation. Consequently,
The proliferation of fake news and its propagation on social media has become a major concern due to its ability to create devastating impacts. Different machine learning approaches have been suggested to detect fake news. However, most of those focu
Authorship identification is a process in which the author of a text is identified. Most known literary texts can easily be attributed to a certain author because they are, for example, signed. Yet sometimes we find unfinished pieces of work or a who
Showing items that do not match search query intent degrades customer experience in e-commerce. These mismatches result from counterfactual biases of the ranking algorithms toward noisy behavioral signals such as clicks and purchases in the search lo