No Arabic abstract
This paper presents a commentator for providing real-time game commentary in a fighting game. The commentary takes into account highlight cues, obtained by analyzing scenes during gameplay, as input to adjust the pitch and loudness of commentary to be spoken by using a Text-to-Speech (TTS) technology. We investigate different designs for pitch and loudness adjustment. The proposed AI consists of two parts: a dynamic adjuster for controlling pitch and loudness of the TTS and a real-time game commentary generator. We conduct a pilot study on a fighting game, and our result shows that by adjusting the loudness significantly according to the level of game highlight, the entertainment of the gameplay can be enhanced.
This paper proposes a method for generating bullet comments for live-streaming games based on highlights (i.e., the exciting parts of video clips) extracted from the game content and evaluate the effect of mental health promotion. Game live streaming is becoming a popular theme for academic research. Compared to traditional online video sharing platforms, such as Youtube and Vimeo, video live streaming platform has the benefits of communicating with other viewers in real-time. In sports broadcasting, the commentator plays an essential role as mood maker by making matches more exciting. The enjoyment emerged while watching game live streaming also benefits the audiences mental health. However, many e-sports live streaming channels do not have a commentator for entertaining viewers. Therefore, this paper presents a design of an AI commentator that can be embedded in live streaming games. To generate bullet comments for real-time game live streaming, the system employs highlight evaluation to detect the highlights, and generate the bullet comments. An experiment is conducted and the effectiveness of generated bullet comments in a live-streaming fighting game channel is evaluated.
Recently, there have been a lot of researches to synthesize / edit the motion of a single avatar in the virtual environment. However, there has not been so much work of simulating continuous interactions of multiple avatars such as fighting. In this paper, we propose a new method to generate a realistic fighting scene based on motion capture data. We propose a new algorithm called the temporal expansion approach which maps the continuous time action plan to a discrete causality space such that turn-based evaluation methods can be used. As a result, it is possible to use many mature algorithms available in strategy games such as the Minimax algorithm and $alpha-beta$ pruning. We also propose a method to generate and use an offense/defense table, which illustrates the spatial-temporal relationship of attacks and dodges, to incorporate tactical maneuvers of defense into the scene. Using our method, avatars will plan their strategies taking into account the reaction of the opponent. Fighting scenes with multiple avatars are generated to demonstrate the effectiveness of our algorithm. The proposed method can also be applied to other kinds of continuous activities that require strategy planning such as sport games.
While many games were designed for steganography and robust watermarking, few focused on reversible watermarking. We present a two-encoder game related to the rate-distortion optimization of content-adaptive reversible watermarking. In the game, Alice first hides a payload into a cover. Then, Bob hides another payload into the modified cover. The embedding strategy of Alice affects the embedding capacity of Bob. The embedding strategy of Bob may produce data-extraction errors to Alice. Both want to embed as many pure secret bits as possible, subjected to an upper-bounded distortion. We investigate non-cooperative game and cooperative game between Alice and Bob. When they cooperate with each other, one may consider them as a whole, i.e., an encoder uses a cover for data embedding with two times. When they do not cooperate with each other, the game corresponds to a separable system, i.e., both want to independently hide a payload within the cover, but recovering the cover may need cooperation. We find equilibrium strategies for both players under constraints.
Automatic emotion recognition (AER) based on enriched multimodal inputs, including text, speech, and visual clues, is crucial in the development of emotionally intelligent machines. Although complex modality relationships have been proven effective for AER, they are still largely underexplored because previous works predominantly relied on various fusion mechanisms with simply concatenated features to learn multimodal representations for emotion classification. This paper proposes a novel hierarchical fusion graph convolutional network (HFGCN) model that learns more informative multimodal representations by considering the modality dependencies during the feature fusion procedure. Specifically, the proposed model fuses multimodality inputs using a two-stage graph construction approach and encodes the modality dependencies into the conversation representation. We verified the interpretable capabilities of the proposed method by projecting the emotional states to a 2D valence-arousal (VA) subspace. Extensive experiments showed the effectiveness of our proposed model for more accurate AER, which yielded state-of-the-art results on two public datasets, IEMOCAP and MELD.
With astonishing speed, bandwidth, and scale, Mobile Edge Computing (MEC) has played an increasingly important role in the next generation of connectivity and service delivery. Yet, along with the massive deployment of MEC servers, the ensuing energy issue is now on an increasingly urgent agenda. In the current context, the large scale deployment of renewable-energy-supplied MEC servers is perhaps the most promising solution for the incoming energy issue. Nonetheless, as a result of the intermittent nature of their power sources, these special design MEC server must be more cautious about their energy usage, in a bid to maintain their service sustainability as well as service standard. Targeting optimization on a single-server MEC scenario, we in this paper propose NAFA, an adaptive processor frequency adjustment solution, to enable an effective plan of the servers energy usage. By learning from the historical data revealing request arrival and energy harvest pattern, the deep reinforcement learning-based solution is capable of making intelligent schedules on the servers processor frequency, so as to strike a good balance between service sustainability and service quality. The superior performance of NAFA is substantiated by real-data-based experiments, wherein NAFA demonstrates up to 20% increase in average request acceptance ratio and up to 50% reduction in average request processing time.