Advanced search powered by artificial intelligence

New community

Subscribe to the gold package and get unlimited access to Shamra Academy

A SURVEY STUDY ON INFORMATION EXTRACTION FROM TEXT

دراسة استقصائية لطرق استخلاص المعلومات من نص

2413 0 130 0 ( 0 )

Download Cite

Added by Aِl-Baath University ورقة بحثية

Publication date 2017

and research's language is العربية

Authors مها وهبي( باحث ) - حبيب علي( باحث ) - حسن أبو النور( باحث )

Created by Shamra Editor

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

Information extraction is the task of finding structured information from unstructured or semi-structured text. It is an important task in text mining and has been extensively studied in various research communities including natural language processing, information retrieval and Web mining. It has a wide range of applications in domains such as biomedical literature mining and business intelligence. Two fundamental tasks of information extraction are named entity recognition and relation extraction. The former refers to finding names of entities such as people, organizations and locations. The latter refers to finding the semantic relations between entities.

Artificial intelligence review:

Upgrade your account to view the content

Research summary

تتناول هذه الدراسة الاستقصائية طرق استخلاص المعلومات من النصوص غير المنظمة أو شبه المنظمة، وهي مهمة أساسية في التقيب بالنصوص ومعالجة اللغة الطبيعية. تركز الدراسة على مهمتين رئيسيتين: التعرف على الكيانات المسماة واستخلاص العلاقات الدلالية بين هذه الكيانات. يتم استخدام تقنيات متعددة مثل نماذج ماركوف المخفية والحقول العشوائية الشرطية لتحقيق هذه الأهداف. كما تستعرض الدراسة تطبيقات مختلفة لاستخلاص المعلومات في مجالات مثل الطب الحيوي والاستخبارات المالية. تعتمد منهجية البحث على الدراسات التتبعية لتتبع أحدث التقنيات والخوارزميات المستخدمة في هذا المجال. وتناقش الدراسة أيضا التحديات المرتبطة باستخلاص المعلومات غير الخاضع للإشراف واستخلاص المعلومات المفتوح من المدونات الكبيرة مثل شبكة الإنترنت.

Critical review

تعتبر هذه الدراسة شاملة ومفصلة في تناولها لموضوع استخلاص المعلومات من النصوص، إلا أنها قد تكون معقدة بعض الشيء للقارئ غير المتخصص. قد يكون من المفيد تضمين أمثلة عملية وتطبيقات واقعية لتوضيح الفوائد العملية لهذه التقنيات. بالإضافة إلى ذلك، يمكن تحسين الدراسة من خلال تقديم مقارنة بين مختلف الخوارزميات والتقنيات المستخدمة وتوضيح مزايا وعيوب كل منها. كما أن التركيز على التطبيقات العملية في مجالات أخرى غير الطب الحيوي والاستخبارات المالية قد يضيف قيمة إضافية للدراسة.

Questions related to the research

ما هي المهمتين الرئيسيتين في استخلاص المعلومات من النصوص؟

المهمتين الرئيسيتين هما التعرف على الكيانات المسماة واستخلاص العلاقات الدلالية بين هذه الكيانات.
ما هي التقنيات المستخدمة في استخلاص المعلومات من النصوص؟

التقنيات المستخدمة تشمل نماذج ماركوف المخفية والحقول العشوائية الشرطية.
ما هي التطبيقات العملية لاستخلاص المعلومات المذكورة في الدراسة؟

التطبيقات تشمل التقيب في الأدب الطبي الحيوي والاستخبارات المالية.
ما هي التحديات المرتبطة باستخلاص المعلومات غير الخاضع للإشراف؟

التحديات تشمل تحديد هياكل المعلومات المستخرجة والوثائق التوضيحية وفقا للبنى المعرفة، والتي تتطلب خبرة بشرية وتستغرق وقتا طويلا.

Keywords

استخلاص المعلومات تمييز الكيانات استخلاص العلاقات الدلالية معالجة اللغة الطبيعية نماذج ماركوف المخفية الحقول العشوائية الشرطية

References used

Douglas E. Appelt, Jerry R. Hobbs, John Bear, David Israel, and Mabry Tyson. FASTUS: A finite-state processor for information extraction from realworld text. In Proceedings of the 13th International Joint Conference on Artificial Intelligence, 1993

Mary Elaine Califf and Raymond J. Mooney. Relational learning of patternmatch rules for information extraction. In Proceedings of the 16th National Conference on Artificial Intelligence and the 11th Innovative Applications of Artificial Intelligence Conference, pages 328–334, 1999

Tao Cheng, Xifeng Yan, and Kevin Chen-Chuan Chang. Supporting entity search: a large-scale prototype search engine. In Proceedings of the 2007 ACM SIGMOD International Conference on Management of Data, pages 1144–1146, 2007

rate research

Improvement learning rules for Relations Extraction from text

1705 - Aِl-Baath University 2018 ورقة بحثية

relation extraction systems have made extensive use of features generated by linguistic analysis modules. Errors in these features lead to errors of relation detection and classification. In this work, we depart from these traditional approaches w ith complicated feature engineering by introducing a convolutional neural network for relation extraction that automatically learns features from sentences and minimizes the dependence on external toolkits and resources. Our model takes advantages of multiple window sizes for filters and pre-trained word embeddings as an initializer on a nonstatic architecture to improve the performance.

relation extraction استخلاص العلاقات هندسة المميزات تضمينات الكلمة الشبكات العصبونية الالتفافية features engineering word embeddings convolutional neural network المزيد..

Zero-Shot Information Extraction as a Unified Text-to-Triple Translation

1431 - Association for Computation Linguistics 2021 مقالة

We cast a suite of information extraction tasks into a text-to-triple translation framework. Instead of solving each task relying on task-specific datasets and models, we formalize the task as a translation between task-specific input text and output triples. By taking the task-specific input, we enable a task-agnostic translation by leveraging the latent knowledge that a pre-trained language model has about the task. We further demonstrate that a simple pre-training task of predicting which relational information corresponds to which input text is an effective way to produce task-specific outputs. This enables the zero-shot transfer of our framework to downstream tasks. We study the zero-shot performance of this framework on open information extraction (OIE2016, NYT, WEB, PENN), relation classification (FewRel and TACRED), and factual probe (Google-RE and T-REx). The model transfers non-trivially to most tasks and is often competitive with a fully supervised method without the need for any task-specific training. For instance, we significantly outperform the F1 score of the supervised open information extraction without needing to use its training set.

تعديل الرسم البياني الشرطي open information extraction unified استخراج المعلومات المفتوح موحد صناعة حمض الفوسفور

Challenges for Information Extraction from Dialogue in Criminal Law

677 - Association for Computation Linguistics 2021 مقالة

Information extraction and question answering have the potential to introduce a new paradigm for how machine learning is applied to criminal law. Existing approaches generally use tabular data for predictive metrics. An alternative approach is needed for matters of equitable justice, where individuals are judged on a case-by-case basis, in a process involving verbal or written discussion and interpretation of case factors. Such discussions are individualized, but they nonetheless rely on underlying facts. Information extraction can play an important role in surfacing these facts, which are still important to understand. We analyze unsupervised, weakly supervised, and pre-trained models' ability to extract such factual information from the free-form dialogue of California parole hearings. With a few exceptions, most F1 scores are below 0.85. We use this opportunity to highlight some opportunities for further research for information extraction and question answering. We encourage new developments in NLP to enable analysis and review of legal cases to be done in a post-hoc, not predictive, manner.

criminal law قانون جنائي صناعة حمض الفوسفور

Learning from Noisy Labels for Entity-Centric Information Extraction

853 - Association for Computation Linguistics 2021 مقالة

Recent information extraction approaches have relied on training deep neural models. However, such models can easily overfit noisy labels and suffer from performance degradation. While it is very costly to filter noisy labels in large learning resour ces, recent studies show that such labels take more training steps to be memorized and are more frequently forgotten than clean labels, therefore are identifiable in training. Motivated by such properties, we propose a simple co-regularization framework for entity-centric information extraction, which consists of several neural models with identical structures but different parameter initialization. These models are jointly optimized with the task-specific losses and are regularized to generate similar predictions based on an agreement loss, which prevents overfitting on noisy labels. Extensive experiments on two widely used but noisy benchmarks for information extraction, TACRED and CoNLL03, demonstrate the effectiveness of our framework. We release our code to the community for future research.

المنطق الزمني للحدث entity-centric information extraction noisy labels استخراج المعلومات التركز على الكيان تسميات صاخبة صناعة حمض الفوسفور

A Selective Survey on Key Distribution in Sensor Networks

1740 - Damascus University 2011 ورقة بحثية

Key management in Wireless Sensor Networks (WSNs) is an important issue due to the absence of trusted infrastructures, on one hand, and the limited resources of sensor nodes, on the other hand. This paper surveys some recent key management approach es in WSNs. It first identifies some of the problems that confront the key management. Then, it defines some criteria for viable solutions to key management problems. Next, it explores some of the proposed key management approaches, and analyzes them according to the presented criteria. Some open research issues are discussed.

شبكات الحساسات إدارة المفاتيح أمن شبكات الحساسات اللاسلكية Sensor networks key Management Security of Wireless Sensor Networks

comments

Fetching comments

University of Mosul

Additional details More universities

A SURVEY STUDY ON INFORMATION EXTRACTION FROM TEXT

دراسة استقصائية لطرق استخلاص المعلومات من نص

Ask ChatGPT about the research

Read More