ترغب بنشر مسار تعليمي؟ اضغط هنا

Oreo: Detection of Clones in the Twilight Zone

107   0   0.0 ( 0 )
 نشر من قبل Vaibhav Saini
 تاريخ النشر 2018
  مجال البحث الهندسة المعلوماتية
والبحث باللغة English




اسأل ChatGPT حول البحث

Source code clones are categorized into four types of increasing difficulty of detection, ranging from purely textual (Type-1) to purely semantic (Type-4). Most clone detectors reported in the literature work well up to Type-3, which accounts for syntactic differences. In between Type-3 and Type-4, however, there lies a spectrum of clones that, although still exhibiting some syntactic similarities, are extremely hard to detect -- the Twilight Zone. Most clone detectors reported in the literature fail to operate in this zone. We present Oreo, a novel approach to source code clone detection that not only detects Type-1 to Type-3 clones accurately, but is also capable of detecting harder-to-detect clones in the Twilight Zone. Oreo is built using a combination of machine learning, information retrieval, and software metrics. We evaluate the recall of Oreo on BigCloneBench, and perform manual evaluation for precision. Oreo has both high recall and precision. More importantly, it pushes the boundary in detection of clones with moderate to weak syntactic similarity in a scalable manner.

قيم البحث

اقرأ أيضاً

103 - K. Alatalo , L. Lanz (3 2016
We investigate the optical and Wide-field Survey Explorer (WISE) colors of E+A identified post-starburst galaxies, including a deep analysis on 190 post-starbursts detected in the 2{mu}m All Sky Survey Extended Source Catalog. The post-starburst gala xies appear in both the optical green valley and the WISE Infrared Transition Zone (IRTZ). Furthermore, we find that post-starbursts occupy a distinct region [3.4]-[4.6] vs. [4.6]-[12] WISE colors, enabling the identification of this class of transitioning galaxies through the use of broad-band photometric criteria alone. We have investigated possible causes for the WISE colors of post-starbursts by constructing a composite spectral energy distribution (SED), finding that mid-infrared (4-12{mu}m) properties of post-starbursts are consistent with either 11.3{mu}m polycyclic aromatic hydrocarbon emission, or Thermally Pulsating Asymptotic Giant Branch (TP-AGB) and post-AGB stars. The composite SED of extended post- starburst galaxies with 22{mu}m emission detected with signal to noise >3 requires a hot dust component to produce their observed rising mid-infrared SED between 12 and 22{mu}m. The composite SED of WISE 22{mu}m non-detections (S/N<3), created by stacking 22{mu}m images, is also flat, requiring a hot dust component. The most likely source of this mid-infrared emission of these E+A galaxies is a buried active galactic nucleus. The inferred upper limit to the Eddington ratios of post-starbursts are 1e-2 to 1e-4, with an average of 1e-3. This suggests that AGNs are not radiatively dominant in these systems. This could mean that including selections able to identify active galactic nuclei as part of a search for transitioning and post-starburst galaxies would create a more complete census of the transition pathways taken as a galaxy quenches its star formation.
Given the availability of large source-code repositories, there has been a large number of applications for large-scale clone detection. Unfortunately, despite a decade of active research, there is a marked lack in clone detectors that scale to big s oftware systems or large repositories, specifically for detecting near-miss (Type 3) clones where significant editing activities may take place in the cloned code. This paper demonstrates: (i) SourcererCC, a token-based clone detector that targets the first three clone types, and exploits an index to achieve scalability to large inter-project repositories using a standard workstation. It uses an optimized inverted-index to quickly query the potential clones of a given code block. Filtering heuristics based on token ordering are used to significantly reduce the size of the index, the number of code-block comparisons needed to detect the clones, as well as the number of required token-comparisons needed to judge a potential clone, and (ii) SourcererCC-I, an Eclipse plug-in, that uses SourcererCCs core engine to identify and navigate clones (both inter and intra project) in real-time during software development. In our experiments, comparing SourcererCC with the state-of-the-art tools, we found that it is the only clone detection tool to successfully scale to 250 MLOC on a standard workstation with 12 GB RAM and efficiently detect the first three types of clones (precision 86% and recall 86-100%). Link to the demo: https://youtu.be/l7F_9Qp-ks4
The B0.2 V magnetic star tau Sco stands out from the larger population of massive magnetic OB stars due to its high X-ray activity and remarkable wind, apparently related to its peculiar magnetic field - a field which is far more complex than the mos tly-dipolar fields usually observed in magnetic OB stars. tau Sco is therefore a puzzling outlier in the larger picture of stellar magnetism - a star that still defies interpretation in terms of a physically coherent model. Recently, two early B-type stars were discovered as tau Sco analogues, identified by the striking similarity of their UV spectra to that of tau Sco, which was - until now - unique among OB stars. We present the recent detection of their magnetic fields by the MiMeS collaboration, reinforcing the connection between the presence of a magnetic field and wind anomalies (Petit et al. 2010). We will also present ongoing observational efforts undertaken to establish the precise magnetic topology, in order to provide additional constrains for existing models attempting to reproduce the unique wind structure of tau Sco-like stars.
The B0.2 V magnetic star tau Sco stands out from the larger population of massive magnetic OB stars due to its remarkable, superionized wind, apparently related to its peculiar magnetic field - a field which is far more complex than the mostly-dipola r fields usually observed in magnetic OB stars. tau Sco is therefore a puzzling outlier in the larger picture of stellar magnetism - a star that still defies interpretation in terms of a physically coherent model. Recently, two early B-type stars were discovered as tau Sco analogues, identified by the striking similarity of their UV spectra to that of tau Sco, which was - until now - unique among OB stars. We present the recent detection of their magnetic fields by the MiMeS collaboration, reinforcing the connection between the presence of a magnetic field and a superionized wind. We will also present ongoing observational efforts undertaken to establish the precise magnetic topology, in order to provide additional constrains for existing models attempting to reproduce the unique wind structure of tau Sco-like stars.
Classic clone detection approaches are hardly capable of finding redundant code that has been developed independently, i.e., is not the result of copy&paste. To automatically detect such functionally similar code of independent origin, we experimente d with a dynamic detection approach that applies random testing to selected chunks of code similar to Jiang&Sus approach. We found that such an approach faces several limitations in its application to diverse Java systems. This paper details on our insights regarding these challenges of dynamic detection of functionally similar code fragments. Our findings support a substantiated discussion on detection approaches and serve as a starting point for future research.
التعليقات
جاري جلب التعليقات جاري جلب التعليقات
سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا