ترغب بنشر مسار تعليمي؟ اضغط هنا

Simulated Annealing for JPEG Quantization

111   0   0.0 ( 0 )
 نشر من قبل Sebastian Wagner-Carena
 تاريخ النشر 2017
  مجال البحث الهندسة المعلوماتية
والبحث باللغة English




اسأل ChatGPT حول البحث

JPEG is one of the most widely used image formats, but in some ways remains surprisingly unoptimized, perhaps because some natural optimizations would go outside the standard that defines JPEG. We show how to improve JPEG compression in a standard-compliant, backward-compatible manner, by finding improved default quantization tables. We describe a simulated annealing technique that has allowed us to find several quantization tables that perform better than the industry standard, in terms of both compressed size and image fidelity. Specifically, we derive tables that reduce the FSIM error by over 10% while improving compression by over 20% at quality level 95 in our tests; we also provide similar results for other quality levels. While we acknowledge our approach can in some images lead to visible artifacts under large magnification, we believe use of these quantization tables, or additional tables that could be found using our methodology, would significantly reduce JPEG file sizes with improved overall image quality.

قيم البحث

اقرأ أيضاً

Deep learning for computer vision depends on lossy image compression: it reduces the storage required for training and test data and lowers transfer costs in deployment. Mainstream datasets and imaging pipelines all rely on standard JPEG compression. In JPEG, the degree of quantization of frequency coefficients controls the lossiness: an 8 by 8 quantization table (Q-table) decides both the quality of the encoded image and the compression ratio. While a long history of work has sought better Q-tables, existing work either seeks to minimize image distortion or to optimize for models of the human visual system. This work asks whether JPEG Q-tables exist that are better for specific vision networks and can offer better quality--size trade-offs than ones designed for human perception or minimal distortion. We reconstruct an ImageNet test set with higher resolution to explore the effect of JPEG compression under novel Q-tables. We attempt several approaches to tune a Q-table for a vision task. We find that a simple sorted random sampling method can exceed the performance of the standard JPEG Q-table. We also use hyper-parameter tuning techniques including bounded random search, Bayesian optimization, and composite heuristic optimization methods. The new Q-tables we obtained can improve the compression rate by 10% to 200% when the accuracy is fixed, or improve accuracy up to $2%$ at the same compression rate.
Video-quality measurement plays a critical role in the development of video-processing applications. In this paper, we show how video preprocessing can artificially increase the popular quality metric VMAF and its tuning-resistant version, VMAF NEG. We propose a pipeline that tunes processing-algorithm parameters to increase VMAF by up to 218.8%. A subjective comparison revealed that for most preprocessing methods, a videos visual quality drops or stays unchanged. We also show that some preprocessing methods can increase VMAF NEG scores by up to 23.6%.
We propose a new stochastic algorithm (generalized simulated annealing) for computationally finding the global minimum of a given (not necessarily convex) energy/cost function defined in a continuous D-dimensional space. This algorithm recovers, as p articular cases, the so called classical (Boltzmann machine) and fast (Cauchy machine) simulated annealings, and can be quicker than both. Key-words: simulated annealing; nonconvex optimization; gradient descent; generalized statistical mechanics.
Although significant progress in automatic learning of steganographic cost has been achieved recently, existing methods designed for spatial images are not well applicable to JPEG images which are more common media in daily life. The difficulties of migration mostly lie in the unique and complicated JPEG characteristics caused by 8x8 DCT mode structure. To address the issue, in this paper we extend an existing automatic cost learning scheme to JPEG, where the proposed scheme called JEC-RL (JPEG Embedding Cost with Reinforcement Learning) is explicitly designed to tailor the JPEG DCT structure. It works with the embedding action sampling mechanism under reinforcement learning, where a policy network learns the optimal embedding policies via maximizing the rewards provided by an environment network. The policy network is constructed following a domain-transition design paradigm, where three modules including pixel-level texture complexity evaluation, DCT feature extraction, and mode-wise rearrangement, are proposed. These modules operate in serial, gradually extracting useful features from a decompressed JPEG image and converting them into embedding policies for DCT elements, while considering JPEG characteristics including inter-block and intra-block correlations simultaneously. The environment network is designed in a gradient-oriented way to provide stable reward values by using a wide architecture equipped with a fixed preprocessing layer with 8x8 DCT basis filters. Extensive experiments and ablation studies demonstrate that the proposed method can achieve good security performance for JPEG images against both advanced feature based and modern CNN based steganalyzers.
Detection of inconsistencies of double JPEG artefacts across different image regions is often used to detect local image manipulations, like image splicing, and to localize them. In this paper, we move one step further, proposing an end-to-end system that, in addition to detecting and localizing spliced regions, can also distinguish regions coming from different donor images. We assume that both the spliced regions and the background image have undergone a double JPEG compression, and use a local estimate of the primary quantization matrix to distinguish between spliced regions taken from different sources. To do so, we cluster the image blocks according to the estimated primary quantization matrix and refine the result by means of morphological reconstruction. The proposed method can work in a wide variety of settings including aligned and non-aligned double JPEG compression, and regardless of whether the second compression is stronger or weaker than the first one. We validated the proposed approach by means of extensive experiments showing its superior performance with respect to baseline methods working in similar conditions.

الأسئلة المقترحة

التعليقات
جاري جلب التعليقات جاري جلب التعليقات
سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا