Cross-site scripting (XSS) is an important issue in the field of network security, and there have been many studies on XSS detection models. However, the emergence of XSS adversarial attack samples has affected the detection accuracy of these models. Therefore, this paper proposes a reinforcement learning-based XSS adversarial attack model, which aims to generate XSS adversarial attack samples and consists of a detection module and an adversarial attack module. In the detection stage, the detection module cleans the original XSS script data and vectorizes it using Word2Vec. Then, the processed data is input into the XSS detection model of the detection module. The detection module uses an ensemble learning method to construct an XSS detection model by combining LSTM, MLP, and SVM to form a higher accuracy model. Finally, the detection module obtains the classification result of the original XSS data as a preparation for the next escape stage. In the escape stage, the adversarial attack module is designed using the reinforcement learning algorithm TD3, and uses an adversarial generation module to generate legitimate adversarial samples that can bypass the XSS detection model. Experimental results show that the XSS adversarial attack samples generated based on TD3 have an evasion rate nearly 6% higher than those based on Soft-Q learning. This model provides a new idea and method for improving the accuracy of XSS detection models and providing more valuable XSS attack data samples.
KEYWORDS: Education and training, Deep learning, Feature fusion, Feature extraction, Data modeling, Performance modeling, Associative arrays, Machine learning, Detection and tracking algorithms, Classification systems
Attack detection is a crucial process that involves closely monitoring and identifying malicious attacks. To identify and locate such attacks with precision, there is a need for a thorough analysis and classification of malicious payloads. To this end, a deep learning-based method is proposed in this paper, which enables the efficient classification of payloads for attack detection. The method involves segmenting the payloads using regular expressions, which helps in preserving their syntactic structure. Also, an improved TF-IDF algorithm is introduced to construct a streamlined vocabulary that alleviates the slow training problem caused by a large vocabulary. By fusing the features of vectors extracted using CNN and BiLSTM-Attention, the payload content can be effectively represented, yielding more accurate recognition results and improving the problem of low detection accuracy associated with traditional methods. The experimental results reveal that the proposed method achieved an accuracy of 99.21% on the CSIC 2010 dataset, which is significantly higher than that of the traditional method, and has faster training speed. This suggests that the proposed method can detect more stealthy attacks and build a more effective Web attack detection system.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.