This thesis is copyright materials protected under the Berne Convection, the copyright Act 1999 and other international and national enactments in that behalf, on intellectual property. It may not be reproduced by any means in full or in part except for short extracts in fair dealing so for research or private study, critical scholarly review or discourse with acknowledgment, with written permission of the Dean School of Graduate Studies on behalf of both the author and XXX XXX University.
With Fast growing internet world the risk of intrusion has also increased, as a result Intrusion Detection System (IDS) is the admired key research field. IDS are used to identify any suspicious activity or patterns in the network or machine, which endeavors the security features or compromise the machine. IDS majorly use all the features of the data. It is a keen observation that all the features are not of equal relevance for the detection of attacks. Moreover every feature does not contribute in enhancing the system performance significantly. The main aim of the work done is to develop an efficient denial of service network intrusion classification model. The specific objectives included: to analyse existing literature in intrusion detection systems; what are the techniques used to model IDS, types of network attacks, performance of various machine learning tools, how are network intrusion detection systems assessed; to find out top network traffic attributes that can be used to model denial of service intrusion detection; to develop a machine learning model for detection of denial of service network intrusion.Methods: The research design was experimental and data was collected by simulation using NSL-KDD dataset. By implementing Correlation Feature Selection (CFS) mechanism using three search algorithms, a smallest set of features is selected with all the features that are selected very frequently. Findings: The smallest subset of features chosen is the most nominal among all the feature subset found. Further, the performances using Artificial neural networks(ANN), decision trees, Support Vector Machines (SVM) and K-Nearest Neighbour (KNN) classifiers is compared for 7 subsets found by filter model and 41 attributes. Results: The outcome indicates a remarkable improvement in the performance metrics used for comparison of the two classifiers. The results show that using 17/18 selected features improves DOS types classification accuracies as compared to using the 41 features in the NSL-KDD dataset. It was further observed that using an ensemble of three classifiers with decision fusion performs better as compared to using a single classifier for DOS type’s classification. Among machine learning tools experimented, ANN achieved best classification accuracies followed by SVM and DT. KNN registered the lowest classification accuracies. Application: The proposed work with such an improved detection rate and lesser classification time and larger merits of the minimal subset found will play a vital role for the network administrator in choosing efficient IDS.
This chapter presents the background information to the study, the research problem statement to be solved, the objectives to be achieved and the research questions. Additionally, the motivations behind the works are outlined in the problem justification and finally the scope to be covered is presented.
NBAD provides one approach to network security threat detection which is a complementary technology to systems that detect security threats based on packet signatures.
A NBAD approach will thus comprise of two stages namely the training and testing phases. The training phase entails training the classifier using the available data that is labelled whereas, the testing phase instances are classified as anomaly or normal using classification algorithm. The central premise of anomaly detection is that intrusive activity is a subset of anomalous activity. An intruder who lacks the idea of the authenticated user’s patterns activities, getting an access to the host computer or system, the highest probability is that the activity of an unauthenticated user is detected as anomalous. This means that, the set of anomalous activities will be the same as the set of intrusive activities. In such a case, flagging all anomalous activities as intrusive activities results in no false positives and no false negatives. However, intrusive activity does not always coincide with anomalous activity which may dupe the anomaly detection algorithm.
Routinely Network staff are challenged with a wide range of events that are unusual. Some of these events may or may not be malicious. The network operators are required to detect and classify some of these anomalies as they occur and choose the right response. A general anomaly diagnosis system should therefore be able to detect a range of anomalies with diverse structure, distinguish between different types of anomalies and group similar anomalies. This goal is indeed ambitious though at the same time this goal is coming into focus because network operators are finding it practical to harvest network-wide views of traffic in the form of sampled flow data. An important challenge therefore is to determine how best to extract understanding about the presence and nature of traffic anomalies from the potentially overwhelming mass of network-wide traffic data. In the case of malicious anomalies, it is difficult to permanently and clearly define a given set of network anomalies. Over the time the new network anomalies will continue to raise and therefore, an anomaly detection system should avoid being restricted to any predefined set of anomalies.
Network traffic analysis is the process of reviewing, recording and analyzing network traffic for the purpose of general network management, operations, security and performance. It is a process of using automated and manual techniques to review the statistics within network traffic and granular-level details.
Since network operation is more technical, traffic classification is pertinent to understanding the network. The main objective of classification is to aid in identifying the types of applications that are used by the end users and the percentage of the share traffic generated by various application in the total traffic mix. Moreover, the communication between IP network nodes can be organized into flows, and traffic classification can assign a specific application to each individual flow. A flow is defined as a collection of IP packets emanating from a given port  at one IP address to a given port at another IP address using a specific protocol. A flow is thus identified using its five-tuple flow identifier: destination IP address, protocol identifier, destination port, source IP address and source port.
As the name suggests, this type of networks are characterized by relatively small average path length between any pair of nodes in the network. Watts and Strogatz define them as “highly clustered, like regular lattices, yet have small characteristic path lengths, like random graphs”. Formally, they are defined as networks in which the distance L between any two randomly selected nodes is directly proportional to the logarithm of the number of nodes, N in the network,
Most IDS use machine learning (ML) tools to achieve detection capability that is high for automation to ease user from task of constructing signatures of attacks and novel attacks. Theoretically, it is possible for a ML algorithm to achieve the best performance, that is, it can maximize detection accuracy and minimize the false alarm rate. The commonly used supervised learning algorithm in IDS are decision trees due to its accuracy, simplicity, fast adaption and high detection. Artificial Neural Networks (ANN) is another method that models linear and non-linear patterns and performs highly well. The resulting model can generate a probability estimate of whether given data matches the characteristics that it has been trained to recognize. IDS that are based on ANN usually achieve a lot in detecting difficult tasks.
Support Vector Machines (SVMs) has a capability of dealing with large dimensionality of data and provision of real-time detection. Through nonlinear mapping and labeling each vector by its class, SVMs is capable of plotting the training vectors in high dimensional feature space. The data is then classified by determining a set of support vectors, which are members of the set of training inputs that outline a hyperplane in the feature space. SVMs are relatively insensitive making them scalable to the number of data points.
Artificial intelligence (AI) based IDS techniques involves establishment of an explicit or implicit model that allows the patterns to be categorized. Ponce has listed many advantages of using AI based techniques over other conventional approach. The major advantages include Flexibility (vs. threshold definition of conventional technique); Adaptability (vs. specific rules of conventional technique); Pattern recognition (and detection of new patterns); Fast computing (faster than humans, actually) and Learning abilities.
MLP has made ANN IDS tools more efficient and accurate in terms of normal and detection communication. Compared to the traditional mechanisms, MLP-ANN shows detection outcomes better and overcomes the limitation of low-frequency attacks. MLP-ANN can easily define the type of attacks and classify them. This feature allows system to predefine actions against similar future attacks.
The SVM is a classifier based on finding a separating hyperplane in the feature space between two classes in such a way that the distance between the hyperplane and the closest data points of each class is maximized. The approach is based on a minimized classification risk rather than on optimal classification. SVMs are well known for their generalization ability and are particularly useful when the number of features, m, is high and the number of data points, n, is low (m >> n) .
When the two classes are not separable, slack variables are added and a cost parameter is assigned for the overlapping data points. The maximum margin and the place of the hyperplane is determined by a quadratic optimization with a practical runtime of O (n2), placing the SVM among fast algorithms even when the number of attributes is high. Various types of dividing classification surfaces can be realized by applying a kernel, such as linear, polynomial, Gaussian Radial Basis Function (RBF), or hyperbolic tangent. SVMs are binary classifiers and multi-class classification is realized by developing an SVM for each pair of classes.
It is an extension of the algorithm, the most common classifier used to manage the database for supervised learning that gives a prediction about new unlabeled data, J48 creates Univariate Decision Trees. J48 based used attribute correlation based on entropy and information gain for each attributes. J48 has been utilized in various field of study that includes; pattern recognition, machine learning, information extraction and data mining. J48 is capable of dealing with various data types’ inputs; nominal, numerical and textual. It can build small trees and follows depth-first strategy, and a divide-and conquer approach.
Tools that are popular and powerful for prediction and classification are Decision trees. It has three main components namely; arcs, leaves and nodes. Each arc out of a node is labeled with a feature value for the node’s feature, each leaf is labeled with a category or class and each node is labeled with a feature attribute. J48 can be used in classification of data point by beginning at the root of the tree and proceed until a leaf node is reached and then the leaf node give data point classification.
K-Nearest Neighbors (KNN) is an algorithms used in Machine Learning for regression and classification problem. It utilizes a data and classifies new data points based on a similarity measures. For example, distance function. Classification is done by a majority vote to its neighbors. It uses data. It is one of the most simple and traditional nonparametric technique to classify samples.
It computes the approximate distances between different points on the input vectors, and then as signs the unlabeled point to the class of its K-nearest neighbors. In the process of create k-NN classifier, k is an important parameter and different k values will cause different performances. If k is considerably huge, the neighbors which used for prediction will make large classification time and influence the accuracy of prediction. K-NN is called instance based learning, and it is different from the inductive learning approach. Thus, it does not contain the model training stage, but only searches the examples of input vectors and classifies new instances. Therefore, k-NN‘‘on-line” trains the examples and finds out k-nearest neighbor of the new instance.
DoS attack are the activities that, attackers, which are also computers connected to the network, try to exhaust computer resources or network bandwidth of a targeted victim system, in order to prevent it from providing services to legitimate users. The resulting effect of DoS attack mainly depends on its impact on the target and its similarity to legitimate traffics.
The attributes of the DOS includes; increased traffic flow, where several unauthorized users send a huge amount of service requests which are legitimate to the targets per time unit to finish up their capacities. Effective DoS attacks usually lead to traffic flows with a lot of different source IP addresses but only very few different destination IP addresses. For this reason, they are called Distributed DoS (DDoS) attacks. In addition, like scanning, other properties of the flows are usually the same or very similar.
Flash crowd is very similar to DDoS. It is characterized by an unusually high amount of traffic from a set of IPs. Contrary to DDoS, however, a flash crowd is not a result of malicious activity. With the rise of social network sites and websites where users can share interesting hyperlinks with each other. Basic botnet architecture are situations where people from all over the world access a particular site during the same time interval. This effect can cause websites to load very slowly or even go down entirely. Flash crowds differs from DDoS in the amount of different IPs observed in the anomaly. While a botnet used for DDoS attacks can be of a very large size, they are normally eclipsed by sheer amount of IPs present in a flash crowd. It is distinguished by an increase of a particular kind of traffic flows (e.g. FTP flows). There are several approaches that have been proposed for anomaly detection. These include: machine learning based techniques, statistical anomaly detection and data-mining based methods.
A decision table is an organizational or programming tool for the representation of discrete functions. It can be viewed as a matrix where the upper rows specify sets of conditions and the lower ones sets of actions to be taken when the corresponding conditions are satisfied; thus each column called a rule, describes a procedure of the type if conditions, then actions.
Given an unlabeled instance, decision table classifier searches for exact matches in the decision table using only the features in the schema (it is to be noted that there may be many matching instances in the table). If no instances are found, the majority class of the decision table is returned; otherwise, the majority class of all matching instances is returned.
If the training dataset size is, say D and test data set size is, say d with N attributes, The complexity of predicting one instance will be O (D*N). So, the underlying data structure used for bringing down the complexity is Universal Hash table. The time to compute the hash function is O (n’) where n’ is the number of features used as schema in decision table. So complexity will become lookup operation for n’ attribute multiplied by l, number of classes that is O (n’ + l).
To build a decision table, the induction algorithm must decide which features to include in the schema and which instances to store in the body. More details can be found in. We use CFS algorithm as induction algorithm for our experiment.
DoS attacks are classified in to two categories namely; high-level or low-level variations. High-volume attacks are characterized by high volume of application-layer requests transmitted to a victim. Low-volume DoS attacks are characterized by small amounts of attack traffic transmitted to a victim strategically. Since one-shot attacks generally exploit a specific weakness or vulnerability in application level protocol/service, in this study we will focus our attention on more universal type of application DoS slow-rate attacks that are often seen in two variations: slow send and slow read. The lack of data with application layer DoS attacks prompted us to create an evaluation dataset. The set up was testbed environment with a victim web server running Apache Linux v.2.2.22, PHP5 and Drupal v.7 as a content management system. The attack was selected to represent the most common types of application layer DoS.
Why Choose Us
At Myhomeworkwriters.com, we always aim at 100% customer satisfaction. As such, we never compromise o the quality of our homework services. Our homework helpers ensure that they craft each paper carefully to match the requirements of the instruction form.
Professional Academic Writers
With Myhomeworkwriters.com, every student is guaranteed high-quality, professionally written papers. We ensure that we hire individuals with high academic qualifications who can maintain our quality policy. These writers undergo further training to sharpen their writing skills, making them more competent in writing academic papers.
Our company maintains a fair pricing system for all academic writing services to ensure affordability. Our pricing system generates quotations based on the properties of individual papers.
My Homework Writers guarantees all students of swift delivery of papers. We understand that time is an essential factor in the academic world. Therefore, we ensure that we deliver the paper on or before the agreed date to give students ample time for reviewing.
Myhomeworkwriters.com maintains a zero-plagiarism policy in all papers. As such, My Homework Writers professional academic writers ensure that they use the students’ instructions to deliver plagiarism-free papers. We are very keen on avoiding any chance of similarities with previous papers.
Customer Support 24/7
Our customer support works around the clock to provide students with assistance or guidance at any time of the day. Students can always communicate with us through our live chat system or our email and receive instant responses. Feel free to contact us via the Chat window or support email: email@example.com.
Try it now!
How it works?
Follow these simple steps to get your paper done
Place your order
Fill in the order form and provide all details of your assignment.
Proceed with the payment
Choose the payment system that suits you most.
Receive the final file
Once your paper is ready, we will email it to you.
Our Homework Writing Services
My Homework Writers holds a reputation for being a platform that provides high-quality homework writing services. All you need to do is provide us with all the necessary requirements of the paper and wait for quality results.
At My Homework Writers, we have highly qualified academic gurus who will offer great assistance towards completing your essays. Our homework writing service providers are well-versed with all the aspects of developing high-quality and relevant essays.
Admission and Business Papers
With Myhomeworkwriters.com, we will help you secure a position at your desired institution. Our essay writing services include the crafting of admissions papers. We will still help you climb your career ladder by helping you write the official papers that will help you secure a job. We will guide you on how to write an outstanding portfolio or resume.
Editing and Proofreading
Myhomeworkwriters.com has a professional editorial team that will help you organize your paper, paraphrase it, and eliminate any possible mistakes. Also, we will help you check on plagiarism to ensure that your final paper posses quality and originality.
My Homework Writers harbors professional academic writers from diverse academic disciplines. As such, we can develop homework writing services in all academic areas. The simplicity or complexity of the paper does not affect the quality of homework writing services.