NEURAL NETWORKS and their applicability in security field by nebunu Disclamer ========= This is not an AI tutorial, it is written only to make the reader aware of the applicability of neural networks in security field. Greetings to all my friends, fucks goes to r0sec as usually (dead and burried group), RDS for their lack of respect towards it's clients.(home.ro sucks as usually) Introduction ============ Neural networks are widely used for prediction, pattern recognition, classification. Voice or handwriting recognition problem is very hard to solve using standard programs and algorithms. I wont go into details about backpropagation algorithm, for the full algorithm read here http://www.speech.sri.com/people/anand/771/html/node37.html The same algorithm is being used by our brain to learn. I will focus in the following lines on the applicability of neural networks in security applications. Brief details ============= A neural network can learn any logical function without knowing the rule first, it simply learns by adjusting it's mistakes, mathematically called weights. Let's take the following function: f(x,y)=x+y It has as input two numbers and as output the sum of those two numbers. By feeding a neural network with enough example and letting it to learn, it will recognize the adding rule and it will apply it to any set of numbers it has neves "seen" before. In the following i will build a neural network that can recognize any logical function that takes two inputs and generates an output. Since i have two inputs, my network will have the following architecture: - two input neurons - two hidden layers - one output neuron - i have chosed the sigmoid function as training function Explanation =========== Obtaining results with neural nets involves the following steps: - choosing the right network structure - feeding the network with enough sample data to learn from - train the network - give the network inputs it has never "seen" before and notice it's response. - if the response is not accurate enough increase the training data, adjust the learning rate and try again For the f(x,y)=x+y function the training data will be 0.1 0.2 0.3 0.11 0.22 0.33 0.23 0.11 0.34 0.2 0.3 0.5 0.1 0.3 0.4 0.2 0.5 0.7 0.1 0.7 0.8 The first column represent x, the second one represent y, the the third is the desired output (the result). Please notice that you can use any rule you like, with numbers from [0,1] interval, because the sigmoid function converts all numbers to that interval. If you want as results numbeers>1 choose another function and see how it behaves. Code for solving the f(x,y)=x+y is appended to this article Applicability ============= - An intelligent network scanner that can recognize potential vulnerability by operating systems. For example 1= linux 2= freebsd 3= windows 0.1= ssh vulnerabilities 0.2= sendmail vulnerabilities 0.3= netbios vulnrabilities The scanner would scan the network, then organise the results in a table like 1 0.1 2 0.2 3 0.3 Then feed the neural network with those data and choose the desired output. - DNS ids prediction for weak implementation of bind - SYN/ACK prediction for windows and other vulnerable system - scan large classes and clasify the ip's from those ranges as (0=secure, 1=easy to hack, 3=whatever). Feed those data to a neural network. - the most important applicability will be recognizing the virus patterns so the signature database wont be necesarry anymore. - build a database of spam classes and feed them to a neural net, so everytime when a similar ip will try to connect to your mail server, it will be blacklisted. Code ==== C code for the above presented problem. Training data should be changed to match any function with two inputs and one output. My data file back.txt contains 0.1 0.2 0.3 0.11 0.22 0.33 0.23 0.11 0.34 0.2 0.3 0.5 0.1 0.3 0.4 0.2 0.5 0.7 0.1 0.7 0.8 ================================= cut here ================================= #include #include #include #include #define RATA 0.5 int i,j; float w[3][3]; float deltas[3]; float v1,v2; // generate weights void genereaza() { srand((unsigned)(time(NULL))); for(i=0;i<3;i++) { for(j=0;j<3;j++) { w[i][j]=(float)(random()%3-1); } } } //sigma function float sigma(float num) { return (float)(1/(1+exp(-num))); } // train the network float train(float inp1, float inp2, float output) { float net1,net2,inp3,inp4,outp; net1=1*w[0][0]+inp1*w[1][0]+inp2*w[2][0]; net2=1*w[0][1]+inp1*w[1][1]+inp2*w[2][1]; inp3=sigma(net1); inp4=sigma(net2); net1=1*w[0][2]+inp3*w[1][2]+inp4*w[2][2]; outp=sigma(net1); deltas[2]=outp*(1-outp)*(output-outp); deltas[1]=inp4*(1-inp4)*(w[2][2]*deltas[2]); deltas[0]=inp3*(1-inp3)*(w[1][2]*deltas[2]); v1=inp1; v2=inp2; for(i=0;i<3;i++) { if(i==2) { v1=inp3; v2=inp4; } w[0][i]+=RATA*deltas[i]; w[1][i]+=RATA*v1*deltas[i]; w[2][i]+=RATA*v2*deltas[i]; } return outp; } float run(float inp1, float inp2) { float net1,net2,inp3,inp4; net1=1*w[0][0]+inp1*w[1][0]+inp2*w[2][0]; net2=1*w[0][1]+inp1*w[1][1]+inp2*w[2][1]; inp3=sigma(net1); inp4=sigma(net2); net1=1*w[0][2]+inp3*w[1][2]+inp4*w[2][2]; return sigma(net1); } main() { int t; float a,b,i1,i2,o1; FILE *f; char *pch; char sir[1024]; memset(pch,'\0',sizeof(pch)); printf("\nLearning 20.000 epocs, with a 0.5 learning rate\n"); for(t=0;t<20000;t++) { if((f=fopen("back.txt","r"))==NULL) { printf("Datafile back.txt does not exist.\r\n"); exit(0); } while(fgets(sir,1024,f)) { pch=strtok(sir," "); i1=atof(pch); pch=strtok(NULL," "); i2=atof(pch); pch=strtok(NULL," "); o1=atof(pch); train(i1,i2,o1); } fclose(f); } printf("First input: "); scanf("%f",&a); printf("Second input: "); scanf("%f",&b); printf("Output = %f\n",run(a,b)); } ================================= cut here ================================= root@mail:/tmp# cat back.txt 0.1 0.2 0.3 0.11 0.22 0.33 0.23 0.11 0.34 0.2 0.3 0.5 0.1 0.3 0.4 0.2 0.5 0.7 0.1 0.7 0.8 root@mail:/tmp# ./test Learning 20.000 epocs, with a 0.5 learning rate First input: 0.1 Second input: 0.6 Output = 0.713706 -> ~7 root@mail:/tmp# As far as you can see, the network learned the adding rule, and it can learn any logical function if training data are being changed. Live long and prosper, nebunu