Once recorded. A convention of +1 if
Once all of the above indicators aretested individually, the idea is to put the whole directory for a test throughall 4 indicators and record their observations following which, a sample dataset containing all 4 indicators and their respective outputs were recorded.A convention of +1 if the indicator istriggered and -1 if it is not triggered was used and was passed through apopular Machine learning technique called Decision Trees.
The basic version ofID3 classification algorithm was deployed which works on the principle ofgenerating decision tree from a fixed set of training instances. For initial training examples, it wasclassified a Yes if more than 2 indicators were triggered and No in case of 2or less indicators. The resulting tree is used to classify future samples. Theexample has several attributes and belongs to a class (like yes or no).
Theleaf nodes of the decision tree contain the class name whereas a non-leaf nodeis a decision node. The decision node is an attribute test with each branch (toanother decision tree) being a possible value of the attribute. ID3 uses information gain to help it decide whichattribute goes into a decision node. It results in different nodes withchanging data sets. Therefore, based on this algorithm, thefuture/new incoming files could be easily classifies as harmful or harmless. IV.Scope of Improvement It is important to realize here thatsafeguarding and securing information from any type of Malwares particularlyRansomware means always putting endless efforts and updating the mechanisms asand when any vulnerability is found in the existing techniques. There is alwaysa possibility of evasion of these indicators which would result in most of theMalicious activities being marked safe thereby letting them slip through ourtraps.
On carefully analyzing our work, weexpect the following things to be embedded in our future versions: ? To also include mechanisms that would protect data privacy before evenentering the system, i.e., analyzing network data and using robust searchingtools like elastic search to be deployed over the network.? To be able to work on moreunstructured data, as most forms of malwares that peek into a computer systemcomes with different forms of text and media.? To improve the dynamic aspect of this mechanism which would access,detect and delete the harmful content.Andof course, to make it work even faster and with accurate results which meansreduced false positives.