searching any string and sorting in DNS queries (question updated)

4 views (last 30 days)
problem statement:-
i have a nx2 matrix with n rows and 2 columns. one of the columns is having hostnames, i.e www.google.com , www.facebook.com etc...
the matrix has been derived out of the DNS queries in a network. so its a huge number of DNS queries or queried hostnames.
momentarily leave all the previous stuff aside , now i get a list of hostnames that are malware affected or blacklisted or infected. and what i need to do is to find whether any of the blacklisted domain(i get from a different source , say an antivirus compay) is there in the DNS queries of my network log or not.
say for example : - ww w .ma thworks . com (space intentionally left)
is an infected site and i want to look into my DNS traffic that is there any hostname queried that matches with mathworks.com ? and that would let me an insight into if i my network is a part of a botnet or a victim of a trojan or something like that.
so i was planning that if i sort out the words like mathworks in above case and matching the strings in the network traffic log or DNS log . for that i need to get all the keywords between two dots in a hostname , say
we have to get (google and co) from www.google.co.in
and if we get the keywords between dots then we can match it with the original DNS log file to get whether my network is infected or not.
i seriously think i am pretty bad at explaining things :(

Accepted Answer

Ken Atwell
Ken Atwell on 10 Apr 2012
I'll take a stab at what I think you are describing:
logs = ...
{ 'www.mathworks.com', '0.0.0.0'; ...
'www.malware.org', '0.0.0.0'; ...
'www.google.com', '0.0.0.0' };
blacklist = { 'www.virus.org', 'www.malware.org' };
inBlacklist = false(size(logs,1), 1);
for i=1:numel(blacklist)
inBlacklist = inBlacklist | strcmp(logs(:,1), blacklist{i});
end
inBlacklist
The for loop can be replaced with a cellfun function call, but I stuck with a for loop for readability. In your real code, you may need to replace the simple 'strcmp' with 'strfind' or even 'regexp'.
  2 Comments
Karan
Karan on 10 Apr 2012
okay sir , i will try the code with regexp , overall i got the basic idea what you wanted to convey.
to keep other ideas coming , i am leaving the question open for some time :)

Sign in to comment.

More Answers (1)

Karan
Karan on 10 Apr 2012
andrei bobrov , hope you missed the question .... :)
  3 Comments
Jan
Jan on 10 Apr 2012
[EDITED COMMENT - web links removed]
Karan wrote:
Sir , refer to the last question that has been linked in the above question.
what i need to do is to filter out malware domain names from a traffic capture log . so i need to filter out the keywords .. or domain names from a list of known malware blacklisted domain names say we have a list of blacklisted domains.
www.******.cc
www.******.za
www.******.cn
so what i need to do is to filter out the keywords like blacklist, infected , trojanhorse from the hostnames and then i am going to match it in the institutes or the demo log to find out if any of the infected domains is there in our computer or firewall traffic log or not.
hope this time i didnt mess out here
Jan
Jan on 10 Apr 2012
I've read your question 4 times now, but I do not get a faint idea of what the actual problem is. I do not not to read any other threads to understand this threads, therefore I ask you to inlcude all information required to solve your problem here - by editing the original question text, not by adding comments. Thanks.

Sign in to comment.

Categories

Find more on Entering Commands in Help Center and File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!