Database Parsing
#11
(12-28-2015, 09:41 PM)MuddyBucket Wrote:
(12-28-2015, 06:05 PM)zeroday Wrote: Sometimes I see dumps like:

Code:
username  
    email
  password
ip



   username  
 email
  password
ip

 username  
 email
  password
   ip

I don't know why you would dump it like that, but that is annoying to parse.


I don't get it, along as the data is in some sort of pattern, it is piss fucking easy to write a script to reorganise the data...

As you can see in the example I gave it has random amount of spaces, new lines, etc.
Thereby I am not gonna write a script for every file.
Reply
#12
(12-28-2015, 09:45 PM)zeroday Wrote: As you can see in the example I gave it has random amount of spaces, new lines, etc.
Thereby I am not gonna write a script for every file.

You must not be very experienced. removing blank spaces, blank lines, etc is childs play...
Reply
#13
Are people really that lazy(or dumb) where they wont put a sql file into SQL and use queries? OR if it isn't a proper db dump they don't just use grep and regex? Either way good luck with this service.
Reply
#14
(12-29-2015, 08:14 PM)NO-OP Wrote: Are people really that lazy(or dumb) where they wont put a sql file into SQL and use queries?
Yes - they are... lol. Many want the SQL shit stripped out. notice how most of the db leaks on pastebin, etc only have user:hash:email?

(12-29-2015, 08:14 PM)NO-OP Wrote: OR if it isn't a proper db dump they don't just use grep and regex?

Personally I prefer SQL - so I am assuming some need help converting the above crap that doesn't have SQL, into SQL statements.

Again, all pretty easy to do imo. but there are a lot of kiddies out there that probably need someone else to do it for them.
Reply
#15
(12-30-2015, 07:16 PM)123 Wrote:
(12-29-2015, 08:14 PM)NO-OP Wrote: Are people really that lazy(or dumb) where they wont put a sql file into SQL and use queries?

I have over 700 databases and sifting through them is much faster since a lot of them are parsed. If they weren't parsed, it would probably take over 15 minutes to sift through everything, as opposed to the ~2 minutes it takes now.

I don't think you get it...

databases are designed for working with data. You're basically taking that data and making it more difficult to work with...

Something Like for example SELECT id, hash COUNT(*) FROM records GROUP BY hash HAVING COUNT(*) > 1

Will return all hashes that are the same, but exist in multiple records. So if its unsalted, you can see if anyone is using the same passwords. Since we know passwords like 123456 and God are the most common, you can narrow down the probability of the most occuring hashes to be in the most used passwords. Shit like that any ways.

What exactly are you sifting through 700 databases in ~2 minutes for any ways?
Reply
#16
I can parse anything zeroday can't, just in case there's any issues or whatever.
Regex is your friend.
Reply
#17
(12-28-2015, 09:41 PM)MuddyBucket Wrote:
(12-28-2015, 06:05 PM)zeroday Wrote: Sometimes I see dumps like:

Code:
username  
    email
  password
ip



   username  
 email
  password
ip

 username  
 email
  password
   ip

I don't know why you would dump it like that, but that is annoying to parse.


I don't get it, along as the data is in some sort of pattern, it is piss fucking easy to write a script to reorganise the data...

ok, write a script to reorganise this:
Code:
Monkey.Sei      John Korzeniewski       anzelmogame@gmail.com   7282fc85febf5a5828d8d9acda041f2b3307cf54
Spartaz Nicholas Simpson        nicknater@hotmail.com   79987ffbc951e0c889ebfcc154b37257dc5c9948
sironuma        siro numa       sironuma@gmail.com      c283a9fa080cc45dfae20598412074d759d2fff3
kurosakitaro    iwasakitaro     doragonandbunnylove@yahoo.co.jp ef432d89c89c1698d6f638d5f847593f12f18b1d
Baddi63 benjamin        ben_the_sniper@hotmail.fr       34afa71744574ed55cb92155e4a9d6bcdbb09e98
kuromi  kuroda huun     je_te_veux_kuro70@docomo.ne.jp  62fe73a8de05e5e7f42c642b2af42bdd12483bbd
Shaunalex       Shaun m8m8      shaunalexyt@gmail.com   a8999e4acb7fe826b532ff0fd3bdef962fe6023c
ag55ful Alexander Dular alexg.2001@hotmail.com  5aa9750512cac30e3b8d0a7cb1b03d368fc2e999
mynameisyogi    Kevin Braun     kev.braun@gmail.com     f31f79795ad4c5dfb335bd2a5cce9d4baf3bc0f0
daybreak-still  KT      urobolos47@yahoo.co.jp  08a64d41b24d4d2fe40bd16fe74cca207e7b63ad

kthnx, shouldn't take you long since it's piss fucking easy
Reply
#18
(12-31-2015, 09:07 AM)Senpai Wrote: kthnx, shouldn't take you long since it's piss fucking easy
Right, you mean like this 8 line piece of code that took fuck all time like i said...

Code [No Highlight]:

file = open('dump.txt')
for record in iter(file):
field = record.split()
if field[2].find('@')==-1:
print field[0] + ":" + field[1] + ":" + field[2] + ":" + field[3] + ":" + field[4]
else:
print field[0] + ":" + field[1] + ":" + field[2] + ":" + field[3]
file.close()

piss off with your fucking sarcasm ya cunt. just cause you can't fucking code doesn't mean it aint fucking easy.
Reply
#19
(12-31-2015, 09:07 AM)Senpai Wrote: ok, write a script to reorganise this:
Code:
Monkey.Sei      John Korzeniewski       anzelmogame@gmail.com   7282fc85febf5a5828d8d9acda041f2b3307cf54
Spartaz Nicholas Simpson        nicknater@hotmail.com   79987ffbc951e0c889ebfcc154b37257dc5c9948
sironuma        siro numa       sironuma@gmail.com      c283a9fa080cc45dfae20598412074d759d2fff3
kurosakitaro    iwasakitaro     doragonandbunnylove@yahoo.co.jp ef432d89c89c1698d6f638d5f847593f12f18b1d
Baddi63 benjamin        ben_the_sniper@hotmail.fr       34afa71744574ed55cb92155e4a9d6bcdbb09e98
kuromi  kuroda huun     je_te_veux_kuro70@docomo.ne.jp  62fe73a8de05e5e7f42c642b2af42bdd12483bbd
Shaunalex       Shaun m8m8      shaunalexyt@gmail.com   a8999e4acb7fe826b532ff0fd3bdef962fe6023c
ag55ful Alexander Dular alexg.2001@hotmail.com  5aa9750512cac30e3b8d0a7cb1b03d368fc2e999
mynameisyogi    Kevin Braun     kev.braun@gmail.com     f31f79795ad4c5dfb335bd2a5cce9d4baf3bc0f0
daybreak-still  KT      urobolos47@yahoo.co.jp  08a64d41b24d4d2fe40bd16fe74cca207e7b63ad

kthnx, shouldn't take you long since it's piss fucking easy

Just a quick sit down with grep you can parse most of the data with

Code:
grep -Eio "\s+?[a-z0-9@\.\-\_]+\s?[a-z0-9\.\-\_]+\s?+"

Outside of a few outliers it gets most of them now you just need to place them into columns etc. The better question is why is that data so fucked? Why are there random blocks of white space in some sections and not others? People should just stop being dumb and stick to SQL it isn't hard and makes staying organized SO easy
Reply
#20
(12-31-2015, 04:20 PM)MuddyBucket Wrote:
(12-31-2015, 09:07 AM)Senpai Wrote: kthnx, shouldn't take you long since it's piss fucking easy
Right, you mean like this 8 line piece of code that took fuck all time like i said...

Code [No Highlight]:

file = open('dump.txt')
for record in iter(file):
field = record.split()
if field[2].find('@')==-1:
print field[0] + ":" + field[1] + ":" + field[2] + ":" + field[3] + ":" + field[4]
else:
print field[0] + ":" + field[1] + ":" + field[2] + ":" + field[3]
file.close()

piss off with your fucking sarcasm ya cunt. just cause you can't fucking code doesn't mean it aint fucking easy.

thanks, using sarcasm to get helper monkeys to do my work is fun.
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  Database Thread Insider 36 80,236 04-29-2020, 02:51 PM
Last Post: Insider
  Bypass 'Hidden Content' & Database Resources sock 0 7,533 04-02-2018, 02:42 PM
Last Post: sock
  barbelith database Merged 7 12,980 12-26-2017, 02:31 AM
Last Post: blahblahblah
  Database List - catz 2 7,935 02-25-2017, 01:16 PM
Last Post: greekhuge