Sunday, June 25, 2017

32hex is not MD5? What are Youku talking about?

 

32hex is not MD5? What are Youku talking about?


During April 2017, various online sources alleged that Youku, a Chinese video hosting service was hacked and that roughly 100 million user accounts were compromised. These sources stated that Youku usernames along with passwords hashed with MD5 and SHA1 algorithms were leaked. We decided to take a closer look in early June and will be presenting our findings in this post.

Of the 99,075,692 lines of data present in the leak provided to us, we were able to extract 99,028,838 usable hash strings. From the hash strings extracted from the original dump, we noticed there were hashes of varying lengths ranging from 30 to 32 ASCII-hex characters and thus suggesting to us they could be more MD5 like. After de-duplicating the hashes we were left with 57,205,528 hashes suggesting there was password re-use in this data.

A common practice, especially those seen in Chinese websites, is that the developers employ a form of ob-security in their password storage schemes. We suspect this is most likely done to deter the hashes being loaded into off-the-shelf password crackers. Another explanation would be that mistakes were made in processing the data.

As we started to work on this data set, it quickly became apparent that there were more than just MD5 hashes in this file.  We were able to identify both iterated MD5 hashes, as well as more complex sub-string iterated hashes.  Each of these also appeared as a chopped (last digits removed) value as well.  The majority of hashes were MD5($pass), but we found a sizeable number of MD5(MD5($pass)) and MD5(MD5(MD5($pass))).  The substring hashes were of the form MD5(substr(MD5($pass),8,16)).

The number of different MD5 variations used in hashing the passwords could be attributed to a number of factors which we won’t know but can only make assumptions. The simplest explanation is that the developers decided to change the hashing method through update iterations to their website. Some other explanations could be they merged with another service and also merged in those user accounts along with hashes, alternatively different accounts such as operators and users may have used different hashing schemes.

Dealing with the chopped hashes was not a problem for our tools. MDXfind natively supports partial matching of hashes, but we did modify hashcat to support these as well. See below for an example patch based on hashcat 3.6.0. A “clean” version including MD5sub8-24MD5 may be released at a later point. This required both small changes in the input parser, as well as the kernel code. We then ran the cracked passwords as a dictionary with MDXfind to mark the hashes correctly.

 

Of the 99 million hashes we parsed, we were able to recover 94.836 million - roughly 95.7% success rate. Interestingly, we noticed about 1.5 million MD5 like hashes which were in uppercase ASCII-hex form, as opposed to lowercase like the rest. We were not able to recover any of these hashes, and it is possible these are either salted or use a more exotic algorithm.

We found 48 million unique passwords, which solved the 94.8 million hashes.  The top-25 passwords for this list are typical for this type of web-site. It is interesting to note the fourth most common password used ‘xuanchuan’ is the romanized representation of 宣傳 translated to English means propaganda.


Perhaps the most interesting thing about this leak was the number of “created” or “generated” accounts we found.  Many, perhaps even the majority, of the accounts use what we consider to be generated email addresses and certainly machine-generated passwords.  While the exact number is difficult to calculate with certainty, we suspect tens of millions of these accounts are generated.

For example, there are 222 accounts we believe were created on October 10, 2011, at 14:25:03, all with 11 character random usernames @qq.com.  Why do we believe this?  Because they share exactly the same password: “2011-10-10 14:25:03”.   These accounts are part of a larger group of 606,733 accounts all created that day, presumably between 14:25 and 15:33.  There were an additional 22,741 accounts similar to these created, we believe, on October 14, 2011 - again with a similar style of @qq.com accounts (but using 9 character user names).  We do not believe that any of these qq.com accounts exist.

Another example is the uppercase ASCII-hex hashes. 1,563,853 (all but 1538) of these have email addresses like this: 037d6909-04a9-4b45-a309-157ef846c573@qzone.com. Having a UUID as the email address is strange enough but we looked into qzone.com. The records of DNSTrails show that an MX record for this domain only existed between October 2008 and August 2009. Also, the wayback machine of archive.org doesn’t have any recordings during that period. These facts lead us to believe that these are generated accounts.

One thing to take from this is that ob-security doesn't really help, in addition, it is interesting to see how there are so many different plays on MD5 used in this leak. It is always a good idea to not assume a single hash algorithm is being used, even if it comes from a single data set. Hopefully, we have provided an interesting read and we would love to find out why there are 1.5 M hashes which seem slightly different to the rest. If you know something, contact us.



9 comments:

  1. This is Very very nice article. Everyone should read. Thanks for sharing and I found it very helpful. Don't miss WORLD'S BEST CarGames

    ReplyDelete
  2. Hi Author!

    Thank you so much for such a well-written article. It’s full of insightful information. Your point of view is the best among many without fail.For certain, It is one of the best blogs in my opinion. Globtier is a New Jersey based Website Designing company which drives in providing excellent web designing services for clients based on USA as well as globally.

    Web Designing Services in New Jersey usa

    web design companies in new jersey USA

    web design agency in New Jersey usa

    ReplyDelete
  3. We aim to show you accurate product information. Manufacturers, suppliers and others provide what you see here, and we have not verified it. See our disclaimer
    The most talked about weight loss supplement out there, keto diet!Fat burning Ketone, BHB has been manufactured to provide instant fat burning solutions in a natural way for those looking to burn fat quickly. Rapid Keto Slim has been formulated with powerful fat-burning ingredients that are gluten free and non-gmo. Rapid Slim has also been formulated to work for men and women. Entering Ketosis quickly helps to melt fat away and begin burning fat for fuel instead of carbs.Put the power of exogeneous ketones to work inside of your body and Rapid Slim Keto watch the results happen. These dietary capsules will help you burn fat, curb your appetite and lose weight so you can slim down. Rapid Slim Keto Pills - Advanced Weight Loss Supplements to Burn Fat Fast - Burn Fat Instead of Carbs - Best Ketosis Supplement for Men and Women - Supports Healthy Weight Loss - Energy and Metabolism.

    ReplyDelete
  4. The entire post is great i really like and Enjoy this awesome post. Thanks for sharing his post.
    ViralVilla

    ReplyDelete
  5. The Entire post is really fabulous I really enjoyed it.
    Attitude Status

    ReplyDelete
  6. MINAT POKER
    * BONUS NEW MEMBER 20%
    * Bonus Turn Over Mingguan 0.5% (ALL GAME TANPA SYARAT)
    - TIDAK ADA MINIMAL TURNOVER
    - TIDAK ADA MAKSIMAL BONUS ROLLINGAN

    * BONUS REFERAL BERJALAN TERBESAR

    - AJAK 3-5 TEMAN DAN AKTIF BERMAIN AKAN DIBERIKAN CHIP 50.000
    - AJAK 6-10 TEMAN DAN AKTIF BERMAIN AKAN DIBERIKAN CHIP 100.000
    - AJAK 11-15 TEMAN DAN AKTIF BERMAIN AKAN DIBERIKAN CHIP 150.000
    - AJAK 16-20 TEMAN DAN AKTIF BERMAIN AKAN DIBERIKAN CHIP 200.000
    - BONUS AKAN DIBAGIKAN SETIAP HARI SENIN

    WA : +855887950794

    Domino Online
    Poker Online
    Ceme Online

    Agen Poker Online
    Situs Poker Online
    Dewa Poker Online

    ReplyDelete