Your Coding Style Is Like A Digital Fingerprint

9 years ago

January 30, 2015 at 8:15 am

Your Coding Style Is Like A Digital Fingerprint

If you think that good code is a plain, expressionless and elegant string of characters that is, at its best, utterly anonymous, think again. New research suggests that programmers have ways of writing code, which can be used as digital fingerprints.

Whether it’s how they space out code using spaces and tabs, naming conventions with capitals and underscores, or quirks in commenting, a team from Drexel University, the University of Maryland, the University of Goettingen and Princeton can spot who wrote a piece of code — with alarming accuracy. Using natural language processing and machine learning to work out who wrote anonymous pieces of source code based on coding style alone, the team can identify the person behind the script with 95 per cent accuracy.

The work uses indicators such as layout and lexical attributes to work out who wrote a piece of code. But it also uses something called “abstract syntax trees”, which “capture properties of coding style that are completely independent from writing style.” In other words, it looks beyond naming, comments and spaces, to find hidden clues in the structure of code. Testing their machine learning software on scripts publicly available data from Google’s Code Jam, the team showed that analysis of 630 lines of code for an author will provide it with enough information to identify the coder from a fresh piece of script with 95 per cent accuracy. Increase the line count to 1900, and the identification accuracy reaches 97 per cent.

As well as being a neat trick, there are clear applications for code of this kind. Being able to accurately identify who wrote an anonymous piece of code could help authorities tack down hackers more easily, for instance, or identify those committing online fraud. Now, it’s time to do with code what you used to do with handwriting as a kid: learn to fake someone else’s. [Drexel via IT World]

Picture: Olly/Shutterstock

See How Andor Crafted Its Adorably Anxious Droid in This Exclusive Bonus Clip

Biden Signs TikTok Ban Into Law, but His Campaign Will Continue Using It

This Westworld Auction Suggests the Show Really Is Over Forever

Bioluminescence Is at Least Half a Billion Years Old

Not Cool, The World’s Getting So Hot, Scientists Needed a New Colour

Kogan Is Currently Your Cheapest Option for an NBN 50 Plan

Circles.Life Is Offering $20 for a Whopping 150GB of Data

Grab a Solid Bargain While Samsung’s Portable SSDs Are up to 54% Off

Today’s Best Australian Tech Deals

Southern Phone Currently Has the Cheapest NBN 1000 Plan

Your Coding Style Is Like A Digital Fingerprint