This is a post by guest author Sebastian Neumann about privacy in bitcoin transactions and methods of de-anonymizing addresses by determining a transaction’s change address.
The question of privacy has been one of the main concerns ever since the creation of cryptocurrencies and blockchain transactions. Can one follow the funds on blockchain and reveal senders privacy even if bitcoin mixers are used?
In this article we will explore the following problem: is it possible to recognize which of the addresses in a BTC transaction can be attributed to another person, and which still belongs to the sender because it is a change address.
Which transaction receiving address is the change address of the sender?
A Bitcoin transaction consists of sending (input) and receiving (output) addresses. However you cannot spend the funds in part: bitcoin ‘balance’ of the sending addresses is completely spent and credited to the receiving addresses in the copurse of a transaction. It is similar to paying $0.20 with a one-dollar bill. You give the whole dollar and then you get $0.80 change. In Bitcoin world sending address balance goes from e.g. 1 bitcoin to 0 bitcoin and there are two receiving addresses with credit 0.20 bitcoin and 0.80 bitcoin.
Here, we want to know if 0.2 BTC were paid to a recipient and 0.8 BTC was change which returned to the sender, or vice versa. This is not clear at first glance, as blockchain does not explicitly record such information. However, there are a few methods to find this out (at least with a high probability):
- Re-use of address: If one of the two recipient addresses is new to the blockchain (never had any transactions before) and the other is not, then the new one is probably the change address which belongs to the sender. This rule is applicable here because the 14uQ… address was already used before and the 1MmA… is a new one.
- Address type: If the sending addresses are of one type (either all starting with 1 or with 3 or with bc1) and only one of the receiving addresses is of the same type then this is likely a change address belonging to the sender. Rule not applicable here, as all addresses start with a “1”.
- Change amount: The change amount in transactions with at least two sending addresses is expected to be lower than each of the sent amounts. Not the case here because we have only one sending address.
- Decimal places: Usually the change amount might have more decimal places then the target amount. Also not the case here, as both amounts have the same number of decimal places.
The outcome of our analysis is that with a high likelihood the address 1MmA… is controlled by the same entity or person as the sending address 1K35… because it is a new address, hence can be attributed as change address.
Let’s have a look at the following bitcoin (BTC) transaction: d445ae28e32a4ae1e582975223518ef7a9320f2c82a600d9eb5d009e3983c886 and see if we can use the four heuristics and identify whether change address is controlled by the same person as the four sending addresses.
- Re-use of address: No success, both receiving addresses 16TH… and 1HUG… were new when this transaction happened.
- Address type: Not working, all addresses are starting with a “1”.
- Change amount: Unfortunately not. But this transaction is suspicious because more input than necessary is used. 272.3 BTC would have been enough to pay 210.86 or 79.84 BTC. This is comparable with a scenario where you want to pay $18 in a supermarket and you give the cashier a 20 and a 10 dollar bill. None of the two receiving amounts is lower than each sending amount. This behaviour is used either to obfuscate the analysis or to clean up the wallet (for whatever reason).
- Decimal places: Theoretically yes, as one received amount has more decimal places than the other but this rule is the weakest of all four and in this case where all other heuristics say no I’d not rely on this one alone, especially where unnecessary sending addresses are used so we have some suspicion that the sender tries to obfuscate the analysis.
So we still don’t know which of the two receiving addresses belongs to another entity and which traces back to the sender. Knowing that people’s wallets will often contain common-spending addresses, we can do a brute force attack on the blockchain tree for both receiving addresses in the hope that we find at least one common-spending from either one of the two receiving addresses with either one of the sending addresses. Additionally we use clustering technique to further improve the chance to get results.
For address 16TH… we are already successful after 70 address analyses: transaction d09f4927a12a35c5b6a79112e0893ef446a3455e89f93295385c80020f008ef8 shows that address 13Pzj… and address 1LjR… are co-spent and therefore very likely belong to the same wallet. And transaction 81565452696641fb14348e6e69753be6072933d5c07512cc2221c7207c1ac5dc links 1LjR… to 16TH… so it’s very likely that addresses 13Pzj… and 16TH… are controlled by the same person. So most probably 16TH… contains the funds returned to the sender as change. We could have stopped here but we can continue the same search for other addresses.
1HUG… requires digging much deeper through the blockchain. Only after analyzing around 365,000 address we find the fact that 1HUG… as well most likely belongs to the sender.
Bottom line with this transaction is that the sender probably tried to obfuscate potential analysis or they were just playing around, moving funds between their own addresses. At the end, no funds have likely changed the owner.
This kind of brute force clustering is a very powerful instrument. It works in many cases, no matter if someone uses Payjoin, Coinjoin, mixers, private coin exchange from bitcoin into e.g. Monero and back, or just to prove or contradict analysis results obtained from different tools.
Can we investigate a coinjoin mixer transaction?
So far, we analyzed a transaction where standard methods for change address detection fail, so it required deep investigation of different address relations. Now we shall try the same approach to link receiving addresses of a Wasabi Coinjoin transaction to the sending addresses.
Transaction f39d831aef2e49e21f542e2e3a2b0d577dae40132b7da49506b45e0b042794a7 has 81 sending and 119 receiving addresses. These kind of transactions are usually team efforts, many people join together for mixing coins in order to protect their privacy. To make tracing even more difficult, many addresses here have gone through multiple mixes.
Can we hack Wasabi mixing algorithm? Of course not directly, but what we can do though is to use clustering technique and determine the wallets of all the sending addresses and examine the wallets of all the addresses that are within a certain distance of the receiving addresses in the Bitcoin blockchain tree.
In some sense it’s like a masquerade ball. One group wears green masks at the ball and took all available fingerprints from that group. Even if the same people wear different masks at the next masquerade balls, some will leave fingerprints sooner or later. As soon as we find a match, we know it is the same person despite the fact they are wearing a new mask.
- Level 0 investigation: If we look directly at the receiving addresses, we see that one of the addresses was already used as input address, it’s address bc1qutrq7rfhv56gdqn4m0nm8agygepxahd7cz3j8u.
- Level 1 investigation: If we follow the transactions after the receipt we find matches for 1,7% (2 of the 118 receiving addresses).
- Level 2: For 5,9% of the addresses the mixing operation was useless, what we detected by digging deeper into the transaction graph.
- Level 3: Already 17,6% of the receiving addresses can be linked to their senders if we analyze one level further.
- Level 4: Now all together 21,8% of addresses can be traced back.
These participants didn’t succeed because they unintentionally mixed an address from their old wallet with one which is connected to the new wallet. This did not necessarily happen in time proximity to the transaction and if we’d do the same analysis in one year from now the percentage would likely increase. Or if we’d dig even deeper. To get these results we had to create as much as around 720,000 clusters.
Here one simple example of a link between receiving address bc1qh7n70j34f0806ht8ch6lsm88zgh9pm4dzk2t43 and sending address bc1qgmwg687snkag73nv4lg8zmnsz43dxyk53h82du. Most of the correlations are multi-step and therefore much more complex, without suitable tools it is not possible to find them. We can trace that:
- bc1qgmwg687snkag73nv4lg8zmnsz43dxyk53h82du is in the same wallet as address bc1qqlk9fa242d99h0ww3syfefdce2z52kv2pryztg;
- bc1qqlk9fa242d99h0ww3syfefdce2z52kv2pryztg is in the same wallet as bc1qh7n70j34f0806ht8ch6lsm88zgh9pm4dzk2t43 which confirms our hypotheses.
For (1), transaction 6f8ffdbb4f0052105e3599b96c359573ea4f4e67c988796c6350417290c6233b shows, considering the ‘address type’ rule mentioned in the first part, that bc1qqlk9fa242d99h0ww3syfefdce2z52kv2pryztg is most likely the change address of the transaction and is therefore controlled by the same person as bc1qgmwg687snkag73nv4lg8zmnsz43dxyk53h82du.
For (2), transaction 58b8e31e329702a093c34ad58806f78f54abe02010164650ab3e817ae34c2dd0 suggests with high probability that the common spend addresses bc1qqlk9fa242d99h0ww3syfefdce2z52kv2pryztg and bc1qh7n70j34f0806ht8ch6lsm88zgh9pm4dzk2t43 are in the same wallet. Furthermore we find in this transaction as well bc1qgmwg687snkag73nv4lg8zmnsz43dxyk53h82du on the sending side which confirms our assumptions further.
Is anonymity good or bad in the Bitcoin world? It depends. If a criminal can act anonymously then this is bad, but if a good person is financially oppressed in a dictatorship then it is good. What is important, however, is that using certain analysis techniques we can determine change address of the transaction and trace hidden connections between addresses.
Thanks to btctester.com for providing blockchain tools used for the forensic analyses conducted in this article.