Every user who has set up any Web3 wallet must have experienced the difficulty of storing or remembering 12 random words (mnemonics) given during wallet setup. This difficulty comes as a requirement with blockchain technology for users to have complete "ownership" of their funds. Despite having several advantages in terms of security, it is clear that the user experience in blockchains cannot be scaled with 12 words. As a matter of fact, Vitalik Buterin, one of the founders of Ethereum, expressed previously that they did not actually want to design wallets like this in the original Ethereum design; however, they were forced to release Ethereum "in its current state" due to the pressure from the community to see a minimum viable product. So, how are we going to scale the user experience of Ethereum, how are we going to enable ease of use, which is the most fundamental element in our journey to reach one billion users? Then, let us prepare the wallets for the next one billion users: Account Abstraction.

Ownership on Blockchain and Working Principles of EOA Wallets

Bitcoin is one of the leading examples of application-specific chains. In fact, the main or almost the only use case of Bitcoin is being a payment system. Bitcoin actually solved many problems by enabling peer-to-peer money transfer - and it still has a considerable number of users. The application-specific structure of Bitcoin has some features that restrict developers who want to build on top of it. For this reason, developers wishing to build applications on top of Bitcoin beyond payment systems have turned to Turing Complete blockchain platforms, primarily Ethereum. Currently, Ethereum has the highest TVL and is one of the most widely used platforms among them.

Since its inception, Ethereum's goal has been to improve and simplify user experience. Although they have made some improvements to wallets for this, it is obvious that wallets are pretty difficult applications to use for an everyday user. The main reason for this is the difficulty of securely managing ownership on Ethereum.

So, how does ownership over EOAs (Externally Owned Account) proven on Ethereum? What are the requirements for transactions to be valid with these wallets?

EOA wallets on Ethereum rely on mathematics and cryptography to prove ownership. To create an EOA wallet, we must use a series of mathematical operations based on randomness. Here, as a result of entropy, we obtain two values: a private key and a public key derived from the private key. If we were to explain this concept with an analogy, we could explain it with two scenarios we climb up a hill and then go down. Going down a hill is like reaching the public key from the private key while climbing up a hill is like reaching the private key from the public key. The private key is the smallest, atomic component of ownership on Ethereum. It can never be predicted or generated using the public key. On the other hand, the public key is accessible by everyone and is generated as a result of one-way mathematical calculations over the private key. The number one rule for the validity of transactions on Ethereum is that they must be proven through ECDSA digital signatures that these transactions initiated from an EOA wallet have indeed been initiated from that wallet.

Creating an EOA wallet is free. An EOA wallet consists of Balance, nonce (with blank codehash and Storage Root). The CodeHash is only present in smart contract wallets (which will be detailed in the next section) and is not present in EOA wallets. StorageRoot is the cryptographic proof enabling the availability of data stored by the wallet on the data layer and is not present in EOA wallets as in the case with CodeHash. Each transaction initiated from a wallet has various properties that prove the ownership of the wallet, and each of these properties determines the "validity rules" of that transaction. Since these validity rules are the main rules that make up the framework of account abstraction, it is necessary to have a basic knowledge of these rules in order to understand account abstraction.

An Ethereum transaction consists of;

  1. Nonce: It is the solution to the commonly discussed problem of "concurrency" in computer science and is called "nonce" on Ethereum. It prevents replay attacks (https://ethereum.stackexchange.com/questions/84207/how-does-nonce-prevent-a-replay-attack-in-case-of-knowing-nonce-value)  and shows which transaction number the relevant transaction is for the wallet. A transaction nonce example: 

  2. Gas Amount: Gas is the main factor that enables Ethereum to be Turing Complete. Gas is the unit in the system where we pay a fee by calculating the computational power we will use in advance. The gas amount in transactions specifies the fee that the sender is willing to pay for the computation of that transaction. There are two parameters here: The base fee and the priority fee ( https://ethereum.org/en/developers/docs/gas/ - for further reads) 

  3. Gas Limit: The maximum gas amount the sender is willing to pay for the transaction.

  4. To: The transaction message call’s recipient or, for a contract creation transaction.

  5. Value: The amount of Ether to be sent to the recipient.

  6. Data: The variable-length binary payload that can be called for variable data. A transaction can have both value and data at the same time can have either value or data or can be called without having either of them. All four combinations here are regarded as valid and processed on Ethereum.

  7. Finally, the ECDSA signature structure consists of three parts: v, r, and s. Proof of ownership of a wallet on Ethereum is done through this signature structure. Since all this is based on mathematical proof, we can perform our transactions on Ethereum without the need for third parties.

The validity rules of transactions initiated by an EOA on Ethereum are determined based on the parameters above. For a transaction to be valid and executable in the EVM, the nonce value should be valid, the gas amount to be paid should be available in the wallet, the asset to be spent should be in the wallet and the ECDSA signature should be valid. Account abstraction aims to render these validity rules changeable and programmable. The aim of account abstraction is to move towards more advanced "smart contract" based wallets instead of EOA wallets. So, how do smart contract wallets work?

Programmable, Smart Contract Based Wallets

The biggest obstacle to Bitcoin's programmability was the “DOS (Denial of Service) Attack Vector” caused by functions creating loops (consistently running until a result is achieved). In case a malicious actor calls a function that creates a loop, all miners would have to process this transaction indefinitely, causing legitimate users to perform transactions not be able to access the system, which would render the system unusable. Ethereum solved this with the “gas” system. Hence, we were able to access a Turing complete state machine (EVM) without risking a DOS attack by paying "gas" according to the computing power needed for functions to be called. Here, we can compare this basic programmability difference between Ethereum and Bitcoin with the programmability of EOA wallets and smart contract wallets. While EOA wallets have limited programmability by design that cannot be overcome in terms of security, smart contract-based wallets are entirely programmable. Before jumping to smart contract-based wallets, first, we should first understand smart contracts.

One of the main concepts of Ethereum is smart contracts. Understanding the properties of smart contracts is one of the key topics to understand smart contract wallets and their problems.

On Ethereum, smart contracts are programs written with clear rules in advance to be executed in the EVM. Each smart contract has a specific "address" just like EOA wallets. Direct access to addresses of smart contracts is not possible. A smart contract cannot initiate a transaction in Ethereum. As smart contracts are not abstracted from the EVM, a smart contract can only execute its codes in the EVM through a valid transaction sent from an EOA wallet. As smart contract-based wallets are essentially smart contracts themselves, they possess the same properties. Unlike EOA wallets, they are programmable; however, (again unlike EOAs) they cannot initiate transactions on their own.

In sum:

  • While smart contracts cannot initiate a transaction, EOA wallets can;
  • Creating a smart contract wallet is not free as some fee is required to be paid for them to be deployed on Ethereum, EOA wallets are free;
  • While the programmability of EOA wallets is rather limited, smart contract wallets enable a great number of different functions and programmability;
  • It is possible to call other smart contracts both with smart contract-based wallets and EOA wallets.

Since smart contract wallets cannot initiate a transaction on their own, the development process of many applications, primarily wallets and privacy apps, is getting harder. Account abstraction is the generic name for the improvements proposed to eliminate these disadvantages of smart contracts.

Why Smart Contract-Based Wallets Cannot Become Widespread?

A smart contract does not have a private key, it cannot initiate a transaction, and some fee must be paid for creating the contract. So, what kind of problems does this cause? Let us try to understand smart contracts and the problem here using Tornado Cash as an example.

Tornado Cash and the privacy problem

Tornado Cash is a mixer programmed to enable privacy. The need for privacy among people is quite clear. Traceability of financial activities on chain through name services and centralized exchanges poses a serious threat for privacy seekers. Tornado is one of the most well-known privacy apps. The working principle of Tornado is quite simple. Tornado Cash has several pools where anyone can deposit Ether. You can deposit 0.1, 1, 10, and 100 Ether in these pools. In return, you receive a "proof" indicating that you deposited Ether into the pool. As when you deposit Ether, your Ether is to be mixed with other user's Ether, you can ensure your privacy by creating a wallet from scratch and withdrawing your Ether from Tornado. However, there is a problem here. Although you have the proof that you have deposited Ether into the Tornado Cash contract, you need some Ether in order to initiate the transaction to withdraw from the pool. You will either have to transfer Ether from your own wallet or use a centralized exchange. And this completely destroys the privacy you are trying to achieve. The need for Ether to initiate the transaction is related to the inability of smart contracts to initiate transactions. In the case of Tornado Cash, intermediaries called Relayers can sponsor the transaction by executing an EOA wallet inside the protocol to solve this problem. In return, they charge a fee. But as privacy will be compromised if relayers stop working, it is obvious that the system needs some improvement here.

Smart Contract Based Wallets

Let’s say you downloaded a smart contract-based wallet to your phone, you transferred assets to the wallet, and the contract is deployed on Ethereum and is ready to use. However, you have the same problem here once again. As you cannot initiate a "valid" transaction through signing the elliptic curve of the mail system, your fingerprint, or your phone's Face ID when you want to make a transaction, you need to initiate the transaction using an EOA wallet. Once again, you are forced to use those 12 words and unlock your wallet to initiate the transaction from your EOA wallet.

As described above, smart contract-based wallets need a transaction from an EOA wallet to initiate a transaction as they do not have the ability on their own. Smart contract wallets that currently work on Ethereum sponsor a transaction through a relayer and enable the execution of the transaction in return for a fee. Since these relayers mostly consist of centralized intermediaries, the users are at risk of censorship or freezing of their assets. The reason for this disadvantage results from the inability to initiate transactions from a smart contract as explained in previous sections of the article.

In both of these systems, the fact that the relayers who sponsor transactions are centralized structures poses various security and censorship risks. Hence, it is clear that we need an upgrade to solve these problems. And yes, account abstraction is exactly what meets us here. In the previous chapters, we have discussed why we need account abstraction. Next up are the historical changes in account abstraction in addition to the details of EIP 4337.

Evolution of Account Abstraction: EIP-86, EIP-2938, EIP-3074, EIP-4337

The idea of account abstraction and advancing smart contracts to the next level has been a topic of discussion for a long time. Since 2017, various Ethereum improvement proposals have been presented to the community for account abstraction, and these proposals have evolved over time to reach a more advanced level. Let us look at these Ethereum improvement proposals and their evolution over time:

EIP-86

Proposed by Vitalik Buterin in 2017 to make account abstraction possible, EIP 86 includes the abstraction of transaction "origin", the nonce scheme, and the "signature" system of the transaction. It enables the initiation of valid transactions using signatures other than ECDSA (digital signature used in Ethereum) to make Ethereum's strictly coded "transaction validity rules" changeable. Accordingly, we are able to initiate transactions from contracts thanks to pre-programmed signature rules and can program contract-based wallets. With EIP 86;

  • Currently, transactions on MultiSig wallets are sent from EOA wallets, and we need to pay Ethereum gas fees using EOA wallets. As we cannot initiate the transaction from the contract, this is a requirement. With EIP 86, it would be possible to pay the gas fee using the Ether in the MultiSig.
  • Right now, the digital signature used on Ethereum is not quantum-resistant. It would be possible to create quantum-resistant Ethereum wallets with EIP 86 as it makes it possible to use different digital signatures.
  • Although not mentioned in the proposal, it would be possible to create programmable smart contracts thanks to this upgrade.
  • And it would eliminate the need for relayers in mixers deployed on Ethereum as it enables the payment of gas fees using the asset in the contract while withdrawing assets from the mixer contract on Ethereum.

As EIP 86 was attempting to make these features possible through rather significant upgrades at the protocol level, it was not adopted by the developers back in the day. As combining such important updates together was considered too radical for Ethereum which made The Merge possible in 6 years, and as all client developers would need to focus on this subject, this proposal was not accepted.

Moreover, there is a DOS attack vector that was left unresolved in EIP 86. As a similar DOS attack vector is also valid for EIP 4337, we will go into the details of this under EIP 4337 section.

EIP-2938

EIP 2938, which includes comparatively simpler updates than EIP 86, was prepared by Vitalik and 3 developers in 2020. EIP 2938 includes the addition of 2 new opcodes to enable account abstraction. Thanks to this upgrade, a new transaction type is added to Ethereum to enable various applications of account abstraction. Contracts meeting certain conditions are marked as "contract wallets", allowing for the use of the new transaction type. What is proposed in EIP 2938 is a limited account abstraction. It allows us to abstract "gas" from the validity rules that were described in the first section of this article. As nonce and ECDSA (signature systems) could not be abstracted, we wouldn't be able to develop most of the applications we could with account abstraction with this EIP. This EIP was not accepted because it would require updates at the protocol level, it does not fully support account abstraction, and it could not provide definitive solutions to the DOS vector found in EIP 86 and EIP 4337.

EIP-3074

EIP 3074 is not actually an upgrade aimed at account abstraction. Rather than a radical change like changing the validity rules of transfers, it aims to enable programmability to EOA wallets with a simpler upgrade. In EIP 3074, the origin of transactions would still be EOA wallets, but with this upgrade, a transaction from one EOA wallet could be initiated by a different EOA wallet. In this way, for instance, the requirement to hold Ether to pay for gas would be eliminated. It aimed to accomplish this by introducing two new opcodes to the protocol: AUTH and AUTHCALL. The purpose of these opcodes is to enable some "account abstraction products" such as meta-transactions and transaction bundling by giving an EOA's transaction initiation authority (called an invoker) to a smart contract. Transferring transaction initiation authority from EOA wallets to a smart contract has both advantages and disadvantages. Let us take a deep dive into these pros and cons through an overview of the technical details of this upgrade:

EIP 3074, as we just described, is an upgrade based on a contract called Invoker taking ownership of the EOA wallet. The nuance that should be mentioned here is that actually, we still need 12 words and this upgrade does not really eliminate EOA wallets. EIP 3074 cannot abstract the signature system while abstracting nonce and gas in a transaction. The changes in gas and nonce structure enable various account abstraction products. Examples of such products include sponsored transactions and transaction bundling.

How Does Sponsored (Meta) Transactions Work in EIP-3074?

If you want to perform transactions such as transferring ERC20 tokens or initiating a transaction from a contract using a regular Ethereum wallet, you are required to pay fees in Ether. In other words, in order to transfer the USDC ERC20 token, you need to hold both USDC and Ether in your wallet. To eliminate this requirement, a series of upgrades were proposed, called "sponsored" transactions, which are carried out with intermediaries who sponsor your transaction in return for a fee, enabling users to perform transactions without holding Ether. We are not able to do this with the current EOA wallets because all seven validity rules that we described in the first section need to be met to "initiate a valid transaction". Since gas is one of these rules, we have to hold Ether to initiate a transaction. When we abstract gas, we eliminate the need to hold Ether and enable "sponsored" transactions.

We have mentioned that two opcodes were introduced with EIP 3074. While Auth transfers ownership to the contract, AuthCall allows the contract to be invoked by different wallets. The user first grants permission to the Invoker contract using the Auth opcode. Then, they sign the transaction to initiate the ERC20 token transfer. Gas of the signed transaction is covered by the sponsor with the AuthCall function, and the transaction is executed in the EVM. To give more details, the only difference between Call and AuthCall Opcodes is the ability to change the "Caller". In this way, ERC20 token transfer can be executed without the need to pay fees in Ether.

How Does Transaction Batching Work on EIP-3074?

Let's say you want to make an ERC20 swap on Ethereum. First, you need to send an "approve" transaction, and then make another transaction with "transferFrom". As we cannot directly call the approve function on the ERC20 contract, we have to make two transactions in order to call these two functions. With the messaging format that comes with Auth opcode, we can combine multiple transactions into a single transaction and execute it in the EVM by using the "commit" function. In this way, we can improve the user experience in many applications.

Why is EIP-3074 Still Under Review?

As EIP 3074 involved the addition of two new opcodes through a protocol-level change, it had to go through a serious security review process. EIP3074 is still under review due to three main issues in these security reviews. The first issue is that it makes it easier to perform sandwich attacks, which is already an issue itself. (For further reading: EIP-3074 Impact Study) Secondly, the risk of an invoker contract vulnerability causing harm to many users was due to the fact that invoker contracts have full access to assets in the wallet and unlimited authority over those assets. (For Further reading: https://ethereum-magicians.org/t/a-case-for-a-simpler-alternative-to-eip-3074/6493) The third reason was that it could not abstract the signatures and nonces, which prevented the implementation of various Account Abstraction applications with this upgrade The EIP 5003 proposal, which is currently on the agenda and being discussed by the Ethereum community, suggests that the AUTH function in EIP 3074 can also be carried over to 4337 thanks to the 4337 DelegateCall function that will be discussed below.

EIP-4337

There was a common problem in all the previous account abstraction proposals. They all proposed critical upgrades at the protocol level. As Ethereum is a chain hosting billions of dollars of assets and enabling the transfer of billions of dollars every day, it is imperative to be meticulous before upgrades. So, is it possible to enable account abstraction without introducing protocol-level upgrades? There we have it: 4337!

Proposed by Vitalik Buterin and a group of developers in 2021, 4337 is actually based on a rather basic principle: To develop an alternative mempool to the normal mempool, and to approve all smart contract wallet transactions from there. Let us start with a basic explanation, then we will go into the details. First, we will create the mempool itself, then the alternative mempool, and understand the logic of ERC4337.

  1. First, a transaction is initiated from an EOA wallet - signed through a Private Key, gas values, etc. are determined.
  2. Afterward, these transactions are transferred to the mempool via a full node.
  3. The validator selects (usually) the most profitable (to maximize revenue) transactions from the mempool, transactions are included in the block, and the transaction is approved.

How does the MEV Boost, Flashbots, and the clients work after the Merge?

Above all, I would like to briefly remind you what MEV is and explain the role of the Block Builder here. Transactions sent by the users wait to be included by block producers in the mempool. During this period, an opportunity arises for block producers: changing the order of transactions to make money. There are types of MEV such as sandwich attack or arbitrage. For more details, you can check the following Twitter thread:

https://twitter.com/lyteraio/status/1578808153923145728

There is no open market in Ethereum where validators can sell their transaction sequencing advantage. Since we cannot know who will produce the block in advance, users who want to do MEV cannot make an agreement with validators "at Ethereum protocol level". That's exactly where MEV Boost developed by Flashbots comes into play.

Ethereum POS chain consists of 2 clients. These are Execution and Consensus clients. These two clients are separate software programs and are required to produce blocks, approve blocks, and participate in the consensus of the chain. These software can be modified as desired and there are various types of them. MEV Boost is one of the execution clients adapted to MEV. The main issue that we should understand and keep in mind here is that all the validators are forced to use MEV Boost.

So How Does MEV Boost Work?

The actors called Searchers (Bundle) who search for potential financial gains by changing the order of transactions waiting in the mempool, send the transactions to Block Producers - along with a bribe - to include them in the block The sole role of the Block Producer is to approve the transaction with the highest bribe. Then, the block produced is transferred to the Relayers. Blocks produced are transferred to the validators, validators approve the block and the block will be broadcasted to the blockchain, and the users' transactions are "executed". Validators not using MEV Boos are not affected by the flow, and receive transactions directly from the mempool. In other words, MEV is currently ensured by adding an extra layer to Ethereum with an economic benefit. Adding an extra layer is the second thing that we should keep in mind.

Now let's get to the main issue: How are we going to create an alternative mempool in EIP 4337?

EIP 4337 proposes the creation of a separate transaction type through an update that will only be performed by a subset of block producers, similar to the MEV Boost that I have just described, instead of changing the entire protocol. Let’s go into a bit more detail:

  • The transactions here are called UserOperation. UserOperation is the name given to transactions initiated from a wallet developed in the ERC4337 standard. UserOperation goes to a mempool that is completely independent of the normal mempool.
  • The actors who listen to the transactions in this mempool and include them in blocks are called Bundlers. Bundles consist of block producers - while all Bundles have to be block producers, not all block producers have to be Bundles.
  • Entrypoint contract is a global contract where all transactions of the smart contract-based wallet transactions are executed. All wallet activities, primarily UserOperation, are provided on-chain through this contract.
  • Entrypoint contract consists of three main parts: wallet contract, aggregator, and paymaster.
  • Wallet contract is the main contract telling the wallet what to do, defining the rules, and holding the assets.
  • Aggregator: It is a "smart contract" controlling the validity of signatures for contract wallets. We'll elaborate below.
  • Paymaster: Actors who sponsor transactions. As mentioned above, they act as sponsors for transactions in operations such as paying fees with tokens other than Ether and initiating the first transaction in privacy applications to ensure that these transactions are executed.

Entrypoint Contract and Bundler

We will examine Entrypoint contract and its content; however let us understand what exactly Bundler does: Entrypoint contract has two main functions: ValidateUserOp takes UserOperation as an input and checks whether the nonce and signature of UserOperation are valid or whether we will pay a fee or not. If any violations of the transaction rules are detected here (such as incorrect nonce, insufficient balance for the fee, or invalid signature), the execution of UserOperation is stopped. The Execute function calls the calldata in UserOperation to execute the transaction.

There is an important nuance here. Normally, the validity rules of the transaction (Gas, Nonce, Signature - rules described in the first section) are checked off-chain. Here, they are checked inside the EVM, which requires additional gas. So, what happens if UserOperation, for which not enough gas is paid, is checked on-chain? As the block producer (Bundler) has to pay fees for the control of validity roles on-chain, first it checks whether the transaction is valid or not locally and whether I am paying enough to cover the fees. Afterward, it initiates the execution of this transaction on-chain. (Some OpCodes cannot be safely simulated during local simulation - because those are performed off-chain. That's why they are cannot be used in the first place. This is a mandatory feature to protect the protocol from a DoS attack.)

Here, we can say that the primary role of the Bundler is to eliminate the requirement for an EOA wallet to initiate the transaction. The transaction is initiated via the Bundler. In the diagram above, we discussed how an existing smart contract wallet not using Paymaster and Aggregator performs a simple transaction; however, the features provided by 4337 are not limited to these. Let us take a look at Paymaster and Aggregation.

Paymaster

Paymasters are contracts located in the entrypoint contract that sponsor UserOperation. Paymasters have use cases such as paying fees with tokens other than Ether and paying the necessary fees to initiate transactions in privacy applications. Paymaster, created to advance user experience, is used optionally.

So, how does Paymaster sponsor transactions?

In the previous section, it is explained how a transaction can be performed without a Paymaster, how the ValidateUserOp function works, and what its purpose is. In a transaction using Paymaster, sponsored transactions are performed with a function called ValidatePaymasterOp. First, ValidateUserOp is called to check the validity of UserOperation, such as nonce and signature. Then, ValidatePaymasterOp is called. With this function, it is checked whether the paymaster is really able to pay the gas. If the Paymaster has enough balance to cover the fee, ExecuteOp function is called, and thus the transaction is sponsored.

Here, there is an important detail: ValidateUserOp function - as each UserOperation accesses (almost) a completely different storage, an invalid UserOperation does not affect another UserOperatoin. However, the case is different in the Paymaster function. Since Paymaster's storage is the same for all UserOperation using that paymaster, in case a ValidatePaymasterOp function returns an invalid output, it will cause all UserOperations using that Paymaster to become invalid. This creates a DoS attack vector. To prevent this, each Bundle has the authority to ban or restrict the Paymaster; however, this does not stop a malicious actor from setting up hundreds/thousands of malicious Paymasters. Here, in order to eliminate sybil attacks, users who wish to act as Paymasters are required to stake some Ether. This does not completely eliminate the DoS attack vector but makes it difficult to perform

Aggregator: While normally the validity of each UserOperation is checked with ValidateUserOp, the signatures of each are separately approved in the EVM. Because the validity rules of transactions for EOA wallets are checked with protocol-level codes outside the EVM, it does not cause extra gas fees. For this reason, transactions sent with UserOperation will cause higher gas costs compared to EOA. The way to eliminate the disadvantage created by this and to access more scalable UserOperations is to combine the signatures and approve multiple transactions with a single signature. Aggregators do exactly that. How does the signature aggregation provided by aggregators work?

A normal digital signature has 4 components: Private Key, Message, Public Key, and Signature. When you sign a message with your private key, you obtain the "Signature". To check the validity of a message, you need the public key, message, and signature. You can check this by calling simple functions.

Each ValidateUserOp also checks exactly whether UserOperation has a valid keypair or not. As I just explained, executing the KeyPair cycles for each of them over and over again will cause high calldata costs.

Aggregating the messages and signatures of multiple UserOperations, Aggregator enables them to be proven in a single function. Verifying the aggregated signature is sufficient to prove that all the other messages are valid. In this way, we can reduce calldata cost by about 26% in the first stage. You can access Vitalik's tweet from the link: (Keep in mind that the actual cost in rollups is the cost of data in the Ethereum main layer - Vitalik refers to this.)

https://twitter.com/VitalikButerin/status/1554983955182809088

Aggregator's process of aggregating signatures of UserOperation and the verification of transactions is as follows:

  • First, Bundler transfers UserOperations to the Aggregator.
  • Aggregator transfers the aggregated signature to the Bundlers.
  • And the Bundler calls the transaction through the HandleAggregatedOps function. With this function, the signatures are verified with the Aggregator contract in EntryPoint, and then the transactions are executed with the ExecuteUserOp function.

The biggest weakness for EIP 4337, as well as EIP 2938 and EIP 86, is to become vulnerable DoS attack vectors. From the concepts I have explained, you may have understood why account abstraction can be a DoS vector, but I would like to explain the most basic reason again and finish this section.

Normally, the validity of an Ethereum transaction is determined through a strict set of validity formulas, and checking a few values is enough to determine whether the transaction is valid and whether it will pay. With account abstraction introduced in EIP-4337, we can only determine whether a transaction is valid or not after fully executing the transaction. This is vital because this is exactly what causes the DoS vector. Since we cannot determine whether a transaction is valid or able to make a payment without fully executing it, we can call invalid or non-paying transactions on the chain, which can even render the system unusable for real users. That's why some opcodes are banned in validateUserOp function, and that's why Paymasters have to stake. This is one of the problems that is still being worked on to be solved.

In 4337, the revenue model of the Bundlers is rather basic: the actor producing the block has the authority to sequence the transactions. Hence, they are the primary actors for MEV. The block producer who gets the MEV advantage will earn revenue just like the block producers who are currently working on MEV boost.

Congrats! Right now, you have mastered all the components of 4337 and account abstraction. You have learned about the changes Ethereum has undergone in terms of account abstraction, the proposals that have been made, and how wallets work on chains, including Ethereum where EIP-4337 has been integrated.

Now, let us take a look at the protocol-level account abstraction integrations. We will be discussing Starknet, Zksync Era, and Fuel.

Why are Protocol-Level Solutions Important?

Ethereum is currently the most economically active blockchain, hosting over $30 billion in locked assets and facilitating billions of dollars worth of asset transfers every day. As such, protocol-level upgrades are rather difficult to implement and take a long time. So much so that "The Merge" could be realized as a result of a 5-year coordination. However, Rollups don't have such problems. Since they are entirely new chains by inheriting security from Ethereum, they do not suffer any backward compatibility problem. As such, there is no reason why protocol-level account abstraction should not be developed on Rollups. StarkNet uses CairoVM, Zksync Era uses zkEVM, and Fuel uses FuelVM. While StarkNet and Zksync implement upgrades similar to 4337 in their own VMs, Fuel aims to enable account abstraction with a completely separate approach using an innovation called "predicate". Let's look at the details:

Starknet

Before delving into how Starknet integrates, let's briefly understand the role of the Sequencer and Prover in ZK-Rollups. The Sequencer is responsible for sequencing transactions, sending soft confirmations to users in the rollup, and forwarding the sequenced transactions to the Prover. The Prover, on the other hand, is responsible for generating proofs for the transactions that are forwarded to it. The actor verifying these transactions is the verifier in the bridge contract on Ethereum. The only thing that we need to know here is that Sequencer is responsible for adding transactions into the Batch, checking the validity of transactions, and forwarding the Batch to the Prover. StarkNet and zkSync both work based on a similar logic.

For account abstraction, Starknet has implemented a modified version of EIP 4337 at protocol level. Here, protocol-level account abstraction consists of three stages:

  1. Sequencer selects transactions from the mempool and "only" checks if the selected transaction has a valid nonce value
  2. If the transaction nonce is valid, the validateTX function is called on the account contract.
  3. If the transaction is valid and will pay as a result of the transaction, the executeTX function is called to execute the transaction. If the transaction is invalid, the transaction is not executed and the sequencer does not receive any payment.

Here, unlike EIP 4337, local simulation is only carried out for the nonce value. This makes the protocol more vulnerable to DoS vectors. For this, it must be certain that the transactions received by mempool will be paid after the transaction is executed. That's why there is another protection mechanism to control transactions before they are broadcasted on the Starknet network. Before a transaction is added to the mempool and broadcasted on the network, its validity is checked by a node (the Sequencer) on StarkNet. In this way, invalid transactions are not broadcasted on the network and malicious contract accounts are prevented from performing DoS attacks.

Since Starknet's integration for Paymaster is almost the same as EIP 4337, I will not mention it again. It should also be noted that Starknet's integration is still in Starknet Alpha and may be subject to change based on various proposals for improvement.

zkSync Era

As mentioned, zkSync has an integration similar to EIP 4337. zkSync is also a ZK-Rollup solution that uses a Sequencer to order and add transactions to the Batch. There are two stages in zkSync Era as in EIP 4337 and Starknet. Here, the EOA and Contract accounts submit transactions to the same mempool. Sharing the same mempool for EOA and Contract accounts enables the paymasters to sponsor the EOA transactions too which is not allowed to do in the original 4337 implementation. So on zkSync Era, you can create a paymaster that sponsors the EOA’s transactions. The Operator (similar to the Bundler in 4337) collects transactions, which then are executed in the EVM via a smart contract called Bootloader (similar to the entrypoint in 4337). Unfortunately, I couldn’t find much information about how the simulation (the off-chain process) is happening on the zkSync, so here is the explainer of what is happening on-chain:

  1. Validation stage where the transaction validity is checked: 

    1. First, the validity of nonce value is checked. (Note: Since nonce integration on zkSync is based on a deterministic order, a wallet can only initiate one transaction at a time. The team is planning to update this in the future.) Operator first calls the validateTransaction function. This function checks the transaction validity, and in case it is valid, it moves to the next stage. In the next stage, the fee of the transaction is paid. There are three different functions that are executed to check whether payment can be received from the Paymaster or directly from the user's wallet. In the final stage of the validation, Bootloader checks if it has received the fee necessary for calling the transaction. 

  2. Transaction execution stage:

    1. This stage consists of two parts. First, the executeTransaction function is called. Then, if Paymaster is used, the PostOp function is called. This is a function to make it easier for Paymaster to pay fees using ERC20 tokens. 

Also, zkSync plans to provide full EIP 1271 (https://eips.ethereum.org/EIPS/eip-1271) support. While only ECSDA (signature used on Ethereum) can be used for now, the team is planning to support different signature types by creating a signature library in the future. They just have fully open-sourced their bootloader contract and shared the audit report.

Fuel

Fuel, unlike the other two rollups, comes as an optimistic rollup and uses the UTXO model as its account model. Thanks to the advantages of UTXO, it incorporates many features of account abstraction without allowing any DoS vector and supports account abstraction applications. But, how? (There is a detailed report on the Fuel Network in Lytera - You can access the report following this link: https://lytera.io/en/report/fuel-report-en/)

UTXO refers to Unspent Transaction Output. If we were to explain this with a simple analogy, let's say you have $100 in cash. If you were to send $30 of that to me, you would have $70 left that has not been spent. This unspent output represents the UTXO. More technically: Transactions on blockchain consist of inputs and outputs. To spend a UTXO, you digitally sign the proof of ownership of that UTXO and spend it, which is considered as an input, and gives you the UTXO as an output, which is unspent. Once a UTXO is spent, it cannot be spent again. At the end of the transaction, two UTXOs are created and the system continues to progress based on unspent outputs.

The most important concept to know about UTXO is that each UTXO is spent separately and has no access to the structure of other UTXOs in the blockchain state. The only thing that we have when programming blockchains such as Ethereum and other Account based chains is "inputs". In Fuel's UTXO model, you can use outputs to write functions in the form of "if you meet these conditions, this output will be generated". The "Predicates" that come with the UTXO concept in Fuel enable Fuel's account abstraction.

What we call a "Predicate" is actually just a piece of code. Predicate is not a contract account or EOA wallet, it is simply a set of code that specifies that it will be spent only if certain conditions are met. Each predicate has an address where you can send and receive assets. However, this address is obtained by taking the hash of the predicate's bytecode. Thanks to predicate, functions can be written for spending that predicate under various conditions. In other words, predicate can actually be described as a UTXO including spendability conditions within itself. The UTXO model opens up many possibilities for developers thanks to the ability to develop spendability rules within predicates.

Let us give an example: We can impose a condition of obtaining signatures from ⅔ wallets as a spendability condition on a predicate. We can do the same with a contract wallet, but there is a big difference between the two: To create a predicate, all you need to do is to take the hash of the predicate's code - now you have the address of the predicate. Assets in the predicate can be directly spent after signatures, and the fee can be paid by the predicate itself. However, in the case of a contract account (I am referring to Safe contracts) we need to use EOA wallets to create the wallet and to initiate the transaction (in a multi-sig with ⅔ signatures, the 2nd person signing will initiate the transaction, and pays the fee) regardless of storing billions of dollars in a contract. And this provides a really bad user experience. In other words, it is possible to initiate transactions from predicates as long as you meet the validity rules.

In this way, the asset itself, the UTXO becomes programmable thanks to the predicate. Moreover, we also solve two very important problems that are present in other account abstraction integrations:

  1. On Ethereum, the transactions initiated from an ERC4337-based wallet had various DoS vectors since we could only know if the transaction would actually make a payment after executing it. The reason for this is the access of ERC4337-based transactions in Ethereum to the latest state of the blockchain. This is called Stateful account abstraction. In Fuel's UTXO architecture, each UTXO only has access to its own storage, and since the spending of predicates is deterministic, account abstraction on Fuel can never access the state in any way. Therefore, account abstraction on Fuel does not cause any DoS vector. And this is called Stateless account abstraction.
  2. According to EIP 4337 on Ethereum, the transactions are still initiated from an EOA. As long as the spending conditions within predicates are met, anyone can initiate the transaction. And this seems like it will provide an enhanced experience for both users and developers.

I should also mention that Fuel is preparing to provide signature verification support at the execution layer called ECrecovery. I believe this architecture that brings programmability to all assets in Fuel's protocol will provide a Web2-level user experience in many products.

Final Remarks and What Can Be Done With Account Abstraction

Although account abstraction is still in the development stage, the limit of what we can do with account abstraction is only limited by our imagination. Fully programmable wallets, social recovery, 2-factor protection, native multisigs, transaction batching, and more...


About Clave

Clave is a smart wallet platform powered by Account Abstraction. We offer seamless Web3 UX and hardware-level security via our key management system utilizing Trusted Execution Environments.

Our approach is simple: We should make the UX as seamless as possible without compromising security, but how?

The User Experience of crypto can be summarized into onboarding, users' interactions with blockchains, and account recovery.

Onboarding: Our most adopted everyday devices such as mobile phones use isolated microchips (e.g. Secure Enclave for Apple) in order to protect sensitive user data. These chips are isolated from the main processor to provide an extra layer of security.

By utilizing the secure elements in everyday devices (like phones, tablets, or some notebooks), it’s possible to onboard users via biometric authentication. Since there's no need to secure the private key or seed phrases, the onboarding process happens in fewer steps.

Users' interactions with blockchains: We need to make these interactions safe and easy. The way users interact with blockchains will change as a result of declarative paradigms like the Intents, session key modules, and transaction simulators like security modules.

Recovery: In order to provide a secure, non-custodial, and censorship-resistant recovery mechanism, Clave utilizes best of both world by providing a combination of social recovery and iCloud & Gmail login.

Clave's tech stack is going to be modular, flexible, multichain and will be starting with @zkSync

Thank you for reading our article. This post is public so feel free to share it.

Special Thanks

Special thanks to Tahir, Hamza, Arseniy, and Ulaş who answered my question while I was writing this article.

Special thanks to Nichanan, Ishanee, Mert, Can, Austin and Antonio who proofread to take this article to the next level.

Also thanks to İsmail for editing and helping for publishing.

Resources

All of the resources compiled in this link: https://hackmd.io/BJYF5A3kSqCZaBZDoPIFYw?view


About Clave

Clave is an easy-to-use, non-custodial smart wallet powered by Account Abstraction and the hardware-level security elements (e.g., Secure Enclave, Android Trustzone, etc.) to simplify the onchain experience for the next billions. By empowering users with a user-friendly and secure bridge to seamlessly integrate their assets into everyday life, Clave delivers a comprehensive fintech solution, ensuring a holistic financial experience for all.

Connect with Clave: