Data Extraction Methodology for EVM-based Smart Contracts

The application implements a data extraction methodology to extract data from EVM-based smart contracts including execution-related data and state changes. To this aim, the methodology first captures the knowledge about the contract transactions and extracts the state changes for each of them. This is possible by replaying transactions inside the Ethereum Virtual Machine (EVM) and obtaining the traces generated to reconstruct smart contract variables’ changes history.

Access the code

Log examples

The Application

Configuration. This step foresees the configuration of the parameters identifying the contract from which to extract data. the first parameter refers to the network to use, such as a particular mainnet or testent. A block range is also necessary to restrict the interval of transactions to retrieve. Some additional filters can also be set and they correspond to gas used, gas price, interval of time, set of sender addresses and set of executed functions. All these parameters are used in the next step to determine and filter the transactions to extract from the specified smart contract.

Get contract code. This step retrieves the source code of the target smart contract used for later compilation. To this purpose, the contract address and the contract name are taken into input. To provide a fully automated procedure, if the contract code is verified and publicly available, it is directly acquired, otherwise, the user can upload it manually.

Get contract transactions. The scope of this step is to collect all the transactions referring to the specified smart contract. For this purpose, initially, the list of transactions between the defined block interval is retrieved. Then, transactions are filtered according to the previously defined parameters so that only those matching all parameters are effectively selected for extraction.

Compile contract. Once the smart contract source code is obtained, the Solidity compiler is used to get three particular outputs: (i) Application Binary Interface (ABI), (ii) Abstract Syntax Tree (AST), and (iii) storage layout.

Extract contract storage and internal transactions. This step captures the contract state changes by extracting the state variables updated during each transaction. For this purpose, each transaction is replayed in a local environment with the state of the blockchain where the transaction was originally executed. This is done by cloning the block where the transaction was included and using it to replay the transaction and any previous ones in the block. This returns the transaction trace containing, among the others, the list of executed operations (i.e., opcodes) and the state of the EVM (i.e., memory locations). In particular, the opcodes represent operations in the memory such as the inclusion of a new variable or the calculation of a storage index. The EVM state contains instead all the storage slots with their respective keys and values. To reconstruct the state variable changes, this information is matched with the storage layout to identify precisely which state variable was updated also in the case of dynamic ones.

Extract blocks, transactions and events. Once the contract state changes and internal transactions are collected and decoded, the methodology continues to read information associated with transactions, blocks and events. For each transaction, the methodology takes the name of the executed function from the corresponding log and its inputs, decoded thanks to ABI. Then, other attributes are read, such as hash, sender, timestamp, gas used, and more. Using the ABI, also events emitted by the transactions are captured together with the name and the value of the decoded attributes.

Generate log. After all the previously mentioned data is extracted, this step generates the output log which is provided to the user. The log can be generated in JSON and CSV formats to support higher compatibility with modern analysis techniques.

Data querying. In addition to the log, the methodology also provides a data querying step where the user can interact with the extracted data. Indeed, during the previous steps, such data is saved in a local database, accessible by the user with querying capabilities. In this way, the methodology permits faster data retrieval, without the need to replay transactions every time. Also, the usage of a standard DBMS permits the definition of complex queries and aggregation features.

Cookie	Duration	Description
cookielawinfo-checkbox-advertisement	1 year	Set by the GDPR Cookie Consent plugin, this cookie is used to record the user consent for the cookies in the "Advertisement" category .
cookielawinfo-checkbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
JSESSIONID	session	Used by sites written in JSP. General purpose platform session cookies that are used to maintain users' state across page requests.
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.

Cookie	Duration	Description
VISITOR_INFO1_LIVE	5 months 27 days	A cookie set by YouTube to measure bandwidth that determines whether the user gets the new or old player interface.
YSC	session	YSC cookie is set by Youtube and is used to track the views of embedded videos on Youtube pages.
yt-remote-connected-devices	never	YouTube sets this cookie to store the video preferences of the user using embedded YouTube video.
yt-remote-device-id	never	YouTube sets this cookie to store the video preferences of the user using embedded YouTube video.

Cookie	Duration	Description
ice.connection.contextpath	session	No description
ice.connection.lease	session	No description
ice.connection.running	session	No description
ice.push.browser	session	No description
ice.pushids	session	No description
ultp_view_2478	1 day	No description
ultp_view_2524	1 day	No description
ultp_view_2526	1 day	No description
ultp_view_2534	1 day	No description
ultp_view_2542	1 day	No description
ultp_view_2544	1 day	No description
ultp_view_2546	1 day	No description
ultp_view_2613	1 day	No description
ultp_view_2620	1 day	No description
ultp_view_2622	1 day	No description
ultp_view_2624	1 day	No description
ultp_view_2668	1 day	No description
ultp_view_2674	1 day	No description

Monitoring & Controlling / Quality & Analysis · June 24, 2024

The Application