Dune SQL Style Guide
This guide provides best practices for writing clean, readable, and maintainable SQL queries on Dune. Whether you're analyzing DeFi protocols, NFT collections, or blockchain metrics, following these conventions will help you create more efficient and collaborative queries.
By adopting these practices, we can build a more maintainable query library that benefits the entire web3 analytics community.
CTEs rather than Subqueries
CTEs are preferred over subqueries because they make SQL more readable, maintainable, and often perform better. Unlike nested subqueries that can become difficult to follow, CTEs create a logical, step-by-step flow that's easier to debug. For an in-depth explanation of why CTEs are a fundamental part of our SQL style, we recommend reading this detailed discussion on dbt's discourse.
Essential CTE Structure and Organization
✅ Break down your query into smaller, manageable pieces using CTEs - Think of each CTE as a building block that does one specific task
✅ Put all your data sources (using {{ ref('...') }} or sources) at the beginning of your query to make it easy to see where your data comes from (only for dbt, non applicable for dune)
✅ Filter out unnecessary data as soon as possible in your query to make it run faster
✅ Give your CTEs clear, meaningful names that explain what they do - For example, use 'daily_transactions' instead of 'cte1'
✅ Add comments to explain any complex calculations or logic in your CTEs - This helps others (and yourself) understand the code later
✅ If you find yourself copying the same CTE multiple times, consider making it a separate model instead (this applies to dbt users)
✅ Always end your query with a 'final' or ‘finally’ CTE that shows your end result - This makes it easier to check your work by looking at the steps in between. And to debug
CTE Structure and Best Practices
✅ Start each new CTE with a comma at the beginning of the line - this makes it easier to add or remove CTEs without breaking your query
✅ Write comments to help others understand what your code does, especially for complex calculations
✅ Arrange your CTEs in a logical order - start with raw data, then transform it step by step until you get your final result
✅ Make your code easier to read by indenting the content inside each CTE - this helps visually separate different parts of your query
Example
with
nft_sales as (
select * from {{ ref('ethereum_nft_sales') }}
)
-- Filter NFT sales for a specific collection
, filtered_sales as (
select *
from nft_sales
where nft_contract_address = lower('0xbc4ca0eda7647a8ab7c2061c2e118a18a936f13d') -- BAYC
and block_time >= now() - interval '30' day
)
select * from filtered_sales;
Writing Clear and Consistent SQL Code
When analyzing on-chain data on Dune, writing clear and consistent SQL code isn't just about making your queries work – it's about making them understandable and maintainable for the entire web3 analytics community. Here's why it matters:
Clean, readable SQL code helps other analysts build upon your work, enables easier debugging of complex blockchain queries, and ensures your analyses remain valuable as protocols evolve. Whether you're tracking DeFi metrics, NFT collections, or other blockchain data, following consistent formatting practices makes your queries more accessible to both beginners and experienced analysts alike.
Let's dive into the key principles of writing clear SQL code for web3 analytics...
SQL Code Formatting and Styling Standards
✅ Put commas at the start of lines (not at the end) to make it easier to add new columns or comment out existing ones. This helps avoid syntax errors when modifying your query.
-- ❌ Wrong: Using trailing commas
select
block_number,
tx_hash,
from_address, -- Trailing comma makes it harder to comment out lines
from ethereum.transactions
-- ✅ Right: Using leading commas
select
block_number
, tx_hash
, from_address -- Easy to comment out any line without breaking syntax
from ethereum.transactions
✅ Make your code easier to read by using consistent spacing. Add four spaces at the start of each line except for WHERE conditions, which should line up with WHERE
-- ❌ Wrong: Inconsistent indentation
select
tx_hash,
block_number,
from_address
from ethereum.transactions
where block_time >= '2025-01-01'
and value > 0
-- ✅ Right: Consistent four-space indentation
select
tx_hash
, block_number
, from_address
, value / 1e18 as value_in_eth
from ethereum.transactions
where block_time >= '2025-01-01'
and value > 0
and success = true
✅ Keep your SQL code easy to read by not writing super long lines - if a line gets too long, break it into multiple lines. This makes it easier to read and understand your queries without having to scroll sideways. Benchmark: 80/120 characters.
-- ❌ Wrong: Long line that requires horizontal scrolling
select tx_hash, block_number, from_address, to_address, value, gas_used, gas_price, block_time, success, case when value > 0 then true else false end as is_value_transfer from ethereum.transactions
-- ✅ Right: Breaking long lines for better readability
select
tx_hash
, block_number
, from_address
, to_address
, value / 1e18 as value_in_eth
, gas_used
, gas_price / 1e9 as gas_price_gwei
, block_time
, success
, case
when value > 0 then true
else false
end as is_value_transfer
from ethereum.transactions
Naming Conventions
✅ When writing SQL queries in Dune, keep everything in lowercase: this includes names of fields (like 'wallet_address'), functions (like 'sum' or 'count'), and SQL commands (like 'select' or 'from')
-- ❌ Wrong: Mixed case and unclear names
SELECT
WalletAddress,
Token_Name,
BLOCK_TIME,
temp_val1
FROM EthTransactions t
WHERE t.ChainId = '1';
-- ✅ Right: Consistent lowercase and clear names
select
wallet_address
, token_name
, block_time
, eth_value
from ethereum.transactions
where chain_id = '1';
✅ Always use as
when giving new names to tables or columns. For example, instead of writing 't' for your table name, use something more descriptive like 'txs' for transactions or 'evts' for events. This makes your code easier to understand for others (and yourself later!)
-- ❌ Wrong: Unclear table alias and missing AS keyword
SELECT t.tx_hash, e.amount
FROM ethereum.transactions t
JOIN ethereum.erc20_events e ON t.tx_hash = e.tx_hash;
-- ✅ Right: Clear table names and explicit AS
-- ✅ Right: Clear joins
select
txs.tx_hash
, evt.amount
from ethereum.transactions as txs
join ethereum.erc20_events as evt
on txs.tx_hash = evt.tx_hash;
✅ Keep table names clear in your joins - using full names like 'customers' is better than short versions like 'c', unless the short version makes your query easier to read. Then using 3 letters is a good idea.
Example above
✅ Give your tables and columns meaningful names that describe what they contain - instead of names like 'temp_table' or 'data1', use descriptive names like 'daily_trades' or 'wallet_balances'
Example above
✅ When you need to uniquely identify rows in your tables, use a consistent name for this ID column - for example, you could use 'unique_row_id' for all your tables
✅ When tracking when data was last updated, always use the same format: current_timestamp as last_updated_at
🚧 Tip: Try to be consistent with your table names - either use singular (like 'trade', 'wallet') or plural (like 'trades', 'wallets') throughout your queries, but don't mix them
SQL Organization Best Practices
✅ List all your regular columns (like user_id, email, name) first, then add any calculations or totals (like sum(amount) or row_number()) after. This makes your queries easier to read and understand
-- ❌ Wrong: Mixing state fields with window functions
select
wallet_address,
row_number() over (partition by wallet_address order by block_time) as tx_number,
token_symbol,
token_name,
sum(token_amount) over (partition by wallet_address) as total_transferred
from ethereum.erc20_transfers;
-- ✅ Right: State fields grouped together, followed by window functions
select
wallet_address
, token_symbol
, token_name
, row_number() over (partition by wallet_address order by block_time) as tx_number
, sum(token_amount) over (partition by wallet_address) as total_transferred
from ethereum.erc20_transfers;
✅ Combine (or aggregate
) your data into smaller groups as early as possible, before joining it with other tables. This helps your query run faster by reducing the amount of data it needs to process. For example, if you're counting daily transactions, first count them for each day, then join that smaller dataset with other information you need.
-- ❌ Wrong: Joining before aggregating
select
w.wallet_address,
w.ens_name,
sum(t.value/1e18) as total_eth_transferred
from ethereum.wallets w
left join ethereum.transactions t
on w.wallet_address = t.from_address
group by 1, 2;
-- ✅ Right: Aggregating before joining
with transfer_totals as (
select
from_address
, sum(value/1e18) as total_eth_transferred
from ethereum.transactions
group by 1
)
select
w.wallet_address
, w.ens_name
, tt.total_eth_transferred
from ethereum.wallets as w
left join transfer_totals as tt
on w.wallet_address = tt.from_address;
✅ Always write out the full join type - use inner join
instead of just join
. This helps make it clear what kind of join you want (an inner join only keeps rows that match in both tables). It's better to be clear about what you're doing than to rely on SQL's default behavior
-- ❌ Wrong: Implicit join type
select usr.wallet_address, txn.tx_hash
from ethereum.users usr
join ethereum.transactions txn
on usr.wallet_address = txn.from_address;
-- ✅ Right: Explicit join type
select
usr.wallet_address
, txn.tx_hash
from ethereum.users as usr
inner join ethereum.transactions as txn
on usr.wallet_address = txn.from_address;
✅ When joining tables, prefer to use left join
instead of right join
.
-- ❌ Wrong: Using right join
select
w.ens_name
, t.tx_hash
from ethereum.transactions t
right join ethereum.wallets w
on w.wallet_address = t.from_address;
-- ✅ Right: Restructuring to use left join
select
w.ens_name
, t.tx_hash
from ethereum.wallets as w
left join ethereum.transactions as t
on w.wallet_address = t.from_address;
✅ When joining multiple tables, always add the table name or its shortcut (alias) before the column name. For example, write 'users.name' or 'u.name' instead of just 'name'. This helps avoid confusion about which table the column comes from. Skip this only when you're selecting from a single table.
-- ❌ Wrong: Ambiguous column references
select
name,
token_symbol,
tx_hash,
value
from ethereum.wallets w
join ethereum.transactions t on w.address = t.from_address;
-- ✅ Right: Clear table prefixes
select
wlt.name
, wlt.token_symbol
, txn.tx_hash
, txn.value / 1e18 as eth_value
from ethereum.wallets as wlt
inner join ethereum.transactions as txn
on wlt.address = txn.from_address;
Data Grouping and Sorting Conventions
✅ Use numbers instead of column names in GROUP BY
whenever possible. This makes your SQL queries easier to maintain, especially when column names change. Learn more here.
-- ❌ Wrong: Using column names in GROUP BY
select
wallet_address,
block_date,
sum(transaction_value) as total_value
from ethereum.transactions
group by wallet_address, block_date;
-- ✅ Right: Using positional references
select
wallet_address
, block_date
, sum(transaction_value) as total_value
from ethereum.transactions
group by 1, 2
-- or {{ dbt_utils.group_by(2) }};
✅ Use UNION AL
L instead of UNION
whenever possible. UNION ALL
is faster because it doesn’t remove duplicates, while UNION
does extra work to check for duplicates. Only use UNION
if you specifically need to remove duplicates. In the case that you do need to remove duplicates use UNION DISTINCT
to make that explicit
-- ❌ Wrong: Using UNION when duplicates don't matter
select wallet_address, 'active' as status
from dune.active_users
union
select wallet_address, 'inactive' as status
from dune.inactive_users;
-- ✅ Right: Using UNION ALL for better performance
select wallet_address, 'active' as status
from dune.active_users
union all
select wallet_address, 'inactive' as status
from dune.inactive_users;
Joins & Filtering
✅ Always list the main (left) table first in joins for better readability and clarity. This makes it easier to understand which table is the primary source of data in the query
select
transactions.*,
senders.reputation as sender_reputation,
receivers.reputation as receiver_reputation
from ethereum.transactions
left join dune.users as senders
on transactions.from_address = senders.wallet_address
left join dune.users as receivers
on transactions.to_address = receivers.wallet_address;
Case Clause
✅ Format CASE
statements for better readability:
• Put each WHEN
and ELSE
on a new line.
• Indent THEN
statements under their corresponding WHEN
.
• Write END
on a new line without indentation.
This makes it easier to see the logic and outcomes at a glance.
, case
when withdrawal_date is null and expiration_date is not null
then expiration_date
when withdrawal_date is null
then start_date + interval '7 days'
else withdrawal_date
end as withdrawal_date
Best Practices Dune example
with
transactions_data as (
-- list columns if possible
select * from {{ ref('ethereum_transactions') }}
where block_timestamp >= '2024-01-01'
and (
status = 'success' or
status = 'pending'
)
)
, user_data as (
select * from {{ ref('dune_users') }}
)
, aggregated_transactions as (
select
from_address
, sum(value) as total_value_sent
, max(gas_used) as max_gas_used
from transactions_data
group by 1
)
, final as (
select distinct
txns.tx_hash
, txns.from_address
, txns.to_address
-- Use line breaks to visually separate logic
, case
when txns.status = 'success' and txns.confirmations > 10
then 'finalized'
when txns.status = 'pending'
then 'awaiting confirmation'
else 'failed'
end as transaction_status
, aggt.total_value_sent
, aggt.max_gas_used
from transactions_data as txns
left join aggregated_transactions as aggt
on txns.from_address = aggt.from_address
having count(*) > 1
)
select * from final;
Final Thoughts for Dune Web3 Users
✔ Prioritize readability over brevity – extra lines of SQL don’t slow down execution, but unclear queries waste valuable brain time.
✔ DRY (Don’t Repeat Yourself) – if you find yourself repeating code, consider using an extra CTE, macro (dbt), or model (dbt) to improve maintainability.
✔ Break down complex blockchain queries – use CTEs to simplify logic when dealing with on-chain transactions, token transfers, and wallet interactions.
✔ Follow consistent formatting and naming conventions – lowercase everything, align JOIN conditions properly, and format CASE statements for clarity.
✔ Optimize for performance but keep maintainability – structuring queries efficiently ensures they remain readable while handling large datasets, such as historical Ethereum transactions.
✔ Keep queries under 100 lines when possible – long queries can slow down Dune’s query optimizer, making them harder to debug and maintain.
🚧 Prefer sentinel values over NULLs – avoid letting NULLs propagate unchecked through your query. Use defaults where applicable (e.g., 0 for transaction values, 'unknown' for missing labels) to prevent unexpected results.
🚧 Use tooling. Configurable linter/formatters like sqlfluff take the drudgery out of keeping your code formatted. You can hook them into your editor to run automatically on save or into your workflow as git pre-commit hook.
Guide provided by Randall Potter and C. Ittevrec from Absinthe, to contribute a guide like this on the Dune's site, contact alsie@dune.com.