Video | Tableau | Data prep | Analytics

Stratified Sampling in Tableau Prep | New in tableau 2023.3

Tableau Prep 2023.3 adds stratified sampling — here's why it makes prepping large datasets faster and far more representative.

Part ofWhat's new in Tableau 2023.3
  • Stratified sampling takes your population, groups it by a chosen dimension, then randomly selects within each group so the sample is more representative
  • In Tableau Prep, sampling only affects what you see while prepping — when you hit run, all records flow through to the output regardless of sample settings
  • Without stratification, a rare category like loan defaults (14% of the data) can be under-represented in your sample, leaving you with too few rows to analyse
  • To enable it, go to the input step, open Data Sample, choose Stratified, then pick the column to balance on (e.g. client ID stratified by default)
  • Sampling keeps row counts down to avoid Tableau Prep performance and stability issues with multi-million row datasets while preserving a useful spread

When working with large datasets, you can use stratified sampling to capture a sufficient number of records from an infrequent category as you explore, clean, and shape your data. The new stratified sampling algorithm allows you to group by a specified column and then sample data within each subgroup. Prep will return an equal number of rows distributed across the selected column for grouping to ensure you get a representative sample. ‍

Timestamps 0:00 Intro 0:23 What is Stratified Sampling? 1:46 Sampling in Tableau Prep 4:41 How Stratified Sampling Works in Tableau Prep

Join this channel to get access to perks: https://www.youtube.com/channel/UC7HYxRWmaNlJux-X7rNLZyw/join ‍ ---------- (C) 2023 TN-Media LTD. No re-use, unauthorized use, or redistribution, of this video without prior permission.