Tableau Prep Builder: The Interface and philosophy (Part 2 of 7)
Forget Excel's flat tabs - in Tableau Prep your data preparation becomes a flow with direction, and the visual dictionary tells you exactly what's happening at every step.
- Tableau Prep treats data preparation as a directional flow from input on the left to output on the right, rather than Excel's flat, order-sensitive structure.
- Each step type has its own visual design in the 'visual dictionary' - aggregates, pivots, joins, unions, scripts and outputs - with names drawn from the database world.
- Clicking on a line between objects inserts a step into the path, while clicking the end of an object spins off a branch into a new direction.
- Steps are sequential and order matters, but you can reorder cleaning actions or step back to earlier points to undo changes without affecting later steps.
- Coloured streams match the colours in the flow so you can trace exactly which files a column has come from, and the alerts pane categorises errors versus warnings.
- Recap and purpose0:00
- Touring the start screen0:39
- The flow philosophy2:23
- The visual dictionary of steps4:13
- Inserting steps versus branches6:25
- Sequential steps and stepping back7:28
- Settings, summary and data grid views9:31
- Zoom controls and summary window12:18
- Alerts pane and tracing columns13:06
- Staying current and next steps15:00
0:00Okay, so in the last video, I took you
0:04through inputting data into Tableau Prep.
0:07And at this point, now we're about to start
0:09to work with the main concept of Tableau
0:11Prep.
0:12So what I wanted to do first before diving
0:14into the product is just to do a little
0:16recap
0:17of the next step and also talk a bit about
0:19the interface.
0:20So we've covered how to connect to files as
0:23typically the first thing you want to do.
0:25Now what I want to do in this video is
0:27slightly different.
0:28I actually want to talk about Tableau Prep
0:30's philosophy, how you go about working with
0:32data in Tableau Prep.
0:33And I also want to introduce you to the
0:35visual dictionary that's used inside of
0:37Tableau Prep.
0:38So here we are in the Tableau Prep
0:40interface.
0:41On the very left-hand side, you've got what
0:43's called the connection pane.
0:45Here you'll find your connections.
0:46You saw this in the first video, and you
0:48can see here a list of the possible data
0:50sources
0:50you can connect to.
0:51As ever, these are constantly updated to
0:54look out for those over time.
0:56Now in this main section of the canvas, you
0:58've got your recent flow.
0:59So you can actually see some of the work
1:01flows I've been working on very recently.
1:04And at the very bottom here, you can see
1:06some sample flows.
1:07So this is a sample flow that everyone
1:09should have.
1:10You can simply click on these, open them,
1:12and you should have Tableau Prep open up
1:14that
1:14flow.
1:15Now if I just open up a new instance of
1:17Tableau Prep, I'll just show you one last
1:19thing, and
1:20that's the discover pane here on the right-
1:22hand side.
1:22If I just go to full screen here, the
1:24discover pane is really useful.
1:26If you're trying to learn Tableau, it has
1:27some really good starting points.
1:29You'll see here they actually have a very
1:31similar sort of introductory video.
1:33And in many ways, these are actually a must
1:34.
1:34It's really important videos to go have a
1:36look at, and they're great resources put
1:38together
1:39by Tableau themselves.
1:40Now if you want to change this view from a
1:42grid view, you can do.
1:43You can just click on this icon, and you
1:46get a list of workflows and what's going on
1:48.
1:48And you can also see how many outputs, the
1:50location of the file, when it was last
1:52opened.
1:52So in many ways, this might be a more
1:54useful view for you if you work with a lot
1:56of files.
1:56At the very top, you've obviously got the
1:58ability to open a flow that is already
2:00existing,
2:00or you can hop right into your process and
2:03just connect to some data.
2:05The last thing you'll see here is on the
2:06very bottom right, you'll see a prompt to
2:08update
2:08to a new version.
2:09Because Tableau Prep is constantly being
2:11updated, there's always this ability to
2:13update to a
2:13new version.
2:14So it's always worth checking that out.
2:16Okay.
2:17So we'll come out of the full screen, and
2:19we'll head back to our first flow that we
2:22opened up.
2:23Now, one thing to bear in mind is that
2:25unlike Excel, where you do your steps in
2:27this sort
2:28of flat structure, in Tableau Prep, you
2:31think of your data preparation and your
2:33sort of
2:34processing of data as a flow.
2:37It has a direction from the start to the
2:39end.
2:39You have your outputs on the left, the
2:41start, and you have your outputs on the
2:42right, the
2:43end.
2:44The end is basically the final output,
2:46whatever you then take on to another
2:48analytics tool
2:49or export to a database or send as a file
2:52to a colleague.
2:53That's always the final step.
2:55Now, in any particular flow, you can have
2:57multiple outputs, and therefore you can
2:58have
2:59multiple final steps.
3:01In this particular case, I could be
3:04creating a budget.
3:05And in this particular step, I could
3:07actually be aggregating the actual figures
3:09for a particular
3:10year, maybe because my future budget is
3:12based on a certain percentage of this year
3:14's performance,
3:15and this output is actually this year's
3:18performance in real time.
3:20Now, in between your output and your input,
3:23you have all these steps.
3:25And in Excel, what you have to do is you
3:27have to make sure you do things in the
3:28right order.
3:29And if you've done something wrong, you
3:31have to go back and change it, or you
3:33change your
3:33calculations and you make sure that
3:36everything references every single cell
3:38correctly.
3:39In Tableau Prep, that mistake is slightly
3:41harder to make because everything happens
3:43in a sequential order.
3:45The visual nature of the product means that
3:47you have to actually think about the way
3:48you're
3:48going to process your data in a much more
3:51logical way.
3:52You can see the relationships between how
3:54your data comes together, and you start to
3:56get some visual cues about what's going on.
3:59And so your flow is essentially the whole
4:01journey of your data from start to finish,
4:04and that's what we're looking at now.
4:05Again, if you're new to Prep, don't worry.
4:07I'm going to go through all of this at some
4:09point throughout this series, so just tune
4:12in.
4:13And what I want you to do is pay attention
4:15to the visual dictionary.
4:16Now Tableau call this the visual dictionary
4:18because it's basically a visual information
4:21system that allows you to understand what's
4:24going on.
4:25But what exactly do they mean by that?
4:27Well, they have this sort of mechanism
4:29whereby every single process or every
4:31single tool
4:32that you will add here in Tableau Prep has
4:35its own visual sort of design.
4:37So if you look at this, you have the main
4:40data types here, so you have branches,
4:42aggregate
4:43steps, pivots, joins, unions.
4:47You have the capability to do some
4:49scripting with Python or R, and also the
4:51ability to
4:52output a file.
4:53So if I just relate that a little bit to
4:55Excel, these are the kind of things you
4:57might do.
4:58For example, the aggregation step, you
5:00might highlight a column and then add it
5:02all up
5:02to get a particular number.
5:04You might pivot your data because it's in
5:06the wrong shape.
5:07You might do a VLOOKUP in Tableau or in
5:09databases, that's typically a join of some
5:12sort.
5:12Or you might want to put, let's say, three
5:15tabs with the same data, but in each tab
5:17you
5:17have data from different months.
5:19You might decide to union them together.
5:21As you start to see, each of these
5:23processes has a particular name in Tableau,
5:25and these
5:26names are typically derived from the
5:28database world, so the data architecture
5:30and so on
5:31and so forth.
5:32So if they're new terms, don't worry, these
5:34are terms that exist in other tools and
5:35philosophies,
5:36so it's worth investing a bit of time
5:38understanding these.
5:40Now the second thing to understand is that
5:42Tableau Prep uses this concept of a flow,
5:45and a flow is essentially what we're
5:46looking at now.
5:47So if you go from left to right, you have
5:49your input here on the left-hand side, and
5:52on the very far right-hand side you have
5:54your output.
5:55Now you will notice that there are two
5:57outputs in this particular flow.
5:59That makes sense because in some processes
6:01you might want data captured from a
6:03particular
6:03point.
6:04So if you're doing a budget process, you
6:06might want an output of your budget, but
6:08then you'll
6:09also might want an output at the end of
6:10your process to show you how you're doing
6:12against
6:13the budget.
6:14And so these two outputs make a lot of
6:16sense, and those might then go into a Table
6:18au workbook
6:19or into some other sort of data system like
6:22a database or some other analysis.
6:24Now between your outputs and inputs, you
6:27have what are called steps, and these steps
6:30are
6:30actually these individual icons you can see
6:33here, and the main thing to understand is
6:36these steps do different things.
6:38So as you saw before, each step sort of has
6:41its own grouping, but this particular step
6:44is a simple calculation step.
6:46So if I go in here and I just say, I
6:48actually have to go and make sure I do this
6:50on a line
6:51because I've already got an output.
6:53So if I click on the line, I insert a step,
6:58okay?
6:59If I go back one step and instead I click
7:01on the very end of the object, I insert a
7:04branch.
7:05So Tableau is understanding the difference
7:07between me intercepting this path and me
7:10spinning
7:11off this path.
7:12So it's about sort of taking your data
7:13processing into different directions.
7:16If you think of an Excel, this is like
7:17copying all the data in one particular tab,
7:19putting
7:20it on a new tab, and then trying out a new
7:22calculation.
7:23This is the exact same step here in Tableau
7:27Prep.
7:27If I zoom out a little bit, you'll see that
7:31each of these flows has a line, and these
7:34lines essentially dictate the direction in
7:37which the data is moving.
7:38So if I was to remove this step and then
7:41just hit run, you'll see that my data goes
7:44from
7:44left to right, okay?
7:46When it's actually running the flow, my
7:48data streams from left to right, when it's
7:50done,
7:51I get the output over here.
7:53Now when I click on this particular line
7:55and I just bring this down, you'll see that
7:58I
7:58get this pane that pops up and I get these
8:01scrolling lines coming out across the top.
8:04What this shows me here in this particular
8:07section is a preview of the data at this
8:10particular
8:10point.
8:12So unlike Excel where I have to capture
8:14each tab and the process for each tab, and
8:16then
8:17I have to spread that across multiple
8:19sheets or I run some sort of SQL in the
8:21background
8:21that isn't visible, Tableau Prep actually
8:24shows you each of the single steps required
8:26.
8:26So here you can see that I've got two steps
8:30.
8:31One is a rename.
8:32In fact, I've got two removed steps and one
8:34is a rename, three steps in total.
8:37And if I go over here to the left-hand side
8:39pane, I actually get the detail of those
8:41and
8:41the order in which they're applied.
8:43I can move these around and it just changes
8:45the order of the process.
8:47But it's important to bear in mind that if
8:49I was, for example, to move this rename
8:51option
8:51to the top, it might fail because what
8:54happens here is that this step might need
8:57to happen
8:58before this step.
8:59So these steps are sequential.
9:00You have to bear that in mind in terms of
9:03the way you process.
9:04The other thing you can do, which is quite
9:06hard to do in Excel, is you can actually go
9:08back several steps.
9:09Let's say you do a calculation or an
9:11aggregation and you're not happy with it.
9:13By just simply clicking on the first step,
9:15I actually go back to that original step
9:17and
9:17I can undo that step without interfering
9:19with some of the future steps.
9:21So removing this top selection here doesn't
9:24actually interfere with any of these two
9:26steps.
9:27Okay?
9:28And of course, I can always go back and
9:30undo that.
9:31Now, if we look throughout the entire flow,
9:34you'll see that this visual information
9:37system
9:37remains broadly the same.
9:39For every single step, you'll always have
9:42the preference pane here on the left-hand
9:44side, a toolbar that goes across the top, a
9:47view that shows you the results summarized.
9:50And if I push this up a little bit, a view
9:53below here that shows you the table, the
9:55tabular
9:56detail.
9:58If I go to another flow, let's click on the
10:01union, you'll see this step has a slightly
10:04different settings pane because it's trying
10:07to show you the way in which the data
10:09unions.
10:10I can also see any changes that have been
10:12made during the step here.
10:13Again, I still have my toolbar, I have my
10:16summary, and again, I have my table.
10:18Now, if I want to change what's in this
10:20view, I can actually go over to this
10:22toolbar here
10:24and you'll see several things.
10:26This view is the one I'm currently on.
10:28If I click on the second option here and I
10:30show the data grid, it actually hides the
10:32summary and I just get to see the data as
10:35it almost looks like in Excel.
10:37Imagine these as column headers and these
10:40are simply my rows.
10:42And then maybe I might want to view where I
10:44don't want to see my data, I actually just
10:45want to see the column headers and I want
10:47to see what changes have been applied to
10:49them
10:49if at all any.
10:51And I can also get this sort of visual
10:53system that shows me the color of the
10:54streams where
10:55this data has come from.
10:57You see, in this particular step, I have a
11:01purple, a green, a blue, and I'm going to
11:04say a teal color.
11:06So you can see these four colors here.
11:08Those actually correlate with these colors
11:11up here.
11:12So this visual system is not only
11:14communicating the particular sort of paths
11:16that things are
11:17coming from, it matches what's actually in
11:20the flow.
11:21So I can look at this and say file paths
11:23are only coming from the green flow.
11:25If I just zoom into that, you can see that
11:27's the color green.
11:27And then if I zoom back out, you'll see
11:30that this is the only particular output
11:32that's
11:32giving me a file path.
11:34And you can see the columns right there.
11:36And then if I want to see what's actually
11:38in there, I can go back to my data grid by
11:40clicking the second icon here on the right-
11:42hand side.
11:43And then if I want to go back to my summary
11:45view as well with the data grid, I can do
11:47that as well.
11:49Now this tool pane, we'll go into this in
11:50more detail throughout this series, but it
11:52's
11:53always important to bear in mind that it
11:55changes depending on what you selected.
11:57If I click on the canvas, the tool pane
11:59actually disappears altogether.
12:01If I click on a particular step, it reapp
12:04ears.
12:05But the number of options that you get here
12:07in this pane change depending on what you
12:09select.
12:10Now if I want to hide this pane, even
12:12though I've got this particular tool
12:13selected, I
12:14can just hit this down button and it
12:16disappears.
12:17Now, the last thing I want to introduce you
12:19to is this little summary window here on
12:21the
12:22bottom right-hand side.
12:23You can actually see that I can collapse it
12:26and close it just by clicking this arrow to
12:28the bottom left-hand side of this square.
12:31I can also take this icon here, and this
12:34will get the flow to fit in my view.
12:36Let's say that I zoom out too far.
12:39If I click that, it goes back to almost 100
12:42%.
12:42If I zoom in, I can again hit that button
12:45there to go back to how I like it.
12:48Another shortcut you can do is hold Ctrl+
12:50to zoom in and Ctrl- to zoom back out, and
12:53that will work.
12:54If you're on a Mac, it's Command- and
12:57Command+.
12:58Okay?
12:59I hit that icon again to go back to 100%,
13:03and then we'll go back out.
13:05Now, the last thing I want to show you in
13:08this sort of step is the Alerts pane.
13:11The Alerts pane is over here to the top
13:13right-hand side.
13:14This is where you'll typically see errors.
13:16If there are any errors in your flow, for
13:19example, if I was to duplicate this step
13:23and
13:23paste it, I'm going to get an error.
13:26If I click on this tool, you'll see the
13:28error is actually highlighted here.
13:30The issue here is that I've created another
13:32output that writes to the file that I'm
13:34already
13:35writing to, so I can't possibly write to
13:37one file from two different places.
13:39So Tableau has thrown me an error.
13:42It will try and categorize them for you as
13:44well.
13:44So there are also warnings.
13:46Warnings are typically things you should be
13:48aware of, but don't necessarily cause an
13:49issue
13:50in terms of the way the flow works.
13:51So this is also something to bear in mind.
13:54Now, the last thing I want to highlight are
13:57these lines that go in between these
13:59individual
14:00steps.
14:01If I just remove this output so we don't
14:04get an error.
14:05As I showed you before, when we click on a
14:07particular column, let's say we go to our
14:10data summary view and we click on the
14:12Discount column.
14:13They go darker to show where a particular
14:17column is coming from.
14:19So this Discount column that I've selected
14:21is making this flow go darker on these
14:23particular
14:24lines because a Discount column actually
14:27comes from all four of our files that have
14:29gone
14:29into this.
14:30So we have our input steps, we have our
14:33cleansing steps here.
14:34We then have a union step, then a join step
14:37, another cleansing step, an aggregation
14:40step,
14:41another join step, and then an output.
14:44And that's where the discount field is sort
14:47of flowing from and flowing to, if that
14:50makes
14:50sense.
14:51Now, as we work throughout this series, I
14:53'll jump into way more detail about each of
14:55these
14:56steps and we'll be able to sort of dissect
14:58them.
14:59This visual system is new.
15:01Tableau Prep itself is a very new product
15:03at the time of this recording.
15:04It's been out for about a year.
15:06So this is going to change very, very, very
15:08quickly.
15:09In fact, I already know that there's
15:11another version at the time of this
15:12recording, which
15:13adds a few bits of functionality to Prep.
15:17So it's really, really important to stay in
15:19tune with all the updates so you know what
15:21's
15:21going on.
15:22Wherever possible, I'll try and link the
15:24updates and notification cards in the top
15:27of all the
15:27videos.
15:28So if you see those, tune in.
15:30Otherwise, I'll drop something in the
15:31description linking to where you can find
15:33the latest updates
15:33about Tableau Prep.
15:34All right.
15:35Join us in the next video where we're going
15:37to start talking about our first tool,
15:39which
15:39is the cleansing step, this line step here
15:43that you see me selecting.
15:46If you enjoy the video, hit subscribe.
15:47If not, drop a comment.
15:49Let us know what you'd like to see on the
15:51channel and I'll catch you in the next one.
In this video, I give you a tour around the Tableau Prep Builder interface, show you how Tableau Prep Builder works and highlight some of the nuances compared to working with files in Microsoft Excel. -----Join my Discord Server. https://discord.gg/shBuxXr it’s a little sparse at the moment but hang in there.