Generate rows in Tableau Prep: New in 2021.3
Generating rows in Tableau Prep finally kills off the nasty self-union scaffolding trick for good.
- The new rows feature in Tableau Prep offers three modes: filling from a single numeric field as a sequence, filling between two dates, and filling between a range of two fields
- You can choose whether new rows are left as nulls or copied from the previous row, with bold rows in the preview indicating the records being generated
- Date fills let you set increments of days, weeks or months, and you can override the min/max range with custom start and end dates
- A handy trick for keeping even rows is ROW_NUMBER() % 2, which returns a boolean you can filter on, and works in Tableau Desktop too
- This feature replaces messy scaffolding workarounds like self-unions for use cases such as counting how many days a support ticket was open
- Why generate rows in Prep0:00
- Loading the mock data set0:30
- Setting up the countdown date calculation1:23
- Creating a row number and even-number filter2:36
- Filling rows from a single field5:46
- Filling rows using dates8:53
- Custom start and end dates10:43
- Filling between two date ranges12:39
- Stacking changes within a step15:01
- Real-world support ticket use case16:16
0:00Hey 21.3 is out and we are covering the new
0:02feature in Tableau Prep that allows you to
0:05add
0:05new rows. Now this sounds like a bit of a
0:07strange feature because of course there's
0:10lots of ways of
0:10adding rows to your data you can just go
0:12ahead and add them to the data set. But
0:15typically when
0:15we talk about adding rows in Data Prep what
0:17we're trying to do is trying to create a
0:19data structure
0:20in Tableau or in another analytical
0:22software that we want to use to visualize
0:24something. And so in
0:26today's video I'm going to be showing you a
0:27new feature in Prep that's going to help
0:29with that.
0:29Let's get stuck in. Okay so I'm doing
0:31something different I'm actually doing this
0:33most of this
0:34release on a Windows PC. I'm actually not
0:36on a Windows PC I'm virtualized I'm on a
0:38Mac but I'm
0:39doing it in a Windows PC for reasons I won
0:41't get into. I've got a data set here on my
0:43desktop I'll
0:44actually make this available via the link
0:46in the description or if you're looking on
0:48TableauTim.com
0:49I'll put a link in the blog post that goes
0:51with this video. I'm just going to drag it
0:53into Tableau
0:54Prep here. Now what this is is essentially
0:56some mock data that I created on a site
0:58called Mokaroo
0:59essentially allows you to create mock data
1:01you can play around with. And it's very
1:03simple it's
1:03got a customer id, a date of birth and a
1:05browser identifier that comes from
1:07essentially like
1:09browsing a website. And so if I actually
1:11click on this clean step here you'll see
1:13that you get a
1:13summary of the data set and it looks pretty
1:15good this is what we're going to be working
1:17with. Okay
1:18let me just make this fit the full screen
1:20so now we can sort of get a little bit more
1:21space to work
1:22in and in order to set this up what I'm
1:24going to need to do is create a few
1:26calculations. Now if
1:28you want to jump ahead to see the new
1:29features check out the timestamps and just
1:31jump ahead
1:32but we're going to need to do some setups
1:33to sort of make this use case a little bit
1:35more
1:35realistic. What I'd like to do is I'd like
1:37to add a few days onto the date of birth.
1:39Essentially I
1:39want to sort of create like a some sort of
1:41countdown to the date of birth column in
1:44here.
1:44Okay let's go ahead and create a calculated
1:46fill. What I want to do is do a date add
1:48here
1:48let's just go select that there and what I
1:51want to do is add days so I'll just go here
1:54and type day
1:55and what I will also do is I'll say I want
1:57to add a negative three so essentially I
2:00want to take away
2:00three days from the date of birth and I
2:02want to do that on the date of birth filter
2:04. There we go
2:05that's our additional date of birth so we
2:08'll call this count down to date of birth
2:12this will create
2:14a new date it actually creates a date time
2:16if we hit apply here you can see that it
2:18actually did
2:19create a date just there you can see it's a
2:21date time field we can change that in the
2:23data format
2:23just by hitting save here going down to
2:26this little date icon and selecting date
2:28and now you
2:29can see we've got the two dates that we
2:30want. I'm going to drag it next to the date
2:32of birth column
2:33and we're pretty much there. Now the next
2:35thing I want to do is I want to add a row
2:37number then I
2:38want to only keep the even numbers you'll
2:40see why I'm doing this when we go to fill
2:42the rows a little
2:42later on so let's go ahead and actually
2:44create that let me just click back in here
2:47and create
2:47a calculated field now the row number
2:49feature in Tableau Prep I have to say it's
2:52a it's a little
2:52bit clunky like I just want to add rows it
2:55shouldn't be sort of this hard but if you
2:57just
2:57go ahead here and just type in row number
2:59you'll see that there's a function there
3:01and it has to be
3:02used in a sort of specific way so if I was
3:04just to take this sort of part of it for
3:06example it
3:07wouldn't it wouldn't sort of it wouldn't
3:09quite work so let's just let's just type
3:11this up properly so
3:13um the auto-completion does actually does
3:16actually do this quite nicely and I have to
3:18say there is a
3:19bug with a version of prep and counting
3:21rows within a partition sometimes it doesn
3:23't count
3:24nested that counts properly so I'm hoping I
3:26'm hoping this actually works so what I want
3:28to do
3:29here is take the customer id and that's
3:31pretty much that done and then what I'll do
3:33is I'll do
3:33like a couple of spaces because there's a
3:35little bit of nesting going on here inside
3:36of the
3:37calculation then I want to do an order by
3:39function here and the thing I want to order
3:42by is the date
3:43of birth and I want to do it in ascending
3:47order hit enter and then on a on a new line
3:50what I'm
3:50going to do is I'm just going to put in put
3:53in this row number function so row number
3:55will do
3:55perfectly fine and I think I'm pretty much
3:58done I think the only thing I'm missing
4:00here is I need to
4:01close off one of these and then it should
4:03be valid now in fact I think what I've done
4:05here is I've
4:06I've created a bit of a nightmare for
4:08myself because I've got um okay I think
4:11what I've done
4:12here is I've got something wrong if I
4:13actually look over here I think it's a
4:15missing this colon
4:16and then I think I have one too many there
4:20so uh I think I think I also have one too
4:23many there
4:24okay I think I've sort of managed to fix it
4:25um something went weird there when I was
4:27typing that
4:28out um but anyway there's our row number
4:30calculation let's just go ahead and type
4:32row number
4:33into the field name hit apply and you
4:35should see here at the bottom that we do
4:38get the row number
4:39it's a little bit cramped down here at the
4:40bottom but you can see that the row number
4:42works which is
4:42fine so let's hit save now that I have the
4:45row number what I'd like to do is create
4:47another
4:47calculation when I basically want to see um
4:50every number that's divisible by two
4:52because I want to
4:53keep just the even numbers in my data set
4:55so there's a little trick here you won't
4:56find this
4:57function inside of the reference pane this
4:59actually works in Tableau Desktop too you
5:01can just take the
5:02row number okay and you can do percentage
5:05two essentially that will only return a
5:08boolean if
5:09the number is divisible by two this is
5:12actually just a you know um function that's
5:14actually pretty
5:15common in most databases so let's just call
5:17this even number hit apply and now you'll
5:20see you get
5:20a one and a zero um depending on whether
5:23the row is divisible by two zero if it is
5:25and one if it's
5:26not and so what that allows us to do is we
5:28can save that and we're going to basically
5:31take all the ones
5:32and we're going to get rid of them I'm just
5:34going to exclude them from our data set
5:36okay so we've
5:37just got rid of half of our customers
5:39essentially and that's fine for this
5:40example because I just
5:41want to sort of set this out nicely for us
5:44to start getting into the new feature okay
5:46let's hit this plus icon and the first
5:48thing you'll see straight away is that we
5:51have a new
5:51icon this is a new icon and what's
5:53interesting is that it's actually nice to
5:56see Tableau Prep adding
5:58more functionalities down this list I
6:00really like sort of the visual interface
6:02here I like that
6:03things are very clear and simple and it's
6:05not too cluttered in a toolbar a bit like
6:07all tricks but
6:08nonetheless let's go ahead and select the
6:10new rows option and when we select this
6:12drop down you'll see
6:12that we get a new set of options now this
6:15is split out into two sort of setups you
6:18can either
6:18take the values from one field or take the
6:21values from a range of basically two fields
6:24so what I'll
6:24do first is I'll take the values from one
6:26field and what you get is a drop down
6:28saying okay which
6:29field do you want to base this row fill on
6:32and so the first one I'll use is the even
6:34number
6:35essentially and so what you'll see it do is
6:38you'll see oh okay here um it's just a zero
6:41because I've
6:41chosen the column where basically it was
6:44returning one or zero so there's not sort
6:46of much for it to
6:47go up here the min and the max here are
6:48going to be the same so that's not going to
6:51do us too much
6:51good instead let's go look at row number
6:54and you'll now see the feature starting to
6:56do what
6:56it should do essentially where you want to
6:58be looking is here you kind of get a little
7:00preview
7:00of what's going to happen and you can see
7:03here that this is the new row fill that it
7:05's doing
7:06essentially and notice that because we've
7:08got this setting here where we're going to
7:10update the
7:11existing field it's actually going to be
7:13creating the same column using the same
7:16name and replacing
7:18the existing one if you don't want that if
7:20you just go to create a new field you'll
7:22see that it
7:22actually creates a new row and it calls it
7:24new rows one essentially because it doesn't
7:27know what
7:27else to call it so you can see here that it
7:29's actually filling the rows essentially
7:31looking
7:31down the number and it's figuring out okay
7:34I should see three after number two and it
7:36's creating
7:37the row to fill that what that looks like
7:39in our data set is if we just go down here
7:41to the bottom
7:42you'll see that we also get this little
7:44sort of gaps that sort of appear so you can
7:47see here that
7:48you get nulls okay and again this is
7:50something you can control if you want Table
7:52au to fill that
7:53with data from the previous row then again
7:56you've got this sort of option here at the
7:58very very
7:59bottom so let's go ahead and click null or
8:01zero you can see here there's an option to
8:03copy from
8:03the previous row if I select that then you
8:05don't get any nulls because essentially if
8:08I go back
8:08here you'll see that it this is the row
8:10that it created and it's essentially
8:11filling it with this
8:12one here the bold rows are always the rows
8:14that it's actually creating so if I go back
8:17and I just
8:17null these out you can see the null rows
8:20are the ones in bold those are the rows
8:22that it's creating
8:23the reason they come before the row that it
8:25's sort of copying from is just because of
8:27the sort that's
8:28going on here and I sort of think it's a
8:30strange behavior it should really show it
8:32after all the
8:32time but never mind and we can leave that
8:35there and so yeah here you go you can see
8:37this is working
8:38really really nicely and of course we do
8:40get our new number row here on the right
8:42hand side and we
8:42sort of need that to keep the data
8:44structure sort of working correctly so that
8:46's the the first
8:47example you can just see here that I'm just
8:49calling this sequence essentially just
8:51using a sequence
8:52to count in the right order okay the next
8:55setup is going to be using dates it's
8:56exactly the same
8:57almost like a sequence but this time we're
8:59just going to use a date so let's just go
9:01in here and
9:02say date and what we're going to do is take
9:04values from one field and in this case we
9:07're going to use
9:07a date of birth column and so what tableau
9:10is going to do it's going to look at the
9:11minimum and
9:12the max you can see that it's calling that
9:14here on the left and here on the right and
9:16it's basically
9:17going to try and fill every single date in
9:19between those two dates so you can see that
9:22you get a
9:22before and after exactly the same as before
9:25because we've got this update existing
9:28field
9:28and you can see that this is what it's
9:30going to create over here on the right hand
9:32side
9:32so the other cool thing you can do with
9:34this is you can change the increments let's
9:37say you
9:37don't want to do it in days you can do it
9:39in weeks and also in months and notice that
9:42it will just
9:43add the the beginning of the week as the
9:45date so you can sort of see this column
9:48here changing
9:49if we just count months of course it just
9:51counts up one month and then the next row
9:53is the 0 6 11
9:55so it doesn't need to do anything there
9:57because that is actually in the preceding
9:58month and so on
9:59and so forth so it guarantees that we have
10:01a row of data for every month and it's
10:03basically going
10:04to be adding a month to the most recent row
10:06that it could see essentially so it doesn't
10:08go to the
10:091st of October it goes to the 28th because
10:11essentially it's adding a month onto that
10:13row
10:14I'd love to be able to sort of have it just
10:17you know add the first of the month because
10:20sometimes
10:20you just you just want that as a sort of
10:22basic starting point you don't want some
10:24sort of weird
10:24data point on the 28th just because your
10:26previous row was on the 28th you can fix
10:28that here by just
10:29going and sort of creating a date part for
10:32the month essentially and just using the
10:34first of the
10:34month you could even do that in tableau
10:36very easy it's not a problem but
10:38nonetheless it's nice that
10:39it's here you can choose all these
10:40increments and you can sort of work through
10:42that now there is
10:43another thing to show you you don't have to
10:45go with the min and max dates you can
10:48actually go
10:48in here and type in your own defaults and
10:50the reason this is is because this little
10:52box here
10:53is ticked so if we come tick that from use
10:55minimum and max you'll see that we actually
10:57get a date
10:58picker and we can change these dates around
10:59so let's say you don't want it to start
11:01from the first
11:02let's say you want it to start let's let's
11:04try and find something that's relevant to
11:06this data
11:07data set so you've got 2809 and 1950 what I
11:11'll do is I'll try and grab let's say we
11:14start on
11:14let's do this I don't know if this is an
11:16American date format but let's let's try
11:18and figure this
11:18out I hope it knows that I'm in Europe so I
11:20don't do dates the American way so let's
11:22let's say we
11:23start data in 28 10 1950 so you can see
11:28here that if we go to the start value 28 10
11:321950 it doesn't
11:34seem to be sort of observing that it sort
11:36of seems to be starting from the first row
11:38that it can see
11:39unless of course these dates are incorrect
11:41so let me let me let me try something
11:43different
11:44let's assume well it can't be it can't be
11:46the um tenth of a month it doesn't exist so
11:48this is a
11:49this is a proper date format so it sort of
11:52pays attention to the very first row that
11:55it sees
11:55and then it carries on filling after that I
11:58assume so um it's not it's not doing
12:00exactly what I
12:01thought it would do in in sort of in my
12:03eyes I would have thought that it would
12:05completely if
12:05I say the start value is 28 10 1950 I would
12:09expect it to not start doing row fills from
12:12this date
12:13essentially ignore anything before that day
12:15but I have to say this this could just be a
12:17bug could
12:17just be like a weird sort of coincidence of
12:20what I'm using today but again that's not
12:22doing exactly
12:22what I wanted it to do and but nonetheless
12:25what I expect this to do is to when you hit
12:27the start
12:27value it will only start filling rows from
12:29that day onwards because you might have
12:32some sort of
12:32quirk where you need the data to be filled
12:34from a certain point in time so that's
12:36really really
12:37cool to see it's nice to see that it can do
12:38date now we've got one more format to look
12:40at which is
12:41filling between two ranges so let's go
12:43ahead to the new rows option here and you
12:46can see now we
12:46have a third option so I'll say between
12:49range as an option here I think I spelled
12:52that correctly
12:53so now here what I'll do is select values
12:56ranges from two fields so value ranges from
12:59two fields
12:59I can't talk today and so you can see here
13:01I've got the two dates that I've created
13:03this is why
13:04I was sort of needing two dates because
13:06what I'd like to do is do a countdown from
13:08the countdown
13:09when the countdown starts which is going to
13:11be three days prior to the date of birth to
13:13the date
13:14of birth and so what you'll see here is you
13:15have a slightly different behavior if I
13:17just sort of
13:18call it out here now it shows us the two
13:20columns that we wanted to see in fact let
13:22me do this let
13:22me increase the space here so we can see
13:24this more clearly and now you can see here
13:26we've got three
13:28columns in this generated rows area and you
13:30can see very very clearly if I just change
13:32this color
13:33so what it does is it takes these two dates
13:35and it's going to sort of pad out between
13:37those two
13:38dates and it's going to be three days
13:39because of course I did a date diff and so
13:41essentially the
13:42date of birth is the 28th and so you can
13:44see that that's what it counts to and it
13:46counts from the
13:4725th so you essentially get your four rows
13:49of data this is great let's say a marketing
13:52team comes to
13:52you and say hey we'd like you to put this
13:54data in the format where you can load into
13:56our database
13:57because this company is going to be using
13:59that database to mail out people on a
14:01certain day so
14:02you could go and do this sort of fill rows
14:04mechanism to essentially count out four
14:06days
14:07before a certain point now realistically
14:09that's only useful if you're going to do
14:11something on
14:12every single day but it just gives you
14:14options being able to really control what
14:16this fill rows
14:17function does do you want it to just fill
14:19rows for your data set or fill rows for
14:21everything else now
14:22in this particular use case this is
14:24actually an example where I will ask Table
14:27au to make sure that
14:28it fills the preceding rows with the data
14:30from that record so that when I then load
14:33this I can
14:33see that this is all data relating to the
14:36same record and so what I'll do here is I
14:38'll go to
14:39what value should your new rows have and I
14:41'll say copy from previous row and now you
14:43'll see that it
14:44does indeed fill the previous rows with the
14:47new data from the first row that it was
14:50essentially
14:51filling from so that's pretty much it sort
14:53of three really nice use cases straight out
14:55of the
14:56box for this feature and there's probably
14:57more use cases that I haven't thought of
14:59here but they're
15:01nonetheless really really nice the other
15:03thing I will say about this is that look of
15:05course
15:05if you decide to make any other change in
15:07any one of these steps you of course still
15:09can let's say
15:10I don't need my even numbers column and I
15:12go ahead and remove it those changes still
15:15work so what's
15:16really nice about Tableau prep is that you
15:18can do the change where it makes the most
15:20sense so if you
15:21do this fill date and then you want to
15:23remove something because cosmetically you
15:24created it
15:25there you can just go ahead and do that if
15:27you need to create a calculated field in
15:29this step
15:30you can just go ahead and do that let's
15:32just create a simple text field here called
15:34test
15:34and we'll call it calculation one apply and
15:37save you can see that over here you can see
15:40that these
15:41two changes are actually stored in this
15:43particular step so I really like this idea
15:45that you're not
15:46being constrained you can do the the sort
15:49of data prep step that's required in that
15:52step without
15:53having to sort of string out a bunch of
15:55tools just for you know for a single use
15:57case so that has
15:58this effect of making your workflow a
15:59little bit neater to look at and a little
16:01bit more tidy
16:02but it still keeps everything visually sort
16:03of detailed and explained and obviously
16:05when you
16:05click out like this it keeps everything
16:07nice and nice and clear so that's pretty
16:09much it that's a
16:10that's a really nice sort of quality of
16:12life improvement for Tableau prep I really
16:14like it
16:15this is going to make working with Tableau
16:17data sources a lot easier if you have these
16:19two tools
16:20Tableau prep and Tableau desktop or Tableau
16:22web authoring if you're sort of in the in
16:24the
16:24web authoring mindset then this is going to
16:26be quite cool because you can solve a lot
16:28of
16:28problems especially a lot of problems that
16:30would have typically involved scaffolding
16:33data but then
16:33at the same time with things like the data
16:35model and this capability you can really
16:37really go
16:38really far in terms of making this work out
16:40the best example of this I can think of is
16:43support ticket data where you're trying to
16:45always figure out how many days was the
16:47ticket open for
16:48okay and essentially when a ticket is open
16:51you essentially going to want to fill until
16:53the date
16:54today because the ticket is open and when
16:55the ticket is closed you're going to want
16:57to fill
16:57the date between the days that it's open
16:59and so what you can do with this which is
17:01really really
17:02nice you can feed Tableau prep with all
17:04your support today with all your support
17:07desk data
17:07you'll have a start date and an open start
17:10date and an end date in terms of when the
17:12ticket was
17:12open when it was closed you can fill in
17:14between each and every step when it was
17:17escalated and so
17:18what you can do is you can create this
17:19really cool visualization where you can
17:21sort of show
17:22the cumulative total of open tickets at
17:25different statuses and so on and so forth
17:28what I'll probably
17:28do is I'll have to create a custom data set
17:30just to show you this example and how easy
17:33Tableau prep
17:33makes that kind of question before in Table
17:35au desktop you had to do something really
17:37nasty
17:38which was to union the data on itself and
17:40snapshotting you know the dates and saying
17:43oh this is the creation date this is this
17:45tier one day and so you'd create these sort
17:47of snapshots
17:48union them on top of each other then you'd
17:49be able to do your analysis but on top of
17:51that you'd
17:52also have this complexity that you'd always
17:54have to have the full range of data in your
17:56sort of
17:56window for the running total of open
17:58tickets to work so if you had three years
18:00worth of data you
18:01had to figure out a way of keeping three
18:03years worth of data in view with this you
18:05don't need
18:06to because you essentially have all the
18:08data on any given day sort of built into
18:10the data set so
18:11you can just look on a specific day and ask
18:13yourself how many tickets are open and just
18:15get
18:15the answer really super simple so I've just
18:18talked myself into doing another video on
18:20that particular
18:21use case but nevertheless thanks for
18:23watching and I'll catch you in the next
18:25video and I love that
18:26that little steam popped up as I was doing
18:28this video it's kind of a funny side effect
18:31of doing
18:31it on windows you get all the junk you can
18:33normally sort of isolate elsewhere but
18:35nevertheless
18:36thanks for watching and I'll catch you in
18:37the next video
Generate a set of missing rows in a table with Tableau Prep based on dates, date times or integers. This allows you to fill gaps in data saving you from having to scaffold your data or carry out complex SQL queries or table calculations in Tableau to build your visualisations.Data set used in the video: https://j.mp/3jTMwGf 0:00 Intro1:36 Setup 5:46 How generate rows works6:17 Using it with numerical values. 8:56 Use a single date 12:40 Use two dates and fill between them16:11 A future use case I’ll build on this channel.