Table Extensions for Data Science and more : New in Tableau 2022.3 | Data Science in Tableau
I knew almost nothing about table extensions, so I called in Charles Laporte to hack the feature with me on camera.
- Table extensions require an analytics extension (TabPy, Rserve, etc.) but, unlike script calculations, they run at the point of data connection and return a full table or data frame rather than a single desegregated array.
- Tableau sends data to R or Python as a dictionary, not a data frame, so in R you must convert it to a list then a data frame before functions like LM() will work, and the broom package's augment() returns fitted values and residuals as a usable table.
- Table extension data cannot be extracted, it is always live, so if your analytics extension stops running the workbook opens blank and you must reapply the script from the data source page.
- The feature is best treated as an exploratory or proof-of-concept tool, ideal for calling APIs (YouTube, Google search, useless facts) and advanced data prep, rather than for deploying machine learning models in production.
- Table extensions can sit on the logical layer and be related to other tables in a data model, letting you blend API or model output with your existing data in real time.
- Introduction and meeting Charles0:00
- Analytics extensions versus table extensions1:28
- Clarifying Tableau's extension types7:13
- Advanced analytics use cases8:44
- Penguin linear model walkthrough9:48
- Setting up the table extension script13:21
- Building visualisations from the model24:13
- Calling APIs at point of connection29:29
- Snowpark and production trade-offs45:12
- Evolving Tableau developer roles51:00
- Wrap-up54:27
0:00Hey, it's Tim here. In today's video, we
0:01are looking at table extensions. And for
0:03this video, we're doing something very,
0:05very different. The truth is, I don't know
0:07much about table extensions. And in fact, I
0:09don't even know anything about analytical
0:11extensions, apart from how to set them up.
0:13So what I did is I sounded the horn to the
0:15Tableau community, and a whole bunch of
0:17people reached out.
0:18Charles Laporte was one of them. And what
0:21he did is he walked me through how to set
0:22up table extensions, including an
0:24analytical extension and how it all works.
0:27We did a bit of a comparison to how they're
0:29different. But then the main thing we also
0:31did is we talked about some use cases. As
0:34of everything is timestamped below. So if
0:35you want to skip ahead, because you know
0:37something or you just want to sort of see
0:38the most interesting section of the video,
0:40then by all means use that. We're going to
0:42start off the video with Charles Laporte
0:43introducing himself. But as ever, if you've
0:45got any questions, leave them in the
0:46comment below. I'm sure me or Charles will
0:48probably reply
0:48to you. And yeah, as always, let's get
0:51stuck in.
0:51Okay, so my name is Charles Laporte. I am
0:55working for a company called Bistory. We
0:58are a Tableau partner, and I'm an analytics
1:01consultant. So my day to day job is going
1:04to Tableau and help people get the best out
1:07of their Tableau experience as well. And
1:11that goes from training, but mostly support
1:15and dashboarding.
1:16And I love to work in Tableau. And I love
1:19to work with other programming language
1:22like Python and HAR. So Tableau extension
1:25is really something I was looking forward
1:27to.
1:27Yeah, I guess like you've used, I assume
1:30you've used analytics extensions already,
1:33it might be worth for everyone's benefit,
1:35just explaining the difference between the
1:38way analytics extensions worked before with
1:42calculations and why it's different with
1:44Tableau extensions.
1:46Yeah. So before, when Tableau released
1:53their first analytics extension, you could
2:01already connect to Python using the Tap
2:08R serve, which is actually a library in R
2:12creating this R server where you can push
2:14code and then it comes back. And you had to
2:16use, like you said, a calculation. So if
2:21you're in Tableau, you might remember that
2:23you have to script underscore something
2:26calculation, which are table calculation
2:29that allows you to send some code towards
2:32an analytic extension, and then it goes
2:34back as data.
2:37The good side of it is that you could do
2:40advanced analytics. So for instance, you
2:43could deploy a very complex model where,
2:46for instance, you analyze data. And I think
2:50Tableau had a very good example with cancer
2:52data. So I think it was breast cancer data.
2:57And you could, through parameter, you could
3:00enter some data, and it would return the
3:04probability of cancer or not.
3:07Right. And that was through deploy model on
3:10Tap by the downside. The downside of this
3:13is that we are talking about table
3:15calculations. So if you remember correctly,
3:19table calculations need to be compute
3:22following a certain scope and certain
3:23direction. So it needs to go on desegreg
3:27ated data. And that's why, for instance, I
3:30if I request my, my model to return the p
3:35value, it would return the p value.
3:36It would return the p value for each of my
3:36points in my view, which sometimes is not
3:36right. Not the not the best. Okay, so that
3:36's the start of an actually beautiful. Yeah,
3:36that's the start of the analytics extension
3:36. And then it evolved straight into Tableau
3:36with more calculations. I don't know if you
3:36know about the model calculations that
3:36actually also allows you to kind of
4:06get more information about your model. And
4:09then they released the analytics extension
4:13API.
4:13Right.
4:15Which actually allows you to use almost any
4:18programming language to create models and
4:21advanced analytics.
4:24I got that. So they they they built Tap Pi,
4:28there's the is it is our part of MATLAB or
4:32is MATLAB completely separate? MATLAB is
4:36completely separate. So HAR is the
4:37statistical programming language that was
4:39founded a couple of years ago already by a
4:44teacher in New Zealand. And now it has a
4:46pretty big community within the statistical
4:49world.
4:50I don't know if you know a company that
4:52previously was called R studio. They
4:55recently changed name to posits. Well, they
4:59are pretty strong in in the R community.
5:03And they actually the main IDE for R. And
5:06now they are trying to integrate Python
5:08users as well. Because yeah, you I don't
5:11know if you are aware of this Python versus
5:13our fights going on.
5:15Yeah. Yeah, there's always a fight between
5:18it. Someone will say our Python versus I
5:21don't know, and any any other program,
5:24right?
5:24Yeah. If you go into the programmer wall,
5:27people will say C sharp is better than
5:29Python. And then Python user will say
5:32Python is better than our and then our user
5:34will say, Look, I'm using R and Python. So
5:37I don't care.
5:39I think something is that's true with both
5:41R and Python is that they do rely on
5:43modules. So these are sort of packages that
5:45have been built specifically for the
5:48languages to make it easy to do certain
5:50things, just using a function, right?
5:53Exactly. So, as you go along with
5:55programming, you realize that at some point
5:58, you are repeating some tasks. And that's
6:00where the package going Zendy, because
6:03people just said, Okay, instead of doing
6:06your linear regression over and over, and
6:08then do all those steps, actually, we can
6:11create one function that just take up your
6:13data frame, and then returns you all the
6:15results.
6:18Right. And actually, there is a rule of
6:20thumb for that, that we could somehow apply
6:22to Tableau as well, like if you're doing
6:24things three times, or more than three
6:26times in your code, it's time to write a
6:28function.
6:30Right. Yeah, that's actually true, right?
6:33We, we, we try and write calculations, like
6:35templates, sometimes we try and think I've
6:38seen a few people write, you know, work
6:40books with pre done calculations, right? And
6:43I think many features in the Tableau ideas
6:45page are gonna have my snippets saved
6:49somewhere, right? So that you don't have to
6:51keep writing the same, if this then that
6:54calculation, right? So yeah,
6:56I would love that. I would love that for
6:59the day calculations, because many times
7:02you get this request on, oh, yeah, I want
7:04to have this year to date calculation,
7:06depending on a parameter, and then you
7:08start to start to mess up your mind around
7:11her.
7:13Now, at this point, it's probably worth
7:15explaining the term extension, because
7:17Tableau has recently started using terms
7:19everywhere. So let's start right from the
7:21top. Anytime you hear the term extension in
7:24Tableau, it simply means a capability that
7:26goes beyond the native capability of Table
7:28au itself. So you might have heard of
7:30dashboard extensions, they're designed to
7:32help you build functionality that Tableau
7:34don't build themselves and add it to your
7:36dashboard. You might also have had of
7:38analytic extensions, those are essentially
7:40capabilities that allow you to add another
7:43analytical component, in this case, our
7:45Python MATLAB or any other analytical
7:47service. And what that does is allows Table
7:49au to send the calculations to that service.
7:51That's essentially how analytics extensions
7:53work. Now table extensions require
7:56analytical extensions. But the difference
7:59with the table extension, and it changes
8:00the way the data model and the table works,
8:03when you bring a piece of data in it
8:04essentially happens at the point of
8:06connection. So it's an extension because it
8:09changes the way you connect to data allows
8:11you to go beyond what Tableau can already
8:12connect
8:13to, either by calling an API or by
8:15specifically calling a particular
8:16analytical function from an analytics
8:18extension. But the really powerful thing
8:21here is it does it at the point of
8:23connection and can also append it or add it
8:25to a data model for a live connection,
8:27which means it works in real time alongside
8:29your data set. So you can in real time call
8:32API's as your data changes. And so it's a
8:34really, really powerful capability. I
8:36thought it was just worth clarifying that
8:37before we carry on. So you're absolutely
8:39clear about all the different extension
8:41types we have in Tableau. Okay, let's hope
8:43back in.
8:43Yeah, to me, I think the offering of
8:46advanced analytics in Tableau doesn't
8:48really target the people that will make
8:51machine learning in production. I think we
8:54are touching to people that are doing ad
8:57hoc analysis within through Tableau. And
9:00then at some point, they are thinking, Okay
9:02, well, I want to confirm that hypothesis
9:04with a statistical test, which I cannot do
9:07directly in Tableau. So I can get help from
9:10from our or Python.
9:13And the good advantage of table extension
9:16is that before, like we said, it would
9:19return from a table calculation. And right
9:23now, you can actually put it back into a
9:26table, a data frame. And you can either
9:29choose to put it back as a new field in
9:32your data or a new table. So that's a great
9:36, great addition.
9:39Good work. Okay, so, um, do you have an
9:42example we can kind of walk through or like
9:45, what are we looking at? What is this?
9:48Well, yeah, this is already a bridge too
9:51far. So I'm going to start clean and create
9:55a new data source. And actually, I'm going
9:59to pick a text file. And in this case, I
10:02have a huge folder with a lot of data sets.
10:06And I'm going to pick my penguin data sets.
10:10It's actually something if you know Bistory
10:12, we love penguins. And I don't know if you
10:17know
10:17that. Well, no, you know, and there are
10:20funny stories about it. So penguins, pengu
10:25in data sets.
10:26Do you have a mascot?
10:27Yeah, we have a mascot actually for for
10:29that as well. It doesn't show off in the
10:32outside world. But internally, we are very
10:35proud of it.
10:36Inside, inside the lid as a mascot. All
10:38right. Okay.
10:38Yeah. So here we have our penguin data sets
10:43. So it's a famous data set. It's actually
10:48very famous in the R community. It's about
10:50polymer penguins. So a study made about
10:52penguins in the South Pole, sorry. So you
10:58have different species from different
11:00islands. And it's actually all Adélie
11:02penguins. And they collect the data about
11:08their
11:08pill and their flippers and their body mass
11:10. And what we want to do here, we want to
11:15create a linear model that could help us to
11:18understand the relationship between the fl
11:20ipper length and the body mass. Of course,
11:24of course, what you can do is just go into
11:26Tableau, create a scatterplot, where we
11:31have our flipper length and body mass.
11:34And yeah, I need to desegregate. So there
11:40we go. We have our scatterplot. I can just
11:43put a trend line and that's it. And now we
11:47know, okay, we have a pretty confident, we
11:50have quite high R squared. So that tells us
11:53that or the model that we have built now
11:55with this trend line accounts for 75% of
12:00the variance that we see in the data.
12:02So pretty good. And if you want to look at,
12:07if you want to speak about correlation,
12:09well, you can clearly see that there is a
12:11positive correlation here with this data.
12:14But what if we really want to check the
12:17assumption of this linear model? So I don't
12:21know if you know, but you have actually
12:23four assumptions behind linear regression.
12:28And one is very important is we shouldn't
12:32see any trends between residual and fitted
12:36values. So it should be a random dot cloud.
12:45The residual, so the difference between
12:49your actual points and the fitted points on
12:53the line, that distribution should be
12:58normal. And those kinds of assumptions, you
12:59cannot directly check in Tableau. Of course
13:00, you can make hypothesis. I could easily
13:02calculate the difference between my actual
13:05value and the fitted value of the model
13:08because I have the equation here.
13:10But yeah, that's some extra steps. And that
13:15's where Tableau extension comes in very
13:17handy. So let's go back to our data source
13:21page. And as you may have seen in the
13:23latest version of Tableau, we have table
13:26extensions.
13:28So I'm going to remove my Penguin table and
13:31I'm going to pull up table extensions. And
13:34the first thing I'm going to do is follow
13:37Tableau's instruction. I'm going to drag a
13:41table here. So I'm going to look for my
13:46Penguin file, put it here.
13:53One really important point, set your
13:55analytics connection. So if you go to the
13:59up menu, setting performance, manage
14:02analytic extension. I already set it up
14:05front.
14:06So through my command line, I open R and I
14:11use this command, colon colon rserv parent
14:16hesis. And now my rserv is running and I can
14:22test it here. It should return a
14:25successfully connection. Yes. So I can save
14:28and close this.
14:29Is that what you've got in the command line
14:32? Is it something you trigger or how did you
14:34start that up?
14:35Yeah. So rserv is a specific library of R.
14:40So I could have started R studio. So I have
14:44R studio here. I could have started R
14:46studio and load the library and then run
14:48the command rserv. Usually what I do, I
14:51just trigger it through the command line.
14:53You can also, if you want, trigger it in
14:57other ways. I know that I'm not using the
15:01most convenient way, but for what I'm doing
15:05with R. Yeah, exactly. Exactly.
15:07There's no right way of using anything. I
15:09always tell people, you're one of these
15:11people that says use it this way. Like you
15:14're not using it properly.
15:16Exactly. Well, there are some areas where I
15:19can improve that's for sure. But for now,
15:22that's good for me. Okay. Start again.
15:26Table extensions, we drag the penguins
15:29table. And as you see here, we have an
15:32input and output table. So our input table
15:36is our penguin.txt file.
15:38Our input table is now empty because we
15:41need to enter a script. And the script is
15:44kind of the black box right now because we
15:47don't have many information about the
15:50script.
15:50And to be honest, when I started to use the
15:53table extension, I was like, okay, script
15:56looks like a bit how we need to use a table
15:59analytics extension in Tableau prep.
16:02So I was starting to write a function. So
16:05if you remember correctly in prep, you need
16:07to write two functions, one to actually
16:11transform your data through Python and R
16:13and another one to get the output schema,
16:17for instance, if you want to have another
16:20table in another form.
16:21So that's what I did. And I ran into many
16:24errors. So I was like, Hmm, that's not
16:27correct. So went back to the actually the
16:29new features video of Tableau.
16:31And actually noticed something very
16:33interesting. They were using this dot arg
16:37and that's specifically for hard or
16:40underscore arg.
16:42And if you remember correctly in the script
16:46calculation, that's so you could call your
16:50field from your data source.
16:53So starting from there, I said, okay, well,
16:57let's just type dot arg one and see what
17:01happens. So if I click on apply, nobody
17:02should trigger something. Yes. Processing
17:05the request.
17:07And you will see that actually Tableau
17:09normally will return my whole Penguin table
17:12. So that's one way to interact. So if I go
17:16to output table, no, I have my data.
17:19It returns everything. Yeah, it returns
17:23everything. When you set arg one and then
17:24you ran the script, what happened is arg
17:27one is just a variable, I assume, right?
17:30Yeah. So our team president is we are
17:35telling her arg one is my table within my
17:40table extension. So I'm telling her look,
17:46print penguin dot txt. Exactly. That's what
17:52it's happening.
17:53And it's grabbing the data. Exactly.
17:58Processing, doing nothing to it. Exactly.
18:00It's just grabbing. So for instance, for
18:02now, if I'm saying okay, I want to return a
18:06specific column in arg if you use the
18:10dollar sign next to your data, and I do
18:14body mass in gram, and I click apply.
18:19Normally it should return only one column.
18:26I never tried it. So.
18:29Okay. Living life from the edge. Okay.
18:33Doesn't work. Okay. And I know why. Because
18:38through the many steps I made towards a
18:40good result, I noticed that actually Table
18:44au doesn't send the data to WatchR or Python
18:48.
18:48As a data frame as a table, it does send
18:51the data as a dictionary. So really, I don
18:55't know if you're familiar with dictionary
18:59in Python.
19:00Not fully. Okay. Explain it to me, maybe.
19:03Yeah. So basically a dictionary is, and I'm
19:06going to write the syntax here. I don't
19:09know if it's very clear, but.
19:12It takes up the name of your column. So let
19:15's imagine we have our species colon. And
19:19then within that specific for that specific
19:23key. Oh, like an array. Yeah. Like an array
19:26for that specific key, then you have some
19:27values. So then you have Adelie. And then
19:31in this case, it repeats itself for many,
19:35many times.
19:37Because different ways. Yeah. So it's
19:39similar to like a JSON array or something
19:42which uses slightly different notation, but
19:45it's the same sort of structure.
19:47Yeah, it's always the same kind of
19:49structure. So you have a key and then you
19:53have values behind it.
19:55And actually that's kind of tricky because,
19:59uh, in Python, it's quite okay because
20:02usually data is interpreted as a list and
20:05dictionary and list are pretty close to
20:07each other.
20:08So it's easy to work with in R they
20:11actually are using a class of, uh, or data
20:14type that is not used anymore called the
20:17parallelists.
20:18So dictionary in R are not very common
20:22because R most of the time works with
20:26matrix or with data frame.
20:29So I'm going to switch back to my actual,
20:32uh, example where I already did more code
20:36and you see here, uh, that I, so I created
20:42a linear model.
20:44And for that, I'm using the LM function,
20:47which stands for a linear model. And if you
20:49look at the data, I actually convert it
20:52first.
20:53So what I do is I first convert my R one to
20:56a list because yeah, R doesn't interpret
20:59for the LM function.
21:01You need to use a data frame or list. So I
21:03first use it, uh, covered as a list.
21:06And I decided to convert as a data frame
21:09because you know, the steps I made, I
21:11realized that that might be the most
21:14convenient.
21:15And then I, uh, saved everything under my
21:22model.
21:22And from there, uh, I cannot use the model
21:28object directly because it doesn't return a
21:34data frame. It returns, uh, an object of
21:36class model, uh, which is not interpretable
21:37directly in Tableau.
21:37And I'm actually loading another library
21:40called broom and the broom library actually
21:43allows you to return two things.
21:45So you have two main functions of the
21:47augment function, which allows you to
21:49retrieve the two fields that the fields
21:51that you are using in your calculation.
21:53So in this case, I'm trying to predict body
21:57mass, uh, from flipper length and gender or
22:02sex.
22:03And the augment will return me a data from
22:05where I have those three columns plus
22:08defeated values, plus the residual.
22:10So really all the information that you can
22:13see here, all the information I could get
22:16from my model, which will be very useful
22:18when I want to build a visualization based
22:22on my model.
22:24Then if we go a little bit down, I added
22:28some more code to retrieve something else.
22:32I will come to that later. And in the end,
22:36I'm actually returning a data frame.
22:40Uh, and that's something I asked to Tableau
22:43as well, because I it's a bit cryptic to me
22:45, but somehow when you're working with
22:48Python, do you need to go back to a
22:50dictionary, but in R sometimes data
22:52framework directly.
22:54So it's something I, um, I dropped the
22:57question at Tableau like, okay, what's
23:00going on there? Yeah.
23:02They said documentation was, was coming
23:04soon enough. So I was going to say the
23:06documentation is kind of tough in this, in
23:08this space, isn't it? Cause, uh, you have
23:11to work on assumptions and little snippets
23:14from videos, right?
23:16Yes. Well, to be honest, I think, I think
23:19if you were in the, if you're part of the
23:22data dev community, I think they hosted a
23:25couple of sessions about it, but besides
23:28that, if you read the Tableau help, it's
23:31pretty, pretty empty at the moment.
23:34And I think for people to engage in the
23:37community.
23:38Yeah, I think the best tip would be to
23:40engage your community and, and honestly, uh
23:42, that's what I've been doing the last
23:45couple of days. I've seen some people
23:47asking around like, okay, what, what do I
23:49need to do with this?
23:50Because I want to use it. And I've seen
23:52some example from Tableau where, um, they
23:54have actually pretty cool example.
23:57And like, I think in the feature video,
24:00they are actually using a sentiment
24:03analysis for tweets, which is super great
24:07to do.
24:08But if you don't know the syntax of the
24:10script, you're quite blocked.
24:13So if we go back to my example with a Bingo
24:20model, so now I have this data frame with,
24:22uh, so if I go to the output with all my,
24:23uh, statistics, statistics around my model.
24:25So I have my data here and then I have my
24:28residuals and I can start actually already
24:33to build some visualization.
24:36So I did here. So I created this scatter
24:40plot using body mass and flipper length,
24:43and then I created dual access with a
24:45fitted value so I could return those, uh,
24:48trend lines.
24:52That's very cool. The value of this is,
24:56yeah, go on.
24:57Uh, yeah. So what I was going to say,
24:59actually at this point, what you see here,
25:02you could already do it with the previous
25:04version of the analytics extension in Table
25:05au.
25:06So using the script calculation, I could
25:10easily return a fitted, uh, column where I
25:14could, that I could use to make a scatter
25:17plot and through a trend line on it.
25:20So no, it's just another way. It's just
25:23instead of having a one array of data, I
25:25have actually a table behind it.
25:28Is this faster because it's not having to
25:30do it every time it loads the chart. So it
25:33's pre-computed in the data set. Is that a
25:36performance benefit?
25:38That's a question I'm actually asking
25:41myself every time I'm using it because the
25:45first time I tried it was super, super slow
25:49and I used it through R serve on my local
25:52machine.
25:53I use it through top on my local machine
25:55and I use it as well with a top by instance
25:58that I have on Heroku.
26:00And yeah, well, performance is really
26:03variable. Like sometimes with the Heroku
26:07instance, it was super fast.
26:09Sometimes with the R local R serve, it went
26:12super slow.
26:14The thing I notice is, uh, and you didn't
26:16see it because I opened the work before is
26:19if your analytics extension is not running,
26:22the work group will open blank.
26:24And even if you trigger the analytics
26:27section and you go through the help menu to
26:31set it up, it will remain blank.
26:33So you really need to go back to the data
26:35source, click apply on the script, and then
26:38the data will appear.
26:40Well, I guess, yeah, that's pretty, pretty
26:43new as well for, for Tableau.
26:46Yeah, they'll probably need to work on that
26:48kind of improvement because you would
26:49expect that behavior to be affected by the
26:51pause and the play button that we have in
26:53the interface at the top, rather than
26:56having to go back to the data connection.
26:58Yes.
26:59Well, also one other important point is
27:01that the data that you retrieve from Python
27:04and R cannot be extracted.
27:06So if you look at my data source here, I
27:08actually cannot use an extract.
27:11So it's always all alive. So it's really
27:13Tableau sending the data, retrieving the
27:16data.
27:17And if the analytics extension breaks, well
27:20, it breaks the workbook basically.
27:23High stakes.
27:26Exactly.
27:27But yeah, again, I think this is not meant
27:32for, uh, yeah, machine learning and
27:34production.
27:35And although we have seen some great
27:37example with Tableau in the past on, yeah,
27:39you can actually deploy your model.
27:42I don't think any companies or I believe no
27:44companies will ever use that in production
27:47and say, look, we put our model on a
27:51instance and that's what we are using.
27:55But for data and I was going to say for
27:57data prep, I'm wondering, I don't know how
28:00this works with Tableau prep, but for data
28:03prep where you can potentially bake some of
28:06those results in, it might be useful.
28:10But in that case, you're probably doing it
28:13outside of Tableau or in prep or in
28:15whatever other tool you're using anyway.
28:18Yeah. And actually that's one of the use
28:21case I wanted to put forward is so we have
28:24this statistical, all the statistical stuff
28:27and machine learning where you can build
28:29your model, but you could also do advanced
28:31data prep through Python and HAR.
28:34It's super easy through Python to pivot and
28:38you can actually have a better pivot than
28:40the built in pivot of Tableau Desktop.
28:44So things that are possible, but again,
28:47yeah, you need to balance your choice
28:50depending on, yeah, well, am I doing a one?
28:56Am I doing this once or am I going to do
28:59this and launch it into production?
29:02Because then, yeah, I wouldn't do it in
29:06Tableau.
29:07But let's say you want to make a Sankey
29:10chart and you need to union your data with
29:13himself and you don't want to use the data
29:15model for some XY reason and you want to
29:18use the analytics extension.
29:20Well, that's a good use case as well.
29:24And while we are speaking about use case, I
29:28think the one that people are most looking
29:32forward to is actually get API data.
29:36So if we go back to this, so our script is
29:40not bound to what's in the data model.
29:44So I can actually call something, an object
29:47that is outside of my Tableau environment,
29:51my Tableau workbook.
29:53And one way to show this is let's go back
29:58to this, to this other data source.
30:01And it's actually something I tested out of
30:05. Yeah, I was a bit lost in this table
30:07extension.
30:08And at some point I was like, okay, let's
30:10just type Iris. Iris is one of the built in
30:13data set of R and I clicked apply.
30:16I was like, yeah, you never know. Maybe it
30:18could work.
30:20And actually it works. So you can call
30:25other data sets directly.
30:27Yes. I guess this is an amazing opportunity
30:32for APIs.
30:33Yes. An example I've seen the Tableau
30:37shared with me was calling the YouTube API
30:40to go get statistics from YouTube.
30:44Another one was Google search. So you've
30:46got some terms in your data set and you
30:48want to go search the most commonly
30:50searched related terms for marketing
30:52purposes.
30:53So you can then build a dashboard on which
30:56terms to go and bid for and how much they
30:58cost.
30:59That kind of stuff would be phenomenal
31:01because at the moment there's still not an
31:04easy way to do that.
31:06Right. And this would lend itself, I guess,
31:11to would this be an ad hoc capability?
31:13Wouldn't be something you'd put in
31:14production, but potentially for someone who
31:16's more of a data scientist using a lot of
31:19tools,
31:20this could be a nice way for them to sort
31:22of build a picture to share with people.
31:24Yeah, I think what you can do is quickly
31:28build a proof of concept through this, like
31:31saying, oh, I want to retrieve data from an
31:33API and I want to link it with some data
31:37that we have in our database.
31:39I can quickly do it through Tableau and
31:41through the analytics extension without
31:44starting to create this big Jupyter
31:46notebook where it's kind of iffy to connect
31:49the database.
31:50I don't know if you've tried to connect the
31:52database through R. I don't I don't like it
31:54.
31:55That's why I like Tableau. It's just, oh, I
31:57need to connect to a database. Well, it's
31:59just new data source.
32:01I just pick it in, pick it, pick my DB in
32:04the list and that's it. Well, in R I have
32:07to start to type a string, a connection
32:11string and then username, password.
32:14So to me, this is the best of the two,
32:16especially like in case of you want to
32:18build a quick proof of concept, you want to
32:20prove something quickly by,
32:22oh, I want to link the database or data
32:25sets with my API data that goes super quick
32:29.
32:29And certainly I think API is something that
32:32I hear more and more also from clients, but
32:35also from other people within the community
32:37where they are saying, well, we are missing
32:41this API connector in Tableau.
32:43This connector, yeah.
32:45And of course you can use.
32:46Everyone thinks that connector is special.
32:48Yeah, yeah, exactly.
32:50And you have, of course, the Web Data
32:52Connector 3.0, which allows you now to
32:56actually, so before you had to use a web
32:58server to host the connector, apparently,
33:01no, you don't even need the web server.
33:03So I think we are getting closer to this
33:06API connector, but in the meantime, yeah,
33:10we are working.
33:11This is a nice way.
33:12And I don't have an example right now
33:14because I actually have one example.
33:17Sorry. So if I go to my Datadev instance, I
33:25made this useless fact dashboard.
33:28So you have a useless fact API, which is a
33:32very simple API that returns a fact, a new
33:36fact every time you refresh it.
33:41And it's very easy to interpret. So with
33:44the minimum Python code I could write, I
33:48could retrieve it, and I decided to make a
33:51dashboard.
33:52I forgot my site, of course.
33:56This happens to me every time.
33:58Yes, yes.
33:59I programmed a shortcut to remember my
34:03online instance URL.
34:06That's the thing. So when you're subscribed
34:09to the Datadev community, you get a Tableau
34:13site on Tableau Cloud.
34:15And I chose this very long name, bstory-
34:18charles-lapport-dev.
34:20And then they added this dev with a number
34:24and I cannot remember.
34:27So let's go to my test.
34:30My naming convention is really bad. I'm
34:32sorry for that.
34:34This is called test and then you have sheet
34:36.
34:36Everything is attached until it's final,
34:38final, final, final.
34:40Exactly.
34:43And in this case, what I did, I used the
34:45Tableau extension directly on Tableau Cloud
34:47because that's where it was.
34:49It was actually deployed before the release
34:51of 2020.3.
34:53So you could already use the Tableau
34:54extension in Tableau Cloud. And that's what
34:56I did.
34:57And normally a useless fact should appear
35:01in a couple of seconds. Oh, there we go.
35:04There's just kisses are called that because
35:05the machine that makes them looks like it's
35:07kissing the conveyor belt.
35:09Useless fact.
35:11I did not know that.
35:14That's a great fact.
35:18Well, it's a great, it's a great API too.
35:21But actually, I was starting to look into
35:25APIs and some person was sharing, OK, those
35:28are the easiest API you can use.
35:30And this one was like, yeah, it's super
35:31easy.
35:32It just returns the facts as text timestamp
35:35and that's it.
35:36And it's very easy to interpret.
35:38And actually, if we look into the workbook
35:43itself.
35:44So let's use a little bit web authoring.
35:47Now we go to the data source, you will see
35:53that my Python code is ridiculously short.
36:01So if I go into this, you see it all in one
36:10, two, three, seven lines. So importing the
36:18important libraries to the request to read
36:18API data, pandas to do the Gson parsing.
36:21And basically just, OK, I have this useless
36:23fact.
36:24And actually, I think it works in my
36:26browser as well.
36:28So if we want to have another useless fact.
36:32There we go. What does it say? Hot water is
36:35heavier than cold water.
36:38You always learn something with this.
36:41And you see this.
36:44I did not know that. Why would it be
36:46heavier?
36:48I have no idea. Now I'm getting into the
36:50science of it.
36:51This is a completely distracting thing.
36:58Yeah, but there's a thing with this API as
36:58a side note.
36:58I'm also a person that can get attracted
37:01out of things I read like.
37:03And those kind of things.
37:05I'm triggered to look through Google like
37:08why is hot water?
37:10Explain why. Yeah, explain to me why.
37:16Oh, God. So this is quite nice. This is a
37:19Python example.
37:21The things you're importing. Do those have
37:24to exist in your tab pi instance?
37:26Or are you just calling libraries from the
37:29Python?
37:30My example here is I use something called
37:33Homebrew on my Mac.
37:34And with Homebrew, you can just call
37:36libraries from wherever and then it imports
37:38them and installs them.
37:39What's going on here?
37:42Well, so I'm actually using my data dev
37:47papa instance on Heroku. So you can
37:50actually deploy it yourself.
37:52I don't know the limitations, actually.
37:56It was quite surprising that they would
37:58allow me to deploy it for free.
38:00But afterwards, I discovered that you could
38:02also have a free Postgres database on Her
38:06oku.
38:06That's changed recently.
38:09But basically, that's there. And I think
38:12those libraries were already in the tab pi
38:15instance.
38:17Yes, they're already included.
38:20I think it is possible to add libraries to
38:24it in the settings.
38:25I think there's a settings file where you
38:27can add, you can tell it what libraries to
38:29include.
38:30Well, if we go to the tab pi instance, I
38:40can quickly...
38:40Yeah, of course, I need to enter a password
38:48.
38:48I'll blow this out, don't worry.
38:52Whenever I edit videos, I'm paranoid
38:55because the amount of videos I make and the
38:59URLs and logins I share,
39:01I'm going to get dosed by someone. Whenever
39:04I edit video, I'm religious about blurring
39:07out user names.
39:08Yeah, but you're right to do so because we
39:12live in a crazy world and people like to
39:15act.
39:17Yeah, like a social engineer even just with
39:21me.
39:22Yeah, and also people like to do that. That
39:24's what we do now with the table extensions.
39:26That's why one of my colleagues said you
39:28are really hacking the feature.
39:30There is no documentation whatsoever, but
39:32you have spent some time just to find out
39:35how to use it.
39:36Just to find where it works. True, true.
39:39I was saying, yeah, if you look at the...
39:42That's a tab pi instance running on Heroku
39:44and you see I don't have any deployed
39:46models.
39:47That's also something you can do.
39:49If you work from Jupyter notebook for
39:51Python or even through VS code, what you
39:54can do,
39:55you have a specific function within the tab
39:58pi package that allows you to deploy
40:00something.
40:01Even if the library doesn't exist right
40:03away in your tab pi instance, you can
40:05actually deploy it.
40:07And that is something I never used in the
40:10past.
40:11But basically I could just deploy a
40:14function with my useless fact model to make
40:18it easier for people just here to say,
40:20"Okay, use this." And then it would
40:24retrieve a useless fact anytime I do it.
40:28One important point when you are calling
40:30API and you may have noticed it is that I
40:33don't have data here.
40:35So that's a bit the weird side effect of it
40:42.
40:42And you have the same in prep is you need
40:44to create a connection to a random file.
40:46Here I picked up the global superstore
40:49Excel file.
40:50Just to have this, okay, Tableau has a
40:54connection and I can use my table extension
40:57.
40:58Okay. And it's probably worth talking a bit
41:03more here because in our example,
41:07you just connected directly to the table
41:10extension.
41:11But am I correct in thinking you can still
41:13use a table extension as part of a data
41:15model?
41:16So you can relate whatever you're computing
41:19in the table extension to a data model,
41:23which then makes it part of your sort of gr
41:25ander picture.
41:26Yeah, exactly. Exactly.
41:29That's amazing.
41:30If I go back here, I could close this.
41:33So you see my table extension leaves on the
41:36logical layer and I could just take my pngn
41:39file
41:39and create a relationship between the two.
41:42Of course, here it will work because I
41:44loaded the iris table.
41:46It's going to try and create like a...
41:48Yeah. Interesting.
41:50So it's super powerful then.
41:53I think this is one of those things where
41:55maybe we're just going to continue to see
41:57people doing things with it.
41:59And every time it will be new because the
42:02effort is really on the organization or the
42:05individual to sort of build their own
42:08solution.
42:09So this is one of those feature sets where
42:11Tableau are giving, I guess, developers the
42:14tools to go and build these very bespoke
42:16solutions
42:17where it doesn't make sense for Tableau to
42:20be building a connector for penguins.txt,
42:23the connector for the Google API, all these
42:25small things.
42:26Yeah. And I think you know how Tableau has
42:31been working on leveraging AI on the
42:35platform.
42:37We have AskData, we have ExplainData, but
42:41still those remain data stories.
42:45And now we have DataGuide, which is not
42:47really AI, but help us to understand the
42:51full story.
42:52And I think this analytics extension still
42:55remains a little part of it.
42:57I hope that this table extension will
43:00actually make it grow a little bit bigger
43:03because the limitations that we had before
43:06with table calculations are not existing
43:09anymore with this table extension.
43:12Super cool. I'm still waiting for someone
43:17to... I don't know if you're aware, but in
43:18the preview of 2020.4, there's a new data
43:24role called image roles.
43:26And with this capability, I'm waiting for
43:30someone to write an API call to mid journey
43:34to create an image based on a description
43:37in their data set,
43:38and then load that image as an image role
43:41in Tableau. That will be the demo that I
43:43think goes down.
43:45Maybe you can build it. That will be the
43:46demo that will blow people's mind because
43:49that's just one of those sort of full
43:52circle platform sort of showcases, right?
43:55Yeah, yeah, indeed. Yeah, that should be...
44:01I'm thinking about it now. That should be
44:03possible. Something I wanted to do...
44:06It just depends. Yeah, go on, sorry.
44:10One thing I would like to have here is
44:14actually the ability somehow to integrate
44:19parameters because while I was working on
44:22this model,
44:24it wouldn't be great if actually people
44:26would be able to change this formula to
44:29whatever they want based on the parameter
44:32because right now it's pretty fixed.
44:35So if I'm the data scientist working on
44:37this workbook, it's fine. But let's say
44:39someone else wants to work the workbook.
44:41He needs to go to the data source, write
44:44this, run the...
44:47If we could just be in this dashboard I
44:50made and just, okay, I want to change the
44:57formula and not flipper length and sex, but
44:59I want only flipper length and bill length,
45:03for instance.
45:04That could work as well. Yeah. And then it
45:07reruns the model and boom, you've got it.
45:10Yeah. Interesting.
45:12And in terms of... I'm going to sort of go
45:16on a tangent here. Do you work with Snow
45:20flake a lot?
45:21I personally don't, but I did have a couple
45:26of demo of Snowflake.
45:29Yeah. So the area that I'm sort of
45:31interested in is Snowflake also talk a lot
45:34about Python and Snowpark, right?
45:36So the ability to run Python in the Snow
45:39flake platform. And so I wonder, like, is
45:44the benefit of doing it in the database?
45:46Is that it's done outside of your workbook
45:49and then your workbook can just call the
45:51results versus this, which is, I guess,
45:54more, as you say, like the prototype.
45:56You're just trying it out. You're trying to
45:58see the outcome. And then once you have the
45:59outcome, you move it further back into the
46:02stack so that it's done in a more
46:04operational, visible way that other tools
46:07and other analytics tools can also see.
46:09Yeah. To me, and that's why I think it's
46:13also why those analytics extensions remain
46:17quite small in the Tableau world is that at
46:21some point people realize that if you want
46:25to launch a model in production, you cannot
46:28do it directly in Tableau.
46:30You need to go up front in the web. Yeah,
46:34you need to do it before in the data prep
46:39step. And I think this integrate Python to
46:41Snowflake is a great addition to that
46:42because it limits the connections towards
46:47other services.
46:49And to be honest, I haven't looked into
46:52detail how they do it. But if it's
46:55integrated properly, then you really have
46:58an advantage.
47:00And I think that's what people are looking
47:03now on how to, how can I deploy a model and
47:06make sure it's working and have continuous
47:09improvements and continuous developments.
47:12And yeah, using Tableau extensions, the law
47:16doesn't allow this iteration, at least not
47:20as easy as it would be in a previous step.
47:24So the ideal solution then would be that
47:26when you connect to data, instead of having
47:29to like, copy and paste your code in or
47:32type your code in, being able to call like
47:34a library of code, I think it's called a
47:37notebook, right? So be able to call a
47:39notebook, where someone has already written
47:43code, it's visible to the organization.
47:45And if that code changes, your code changes
47:48as well. So there's a little bit of version
47:51tracking and capability there as well. So
47:54that that would be sort of a fully
47:56integrated sort of way of doing that.
47:59But, you know, it speaks to this idea, I
48:02think we've, we've lost track of this in
48:05the community that Tableau was always about
48:08exploration, as much as it is about
48:10building the dashboard and the final output
48:13.
48:13And this is a feature to me that I think,
48:16yeah, this is this is like giving you the
48:19tools to go out on an expedition and find
48:21the insight. And once you found it, you can
48:24think about how to bring that to more
48:26people in a much easier way. But it's it's
48:29a primarily exploratory toolset, right?
48:32Yeah, for sure. And linked to what you say,
48:35I think explained data is also a major
48:38changer in that game as well. Because
48:41before when you were going to explore it,
48:45to exploring your data, you had to combine
48:48yourself, oh, okay, I have this dimension,
48:52I want to see if there is a relationship
48:54between this dimension with that measure to
48:56other dimension, and it would take you
48:59quite some time.
49:00While now I can just take this point, click
49:04on the light bulb. And then here it says,
49:07Okay, some of body mass is high. Do you
49:10have an explanation for it? Well, actually,
49:14yeah, in this case, it doesn't have really
49:17have a great explanation. But at least you
49:20have leads towards a great explanation or
49:25potential explanation.
49:26Yes, exactly. Which is for me, a big game
49:29changer when you do data exploration. And
49:33in a business context, of course.
49:37Yeah, I think it's sorry, I think it's also
49:40important to take into account the
49:42knowledge that people have about their data
49:44. Because some people asked me, Yeah, but
49:47this is a bit rubbish, because it doesn't
49:50say anything. I said, Yes, but if you are
49:53someone with, with the business knowledge
49:56behind data, things can become very clear
50:00very quickly.
50:02Yeah. And it's, it's a really important
50:05step. This is I sort of see this as a
50:07painful step towards this feature getting
50:10better, because it's only by people using
50:13this initially and sort of iterating
50:15through it, you have to build it in public,
50:18you can't build it in private and then say
50:22that, because it doesn't get good that way
50:23these AI machine learning tools get better,
50:26as people use them, and they start
50:28collecting that sort of usage.
50:31And then they start to be able, you know,
50:33three, four years down, I mean, Google
50:35search, you know, we all now think Google's
50:38probably the best search platform, but it
50:40started off pretty average. And then over
50:42time, it's been built in public, and it's
50:44gotten better and better and say, Yeah, now
50:47, now, now it is what it is YouTube
50:49algorithm is a similar thing, right? And
50:50tick tock algorithm, all these things, they
50:53built in public. So it's amazing. This has
50:56been really, really useful. I think what
50:58this has taught me is that
51:00this is something I've been thinking for a
51:02while, which is like, the Tableau developer
51:05is becoming a really interesting role.
51:08Because on one side, you've got this, what
51:11I consider the exploratory sort of person,
51:14the person who sent out to go and find
51:16insight, then you've got the now analyst
51:19who just builds maybe ad hoc reports for
51:23their own purpose, so they can talk to
51:25people and tell them what's going on. Then
51:27you've got consumers who are consuming all
51:28of this content, and then you've got the
51:28who are consuming all of this content,
51:30wherever it's built, whether it's the
51:31analyst, he's built or the developer has
51:33built it. But I think in the past, we just
51:36used to say, I, you know, everyone who
51:38works in Tableau desktop can do all of
51:40these things. And actually, this highlights
51:43to me that now there are starting to become
51:45disciplines in Tableau, you know, where if
51:48you're good at, I'm just going to call a
51:51statistical analysis, data science, and you
51:54're familiar with the tools in that space,
51:57then, you know, this is the area to really
52:00focus on, invest your time and become
52:02almost an expert in this. And don't worry
52:04about all the other stuff like dashboarding
52:06, design and stuff, right? Just become good
52:09at this. And so other people will be good
52:11at dashboards, and other people will be
52:13good at, you know, writing calculations.
52:16And the bit that's missing is the
52:18collaboration capability. Like we all can't
52:21work on the same workbook at the same time.
52:23That's, that's the difficult thing, right?
52:25And when Tableau
52:26brings that capability, I think we'll start
52:29to see that sort of happen a bit more,
52:31where someone can come in, build the
52:33connection. And once the connection is
52:35built, the person who really understands
52:37the data or the analyst can put together,
52:39you know, a couple of quick charts, then a
52:42UX designer can come in and make that
52:44interactive and really nice. And everyone
52:46can collaborate at the same time. You know,
52:50I think you're right. I also think that the
52:53way that Tableau is seen, you know, is
52:56pretty much in silos, like, okay, you have
52:59people that have a role of data creation,
53:02data source creation, so the published data
53:04source, and then from there, explorers and
53:07creators can access data source. And they
53:10will focus on either doing exploration or
53:15making dashboards that will be used day to
53:18day.
53:18And it's my, to my experience, that's what
53:20I see most of the time, right? The data
53:22exploration part is less used, because
53:25people see Tableau as the dashboard slash
53:28report tool, where before I was doing it in
53:31my pivot table in Excel, no, I'm actually
53:34doing it in dashboard on Tableau.
53:38But again, yeah, it's very siloed, like,
53:40okay, you have dashboard creation, and then
53:43you have data expression, while the two
53:45could be linked together. And I do tend to
53:47see more and more people going in that way,
53:49like, I need to build a dashboard. But
53:52first, I want to do some exploration in my
53:53data.
53:56And like you said, yeah, good roles, roles
53:59are evolving. And I think it's getting
54:01better and better. And through the Sales
54:04force, there's more salesforce. Yeah, there
54:06's more to do in through Salesforce, it's
54:09getting even more and more to do because
54:11Tableau is really getting integrated into
54:14Salesforce. And yeah, I'm not even into the
54:18Salesforce world right now. But I'm pretty
54:20sure that you've heard of CRM analytics,
54:24which is also a big game changer.
54:26Okay, so that's it. I've actually decided
54:28to cut the video right here, because we
54:30went on to talk about a bunch of other
54:31things that have nothing to do with table
54:32extensions, and the video was already long
54:34enough. But actually, if you're interested
54:36in watching the rest of the video, it's
54:37about 15 minutes is something that I've
54:38uploaded to our members. So essentially, if
54:41you're a member of the channel, you can
54:42actually see some of the behind the stuff
54:44content that I create for the channel, and
54:46stuff I get up to around the YouTube and
54:48making content here. So I've uploaded it
54:50for members of the channel so they can
54:52check it out. But anyway, thanks for
54:54watching this rather different video.
54:55If you've got any comments or opinion or
54:57feedback about the video, let me know below
54:59. I'm also going to make a couple more
55:00videos about table extensions. This is by
55:02far not it at all. So there's so many
55:04different options. I spoke to so many
55:06different people about these. And I've also
55:08got a couple of use cases that Tableau sent
55:10through that I'm going to get stuck into
55:12next week once we start to actually
55:13understand how to use this. And once I'm a
55:15little better versed at using the
55:17capability. Thanks for watching, and I'll
55:19catch you in the next video.
55:20Transcribed by https://otter.ai
55:26[ Silence ]
Charles Laporte from Biztory took me through how the new Table extension capability works, and we spent some time discussing some of the possible use cases that come from the feature! This is longer than normal, so use the timestamps to jump ahead and or come back and carry on from where you left.
Join the Tableau developer programmer here: https://community.tableau.com/s/developers Follow Charles on Twitter: https://twitter.com/charlaporte Tableau documentation https://help.tableau.com/current/online/en-us/tc\_table\_extensions.htm Also, check out this video : https://www.youtube.com/watch?v=ATR8WVp9ovo&t=834s
Tableau Release notes: Table Extensions allow you to create new data tables with an analytics extensions script. You can write a custom TabPy or Rserve script and optionally add one or more input tables. Table extensions are supported by Tableau Cloud, Tableau Server, and Tableau Desktop.
Timestamps: 0:00 Intro 0:52 Introductions 1:33 Analytics Extensions and Table Extensions in Tableau 4:22 Context for programming 7:13 What are extensions in Tableau 8:44 Who might use this feature 9:41 Demo of the feature 13:29 Using a Table extension 18:07 Dictionaries vs Tables 20:37 A closer look at the code 24:37 The result and performance 28:32 Discussing other use cases 54:26 More videos to come