Can Ai Replace a data analyst? with Rahul Trehan on Chat GPT Auto GPT & Tableau
Can AI replace the analyst? Rahul shows me a world where analysts matter more than ever.
- ChatGPT reads crosstab data as tokens rather than performing visual or true mathematical analysis, which is why numbers can fail to add up at higher 'temperature' settings.
- The system prompt assigns the model a character or role (for example, 'be a data analyst'), shaping the kind of output you get.
- AutoGPT-style SQL agents use toolkits to list tables, scan schemas, generate SQL via the LLM, fire queries against a data lake and self-correct failed queries.
- Large language models are statistical models of language, not maths engines, so expecting reliable calculations or maps from them misunderstands what they are.
- None of these AI extensions or agents are production-ready today; governance, auditing and number verification all remain unsolved work.
- Why AI replaces nothing but empowers analysts0:00
- Rahul introduces himself1:05
- Why AI took off in recent months1:27
- From playing with ChatGPT to building an extension3:57
- Demo: the Tableau ChatGPT extension7:56
- Temperature settings and accuracy9:32
- Understanding the system prompt11:59
- LLMs as language not maths models13:23
- Discovering AutoGPT and SQL agents14:41
- The agent workflow against a data lake17:42
- Live coding demo of the SQL agent23:15
- Enterprise readiness and governance33:37
0:00Can AI replace the analyst? That is a big
0:04question and everyone in the analytic
0:06space has been learning what AI can do and
0:08this has sort of been the question,
0:10the elephant in the room as it were, since
0:13the emergence of AI tools
0:14specifically ChatGPT. We've also seen lots
0:17of examples of Tableau with ChatGPT
0:19and lots of interesting new cases and I
0:21think it's fair to say that AI is
0:23definitely going to make an appearance in
0:25the product even more than it already
0:27has in the current set of features. But in
0:29this conversation with Rahul Trehan
0:32he actually shows me a world where I think
0:35analysts are going to be even more
0:38important. I'm a big believer that you
0:40should empower analysts rather than
0:41replace what they're doing because
0:43fundamentally they understand the
0:44business and in this demo, in this video, I
0:47think he shows a perfect use case that
0:49actually speaks to that solution. One more
0:52thing to mention, Rahul works for
0:54Salesforce, he'll introduce himself shortly
0:56, but when we had a chat we were
0:58just talking to each other as two
0:59individuals. He was Rahul and I was
1:01Tableau Tim. It's as simple as that. As
1:03ever, let's get started.
1:04Hey everybody, this is Rahul Trehan here. I
1:07have been with Tableau for last eight years
1:09.
1:09I have done a couple of roles. I've been
1:13with solution engineering, I've been a
1:14customer success myself and now I'm
1:17managing a team of customer success here
1:19in ASEAN. I'm based in Singapore. Yeah, and
1:23I look forward to this conversation.
1:26This is amazing. Thank you so much for sort
1:28of agreeing to talk to me. I think about
1:32three, four months ago AI was not part of
1:34everyone's everyday conversation and
1:37then suddenly out of nowhere it's just
1:39completely exploded and it's been
1:42an interesting journey. I'd sort of love to
1:43take your perspective on that
1:45transition before we get into what we're
1:47going to talk about. So are we looking at
1:49why is AI catching everybody's attention?
1:52Yeah, yeah. Why in the last four
1:54months and not because you know GPT and all
1:57these technologies have been around
1:59for some time but why in the last four
2:00months is it really sort of just
2:02taken off? And that's a good point, right?
2:04Because you know the AI models and
2:06machine learning models they've existed,
2:07right? I mean we've had some of these
2:09models around neural networks and other
2:11models around statistical models
2:13and predictive models around for a while
2:15but I think it's the first time
2:17to me it's sort of like a tipping point,
2:19right? It's like I mean the way I think
2:21about it is and I talked to a few folks
2:23about this it's like a browser
2:25being launched back in, the Firefox browser
2:28being launched, right? Yeah. I mean
2:30it's that tipping point that it makes it so
2:32relatable to our experiences. Now
2:35we're looking at a browser like experience
2:39where we
2:40sort of relate to the technology now, right
2:42? I mean nobody related to neural
2:43networks, nobody related to a predictive
2:47Bayesian neural network. Yeah.
2:49I mean that's too much off far away from
2:52where we are but I
2:53think now with chat GPT I think we for the
2:56very first time can really relate to
2:58it. Yeah I think they've potentially
3:01brought an interface to the
3:03technology, right? They've really made it
3:05accessible to a lot more people
3:07because I think the only sort of
3:09recollection of AI I have is in Gmail
3:11when you go and type the name and it
3:13guesses the email or it guesses the next
3:16word or sentence whereas you know chat GPT
3:19is going a whole bunch further
3:20you're giving it a question and it's just
3:22coming back with a whole very
3:24colorful response as well sometimes that's
3:26what sort of the delight
3:28about it isn't it? Exactly and you don't
3:29have to worry about what neural network
3:31is there, what predictive model is there. I
3:34'm a human, I have a
3:35cushion, go search whatever information you
3:38need to search, run
3:39through whatever information you need to
3:41transform it and it gets you something
3:42that we understand and we relate to which
3:45is why I think it has really caught
3:47everybody's attention. Good. It's
3:50fascinating and so let's sort of
3:53talk about sort of what you're gonna show
3:55in a minute but how did you
3:57go from you know what you're doing day to
3:59day to sort of building what you
4:00built like what excited you what sort of
4:03what was the spark moment in your
4:05sort of journey? So I was playing with chat
4:08GPT and of course just like
4:10anybody else I was looking at you know
4:11explain this to me explain that to me
4:13explain some code to me and I was you know
4:16doing all those those exploration
4:18with it and then I was like you know what
4:20happens if we really feed some
4:22cross tab data into it I was just curious I
4:24was like you know right now
4:26I'm feeding whatever the language I'm
4:28feeding to it and it of course goes
4:30and searches and explains stuff to me and
4:32it helped me to code and stuff like
4:34that which was great but then I was just
4:36curious to say like you know what
4:37happens if I just feed it a cross tab and I
4:39say like hey look at this cross
4:41tab and tell me what do you see summarize
4:43this for me right I mean
4:45instead of summarizing paragraph summarize
4:47this cross tab and tell me
4:49what are your insights and that was the
4:50curiosity that I had around sort of the
4:53you know the tab GPT extension and by the
4:55extension that I built I mean the
4:57early thought process was like I was just
4:59curious to see like what happens
5:00when you feed it data instead of words you
5:03've created some data and how it
5:05reacts it comes back yeah and so yeah like
5:09I guess so you had that idea what's
5:11the next step like how does someone go from
5:14I think we've all been messing
5:15around with chat GPT how do you go from
5:17that to you know heads down coding
5:19something and putting it out yeah so
5:20initially I was what I was doing is I
5:22took the underlying data from the wiz and I
5:25and I made those prompts and I was
5:27speaking with that right manually of course
5:28there was no coding it was no
5:29extension there was no tableau API is
5:32involved I was just manually taking some
5:34scenarios I would take the underlying data
5:36for a wiz I would see what kind of
5:38prompts I can provide I would say like hey
5:40here is a cross tab you know I'm be
5:43a data analyst so I'll provide it like what
5:45I wanted to do and then I would say
5:47like summarize it for me or find some
5:49insights tell me why the sales is low
5:51and I was just seeing like you know how it
5:53behaves for that cross tab and you
5:55know what prompts are working well and and
5:57and sort of you know what
5:58temperature settings right I mean we talked
6:00about it we're giving it more
6:02colorful insights versus more boring but
6:04predictive insights so I got a hang of
6:07it from that manual playing around then I
6:08was like okay you know can I automate
6:11what I'm doing today manually which is take
6:13the data from a wiz I copy it I
6:15click right click I say look at the
6:17underlying data then I copy it and then
6:20I paste it I formulate that prompt and I
6:23press enter button and my next question
6:24was could we actually just automate the
6:27whole thing so that a user doesn't have
6:28to do all these steps manually which is
6:31that so you kind of yeah you kind of
6:33broke it down into the constituent steps
6:35and you almost let's say bootstrapped it
6:37right you kind of you just manually tried
6:39to see if the workflow works right
6:41and and then okay what is each step in the
6:43workflow and then what does it take
6:45from an automation standpoint and the API
6:47is that exists in tableau to
6:49basically then code it all the way yeah
6:52yeah and the one one thing I want to
6:54pick up on is you keep saying you you're
6:56taking the crosstab data that's
6:58important isn't it because you know chat G
7:00PT can't do visual analytics like we do
7:02with pre-attentive attributes it has to
7:05actually look at the text and the numbers
7:07right correct correct so what it does is
7:09that it looks at crosstab and the words
7:11each token at a time right so each number
7:13becomes a token and then it
7:15basically looks at all those numbers
7:17individually and it also looks them as a
7:19holistic right gives them some importance
7:22and then runs through whatever
7:23your prompt is then say like what is what
7:26does it model mean for it basically
7:28yeah so it doesn't really look at numbers
7:30numbers it just looks them as
7:31tokens yeah exactly exactly and it does
7:35real data analysis right it's kind of
7:37it's the raw thing that we you know tableau
7:39exists for actually it's going
7:40back to the basics and processing the
7:42numbers and trying to sort of sort of
7:45pull out you know pertinent things that we
7:47should be looking at so and at this
7:50point is it useful to have a look at and
7:52maybe do a screen share and you can kind
7:54of walk us through sort of how it works
7:55just in case someone hasn't seen and
7:57then we can get more into it okay yeah so
7:59let's say you just have a simple
8:00dashboard I've got a line series here for
8:03superstore you know I'm looking at
8:04monthly sales by different years now what I
8:06wanted to do is I wanted to sort
8:08of you know bring in the extension to to
8:11talk to talk about or interact with this
8:14with along with charge GPT once you
8:16download the extension from the AWS link
8:20that I've hosted it on it basically becomes
8:22your local extension right so you
8:24go left inside and you know this extension
8:28is is on your laptop downloads
8:32the code onto your laptop and it starts
8:34basically running it from your your sort
8:36of extension window itself right so yeah
8:39and JavaScript code that runs inside
8:42this article when they're gonna download it
8:45right so it as you would see it
8:47pre-populates some of the things right so
8:50it pre-populates the API API that
8:52we're using yeah what it also means is that
8:54in future imagine if you don't want
8:57to use the cup you know the public open AI
9:00API and you may have your own
9:01commercial private one right you could
9:03easily update the URL of to point it to
9:07your private you know sort of API large LLM
9:10in large language model API yeah and
9:13then basically provide in this case of
9:15course since we're using open AI it's
9:17pre-populated with GPT 3.5 yeah when GPT 4
9:20comes in you could just I could just
9:22auto-update this use that yeah you can just
9:24use that model and you put in your
9:26own key right so once you use that key
9:29basically you're good to go and you see
9:32there's a there's this temperature setting
9:34which is quite interesting and
9:35I had some yeah Andy about this as well and
9:40basically saying like you know how
9:42much predictable you want the model to be
9:44what's this on the table right so yeah
9:46what it does is that if you make it very
9:48predictable the numbers become
9:50consistent but the insights are really not
9:53that rich anymore right the
9:54constraint yeah it stops adding and
9:56multiplying things and it just gives you
9:58a summary of whatever the cross-tipers
10:01where does if you make it creative like
10:02a hundred well the insights become really
10:05rich but then the numbers are not
10:07adding up sometimes and we'll talk about
10:09you know how the space is evolving and
10:11and I've just shared it with the latest
10:14space posts around the next phase of the
10:17agents and and how that would improve in
10:19terms of number accuracy but going back
10:21to it you save the config once you're done
10:23with it basically you know you're
10:26good to go you can forget it you select
10:28your sheet in the dashboard that you
10:30want to analyze and in this case what I did
10:32is I said hey take the values off
10:34you know you type whatever you want to type
10:36right so this was a popular portion
10:39of what's coming through you see like you
10:42know people wanted to understand how
10:44to make chat GPT not only you know explain
10:48the forecast model in Python but
10:51actually run it so what I did is you know
10:54you paste whatever you want to post
10:55so I say take the values of the sales from
10:57this previous years from this
10:59time run a forecast model and give me the
11:02output of predicted forecast sales
11:03values for 2023 don't have here and now you
11:07see that you know once you once you
11:08press that enter button it sends the data
11:11from the crosstab along with the
11:14prompt that you provided right right is
11:16your prompt yeah and then it says so
11:20what what what we're doing is we're saying
11:22hey chat GPT be a data analyst
11:24I'm gonna provide you a prompt right take
11:27the prompt into consideration and
11:29then I'm gonna provide you the behind the
11:31scenes the extension then gifts
11:32chat GPT the crosstab which is you know
11:35basically if you look at underlying data
11:37yeah it is yeah this this stuff okay stop
11:42right so we basically just take
11:44this this crosstab along with all the
11:46problem that you provided and one more
11:48system prompt we call it as they call it a
11:50system prompt which yeah yeah which
11:53tells it's basically what role is it
11:54playing and the role we're asking it to
11:56play is to be a data analyst yeah yeah so
11:59let's just let's pause on that point
12:01because I think not too many people are
12:03familiar with the concept of the system
12:05prompt the system prompt is a is a is like
12:07a way of basically yeah yeah giving
12:10it a character right giving us a sort of
12:13behavior that is reflective of the kind
12:16of outputs you want so in this case it's a
12:18data analyst but an alternative could
12:20be you know I'm a decision maker in a
12:23business or you know there's lots of
12:25little roles and they can kind of play into
12:27that yeah correct system yes as you
12:29just you know described it perfectly right
12:32system prompt is saying like what
12:34what role do you want chat GPT to play
12:36right it can take yeah of a translator
12:39could be a data data analyst it could be a
12:41travel specialist it be a trial yeah
12:44right doesn't exactly exactly it's you're
12:46giving a character and I think
12:48that is that is super interesting because
12:50we've touched on two things you
12:51mentioned earlier that you know this is
12:53using chat GPT the configuration allows
12:56you to put another one I think just last
12:58week data bricks announced something
13:00called Dolly right which which is like
13:02another large number one you can train
13:04internally but it's I fully see a future
13:06where companies might train their own
13:08LLM's internally in-house and they'll hire
13:11people to do that to do very
13:13specific things that could become a
13:14competitive advantage amongst
13:16businesses but also now I just want to
13:18interject in here before we go on to the
13:20next one because it gets exciting one thing
13:23I've just sort of completely
13:25forgotten about technologies like chat GPT
13:27is exactly like the naming of what
13:30they are the way they're classified they're
13:32classified as large language
13:33models they are essentially statistical
13:37models built around language
13:39specifically language what they're not is
13:42mathematical models they're not
13:43compute models all of these different
13:46capabilities do not sit squarely in the
13:49in the place of language models and so what
13:52I'm trying to say here is that I
13:53think maybe this is obvious to other people
13:56but for me you know when you try
13:58and do maps inside of chat GPT you shouldn
14:00't expect it to get it right
14:02because fundamentally it's not about maps
14:05it's about language and language is
14:07about speech so if you ask what is one plus
14:10one more commonly you should find
14:12the phrase equals two but in some cases and
14:16you know if you happen to train the
14:19model on a data set that was completely
14:21made up of jargon you will find
14:24instances in language where people have got
14:25the answer wrong and therefore it's
14:27also possible that one plus one does not
14:30equal two and all the other variants
14:32that come about anyway that's just a small
14:34thing I wanted to add in here
14:35before we get on to this next point because
14:37I think it's super important for
14:38setting the context let's get back into it
14:41you know chat GPT is a very
14:42generalized language model right you could
14:45actually purposely train a
14:46language model to do data analysis in very
14:49data analytical things right
14:51correct and which is where I think my
14:52latest post just went today because I
14:54was playing and I saw and we also responded
14:57me back to me saying like yeah
14:58this is it made him really happy because
15:01the challenge with this extension was
15:03that I mean and he did the right thing you
15:06know with the temperature setting
15:07being you know 80 or whatever you would
15:09find that the numbers do not add up
15:11sometimes there's a lot of inconsistency
15:13there yeah so what I was looking into
15:15it what kind of you know specialized agents
15:18are being built right so I yeah I
15:20uncovered this thing called auto GPT which
15:23is becoming you know a hype now
15:25but what I realized I discovered is that
15:27you can actually have agents which are
15:29specialized with toolkits right and if you
15:32look at the latest latest demo that
15:35I posted it's basically I used a sequel
15:39agent that uses chat GPT turbo 3.5 model
15:43but what it does is that you know what we
15:46can maybe double click into it but
15:48then yeah what it does is that it it gets
15:50rid of the problem because now you
15:52have a specialized application on top of
15:54chat GPT model but yeah should we should
15:56we have a look at some of the more recent
15:59stuff you've done with with auto
16:00GPT then and I think that people will be
16:02super excited about that because it's
16:04also interesting how quickly you've evolved
16:07what you started with right and
16:08you've sort of taken feedback you Andy's I
16:11was gonna talk about Andy cockroof
16:13videos you know showcasing the fact that it
16:15couldn't do some to base him out so
16:17this is almost a perfect response to that
16:19in a way because there's a problem
16:21and you're basically showcasing how quickly
16:24the technology is adapting to
16:26solve these emerging problems right I just
16:27got a ping from Andy on LinkedIn
16:29and he's happy now looking at whatever
16:31latest demo is I think he's happy
16:33because the numbers are adding up it's hard
16:36for sequels in action and I have to
16:39say like I think Andy gave me that that you
16:42know that push to say like yeah why
16:44are the numbers not adding up so that
16:47really triggered to see like what is
16:49this forefront thing what is this all the G
16:51PT agent thing what is this yeah
16:52blank chain thing so I started exploring
16:55from that conversation with based on
16:56what Andy provoked and then yeah that's
16:59that's and that's what this is about
17:00right because I loved his video as well I
17:02put a link to the video for everyone
17:03watching but yeah I that that really bug
17:06like really confused me that it couldn't
17:09do something so basic and yet here it was
17:11doing all this stuff in Python I've
17:13seen websites I've seen codes I've seen
17:16apps how is it doing all of that and yet
17:18it can't add up basically it doesn't add up
17:25numbers you know where it's it's
17:28failing adding of numbers quite
17:29consistently unless you drop that
17:31setting to zero and then it stops adding
17:32the numbers so it's like right then it
17:34doesn't adding in multiplication anymore
17:36and it just gives you more general
17:38general stuff yeah yeah good yeah let's me
17:42talk about the latest one and I think
17:44this will excite people into like just how
17:46quickly the space is evolving and
17:48and I think more around not just now
17:50looking at I think analytics output but
17:53also the workflow angle of a let me show
17:56yeah yeah yeah do some setting the scene
18:01for what this version is all about right
18:04yeah I mean it's it's auto GPT quality
18:07agents it's basically specialized sort of
18:11agents that can work with the LLM yeah
18:14to do things right so we're just talking
18:16about this right today the workflow is
18:18that you've got a data lake you know every
18:20time a user asks a set of business
18:22questions and scenarios a lot of heavy
18:25lifting needs to happen in the analytics
18:27layer right your metadata model the data
18:29you add all those metadata you're gonna
18:31do that prep and a lot of heavy lifting is
18:33required there right but now I mean
18:36and I'm going to show you guys the demo as
18:38well is that you know if I can attach
18:41a sequel agent to work with order to work
18:44with let's say in this case chat
18:47GPT the users questions are translated into
18:51SQL queries which the SQL agent
18:54can then fire against the data lake and I
18:56'll get the users the answers back and
18:58effectively doing all that modeling and SQL
19:02joints on the fly every single time
19:04basically that are fired now what are these
19:07AI agents capable of so they have
19:09toolkits right so if I if I talk about the
19:11sequel agent that I'm going to talk
19:12to you later it's a it's a long chain
19:15sequel agent and it has yeah certain
19:17toolkits that it's of liberty to it and
19:19those toolkits basically allows it to
19:21list the tables in your in data lake yeah
19:25it allows us to scan the schema of
19:27the relevant tables based on the cushion
19:29that was fired yeah and then it allows
19:31it to generate the sequel which it uses the
19:34chat GPT to get the sequel back yeah
19:37it asked the chat GPT to give it the sequel
19:39back but in the agent can then
19:40fire the query against the data lake and
19:43then give the observations back to the
19:45to the user basically right right so these
19:48are super toolkits that are labeled
19:50to the agent that's quite a powerful list
19:53of tool that like it on their own
19:56it's kind of like very basic things but
19:58actually together they amplify each
19:59other in a huge way right correct correct
20:02absolutely and then in scenario
20:04that you know I was playing around with and
20:06I'll show you guys the coding
20:07element of it and it's not very polished it
20:10doesn't have a UI but just to show
20:12the whole workflow end-to-end and how it's
20:13working in this case we have a
20:15music album sales database schema right it
20:18's running on my laptop just using a
20:20sequel I database it's something very
20:22lightweight but the concept remains the
20:24same you've got music album sales you've
20:26got the track tables you've got the you
20:28know artists table you've got playlists and
20:30invoices for those sales of those
20:33albums and which customers are buying it
20:35what is the invoice line items and you
20:37know very typical I would say multi table
20:41data lake in which you have
20:43primary and foreign key relationships
20:44between different tables right yeah very
20:46typical right this could translate to any
20:49of the other database schemas that
20:52exist in an enterprise whether it's sales
20:54whether it's you know HR or supply
20:56chain or whatever right now the scenario
21:00here is you want the user as asking a
21:02question saying like hey list the total
21:04sales per country right yeah and it also
21:06saying hey which are the top five artists
21:08by sales so it's a multi-part
21:10cushion right it has two parts to the
21:12question yeah in a traditional world Tim
21:16how would you answer this question this is
21:20okay so if I think of tableau like
21:22go get the country field drop cells in and
21:25normally sort it in descending order
21:27and then in another sheet I go yeah top
21:30five artists by cells yeah artists
21:33cells sort descending order keep to the top
21:36five using a set or something like
21:38that but would you how would you figure out
21:40which tables to join oh good good
21:42question interesting interesting ah yeah so
21:48honestly speaking I'd have to spend some
21:50time looking at this this model to try
21:52and figure out what's the lightest touch
21:54way that I could answer the question and
21:56what sort of what are the minimum minimum
22:00viable sort of things I need
22:02because the thing that always catches me
22:04out with models is the granularity of
22:06what you're working with and anticipating
22:09what you're going to need
22:10to do in the future so for that one it was
22:13artists it was country artist and
22:15track right so at the very basic we have to
22:18go down to the country artist level
22:21and the track level that's sort of to me
22:23the lowest level of granularity and
22:24everything aggregates from from up there
22:27and so that's I've got tracks there how
22:29do you get from artists artists you have to
22:32go through albums in voice and yeah
22:34you start to sort of you start to weave
22:36your way through this model and then you
22:38bring everything in a complicated scheme
22:42now there's a lot of foreign key
22:44relationships and joining you'll create
22:48that model and then start pretty much
22:50it but then as you said a lot of time is
22:52spent by an analyst just like what you
22:54did right now to understand the schema and
22:56how to join it and before you answer
22:58the question Tim are you ready to to be
23:01blown away by I am I am I'm going to
23:06show you a little intimidating coding world
23:11what I what I want to show you is
23:15what I did here and let me just walk
23:17through a little bit of you know what
23:19what this is doing so we've imported a
23:22sequel agent in this case and it had a
23:26sequel toolkit so what what I'm creating
23:28here is I'm saying here's a sequel agent
23:31and we'll provide it a toolkit which
23:32basically had all those things we talked
23:34about right ability to scan tables look at
23:36schemas look at columns and you see
23:40that we provide open AI API to this agent
23:45as well right so the agent has the
23:47sequel toolkit and it has the ability to
23:49talk to open AI chat GPT now here is the
23:54input right I mean this could be a fancy UI
23:56in future for this particular raw
23:58prototype I'm just providing the input
24:00saying like let's the total sales the
24:02same query right and what are the top five
24:04orders by cells and all I gotta do
24:07is go here and let's clear this so you guys
24:12can take a look and I'll say NP I
24:16mean just think of it as a button and I'll
24:18say yeah yeah run the analyst agent
24:20on this on this particular data set yeah
24:24and the agent activates it start taking
24:27your input it starts talking to scanning
24:29the data data lake it starts talking to
24:31the the chat GPT and let's wait for it for
24:34for like a couple of seconds you
24:36guys will see real time it making calls to
24:38what it's what it's doing yeah the
24:40chat GPT and scanning the database and
24:42stuff I'll give it another couple of
24:45seconds because it's going over the
24:46internet yeah I'm probably hogging your
24:49bandwidth with the recording but while we
24:53wait for right yeah here you go
24:56it comes back right now let's take a look
24:59what it did so what it did is that
25:04first of all it gave you the output the top
25:06five artists that you're looking for
25:08sales top sales are maiden Metallica that
25:11Zeppelin and love okay but this is
25:13not where the interesting stuff is look at
25:15all the intermediate intermediate
25:16steps the actions we talked about yeah in
25:19action it said list the table sequels
25:21so it performed this listing of sequel
25:24tables on the database the sequel light
25:28running on my laptop the observation that
25:31it had is this data lake has all
25:33these these tables right that's the first
25:36step what is the next thing it did
25:38it took next action from the toolkit it
25:41says I should look at the schema of the
25:44relevant tables to see what talents
25:46problems I can use right right I mean
25:50that sounds really scary it basically then
25:57looked at the scheme of of the
26:00relevant tables in this case the invoice it
26:02's a identified invoice and invoice
26:04line as the relevant tables for the first
26:06part of the question which is you
26:08know what are the two which country right
26:10yeah and then what it did is
26:12generate and this is where I think the
26:14magic happens it now takes this list of
26:18tables and the column names of the schema
26:20information and it gives it to the
26:23LLM model and says can you generate the
26:25query based on the users prompt yeah and
26:29then give me that query back that is nuts
26:32right so look look at look at the
26:35query generated it multiplied the unit
26:36price by quantity that becomes a total
26:38sales nuts it does the join between invoice
26:41and invoice line item right
26:42based on the invoice ID and then it's hot
26:45set by descending order over total
26:47sales and here is the results USA total
26:50sales is 523 Canada so on and so on yeah
26:54the story doesn't end here because remember
26:58the prompt that we had a
26:59question right which is yeah the top five
27:02artist it continues the analysis so
27:04it says now I should use look at this guy
27:09right so it generates for the next
27:12part of the cushion it generates a much
27:14more complicated sequel model so it now
27:17starts joining an artist table and the
27:20album table and starts doing all these
27:23inner joints based on the artist ID and the
27:25track ID to give you and you'll see
27:29something so it had a failure so the query
27:32somehow didn't work right it has
27:34a toolkit it's called the sequel checker
27:38okay so what it does is that for the
27:42next part of the question right it it says
27:45it failed and you would see that
27:48you know a minute you would see that it it
27:53this was the second there in a price
27:56give me one minute it's the second run so I
27:58was going through the first run on
28:00in my in my screen yeah see the query check
28:03er here yeah it said it has some
28:06problem with the column names but if it's
28:08equal toolkit with the agent there
28:10is a query checker or optimizer so it looks
28:13at the field query that a gender
28:16model fixes it it regenerates the query
28:19using chat GPT and then it fires that
28:22final query which joins your album and
28:25artists and track ID and boom it gives
28:29you the five artists made in total sales
28:33you to the old sales let Zeppelin artist
28:36fourth number and lost total sales dies
28:39incredible the the the incredible thing
28:42here is it corrected itself exactly not
28:45even not even that it got it right
28:49problem with the sequel that was generated
28:51the first time yeah yeah
28:53ambiguity with the column names the unit
28:55price it had an issue it said okay
28:57charge GPT it goes back to LLM says like
28:59there's a problem can you fix this
29:02query yeah it gets the revised query powers
29:05of
29:05backs to to your point out you're on that
29:09just speaks to yet another toolkit
29:11which is like a data dictionary builder
29:14which starts to understand it creates a
29:16dictionary of all these things right
29:17because it sees here it's got it wrong
29:19that could go into a dictionary as an
29:21alternative name for something and then
29:23next time it runs the model that's not
29:24gonna throw it off so you also you could
29:26improve the quality of itself over time
29:29really quickly and in the LinkedIn post
29:31I had a little bit of fun thing to add you
29:33know just like I'm you're chaining
29:35the toolkits here of course I didn't do it
29:37here I could chain Spotify toolkit
29:40here or ecommerce toolkit and I guess the
29:42playlists or something and then I
29:44could take the playlist and say like hey
29:47can you play listen that's what if I
29:48can't it's not just a sequel carry anymore
29:51it's you're talking about a goal
29:53you had a goal in mind you you wanted to
29:55have the top artists in your playlist
29:57based on a database it's incredible like
30:01right so you can see that you can start
30:03chaining these agents together right and
30:06these agents doesn't have to be super
30:07agents it could be you know it could be
30:09browser agents it could be search agents
30:12Google search agents it could be a Spotify
30:14agent doesn't matter it could be
30:16an e-commerce agent are you able to explain
30:20how it knows which agent to use
30:23when that's that's the thing I'm I'm part
30:26in which you have to change the
30:28agency in you have to in the kind of in the
30:31kind of sense that you you're
30:33telling it you're kind of orchestrating the
30:37the role and actually maybe that
30:39speaks to this thing about yeah I'm putting
30:42putting the tool into a
30:43character right correct like an analyst
30:46does things typically in a certain order
30:47and that's what you sort of specified yeah
30:50correct correct yeah I think for
30:53now that's the case but you know Tim I'm
30:55looking at the way the spaces evolving
30:57I mean I don't think that'll matter all you
31:01can do is just say like you know the agents
31:04that you're disposable and go figure out
31:06how to
31:06orchestrate yourself I mean I think that
31:08paradigm will change as well again
31:11because it's doing this so quickly it's it
31:15's not hard for it to essentially
31:18learn over time which order tends to answer
31:20the questions the best way right
31:22like it's even the way these things work is
31:25that today they're as bad as they'll
31:27ever be and tomorrow they're even better
31:29right so what we're seeing now what
31:31you've built here is is literally the worst
31:34this tool is ever gonna be and
31:35that's crazy to say. Yeah look at this, the
31:39worst tool, I'm showing you some strings, I
31:41'm showing you some SQL generated. Yeah yeah
31:44I have to be honest with you though the the
31:46the the I think the
31:48responses here are in the JSON array is
31:50that correct so it's basically giving you
31:52an interface that you could very easily put
31:56to a web web developer and I think
31:58over the next couple of weeks you might see
32:00that I will probably build a nice UI
32:02where the input screen is user-generated
32:05and then what I would do is I would
32:08probably format this JSON to give the
32:11output in a human readable. Yeah yeah
32:14exactly and and that is that is exactly but
32:17that's the easy part I mean that's
32:19the easiest part of the whole thing. I'd
32:20say that's the hard part actually I
32:22challenge you on that, I'd say that's the
32:23hard part because when we go back to the
32:24beginning of the conversation what I said
32:26to you is that language models have
32:27been around for a long long time what Open
32:30AI did is they gave it an
32:32interface right and actually what you're
32:34doing here is you've pieced these things
32:36together and now you're about to give it
32:38the interface the interface is the thing
32:39that opens it up to you know anyone who
32:42wants to try your plug-in this this is
32:45this as you say this is the easy part that
32:47's the hard part the interface right
32:49the interface is actually the thing that
32:53opens the door right. I would take up the
32:53silence and I would be the early prototype
32:54tester then.
32:55I know for you this is that's the easy part
33:02right because you've done the hard
33:03work of sort of doing the back-end
33:05development in getting all of them. The
33:06APIs and you know the whole SQL toolkit and
33:09those things were the hard part it was like
33:10putting it into an
33:11HTML page that's easy right. I was excited
33:16so I wanted to share with
33:16everybody. Trust me there are tons of
33:21people who watch the channel who will
33:24have seen what you've done there and
33:26completely understand what's going on
33:27and you know for those who don't all I'll
33:30say is like it's incredible it's
33:32it's absolutely incredible what I just saw
33:35there you know it it really
33:37makes me think about you know not not just
33:39data engineering modeling I'm
33:41thinking about the whole workflow start to
33:44end you know how do you how do
33:46companies reorganize themselves for this
33:49kind of disruption right because there
33:53has to come a point where you have to let
33:54go a little bit right you have to
33:55sort of trust at all you have to see it do
33:58something small and then enable it
34:00with more and more trust so are you able to
34:01sort of speak to like how do you how
34:03do you see that challenge sort of being
34:05solved how does a business start to take
34:07on this kind of innovation and over time
34:10slowly try and incorporate it. You know
34:13Tim I think the way I think about it is
34:15just again the example of browser right
34:16when the browser was launched the intent
34:18was good right and I look a lot of good
34:20things have come out of the browser right
34:22we have yeah today we have the
34:23sort of applications in the browser and the
34:25mobile phone that nobody thought
34:26when Firefox was launched that we will be
34:29able to do all those things right
34:30yeah but at the same time let's not let's
34:33not forget there has been
34:34challenges there is stuff out there on the
34:36internet that's not good it's not
34:38flavorful and I think it's the same thing
34:41with with these LLM models and
34:43SQL agents and I'm sure not so tasteful
34:47agents as well and I guess the
34:49question would then become like you know
34:50how do you govern this thing how do you
34:52audit this thing especially from an end up
34:54and let's and I'll be frank right I
34:56mean I've been doing the prototyping but I
34:58don't think anything is production
34:59ready today right yeah no doubt right none
35:02of these extensions none of these
35:04models none of these technologies are are
35:07ready today to go into production
35:09into any enterprise for any serious
35:11business right it is just the inception
35:13point right yeah without thinking I mean
35:16firstly the spaces evolving right so it
35:18needs to settle down to a level that it
35:20becomes mature enough and then the hard
35:22part of how do you make it enterprise ready
35:25what is the government strategy
35:26what is the audit strategy you know what is
35:29the strategy to verify the numbers
35:31we can solve these numbers right I mean all
35:33these things are our work to be done
35:35I mean it's gonna take a while yeah
35:39definitely and I think you know that I
35:42think more than ever this is gonna force
35:45the pace of innovation right it's the
35:47companies that benefit from this will be
35:50the ones who have their ducks in a row
35:52they really know how to do all of that
35:54because you do come across some
35:55organizations who really haven't learned
35:58how to adopt innovation well right they
36:01they they have innovation they have
36:03innovative tools they have innovative
36:05people but they've not learned to adopt the
36:07innovation really really well and I
36:08think this kind of technology will really
36:10challenge those kinds of
36:11companies whereas companies who do have an
36:14adoption sort of strategy it might
36:17take them time but when they get there they
36:19're really maximizing the value out
36:21of something like this and the kind of rein
36:23vigiding where the future goes so
36:25yeah man this is totally fine I'll put the
36:42HTML page you know in a couple of
36:44days absolutely and and when you when you
36:46when you've done when you've done
36:48the HTML page I'll update the description
36:50of the video so people people
36:52can go and have a look and play and and
36:53sort of really enjoy it because that
36:55it's the one of the things where man you
36:58really have to use it to believe it
37:00right you have to use it I think the thing
37:03with I mean and I think this is
37:03really thing is I mean if you look at the
37:04HTML page you might not see what
37:06it's doing behind the scene correct you
37:07start seeing what it did behind the
37:09scene all those you know those was just on
37:11things that was showing you the
37:12actions yeah which is where people who are
37:14in this analytic space and you know
37:16space they see that yeah sort of aha moment
37:19like oh yeah just even even like
37:22I don't know if you're planning to do this
37:25but like in the HTML page you can
37:27obviously just the interface it shows the
37:30people but I think the super
37:31interesting is that is you know that sequel
37:33like if you could put that in a
37:34in a formatted way that a human would write
37:36and say it like and just show
37:38those steps yeah that that's the that's the
37:40incredible thing that I like that I
37:43like that idea and I think that's yeah
37:45input and then you should have this
37:47sequence of ups and sequels that yeah yeah
37:49yeah and then like it should be
37:51that yeah yeah exactly exactly and it's a
37:55weird one because people just want to
37:57see the answer but you don't appreciate the
38:01answer if you don't see the steps
38:03yeah exactly exactly so yeah man I can't
38:08wait to see that that's that's gonna be
38:10incredible good send me the link I'll be I
38:14'll be there myself we'll do a demo
38:18we'll do like a classic new feature demo or
38:21something like that good work okay
38:25listen thank you so much thank you so much
38:27for your time this morning I I'm
38:28I'm going to be keeping up trying to keep
38:31up even with with all the sort of
38:33improvements you're going to be making but
38:35also just the space in general I
38:37think one thing I haven't had time to do is
38:39just really sort of learn the
38:41concepts behind AI and I think in our
38:43discussion you highlighted the
38:45importance of being aware of sort of I'm
38:48gonna call them sort of cul-de-sacs
38:51within AI they're like little things that
38:53once you understand them you don't
38:54need to go back there again but they're
38:56really important and fundamental for the
38:57way the the whole sort of thing works and
39:00if you can learn those at a basic
39:02level then actually you can help explain it
39:04to other people really really well so
39:05definitely something I'll do absolutely and
39:07I'm looking forward to your findings
39:09and you know you yeah also finding some
39:11quirks and some from yeah exactly that
39:14might as a group you know then we start
39:16looking at all these dimensions and
39:18seeing like yeah it is a collective task
39:20isn't it it's it's kind of like it's not
39:23unless you put it in front of people that
39:24you start to find all these weird
39:25edge cases that people go to just you know
39:28naturally it is so nascent that it
39:31has to be collected right I mean it's such
39:33an early stages then so many moving
39:35pieces at the same time that yeah it has to
39:37be collective I think we'll all
39:38benefit from leveraging each other's
39:41finding and summaries and you know of
39:43course and also where it's breaking which
39:46is an important piece of as you
39:47just highlighted right I mean all those
39:49pieces bringing our governance and and
39:51all those things right I think and and that
39:53's where as collectively we move
39:55sort of you know we take a step forward
39:57yeah amazing amazing good stuff so we
40:00made it to the end of this video and there
40:01's quite a lot that I had to cut
40:03such to get this into a workable format for
40:05YouTube but what this whole
40:06discussion and in fact the last few weeks
40:08has shown me is that I don't know
40:09anywhere near enough about AI as I ought to
40:12so one of the things I'm gonna be
40:13doing the next week or two is starting a
40:16new YouTube channel purely for the
40:18purpose of cataloging my journey learning
40:21AI it's essentially gonna be
40:22one video every single week just learning
40:25what I'm learning about AI
40:26sharing what I'm learning about AI even if
40:28I can get my words correct and the
40:31whole point of it is to just essentially do
40:3352 videos over 52 weeks just simply
40:36covering what I'm coming across what I'm
40:38learning the tools I'm using not just
40:40applications and analytics not just tools
40:42and analytics but also interesting
40:45applications creative applications you know
40:48I like like even even when I'm
40:50making videos for example I've started
40:52using AI to help improve things like
40:54transcription all of that kind of stuff as
40:56I go down through this journey and I
40:58build my understanding each week I'll be
40:59trying to package it in a way that's
41:01maybe useful to others and I'm not trying
41:04to sort of you know reach a large
41:06audience if it's useful for you great but
41:07that's where I'm gonna be putting
41:09all of this stuff so if I reference it in
41:11videos going forward that's the
41:12channel to go to and look out for that
41:14unfortunately YouTube punishes you when
41:17you start to make content about different
41:19topics that it thinks you
41:20don't make topics about and because this
41:22channel is about tableau as soon as I
41:25start making content about AI and as soon
41:27as you guys start enjoying the more
41:29than my tableau content unfortunately what
41:31YouTube will do is it will stop
41:32showing you the tableau content it's just
41:34something that happens on YouTube
41:36platform so this this channel is tableau
41:37Tim it has to be about tableau nearly
41:39always and so that's what we're gonna do so
41:41it's separate channel for AI look out
41:43for that coming soon
This was easily the most interesting chat I’ve had in a long time about the future of analytics and the first time I understood a path for AI to become a pivotal part of the analytics workflow.
TabGPT is a tool Rahul built as a dashboard extension that allows you to use Chat GPT alongside a dashboard.
Auto-GPT is an experimental open-source application showcasing the capabilities of the GPT-4 language model.
Dolly by Databricks: https://www.databricks.com/blog/2023/04/12/dolly-first-open-commercially-viable-instruction-tuned-llm
Andy’s Video on ChatGPT https://youtu.be/7M-wevgZ4H8
AutoGPT https://github.com/Significant-Gravitas/Auto-GPT
Timestamps 0:00 Intro 1:05 Introductions - Rahul Trehan 2:05 why is Ai having a moment now 4:00 what motivated Rahul to build these tools? 7:52 TabGPT tableau extension with chat GPT to 16:00 AutoGPT demo, this blew my mind. 29:00 What does this all mean? 40:00 I’m starting a new channel for learning Ai.
Join this channel to get access to perks: https://www.youtube.com/channel/UC7HYxRWmaNlJux-X7rNLZyw/join