Video |

Snowflake | Data engineering | Data prep

How to Load data into a Snowflake Database from AWS S3 | Learning Snowflake

I'm learning Snowflake myself, so I'll show you how to load S3 data and every mistake I made getting there.

Tim Ngwena •17 November 2022 •1 min read •2849 min watch •7.5K

Part ofGetting Started with Snowflake

Set up structure first: use CREATE OR REPLACE to make a database, a dedicated external staging schema, and an external stage pointing to your S3 URL (a public bucket needs no credentials, but enterprise setups require keys and encryption).
You can query files directly in S3 without loading them by aliasing the stage and selecting positional columns (t.$1, t.$2), which lets you preview data and discover the real delimiter.
Create a reusable named file format (TYPE = CSV, FIELD_DELIMITER = '|', SKIP_HEADER = 1) and reference it in later queries instead of re-specifying options each time.
File formats are schema-specific: a format created in one schema isn't visible from another, and the COPY INTO context must reference fully-qualified stage paths like database.schema.stage.
COPY INTO can load either directly from an external location or from a SELECT statement, letting you apply transformations at load time.

0:00Hey, it's Tim here. In today's video, I'm

0:01going to do something different. We're

0:03going to look at

0:03Snowflake. I'm going to show you how to

0:05load data from S3, that's Amazon's storage

0:08service,

0:09into Snowflake. And I'm going to show you

0:11everything that I learned along the way,

0:13because I'm currently learning Snowflake

0:15and I'm finding it, it's a little bit tough

0:17. It's not like

0:17what I'm used to with Tableau prep and all

0:19tricks. So I'm also going to show you a few

0:21things that

0:21I've added to the process that people

0:23typically wouldn't do to make some bits of

0:25this more

0:25familiar for people who do use those visual

0:27tools. Let's get stuck in. We're going to

0:30be doing a lot

0:30of SQL on screen, I'm going to try and go

0:32really, really slowly. So you can follow

0:34along. I'm going

0:35to be using lots of arrows and annotations.

0:36And if you have any questions or feedback

0:38about my pacing

0:39or anything like that, leave it in the

0:41comments below. Now all the warnings are in

0:43. Let's get

0:44stuck in. So what I've done is I've already

0:46written some comments in my worksheet, I'm

0:49using the

0:50Snowflake interface, this is called Snow

0:52site, the new interface that they've

0:54recently launched.

0:55And I've gone ahead and essentially

0:56commented what we're going to need to do

0:58for this whole

0:59entire process. I find this useful because

1:01it allows you to plan everything you want

1:03to do

1:03and then make sure that you have this sort

1:05of high level view of what you're trying to

1:06achieve.

1:07And it makes you think about things a

1:08little bit more. The other thing that I've

1:10done is I've

1:11already got some items in an S3 bucket, you

1:13can see them here. Customer one, customer

1:16two and

1:16customer three dot CSV, these are CSV files

1:19. CSV stands for comma separated values,

1:21essentially,

1:22it's a text file with commas separating the

1:24data. Except for in this instance, there's

1:27not actually

1:28common separated data, I've got pipes

1:29separating the data. And I'll explain why

1:31when we get to that

1:32section of the video. As ever, this video

1:34has timestamps, so you can skip ahead and

1:36watch

1:37whatever bits you're not familiar with. And

1:38I'll try and make sure that that's as

1:39detailed as

1:40possible. So you can rewatch or watch the

1:42section you're interested in as well. Let's

1:44head back into

1:45Snowflake. And let's start by writing some

1:47SQL. So the first thing we need to do is to

1:49actually

1:49create some structure. The reason we want

1:51to do this is just to make sure we're

1:52loading the data

1:53into the right place, so that you can maybe

1:56start to understand some best practice

1:58about how to go

1:59about doing this. So I'm going to assume

2:00that you haven't even got a database set up

2:02. And we're

2:03going to go ahead and create them right now

2:05. So I'll just go ahead and create a

2:07database. I'll

2:08actually use a create or replace command

2:10because what a create or replace does is it

2:12replaces the

2:13database if it already exists. So we'll go

2:15ahead and do that. And you can see the Snow

2:18site here

2:18autocomplete. So you can just go ahead and

2:20write everything and it tends to want to

2:22autocomplete

2:23for you. Sometimes it doesn't do that you

2:24want to sort of, you know, pay attention as

2:26you're typing

2:27to make sure you do things correctly. So we

2:29're just going to create or replace a

2:30database. We'll

2:31call this, let's call this what should we

2:33call this, I never know what to call it. So

2:35I'll call

2:36this staging, I just call this database

2:37staging for whatever reason. That's a bad

2:39name, because

2:40staging is the term we use in databases

2:43anyway, but I'll call it staging. And that

2:46's it. So let's

2:47go ahead and just run that query. So let's

2:50just go ahead and do that. And you'll see

2:52that that's

2:53been created successfully. If you want a

2:55visual identifier that that's worked, what

2:58you can do is

2:58you can go over here to the databases

3:00column, refresh that view. And if we go and

3:02refresh this,

3:03just make sure it's been created, you can

3:05see that staging is right there. And the

3:07other thing

3:07to notice is when I run that query, the

3:10context for Snow site changed to this

3:12public space. So

3:14public is a schema that's created by

3:16default inside of staging. So if I go to

3:19staging,

3:19if I just go over here, you can see that

3:21public is actually referring to this schema

3:23here,

3:24there is another schema, this is called the

3:25information schema, this is also created by

3:27default. This is where Snowflake stores its

3:30information. But the public schema is just

3:32created by default. Now, because we want to

3:35load data into a particular place, the

3:38public schema

3:39tends to be where we store our actual data,

3:40the customer data that we're going to be

3:42loading from

3:43the CSV, what we want to do is potentially

3:46create another place for loading staging

3:49information. So

3:50we can go ahead and actually create a

3:52schema for external staging. So let's go

3:54ahead and create

3:55or replace. And this time, we're going to

3:58type in schema. A schema is just another

4:00way of organizing

4:01a database. Think of it as a different

4:03aisle or different section of the library,

4:05the library is

4:06the database, the aisle or the section is a

4:09schema. And the particular tables are

4:12shelves or you know,

4:13I was in a particular library. So we'll go

4:15ahead and create a schema and we'll call it

4:17external.

4:19Okay. Now, the important thing to bear in

4:23mind is the context that I'm in right now,

4:26I'm still currently in my public schema. So

4:28if I go ahead and create a schema here,

4:31what it's actually

4:31probably going to do is do this in the

4:33wrong place. So let me just go and make

4:34sure that I'm

4:35actually in the right place on the staging

4:37public. Yep, that's right. Let's go ahead

4:39and call this

4:40external underscore staging. Let's go

4:43underscore staging. And we'll close that

4:46off. And we'll

4:47run this. There we go. So schema external

4:51staging. And again, you'll see that the

4:53context has changed.

4:55And now we're in staging external

4:57underscore staging. And then the final

5:00thing is we want

5:01to actually create our external stage. And

5:03this is basically where we're going to

5:05actually load

5:06our data into when we go and get the data

5:09from s3. And then we'll copy from our stage

5:12into a database

5:13table that we can actually use inside of

5:14Snowflake to create the data. So for this

5:16one, we'll go

5:17ahead and do one more, create or replace

5:20stage. And for this one, I'm just going to

5:24call it AWS

5:25stage underscore stage. And again, you want

5:27to make sure you're doing this in the right

5:30place,

5:30we're going to be creating this stage. This

5:32is the context. One thing I did mention at

5:34the start is

5:35that I'm using something called my compute

5:37warehouse. This is just an extra small Snow

5:40flake

5:40instances. warehouses in Snowflake are

5:42essentially responsible for computing power

5:45, they're not like

5:45your typical warehouse in a database setup.

5:47So don't get those two things confused. But

5:50we're

5:50going to go ahead and run this line here.

5:52And we're going to hit run. And there we

5:55are, we have

5:55our AWS stage. Now let's go ahead and

5:58refresh this left hand side list. When we

6:00do that, you'll see

6:01everything that we've created. So you'll

6:03see that we've got our staging database, we

6:05've got the

6:06external staging schema, this is the schema

6:08we created for staging information. When we

6:11expand

6:11that, you'll see that we have a bunch of

6:13tables views. And if I go down to the

6:15stages section,

6:16you can see that we have our stage. So we

6:18're pretty much good to go, we're pretty

6:20much set up,

6:22and everything is set up as we'd like to.

6:24Now for the next stage, what we want to do

6:26is to try and

6:27add some context to that stage, because at

6:29the moment, we created a stage, but we didn

6:31't tell

6:31Snowflake where to go get the data from.

6:34And if we look at the documentation for

6:36creating a stage,

6:37if I just go back up here, you see that I

6:38was looking at it right before I recorded

6:40this video.

6:41And what I always do with the Snowflake

6:43documentation, a little hack is I just go

6:45all the way to the examples, because I find

6:47that documentation is almost so detailed,

6:50that the upper parts tend to be only useful

6:52when you're really stuck. So I go to the

6:54documentation,

6:55I look at the examples first to see how it

6:57should be done. And from the examples,

6:59I tend to get a better feel for what I need

7:01to be doing. And while we're here, you'll

7:04see the sort of

7:05explanation of internal stages versus

7:08external stages. Now, an internal stage is

7:11essentially

7:12a stage that is familiar with Snowflake. So

7:14this is a stage where data is being held

7:16inside of

7:17Snowflake itself. An external stage is

7:19essentially data that comes from an

7:20external source. In this

7:22case, could be Amazon S3, Azure, or Google

7:25Cloud Platform, we're getting data from S3.

7:28And you can

7:28see here that it's actually quite simple.

7:30We just need to do this step, create my

7:32external stage,

7:33and we need to give it a URL. This is where

7:36I copied my S3 URI. And you just need to

7:38tell it

7:39where that information is. Now, if you've

7:41got any other sort of complicated security

7:43settings,

7:43you can see other examples here. For

7:45example, if you want to add credentials for

7:47AWS,

7:48you can absolutely do that. You can also

7:50add things like encryption. And as you get

7:52more

7:53and more complex, you'll need to add more

7:55and more to your setup. In essence, in real

7:57terms,

7:57if you're working in an enterprise setting,

7:59likely it is you're going to be using

8:00encryptions,

8:01credentials, and a bunch of other

8:03information. So this is absolutely what you

8:05're going to be needing

8:05to do. But because we're just doing a very

8:08simple example to show you the overall

8:10process,

8:10we're actually just going to need the URL.

8:14Why? My AWS bucket is public, so it doesn't

8:16have any

8:17security whatsoever. I'll delete it after

8:19this video. So you can't go and download

8:21the data from

8:22it and create an S3 bill for me. But in

8:24essence, my S3 bucket is public. So Snow

8:27flake can just see

8:28it right from the get go. We don't need to

8:30do any security to make it available to

8:32Snowflake. So

8:34you can see here that I have my S3 URL just

8:36right there. I got that from S3 itself. You

8:39can see that

8:39I copied it from this top right hand side

8:42section just over there. And once you copy

8:44that, you can

8:45just come in here and paste it. So we'll

8:47just go ahead and we'll actually edit the

8:48initial statement

8:49we made. So let's just go and do this. And

8:52we'll say URL equals and this is known as a

8:56string. So

8:57a string is just basically a piece of text.

8:59And you normally put speech marks on either

9:02side of

9:02it. You can see that I've got something

9:04else that I tried to do just before this

9:05video. So let's go

9:06ahead and grab this and paste that in there

9:09instead. Perfect. And now we can go ahead

9:11and

9:12close that with a semi colon. What this

9:14will do is actually go and get the

9:15information from S3. And

9:16then what we can then do is ask Snowflake

9:19to tell us what it knows about the metadata

9:22that it's got

9:23from that particular location. So because

9:25we've already created a stage, this time

9:27when we go and

9:28run this query again, what it will actually

9:30do is replace the stage with my new stage,

9:32which is also

9:33called AWS stage, this time with the right

9:35information. Let's go ahead and run that.

9:37You can see it's compiling. And now we have

9:40a stage. It's pretty much good to go. Okay,

9:44so now

9:44that we have the stage, what you might want

9:47to do is get some information about it, try

9:49and understand

9:50what it looks like what kind of metadata

9:52does Snowflake have about the stage. And

9:54this is a

9:55good way of making sure that everything has

9:57worked as it should do. So there is a

9:59command, let me just

10:00look at my notes here. Yes, that's correct.

10:02So there's a command here where we can just

10:04go ahead

10:04and write DSC. This is short for describe,

10:07but it's not the same as describe, I don't

10:10know why,

10:11but we can just go ahead and describe. And

10:13we can say the stage. And then we can ask

10:16it to literally

10:16describe the stage we've just been talking

10:18about. Now, because we are already in the

10:20right context

10:21here, we don't need to really specify where

10:24the stage is. So we can just go ahead and

10:26just call

10:27AWS underscore stage, you can see that it

10:30auto completes that. And we can go ahead

10:33and put a

10:33column on that. And if we run that, this

10:35should hopefully work. Yeah, there we go,

10:37we get the

10:38metadata for that stage. So we essentially

10:40have some information, we have the file

10:42format right

10:43at the top. If we go down, we have a field

10:46delimiter, it's assuming that it's a comma,

10:50I know for a fact that it's not. So we'll

10:51come and address this later, I deliberately

10:53created CSV

10:54files with a slightly different delimiter.

10:57And I'll show you that soon. And if you

10:59come down

11:00here, you can see a few other bits of

11:02information. For example, some copy options

11:04, some file

11:05information, if we had any AWS keys or

11:07credentials, those would also be down here.

11:10So you can see that

11:11that's worked perfectly, it's connected to

11:13our data in s3. And we're pretty much good

11:15to go.

11:16The last thing we can do is just ask if we

11:19can see the files that are in currently in

11:21that stage. So

11:22what we can do is we can go just say list.

11:24This is basically what we're going to do.

11:26And for this

11:27one, we can actually just use this notation

11:29at AWS underscore stage. And we can just go

11:32and do

11:33underscore there, put the colon, close that

11:36off. And what we should do is see the three

11:39files. So

11:40you can see that three CSV files are good.

11:42Now a mistake I made in my understanding

11:44was that

11:45assuming that size meant the number of rows

11:47sizes got nothing to do with rows, it's

11:50actually the

11:50physical size of the file. So 34 kilobytes,

11:53this is all in bytes at 34,000 bytes is 34

11:57kilobytes.

11:57We know that because if we go to s3, you

11:59can see here that it says 34.1 kilobytes.

12:01The reason those might be different is

12:04because of file file spacing, essentially,

12:07when you

12:08have different bits of information stored

12:10on different types of storage devices,

12:12you have sometimes compression, you

12:13sometimes have file sizes that change for

12:15minute distances.

12:17The best example is Windows, if you go look

12:18on Windows, you ask it how big is your hard

12:20drive,

12:20it will tell you. And then you ask it how

12:22much is available, it won't give you the

12:24same number,

12:24it kind of compresses that number and does

12:26weird things. So that's just an explanation

12:29of why those

12:29two might not be exactly the same in the

12:31two places. So now that we've created our

12:34stage,

12:34we've loaded our data, we've seen our data

12:37is there, you can see the CSV is there as

12:39well.

12:40One thing that I got really frustrated with

12:42actually in the whole process,

12:44and I get frustrated with SQL a lot, it's

12:45probably why I've avoided learning it as

12:47much as I have,

12:48is because I can't see what's going on. I'm

12:50having to, I'm having to visually sort of

12:52work with this.

12:53So if something like altrix or Tableau prep

12:55is your tool, I can totally understand why

12:57that is.

12:58But here, what I want to be able to do is

12:59recreate some of that experience, I want to

13:01be able to see

13:02what my data looks like I've already got a

13:04file here, I've got three files, in fact,

13:06I should be able to preview the data. So I

13:08know what to do next. Because if for

13:11example,

13:12I know for example, for example, the delim

13:14iter is different, it's actually a pipe, but

13:16it's been

13:16saved as a CSV. So how do I know that that

13:19's the problem? How do I identify that? Well

13:21, there's a

13:22couple of things you can do. And one of the

13:24really powerful capabilities of Snowflake

13:27is you can

13:27actually query this file in situ. So you

13:30don't have to load it into Snowflake. To

13:32query the data,

13:33you can actually query it directly in

13:35Amazon s3. And that allows you just to give

13:38you a little bit

13:38of a feel of what's going on. So let's

13:40figure out how to do that. Now, there's a

13:42couple of ways of

13:43doing this. And I kind of had to sort of,

13:45it wasn't it wasn't hard, I had to sort of

13:47find a way of,

13:48you know, kind of getting to this answer.

13:51And I found this, this solution here, this

13:54is a

13:54documentation, everything's in the

13:56documentation for the record, nothing I'm

13:58doing is, you know,

13:59like life changing, this is all well

14:01documented. So you could do this to I'll

14:03put a link to all

14:04of this in the description as well. And if

14:06I go down here, you'll see that there's a

14:08bunch of

14:08different examples. But if I go to the

14:11examples that I, you know, hold daily, you

14:13can see that

14:14there's this sort of example here. And this

14:16is really good, because it actually breaks

14:18out the

14:19whole the whole thing. So what it does is

14:21that you can create or replace a file

14:23format. And you're

14:23wondering why, why do I need to create a

14:25file format? Well, it's because it makes

14:27the subsequent

14:28steps a little bit easier. You see here,

14:30you can got you've got the stage, we've

14:32already done that.

14:32So we don't need to put anything in the

14:34stage, we can, we can already sort of

14:36almost disregard

14:38these two steps. But this is where I got

14:39really interested, because here you can see

14:42there's

14:42something really interesting going on. And

14:44I'm actually going to explain this in a

14:46slightly weird

14:46direction, you don't really explain this

14:48from the beginning, because first, you need

14:50to identify

14:51that you're working with a stage. And that

14:53's what we've got here. So for us, this is

14:55called AWS

14:56stage. Now, the next thing you can see is

14:58you've got this sort of brackets that goes

15:00all the way

15:01up to here. And what's going on in that

15:02space that I've highlighted is essentially

15:05you're specifying

15:06a file format. And you're also specifying

15:08which files to go and get using regex. So

15:11this particular

15:12part of this code is regex essentially

15:14describing how the file should look like.

15:17So this is

15:17basically saying go and find any file that

15:20has the term data dot CSV dot gz gz is just

15:24a compression

15:24format. So gz means dot gz. So just just

15:27the compression format, gz means that gz

15:30sounds like

15:30I'm talking about Kanye West or something.

15:33But nonetheless, that's just saying, look,

15:35use regex

15:36to find files that look like this. And this

15:39first fact is this first part is saying

15:41file format is

15:43equal to or greater than my format,

15:46whatever. Okay. And my format is actually

15:50been specified

15:51up here, you see, the file format is

15:53specified ahead of time. Okay. So my

15:56question is, well,

15:57how can you specify a file format, if you

16:00can't see what's in the file to know what

16:02delimiter to

16:03use that sort of that was just one of these

16:05things that had me in the loop. So you'd

16:07have to have

16:08access to the file, you'd have to have

16:10access to the information to look inside to

16:12know that you

16:12need that delimiter. But what if you didn't

16:14have access to that? What if you didn't

16:16know that? Well,

16:18it turns out that this still works

16:20perfectly fine. You see, if you just emit

16:23this file formatting

16:24capability, you can essentially just query

16:27the stage with just a single column. And so

16:29now if

16:30I explain this to you, what you're

16:31essentially doing is you're selecting you

16:33're just running

16:34a normal SQL query, select start from, but

16:36in this case, you're specifying the columns

16:39,

16:39and you're probably wondering, well, what

16:41does t dot pound signed one mean t dot

16:43pound sign two?

16:44Well, the best thing to do is to describe

16:47this t here at the very end, this is known

16:50as an alias.

16:50In essence, we're basically saying, call my

16:55AWS stage t. And from t, I would like to

16:58get column

16:59one and column two. So in essence, what

17:02this is saying is select column one, and

17:06column two,

17:06from my stage, the file format is this, it

17:10needs to match this pattern from regex,

17:13meaning it has

17:14to have data dot CSV in it. And the alias

17:17for the table is t. Now, if I change the

17:20alias to C,

17:21say, for whatever reason, I change this to

17:23C, then this would be C dot pound sign one

17:27and C

17:27dot dollar sign two, I don't know why I

17:29said pound sign, that's completely the

17:31wrong sign.

17:31So C dot dollar sign one and C dot dollar

17:35sign t. So hopefully, I've sort of

17:37explained that well.

17:38And all you're doing is calling the columns

17:41in essence. And that's how it works. Okay,

17:44so let's

17:44go back to our snowflake instance. And

17:46actually, what I'm going to do, I'm going

17:48to cheat, I'm going

17:49to be very, very bad here, I'm just going

17:51to copy this, because I don't want to get

17:52this wrong,

17:53whatsoever. So I'm going to paste it here,

17:54because we can kind of see this working.

17:56And we can just

17:57go ahead. And for now, we're just going to

17:59remove this because again, I want to query

18:01my stage. So

18:02I know what file format and delimiter to

18:04use, because how can I see the file? I let

18:06's say I

18:06don't have access to the file. How can I

18:09see it here. So let's go ahead. And what I

18:11'm going to do

18:12is remove this pouncer. And you see, the

18:14reason I'm doing that is because I don't

18:15know how many

18:16columns I need. So it makes sense to start

18:18with just one column and see what I get.

18:20And if we go

18:22ahead and replace at my stage one with at

18:25AWS underscore stage, you get this. So I

18:29have a very

18:30simple query, we're querying the in the

18:33external stage, everything is set up

18:35correctly. So hopefully,

18:36fingers crossed, this should work. And we

18:38can just hit run. You can see it's

18:40compiling. Now,

18:42what I should have done is put a limit. If

18:44you ever do this in a database, let's say

18:46someone's

18:46giving you an S3 bucket with billions of

18:48rows as hundreds of files, don't do that,

18:50because that will

18:51cost you a lot of money on Snowflake if you

18:53do that really badly. But nonetheless, here

18:55, you

18:55can see it worked. It worked really, really

18:58well. And it just returned one column. Okay

19:00. Now, what

19:00happens if you try two columns, let's go

19:04ahead and to do t dollar sign, or t dot

19:07dollar sign

19:08two for the second column. Let's try that

19:10again, hit run. When we do that, you'll see

19:14it returns

19:15a second column. But the second column is

19:17now which means it wasn't able to find it.

19:19And the

19:19reason Well, you can see that my CSV file,

19:21at least all three CSV files, I think in

19:24this case, it's

19:25has it loaded, it's loaded 3000 rows, there

19:27were three files, it has actually loaded

19:29all three

19:30CSV, I noticed 1000 rows in each of those

19:33files, because I created them in mockaroo,

19:36which has a maximum limit of 100 rows, 1000

19:39rows, sorry, when the free account, so I

19:41can, let me move

19:42this. Oh, let's not do that. Let's move

19:44this up here. Oh, I don't want it to get in

19:46the way of the

19:46run button. So I think I'll leave it there.

19:48So you can see here, we've got the pipe

19:50delimiter. Okay,

19:51so now I can see the data, I know what's

19:53going on, I can even see what column names

19:56I might need to

19:57use, and what data types I might need to

19:59use now that I can see the data. So now

20:01that we've got

20:02this, we can see here that we need to use a

20:05pound sign, sorry, a pipe sign as the delim

20:08iter. So now

20:09we need to create a file format. So let's

20:11go ahead and create a file format. Now, I'm

20:14not going to get

20:14the code for this correctly. So what I'm

20:17going to do is I'm going to grab this, okay

20:20. I'm going to

20:21grab this, which is essentially, it's

20:23actually quite straightforward. I could

20:25even type it,

20:25create or replace. And I think this is file

20:30format. I never know how you would know

20:34this,

20:34if that makes sense. I don't know why file

20:37format isn't one word. I don't know how

20:40they come up with

20:41this. But anyway, file format, create or

20:43replace file format. And now we need to

20:45give it a name.

20:46So in this particular case, I'll call this

20:50CSV with, what should we call this CSV_pipe

20:55.

20:55Okay, so CSV with the pipe delimiter, in

20:58essence. And what we're going to do is we

21:01're going to

21:01specify that that format is called CSV_pipe

21:05. And the type is equals to, and for this,

21:10you need to

21:10do this CSV. And then the next thing we

21:13also need to do is specify what field delim

21:16iter we're going

21:16to use. So field, now this one on my word,

21:19I have no hope of spelling delimiter right.

21:22I have slight

21:23dyslexia. So this is not my, this is like

21:25the worst thing I could be doing, trying to

21:27type the

21:28word delimiter out. Field delimiter equals,

21:30and for this one, what you need to do is

21:33just put the

21:33pipe in there. So that's the pipe character

21:36to tell the tool that this is the file

21:37format. And

21:38the other thing you can notice here,

21:40because we can see the data, we now know

21:41that we need to skip

21:42the first column, I don't want to import

21:44that first row as data, because it contains

21:46headers,

21:46we can actually specify this and create the

21:49table itself inside of Snowflake when we

21:52load the data

21:53in. So let's go ahead and actually specify

21:55that we're going to skip the first row. And

21:58the

21:58notation for that is essentially a boolean

22:01field, all you're going to do is just say,

22:04skip, let's say skip, see it doesn't come

22:07up, so I don't know how on earth you're

22:09supposed to know

22:10that. So skip header equals a one. So if it

22:13wasn't skip header, if skip header wasn't

22:16on, it would be

22:17zero, that's it, that's just a one or a

22:19zero for that field. And then we can do

22:20that. So let's just

22:21double check this. Create or replace file

22:25format CSV pipe, the type is a CSV file,

22:28the field

22:28delimiter is a pipe and skip the header

22:31equals one, essentially skip the first row.

22:34Okay, so I

22:35think that's pretty much it. Let's go ahead

22:37and run this, we should see a success. So

22:40CSV pipe

22:40is now created. That means we can now

22:43specify file format very, very easily. If I

22:46look at this

22:47notation, you can see here, this is a

22:49documentation for file format. If I go down

22:52to the examples,

22:55you can actually do quite a few things with

22:56the file format setup. So you can go ahead,

22:59specify a field delimiter, skip header,

23:01null if, essentially specifying what to do

23:04for if there's

23:06any null information, empty field is null

23:08equals two, compression if you want to

23:10specify that as

23:11well. All of that is in there. And it works

23:13with JSON as well. So it's a really nice

23:16sort of

23:16capability to be able to save those file

23:17formats, then call on them any way you're

23:19writing sequels.

23:20That's really sort of a really handy, handy

23:23trick to be aware of. Okay, so let's go

23:25back here. And

23:26let's go to the next row. So what we've

23:29done so far, if I just sort of do a recap,

23:31we created our

23:32database, we created a stage, we created a

23:34schema as well, we've described our stage

23:37to make sure

23:37it's there, we've listed the files in our

23:40stage, we've even queried the stage itself

23:43to have a look

23:43at the data and see what it looks like. We

23:45found that there was just one column with

23:47information.

23:48And in that column, we had one long piece

23:50of text with a delimiter separating the

23:53information. So

23:54we've still only got one column, but we've

23:56got information separated by pipes, that's

23:58our

23:58delimiter, what we're going to do is use a

24:00transformation to essentially break that

24:03out

24:03into columns, essentially split the columns

24:05as it were. And then what we've done is we

24:07've gone and

24:08specified that particular file format as a

24:10format. So here CSV pipe is going to be my

24:13file format.

24:14And that's what I'm going to use in the

24:16next few steps. So it's taken a while to

24:18get here. But I

24:19thought it was important to sort of really

24:20give you a thorough overview of how I've

24:22got here,

24:22because I had no clue. And I spent the best

24:24part of a couple of hours trying to figure

24:26it out using

24:27a bunch of SQL from lots of different

24:29places. So okay, now what I'm going to do

24:31is I'm going to be

24:32very naughty, I'm actually going to copy

24:34this, I'm really lazy, I hate repeating

24:36things. Because now

24:37that we've created our file format, what we

24:39can actually do is use that file format in

24:42our next

24:42step. And for our next step, what I'm going

24:44to do is write my SQL in a very weird way,

24:46I'm not going

24:47to write it all in one line, I'm actually

24:48going to create this space, because that T

24:50is still going to

24:51remain our alias, we're still going to call

24:53everything with that. But this time, what I

24:55can actually do is tell the query what my

24:58file format should be. And to do that, I

25:00can just go

25:01ahead and create a brackets, essentially,

25:03what the bracket does is it almost passes

25:06an instruction

25:06to this query, and it tells it to treat the

25:08data in a slightly different way. So for

25:10this one,

25:10I can just go ahead and say file_format

25:14equals, and then this time, instead of

25:18saying CSV,

25:20I can just go and say CSV_PIPE equals CSV_P

25:23IPE. So let's go ahead and do that. CSV_PIPE

25:29. Okay.

25:31And that basically means that the

25:33instruction I'm passing here is going to

25:35assume all of this

25:36information here. Okay. And now that we've

25:38done that, that's it, that's all we need to

25:40specify.

25:41Now when we run this query, you'll see

25:42something slightly different. Let's go

25:44ahead and run this

25:45and hope this works. Oh, something's broken

25:47. So here we go. Invalid argument. We have

25:50not done

25:50this correctly. Unresolve function. I think

25:54I need to change something in here. So let

25:57's see,

25:58select column one or column two from AWS

26:03stage. File_format CSV_PIPE. The table

26:07function argument

26:07is required to be constant. I think what I

26:11've done here is, I don't know why, but I

26:15think I need this.

26:16If you know why I need that particular

26:19greater than sign or whatever, let me know.

26:22I think that's

26:22what it's missing. I can't believe that.

26:26That's so annoying. But hey, look, this has

26:29worked because

26:29you see now, previously, when we run this

26:32query, we've got everything in one column.

26:35But now you

26:36can see here that T1 and T2 bring out all

26:38the individual fields. So I can actually

26:41kind of go

26:42crazy here and I can just copy these and we

26:44can just change these. I can copy these

26:47twice because

26:48I believe there were six rows. So we can go

26:51to, we can call this three and call this

26:54four, call this

26:55five, call this and six. That's it. That's

26:58everything we need. We don't need that last

27:01comma. So we can go

27:02ahead and call these six columns from our

27:04AWS stage using this file_format. Let's go

27:06ahead and select

27:07that. And here we go. This might take a bit

27:09longer, but no, there we go. There we have

27:12it. So we have

27:12our six columns. And now we are ready to

27:14load some information. Actually, we don't

27:17have a sixth column,

27:18so we can get rid of that final one there,

27:20which is five columns. And we're pretty

27:22much good to go.

27:23Okay. So, so far, so good. The next thing

27:27we're going to do is we need to actually

27:30start loading

27:31our data into a stage. And there's a couple

27:33ways we can do this. Now that we know our

27:35data is there,

27:36we know what it looks like, we've sampled

27:38it, we know there's 3000 rows, we know

27:40exactly what to

27:40expect. So we can actually, let me get rid

27:43of this, this is a bit confusing here,

27:45because we don't need

27:46this. We've actually already used our stage

27:48. So we don't need that there. What I'll do

27:50actually,

27:51is I'll take it and I'll put it up here

27:54where we need it. And then what I'll do is

27:57I'll leave this

27:57in GitHub. So you can grab this and have a

27:59look at it and replace it with your own

28:01code. So let's go

28:03ahead here and have a look at the next step

28:05. So loading data into stage from S3, we've

28:08kind of

28:08already done this, I didn't really observe

28:12what was going on with my notation. So we

28:16're still on

28:17the process for loading data. So now that

28:19we can see our data, we know what needs to

28:21be done. Now,

28:22you've got to think about this in a couple

28:25of different ways. Even this step here,

28:30previewing the data in the stage, I've

28:31actually kind of already done this here. So

28:34let me just

28:34put that there for now, loading the data

28:36and previewing the stages there. For this

28:39next step,

28:39what we need to be able to do is we need to

28:42create a table to load our data into

28:44because at the

28:45moment, we've got nowhere to load something

28:47into. So first, we need to create the table

28:50. So for this,

28:51because we can see our data, it should be a

28:53lot easier to go ahead and do that. And the

28:56trick here

28:56is to make sure that we get the context

28:58correct, because we're not going to load

28:59the data into our

29:00external staging, we're not going to load

29:02it into staging, we might actually load it

29:04into the public

29:05space. So we need to almost switch contexts

29:07here and go to a different place. So let's

29:10go ahead

29:11to, we can actually stay in staging and

29:13just go to the public schema here. And now

29:18that we're in

29:18the public schema, we can go ahead and

29:21create a table. Yeah, we can go ahead and

29:24create a table.

29:27Okay. And the create table function is

29:30quite a straightforward one. Let me just

29:32double check my

29:33notes. Basically, what we need to do is

29:35specify the name of the table, we're

29:37already in public.

29:38So let's go ahead and call this customer_

29:42table for now. And to make sure we get this

29:46correct,

29:48we need to specify the columns correctly.

29:50So what I'll do is I'll just put this over

29:52there,

29:52and I'll put a colon at the end of that so

29:54that we know that that's correct.

29:56Then we'll actually need to start

29:58specifying our columns. And this is quite

30:00straightforward,

30:01because we can just go ahead and type them

30:03in. So what I'm going to do is I'm actually

30:05just going to,

30:06I was going to copy it over, but I'll type

30:08it for the benefit of everyone. So we'll

30:11say id

30:12is integer. So we have whole numbers for

30:15the IDs. So we'll just go int. The next

30:19thing is going to

30:20be the first name, I believe. So what it

30:23doesn't help is that I skipped the first

30:26header. So what

30:27I can actually do is I can actually go here

30:30, just run this again. This won't work

30:32because I'm in the

30:33wrong context. So let's go back here and go

30:36to external stage, run this, we'll get the

30:39right

30:40table below. Still bringing two columns. So

30:43we've got first name, last name, favorite

30:45app,

30:45and favorite color. So that's good to know.

30:48Let's go back to public, get that context

30:51correct. Now

30:52we're in public, we can keep on writing

30:54here with that preview right below. So that

30:57's the first one.

30:58The next one is going to be first_name,

31:02first_name. Next one, I'll do this first,

31:06last_name.

31:10I'll call this favorite_app. I'll call this

31:17favorite_color. I'll do the American way

31:21for now.

31:22And for the data types, we're going to be

31:25using characters for all of these ones. So

31:28we just go

31:28ahead and type, I think it's varchar. I

31:31always remember it as like a Pokemon varch

31:33ar. I don't

31:34know what that is. So varchar, I think it

31:36just means variable character actually. It

31:40's one of

31:41these things that you just see a lot and

31:43you don't ever get it right. Everything

31:45here is going to be

31:46a variable character. I'm not going to

31:49specify the size or anything like that. We

31:52can kind of worry

31:53about that kind of stuff later. So there we

31:55go. We've got all our columns set up. I

31:57believe that's

31:57everything. So 1234512345. That looks

32:02pretty good to me. I can't see any other

32:05bits of information

32:06that we'd need. It looks about right. I

32:09think the only thing, I feel like I'm

32:12missing something,

32:13definitely missing something. So let's go

32:15back into my create table. Okay, yes, I

32:20knew I was

32:21missing something. You need to put a comma

32:23after each one, but the last one. So let's

32:25go ahead and

32:26do that. This is what's missing. Let's go

32:29and do that. And the last one doesn't need

32:31it. So

32:32we're in public. We're going to create the

32:34table called customer table. Let's go ahead

32:37. Oh, this

32:38is not right. Good spot. Let's go ahead and

32:42run this. They go customer table was

32:45successfully

32:46created. So you need to also get in the

32:48habit of checking your work before you run

32:49things.

32:50And we can kind of validate that here by

32:53hitting refresh, going to public, let's go

32:57to public,

32:57go to tables. And you can see the customer

32:59tables right there. And you can even see

33:02the columns have

33:02all been loaded. So ID, first name, last

33:05name, favorite, favorite color, those are

33:08all stored

33:08in that. Now, the fact that we've done that

33:11means we now have a table we can work with,

33:13we're now ready to copy some data into that

33:18table. So let's go ahead and do exactly

33:21that.

33:21Now to do that, we use something called the

33:24copy into command. And you can use the copy

33:27into

33:27command in a couple of different ways. If I

33:30go down again to our lovely examples, let's

33:34go here

33:34to a named external stage. So here you can

33:37say, for example, you can actually load the

33:39data directly

33:41from hold on loading files directly from an

33:45external location. So copy into my table

33:49from

33:49S3 bucket, right, right, right. So here,

33:51you can actually specify how to load it

33:53into a particular

33:54table, though, everything I've done here,

33:56we've already been querying it. So you're

33:58probably

33:58wondering, well, I already sort of have the

34:00data. Well, this query here is actually

34:03performing a

34:03transformation on the information, what I'm

34:06going to do when I load it in is actually

34:08apply that

34:09transformation at the point when I'm

34:11loading it. So even though I've done it up

34:13here, it's going to

34:14look a bit different. But there is also

34:15another way of loading data that allows us

34:17to just use

34:18this information straight from the get go.

34:21And it's essentially a copy into my table

34:23from this

34:24select statement. And that will actually do

34:26the transformation as well, because it's

34:28essentially

34:29the exact same piece of information. So

34:32what I'll do is I'll show you both ways

34:34that we can do that.

34:35So let's go ahead and grab this, I'm really

34:38bad with my annotations. I did not stick to

34:41them at

34:41all. Thing is, because my head is just

34:43running ahead of the tutorial. So we'll go

34:45ahead and

34:45close that. So the last item here is to

34:49load it into the table. So what I'll do is

34:52I'll start by

34:53typing the copy into command, let's just do

34:56that, okay. And copy into, and we're going

35:00to call this

35:00customer table. So customer table is

35:03already been created. Now, for this step,

35:06we've got two ways of

35:07doing it. Now, the first way, where we

35:10where we can basically take the data from

35:13our staging,

35:13and load it in. And the other way is to

35:16essentially just use our select statement.

35:19Now this this

35:20seems like almost bizarre to me. But you

35:22can just essentially do this. And this is

35:25by far the

35:26quickest way given how I've shown you this

35:28so far. So we can say copy into table and I

35:30'll just say

35:31from and then what you need to do is I must

35:33put two brackets, okay, and I'm just going

35:36to close

35:36that by doing this. And then inside of this

35:39from you need to basically go and get this

35:42select

35:42statement. You can copy that because we've

35:44done the work already. Okay, and then paste

35:46that in

35:46there. So what that's that's what that's

35:49going to do is saying copy this, copy the

35:51data into this

35:52table from this query that I've got going

35:54on, which is already calling a piece of

35:57information

35:58from a stage. And so that will mean that we

36:01can then go ahead and just load this up. So

36:03I believe

36:04this will work. I really hope it works. I

36:06think it will work. Let's go ahead and try

36:07that. Oh,

36:08there was an error. So what is the error?

36:11So I was because I think I've got this and

36:14it shouldn't be

36:15there. I don't need that there because it's

36:17actually over here. So because I copied it

36:20before

36:21it kind of got that wrong. So ah, here's

36:24this is great. This is a great error. St

36:26aging.public.aws

36:28stage does not exist or is not authorized.

36:30So what it's doing, what it's basically

36:32telling me is that,

36:32hey, over here in staging, in the public

36:35schema, I don't see an AWS stage. So what

36:38the hell are you

36:39talking about? Well, that's because here I

36:41've just referred to at AWS stage, but of

36:43course,

36:44my context is public. So what I really need

36:46to do is specify that I need to load the

36:49data from this

36:50other location. So we're going to have to

36:52just go ahead and spell it out for the tool

36:54to understand

36:55what's going on. So what we all do here is

36:58we'll call staging database and we'll call

37:01external

37:02stage, which is the schema. And then we'll

37:05say AWS underscore stage. Okay. So that's

37:08basically

37:09us specifying that, hey, this thing you're

37:11going to load is from another place in the

37:14system. So

37:15let's go ahead and try that now. We'll go

37:18ahead and run that. And this time we get a

37:20slightly

37:21different error. So a syntax error line

37:24five at position zero expected a close

37:28bracket. So what

37:29is going on here? File format. There's a

37:33close bracket. Let's try this. Let's try

37:36putting this

37:38here. Let me also do this because I'm

37:41losing track of what's going on. Let's just

37:44make it easier to

37:45see. So if I do this as well, then we can

37:48see that this closes there, that closes

37:52there, that closes

37:54there. Interesting. Let's go ahead and run

38:00that. Nope. Position line, line four at

38:05position 33.

38:08While line four is one, two, three, four,

38:13position 33.

38:15It expected a mother bracket. I'm fairly

38:22confident this is correct. I think I might

38:25have another

38:26issue elsewhere here. So let's go and look

38:29at the copy into documentation. This is

38:32what we should do.

38:33So in here, they don't give you an example

38:37where they're using

38:39all of these. Maybe it suggests you shouldn

38:43't be trying to do this. Maybe I got too

38:45ahead of

38:46myself. I got it working in one situation

38:48and then I didn't get it working in another

38:49.

38:50What are we looking for here? So copy into

38:55from what am I missing?

39:02I bet it's something tiny. When we queried

39:06the data,

39:06we queried it from @. Ah, this might be it.

39:13So it's really obvious, but of course, this

39:16query

39:16works with an @ sign. So let's just, I hope

39:19this isn't it. I hope this is not what I'm

39:22looking for.

39:22If that is it, I'll be furious. Let's go

39:25ahead and, nope, CSV pipe does not exist or

39:28is not

39:28authorized. Did I create the file format in

39:33a different context? I might've done. I

39:38might've

39:38done. So let's go and see if file formats

39:42are, nope, no, I don't believe so. If I go

39:47in here,

39:48nope. Okay, what I'm going to do, I'm going

39:51to try something cheeky. Let's see if we

39:54can create this

39:54file format, but we're going to create it

39:57in this context of public to see if that's

40:00potentially

40:00causing the issue. So let's go ahead. I'll

40:03put this here just for the fun of it. And

40:05because

40:05we're already in public, I'll go ahead and

40:09run this. Interesting. It uses a create or

40:13replace,

40:14that might not necessarily mean that this

40:17has worked. So it's CSV underscore pipe all

40:20lowercase.

40:21So now if I go right this, all my word. So

40:25that was a really good bit of troubles

40:29hooting. So it

40:31looks like you have to create the file

40:32format specific to the schema. That's

40:34something new.

40:35And then the other thing you have to bear

40:38in mind here is that you have to reference

40:40the

40:40staging correctly. I got it right up here.

40:44And for some bizarre reason, I decided to

40:46change the

40:47way I was referencing it here. And what I

40:50thought that the outside was doing was

40:53almost it's like

40:55Twitter mentions that I was thinking too

40:56much about social media. I thought the

40:58outside works

40:58globally. I'm incorrect in that specific

41:01instance, because I felt like I had to call

41:03it directly. Let

41:05let me I wonder, hmm, let's try it

41:10something else. Let's just make sure we

41:13absolutely

41:14clear about this. Let's say, let's let's

41:17create another table, we'll say customer v2

41:20,

41:20and call this v2 and hit run on that. Okay,

41:23so we've created another table, we'll go

41:26ahead and

41:26refresh. We'll go down, we'll go to public,

41:29you'll see tables, you should see two

41:31tables, customer v2.

41:32And then what I'm going to do is I'm

41:34actually going to get rid of this at sign.

41:37And for this

41:37one, I'm going to say copy into table v2,

41:42customer table v2, the file format is ready

41:47here, v2,

41:48table v2. Yep, yep, yep, yep, everything

41:51looks good. Let's go ahead and highlight

41:53and run.

41:53This time, I'm going to see if it works

41:55without the outside. Hit run. No, we get an

41:58error. So

41:59you need the outside irrespective of

42:01whatever you're going on. And otherwise, it

42:04causes an

42:05issue here. I don't know why it causes an

42:08issue there. When the issue is here, that

42:10makes no sense

42:11to me. But if I then go ahead and run that,

42:14it works. Okay, so a little bit of live

42:17troubleshooting,

42:19we've copied our data in, we've done it

42:21this way, you could do it another way. And

42:24the other way that

42:25I thought of doing it is you can just

42:27basically go ahead and specify the

42:30transformation at the point

42:32at which you load. So the other way of

42:34doing this is to say, we can actually just

42:37do this, let's say,

42:38let's create another V, let's create a v3.

42:41And show you that this works regardless, go

42:43ahead and hit

42:44run, good table v3 has been created, we'll

42:48go ahead and copy this. And we'll paste

42:51this over here.

42:52Call this v3. And then what you can say is

42:55from and now this time, what you're going

42:58to do is

42:58you're going to do a slightly different

43:00notation for referencing the stage in the

43:02first place. Okay,

43:03so for this one, let me just double check

43:05my notes. And what you want to do is make

43:08sure that

43:08you're basically going to say from this is

43:11the stage, we're going to grab that, okay.

43:14And what I'll do on a new line, what I

43:17should have actually done is this, I should

43:20have said

43:21is copy into table from this stage. And

43:24then here, you can specify the file format.

43:27So file format,

43:28I'll go ahead and copy this actually,

43:30because I think this is basically the same

43:33thing. So

43:33basically, you're specifying the file

43:36format right here. And for this, you don't

43:39need this

43:40brackets, you can just go ahead and do that

43:44. And I think I'm pretty sure that's it. I'm

43:48pretty sure

43:48that's it. You just need to close this off.

43:50I think that's it. Let's go ahead and try

43:53this.

43:53If we get any errors, we'll try and debug

43:55it anyway. And let's see what happens. So

43:57okay,

43:57we've got an error file format. I think

44:00maybe I do need the brackets. Let's go

44:02ahead and

44:02check that. Go ahead and run that. U

44:07nexpected, close brackets in line 46. So we

44:14don't need that.

44:14So what have I done wrong this time? Let's

44:17go and find out. I think potentially, this

44:24information

44:25here needs to come in here. So rather than

44:29referencing my file format, which I handily

44:33prepared, I need to put this here. And I

44:38think if I reference this, I need to put

44:42brackets around

44:43this. So let's go ahead and do that here.

44:47Right, so I think this might be it. Let's

44:51go ahead and try

44:52this. Here we go. I have no clue why. It's

44:56almost terribly inconsistent. So up here,

45:01you put the

45:01brackets round and out here, you put it

45:03round the file format. What I'm not clear

45:05about is why I

45:06couldn't just reference the file format

45:08directly. Or maybe I didn't do it right.

45:11Maybe I got

45:11something wrong. And I didn't reference the

45:14file format correctly here. And I would

45:16have I would

45:17have thought I would have thought that that

45:20would have worked. But maybe you know what

45:22's the issue

45:23is, let me know in the comments below. I

45:25think this video has been plenty long

45:26enough, we've

45:27done some live troubleshooting. And we've

45:29actually got there a couple of different

45:30ways. So if I

45:31actually go ahead and refresh, you see that

45:33we've actually loaded three tables now with

45:36information,

45:371000 rows. And of course, now we can

45:40actually just go and do a basically a basic

45:44query here just to

45:45show that this is done select star from

45:48customer table, we've got three 3000 rows,

45:53what do you

45:54want to choose? Go ahead, run that and we

45:57should see a lovely table of information.

46:01Let it load.

46:03There we go. So there's our loaded data. We

46:05've been able to load that. And it's been

46:08hopefully

46:08an interesting lesson and journey for

46:10everyone who's watched this. And yeah,

46:11thanks very much for

46:12tuning into the very end and watching even

46:14the troubleshoot. I think the troubles

46:16hooting is an

46:17important part because you kind of learn I

46:18think you learn a lot by troubleshooting.

46:20Now, when I

46:21learned this for the first time, what was

46:24really interesting to me is that this this

46:26whole middle

46:27section here was almost sort of not

46:29something that I saw a lot of people doing,

46:30they would create the

46:31stage and then they just go create the

46:33table and they just go load it into the

46:35table. So we've done

46:36a few extra steps. And we've actually used

46:38some of that information to help us do the

46:40loading.

46:41But hopefully you've seen all the different

46:43ways of doing stuff, writing different

46:44queries that

46:45help you do all these things. And yeah,

46:47hopefully you've also learned something

46:49interesting and

46:50new. Thanks for watching. And I'll catch

46:52you in the next video. And by the way, if

46:54you've got any

46:55feedback, please let me know I really,

46:56really appreciate it. There's a whole new

46:58different

46:59place. You know, when I'm showing Tableau

47:01or showing some visual tools, it's much

47:03easier

47:03for people to follow along. But when you're

47:05showing code, it's a completely different

47:07thing.

47:07And so I really appreciate any support and

47:10feedback that you can give to make these a

47:12little bit better. And do you like watching

47:15me troubleshoot live? Or would you rather a

47:17completely

47:18polished video with no mistakes that you

47:20can sort of follow along more clearly? Let

47:22me know in the

47:23comments below. We'll find out what people

47:25think. Thanks for watching, and I'll catch

47:27you in the next

In this video, I show you how to load data from a stage that holds data in AWS S3 into a table in Snowflake. This is an end-to-end tutorial, including the troubleshooting I had to do to get my SQL working.

Resources I mention in the video: Link to GitHub with the SQL https://bit.ly/3OeANzy Create table https://bit.ly/3tEJ6eu Create stage https://bit.ly/3XnBAmb Querying data stored in staged files https://bit.ly/3V0zE1n Copy into https://bit.ly/3EFWfdG File format https://bit.ly/3AqPQka

Timestamps: 0:00 Intro 0:29 Setting up 0:44 Comments and overview 1:48 Create a database in Snowflake 3:34 Create a schema in Snowflake 5:17 Create a stage in Snowflake 9:45 View metadata of your stage 11:22 View the files in your stage 12:42 How to query the data in your stage 19:50 File formats in Snowflake 24:33 Preview with a transformation 28:39 Create Table in Snowflake 33:11 Copy data into the table in Snowflake 36:04 A long spell of troubleshooting 42:25 The other way to copy into in Snowflake 45:30 Query our data in Snowflake 46:51 Outro ‍