Trying to go through for all the announcements for a product can be sometimes overwhelming. It takes time as you need to go through for all the individual press announcements. To ease the pain, I’ve gathered a curated list of new Snowflake features released at the first annual Snowflake Summit at sunny California at Hotel Hilton.
So let’s begin.
Snowflake released new features which can be broken down into to following categories: Core Data Warehouse, Data Pipelines, Global Snowflake and Secure Data Sharing. I’ll list the features and give a more precise description of the feature (if possible).
Core Data Warehouse
Core Data Warehouse features are the bread and butter of Snowflake. Everybody is already familiar with Snowflake features such as virtual warehouses, separation of storage and compute so Snowflake didn’t release any announcements on those. They though said that they are working on decreasing concurrency latency and making large ad-hoc queries to work even better. In larger scale this means that the boundaries of operational databases and operational data platforms are narrowing down.
On the security side, Snowflake told that they are working with multiple vendors on enabling identity passing from BI -tools to Snowflake. Basically this means that your user id which you use to log on to for example to Tableau, is passed straight to Snowflake. This will enable even better row-level security and secure views possibilities. As of today, Snowflake reminded, that Snowflake already supports OAuth 2.0 as an authentication method.
Guys and gals at Snowflake are also keen on providing a richer experience for the end users. They told that they had gathered information from the end users, to provide better worksheets, Worksheet 3.0 so to speak. The end result of that was that during the investigation phase they encountered company called Numeracy and eventually decided to make their first acquisition as Numeracy was working on creating even better UI for Snowflake. Now the features of Numeracy UI are being ported into Snowflake UI. Numecary UI provides better editor suggestions, visualizations and provides worksheet sharing. On text this doesn’t sound anything new, but my colleague took a video of the new UI and it is awesome.
Data Pipelines is a new feature coming into Snowflake. Basically Snowflake has now the capability to auto-ingest data coming into the cloud storage layer and the possibility to do a data transformation for the data based on user-defined tasks. This means that you no longer need any external scheduler to trigger the small ELT or scheduled jobs.
This is possible due to the following new features:
– Streams and Tasks
During the Data Pipelines presentation a Snowflake Connector for the Kafka was also announced.
Under the Data Pipelines headline, the concept of Snowflake as a Data Lake was introduced more in detail because Snowflake will now support external tables and SQL over external tables, which means that you don’t need to load the data into Snowflake to get the insight of the data (and data structure). To make things interesting, Snowflake will now also support materialized views on external tables.
As a surprise effect, all the features which were were listed above are available today in public preview.
Under the title of Global Snowflake the new regions were introduced. AWS got Canada Central, US-Central (Ohio) and Japan. For Azure, new regions will open in US-West-2, Canada Central, Government and Singapore.
The most anticipated release was the release of Google Cloud version of Snowflake, which will be on preview at Q3 at 2019. Google Cloud supported stages are though possible already, as noted earlier.
Snowflake Organizations was published as a new way to control your Snowflake instances within a large corporation. Now you have the possibility to deploy Snowflake instances in your chosen cloud and region through Snowflake UI and you can act as organization admin. Organizations will provide a dashboard of warehouse and storage costs of all your accounts. So, if your company has multiple Snowflake accounts, this the new way to go.
Finally, Database Replication was announced. Database replication offers the possibility to replicate your data, per database level, to another Snowflake account regardless whether the account is Azure, AWS or Google Cloud. It doesn’t even matter on which region you’re planning to replicate your data as database replication supports cross-region replications.
Database replication is a point-and-click version of the traditional Oracle Data Guard or Microsoft SQL Server AlwaysOn Cluster where you define database which you want to replicate to a different location for business continuity purposes or just to provide read-only data nearer to the end users. The difference to the Oracle and Microsoft versions is that Snowflake implementation works out-of-the-box without any hassle. Database replication also supports failover and failback for client application end user point, meaning that if you have Tableau Server connected to Snowflake and you do a failover the database, Tableau Server will reconnect to the new location.
Secure Data Sharing
On Secure Data Sharing field, Snowflake introduced their concept of a data marketplace, called Data Exchange. Basically, Data Exchange is a marketplace for data that can be used through Snowflake Data Sharing. You can buy data shares (for example consumer travel data, weather data etc.) through Data Exchange and use them straight away in your SQL queries.
There are two options for this, Standard and Personalized. Standard -data is your normal data set, available for everybody. Personalized -data requires authentication between you and the data provider, but it will grant you the possibility to ingest your own data (if possible) straight from the provider using Data Sharing -functionality.
If that’s not all, Snowflake even advertised in mid-sentence, a simpler way to deploy Tableau into AWS and configure it to use Snowflake, now in 30 minutes.
Hopefully, this list helps you to get a grasp on things to come on Snowflake landscape. I will update this list if new features will be launched tomorrow or Thursday.
Link list in raw format.